Archive for the ‘Security’ Category.

Released M2Crypto 0.19

I just pushed out M2Crypto 0.19. This ends the longest hiatus in releases (almost a year since 0.18.2) since I took over the project; apologies for the delays. I highlighted the best parts about 0.19 in an earlier post, so I won’t repeat them here. I need to make one clarification regarding Python 2.6 support: the optional timeout parameter added to many network modules is not yet supported in M2Crypto 0.19. I just noticed this too late for this release.

In preparation for 0.19 I did the first ever code coverage analysis of M2Crypto. I installed the latest coverage and nose, and run the M2Crypto unit tests. At first I got 72%. I then added some tests on trunk, and got the number to 75%. Then I added some docstrings and was surprised to note the figure jumped to 78%. Now I just need to write some more docstrings to break the magical 80% code coverage limit ;)

While nose and coverage were surprisingly easy to set up and run, finding out the specific lines of code that were not covered was not very user friendly. For that I installed figleaf. The workflow then became:

nosetests --with-coverage --cover-package=M2Crypto

which wrote the file .figleaf in the current directory. Then I run:

figleaf2html -d build/fig .figleaf

which produced HTML files in the build/fig directory. The HTML files showed the source code, formatted such that it was easy to see what was covered and what not. Basically non-covered lines were colored red.

Update: It seems I messed up the figleaf instructions. The above nosetests line will not produce .figleaf. I know of two ways to produce that. The first one is to add two more options to the nosetests command, which then becomes:

nosetests --with-coverage --cover-package=M2Crypto --with-figleafsections --figleaf-file=.figleaf

Unfortunately trying to process this with figleaf2html leads into:

Traceback (most recent call last):
  File "/usr/bin/figleaf2html", line 8, in <module>
    load_entry_point('figleaf==0.6.1', 'console_scripts', 'figleaf2html')()
  File "/usr/lib/python2.5/site-packages/figleaf-0.6.1-py2.5.egg/figleaf/annotate_html.py", line 256, in main
    coverage = figleaf.combine_coverage(coverage, d)
  File "/usr/lib/python2.5/site-packages/figleaf-0.6.1-py2.5.egg/figleaf/__init__.py", line 89, in combine_coverage
    keys.update(set(d2.keys()))
AttributeError: CodeTracer instance has no attribute 'keys'

The second way which actually works is to use figleaf directly:

figleaf --ignore-pylibs setup.py test -q

in the M2Crypto source tree. Then figleaf2html will work. The downside is that setup.py and test files are included in coverage.

Common SSL Misconceptions

There seems to be a single fundamental misunderstanding about Secure Sockets Layer (SSL), or Transport Layer Security (TLS) as the newer standard is called, and that is that given insecure DNS, it is possible to perform a man-in-the-middle attack on any SSL connection. Obviously if this was true, SSL would not be used, so that should immediately make you suspicious.

There are many checks that an SSL implementation must do. Typically SSL libraries will do these checks without application developers needing to worry about them too much. These checks prevent many classes of hacking attempts, for example by foiling tools that automatically create certificates by duplicating all human readable fields for connections they try to spoof, because these generated certificates are not issued by known certificate authorities. But what often seems to be missing by default, and is also often omitted by literary describing SSL deployment is the post connection check done right after the SSL handshake. Typically this check is to make sure that you connected with the host you wanted to connect, and is done by checking the hostname field of the certificate returned by the peer. (There are other kinds of checks that could be done too, like verifying the peer’s certificate fingerprint is in an expected set of fingerprints.) Without this check it would be possible to to perform a MITM attack by requesting a valid certificate from a certificate authority for any domain, and use that to spoof connection to any domain.

There is also some confusion about HTML pages that submit data to a secure URL, or secured page that submits data to non-secure URL. In both of these cases it is possible to attack the unsecured connections, rendering SSL useless. Unfortunately many banks and ecommerce sites still use these techniques. There are a couple of workarounds for end users regarding these: one is to just try and submit the login page without any data, which in some cases leads to SSL-protected version of the page. Also in some cases you can try changing the http protocol to https manually in the urlbar. And if you have ever encountered a secured login page, it would be a good idea to bookmark the page for future use (rather than try to remember type the url or worse follow a link from some page).

It should be noted that given massive DNS vulnerabilities like what Dan Kaminsky found (let’s hope there aren’t more lurking around), it is possible to attack even SSL protected sites, because the attackers can get domain validated certificates for any domain. Only sites serving pure EV content would be safe from tampering, but even then this requires users to use products that support EV and be aware that the site is supposed to be EV protected (and stop if the site comes up without EV indicators).

Update: Dan Kaminsky pointed out in a comment that even an EV site can be attacked by script in another window (assuming the attacker redirected the other window into a domain validated site with same URL as the EV site), so one additional condition needs to be tacked on to ensure security: the browser should have just one window open, on the pure EV site. It could be relaxed a little by saying all browser windows and tabs must be on pure EV sites.

How to Replace Python’s socket.ssl with M2Crypto’s SSL Implementation

It seems like I started a mini-series about “hidden” M2Crypto tools and modules…

Python’s socket.ssl is not secure. If you need any real security you need to look for 3rd party packages (things will improve a little with Python 2.6).

Sometimes you are faced with a library that does SSL, but uses Python’s socket.ssl that you can’t easily replace. For this purpose I wrote a little helper module using M2Crypto. Basically you just need to import this socklib.py before you import the module that is using Python’s socket.ssl, and call socklib.setSSLContextFactory() with context factory that creates secure SSL contexts and your SSL usage just became secure.

The socklib.py implementation is for client side only. It would be easy to expand it for servers, though. It may also lack some features, but it filled the need I had so that is where I stopped. I wrote it for Python 2.5 and haven’t thought what would need to be changed for 2.6.

Root Certificates for Python Programs using Python

OpenSSL itself does not come with root certificates, which means that if you use OpenSSL for anything that requires those certificates (like SSL for example) you will need to get those certificates from somewhere else. This concerns most Pythonistas needing SSL since most Python programs use OpenSSL for SSL.

Most if not all Linux distributions include various sets of root certificates in OpenSSL-friendly formats. Windows also comes with root certificates, but to get access to them you would need to use the Windows-specific APIs.

The Curl project produced a crazy little script that can convert the certdata.txt from the NSS project (from Mozilla) into PEM format, suitable for OpenSSL. The Curl project also provided a converted certdata.txt file for download. Unfortunately the converted file was from a very old version of the certdata.txt file (when I first looked at it). I figured M2Crypto should have it’s own utility to do this conversions, so I ended up porting the script into Python. The dirty little certdata2pem.py script uses M2Crypto for certificate handling.

I used my script to get root certificates for Chandler.

fcgi.py Exposes Python Tracebacks by Default

I was testing a Python web application that was using FastCGI deployment on Dreamhost, when I found myself looking at a souped up Python Traceback in my browser. At first I couldn’t understand why that was happening. As far as I knew I was running with full production settings and as such I would have expected a terse internal server error message.

Looking at the HTML source of the error page I discovered reference to cgitb. But as far as my code was concerned I did not set that. I tried specifically disabling that in my script but that made no difference. In a momentary act of desperation I did a find for all cgitb.py files under my account and made the cgitb.enable() function do nothing. Yet I was still seeing the tracebacks.

After a bit of scratching my head and throwing different words at Google it occurred to me to take a look at the fcgi.py script. Oops. The [WSGI]Server class has an error() method whose docstring states that it “May and should be overridden”. No $%^, the default just plasters all the dirty little secrets for the world to see! I’d like to see something like Debug[WSGI]Server that pretty prints the error, and leave the [WSGI]Server the production class. The naming would make it clear that you should not be using the debug version in production. As it is now, I wonder how many people actually read all the way towards the bottom of the 1331 line file to discover this gem.

I also added a warning to the Dreamhost documentation regarding Python FastCGI.

Countdown to M2Crypto 0.19 Begins

I just pushed out the first beta for the M2Crypto 0.19 release. The plan is to release 0.19 as quickly as possible following Python 2.6 release.

The road to 0.19 has been surprisingly long, and I didn’t intend it that way. While I was taking a break after 0.18.2 release, I found out it was time to find a new job. With the job search, and later ramping up with the new job, there just wasn’t much time and energy left to put in M2Crypto. But I have settled in with the changes, and it is high time to roll out the bug fixes and new features in M2Crypto that many people have worked hard for.

In my opinion the 0.19 release highlights are as follows:

  • Python 2.6 support
  • Fixed SSL deadlocks caused by GIL handling changes done in 0.18
  • Wrappers for OpenSSL ENGINE_* functions, which enable smart card usage
  • Wrappers for OpenSSL OBJ_* functions, making it easier to deal with X.509 certificates
  • Fixed crash that prevented encryption using public key from X.509
  • Fixed several functions and methods that failed silently or with wrong errors
  • Switched to writing private keys in more secure manner

You might want to take a look at the full change log as well.

I have done most of my development on a 64-bit Ubuntu Linux machine except for the last week or so since that machine died. Over the weekend and this week I have tested on 32 bit Ubuntu Linux and Cygwin. The Python versions I have covered are 2.4.x, 2.5.x and 2.6 release candidates. OpenSSL versions were late 0.9.8 series (0.9.8g or so). SWIG 1.3.33 or thereabouts. I would especially appreciate it if someone could test on Mac, and using native Windows Python. Also tests using 0.9.7 series OpenSSL and SWIG version < 1.3.30 would be a big help.

You can grab the sources from the M2Crypto homepage, or just do easy_install M2Crypto.

EV Certificate Sites Still Vulnerable to DNS Hacks

Extended Validation Certificates (EV) were invented to fix the mess of what became of SSL in the race to provide ever cheaper certificates. Historically browsers were displaying just a lock icon for any certificate that was issued by a trusted certificate authority (CA), and since there was no standard levels or verification, the least validated and cheapest (sometimes even free) domain validated certificates (DV) won a big chunk of the market. While DV certificates are fine for hobby forums and the like, you would really want more from a bank or an ecommerce site. The trouble is, you can’t easily tell if the certificate is DV or if more checks have been done. With EV there are standardized guidelines for the minimum level of checking that is needed, and browser vendors are on the board by displaying different UI for EV sites. The expectation is that EV certificates will give a high assurance that the the entity you think you are talking to really is that entity.

Dan Kaminsky‘s recent DNS vulnerability find highlights the fact that DNS is still not secure. It is possible to spoof DNS and get a DV certificate for any domain, and then use another DNS spoof to redirect traffic to a site containing attack code. Now the problem arises when sites using EV certificates mix content from sites with DV certificates. There are some high profile sites doing this, by embedding Google Analytics scripts, or advertisements. So sites mixing DV content on EV page are actually vulnerable to DNS hacks still!

There is currently discussion going on in the Mozilla security forums on how to fix this. One way would be to state that an EV site can only load content from sites that are controlled by the same entity as the main EV site. After all, the idea with EV is that you can be really sure who you are dealing with, but if you have content coming from multiple sources it is no longer so clear. Another option could be to require EV site to load content only from other EV sites, regardless of who controls the other sites. You’d naturally need to tell the user who all the parties are they are talking to, but this will quickly result in a messy UI. And at least I would like to know which entity controls what part of the page I am looking at, but this would be a hard problem to solve with dynamic content.

Security Vulnerabilities in Online Advertisement Systems

Since I started experimenting with online advertisements myself, I started noticing a worrying trend of practices that are vulnerable to malicious activities. Security holes are of course not limited to online advertising software and services, but the very nature of it makes it potentially more harmful than run-of-the-mill security vulnerabilities.

First issue is of course that there is obviously money involved, which will get the criminal elements involved right away. The potential victims are often individuals hoping for quick and easy extra income using point and click solutions without the know-how to vet the security of the solutions. And at least in the United States, to be able to receive income, individuals will need to give their name, address, phone number, social security number and potentially other tax documentation to the entity through which they get the advertisements to publish on their sites. All that personal information just waiting to be captured and used for identity theft. It is also the user’s expectation that since all this personal data and money is involved, the software and service providers would be especially careful about security, but unfortunately this does not always seem to be the case. The users cannot be expected to be enforcing security either; the dancing pigs problem is well known.

In my experience the smaller and less experienced software and service providers are on the whole more likely to suffer from security vulnerabilities, but the big players are not immune either.

Here are three examples of issues I have noticed:

  1. A marketing plugin for WordPress provides two ways to fetch the ad information. The one they recommend uses PHP to fetch PHP code from their host over http and execute it without any checks. If an attacker can inject their own code, the attacker’s code can run with the web server’s permissions, reading login and password information from the WordPress installation at a minimum. Depending on webserver configuration, it might also have access to other areas of the file system. This is especially worrying attack now after Kaminsky’s DNS vulnerability has been disclosed, which makes the code injection easy. The fix for this issue would be to provide PHP code that would only fetch ad parameters that would be validated before being used, and nothing received over the network would ever be executed. The connections could also be protected with SSL, which would be an even better guarantee that no bogus data was returned.
  2. A Google AdSense competitor requires their users to login and submit all their personal information over unsecured connection!
  3. Many Google AdSense account holders will use the same Google account to login to their Google Analytics pages. (In fact, they will probably use the same account for GMail and every other Google service that requires an account.) However, if you go to https://www.google.com/analytics, you will be redirected to the unsecured login page, giving malicious parties a way to get the users account information and access to victims AdSense information and other services. The workaround is to login through https://www.google.com/analytics/home. This issue has been discovered before by other people and reported to Google.

I have some recommendation for the online publishers to make it safer to operate in the advertisement business:

  1. If possible, form a company. Company information is public anyway, so there is not much harm in divulging this to others. This isn’t such an appealing prospect for people starting in the publisher business because of the extra paperwork and in some cases significant amount of money and time that is required to start and run a company.
  2. Make sure that you are giving and viewing personal information only through a secure connection. If you can’t see a link to a secure area, try changing http to https and hitting enter. In some cases you will be lucky and get the secure version of the page. If that does not work, email the service provider and ask if they have a secure version. Even if they don’t, this will give them incentive to work on implementing the secure version. You could also use other means of sending your information to the company, like calling or sending a letter by courier for example. But keep in mind that they might add your information on your behalf to the database that serves the data unsecured over the network. There is also quite a bit of competition in the ad space, so look at the competitors as well.
  3. When given a choice of ad injection method, as a rule of thumb HTML markup is safer than doing anything on the server side, so opt for the HTML version (this would typically be a Javascript tag). If server side is the only option, you would do well to either check the code yourself or ask someone else to go through the code looking for security vulnerabilities. This is a big topic in itself, and there are several books written on the topic, but one easy thing to check for is the use of “eval” or “exec” which are available in most languages. Any use of such a thing is a warning sign. Basically the code should be checking all the data that it reads over the network as potentially dangerous and do sanity checks on it before using it. A sort of a corollary of this advice is to avoid using pugins to manage ads, but do this by hand by inserting the ad markup in the desired locations. Of course this is not possible if you maintain large sites.