M2Crypto Has New Maintainer

I am excited to announce that M2Crypto has a new maintainer and homepage! As of last summer, Martin Paljak agreed to take over the project since I had not been able to dedicate enough time to keep things going.

Martin is a long time M2Crypto contributor and was a security expert even before his contributions to M2Crypto. He has probably forgotten more about software security than I ever knew, so I know the project will be in safe hands.

There is also a new version of M2Crypto available on PyPI so go check it out!

Beware of cPickle

The Python pickle module provides a way to serialize and deserialize Python objects. A large downside of the pickle format is that it is not secure, meaning you should not deserialize pickles received from untrusted sources.

There is also a cPickle version of the pickle module which implements the algorithm in C and is much faster than the pure Python module. This provides somewhat surprising use cases for the cPickle module besides the obvious application save format: it turns out cPickle can be the fastest way to make a copy of nested structures. Due to speed, using cPickle can also be attractive as a data format between trusted servers.

There is an issue that you need to watch out for in the cPickle module, though. When you are serializing to or deserializing from string using the dumps and loads functions respectively, the functions do not release the GIL! This took me by surprise: I did not expect anything in the stdlib to hold on to the GIL for anything that could potentially take a long time. You can try this out easily by creating a multithreaded application where one thread tries to use cPickle.dumps on multimegabyte data structure while the other treads are printing to screen for example. You will see that while dumps is running, the other threads are stopped.

Luckily there is an easy workaround: you can use the load and dump functions with cStringIO buffer or other file-like objects.

Note that I haven’t checked if this problem applies to Python 3.x.

Decorator to Log Slow Calls

One of the most common examples given for Python decorators is a decorator that tracks how long the execution of the wrapped function took. While this is very useful in and of itself, it can cause issues when you want to apply that into production usage.

The issue I faced was that when I was trying to find out why my servers were too slow (only under production loads), I first added the simple timing decorators to everything I thought might be slow in the hopes of catching all the slow calls and maybe finding some patterns. Well, this approach worked in the sense that I did find the slow parts quickly, but it was producing much more logs than before, and I wasn’t really interested in most of this timing information.

What I really wanted was a timing decorator that would log only when the wrapped callable took too long to execute. But there were still some calls that I wanted to log always for accurate statistical purposes. I figured the best way was to make my decorator take a threshold argument with some reasonable default that I could override if I wanted.

Now while I have written decorators before, this was the first decorator that called for optional arguments. Python treats decorators that don’t take any arguments slightly differently from those that require arguments, so this complicates things a bit. The sample in Python decorator library is almost scary! I think my approach is nice and simple yet fairly sophisticated:

from time import time
import logging
import functools
 
log = logging.getLogger(__name__)
 
def time_slow(f=None, logger=log, threshold=0.01):
    def decorated(f):
        @functools.wraps(f)
        def wrapper(*args, **kw):
            start = time()
            try:
                ret = f(*args, **kw)
            finally:
                duration = time() - start
                if duration > threshold:
                    logger.info('slow: %s %.9f seconds', f.__name__, duration)
            return ret
        return wrapper
    if f is not None:
        return decorated(f)
    return decorated

This decorator can be placed on a callable without any arguments, or with a custom logger or threshold value. In other words, both of these would work:

@time_slow
def myfast():
    pass
 
@time_slow(threshold=0)
def myslow():
    from time import sleep
    sleep(0.001)

and typically calling myslow only would produce log output. I chose 0.01 as a reasonable default threshold, but this of course depends a lot on the use case. The log includes the slow function’s name, as well as the time formatted with 9 decimals in order to avoid the exponential notation, which makes it easier to work with the log output (sort -n, for example). I have just used a single threshold, but an easy improvement would be to pass a list of thresholds and log at different levels depending on the duration.

M2Crypto Supports OpenSSL 1.0.x

I was supposed to release new M2Crypto version in the summer of 2010 but “real life” got in the way, and this extended all the way until this week. I finally decided that I’d better push out a new release even though there was just one significant change: OpenSSL 1.0.x support. However, I felt this was really important since OpenSSL 1.0.x has been out for almost a year now, and it is starting to get difficult to deal with software that does not work with pre-1.0.x.

Unfortunately I made a mistake in my first release to PyPI: I used the setup.py commands to build, sign and upload a source distribution, but I did this from a tree I had svn exported. Due to the way the M2Crypto setup.py was constructed this meant that the tarball was lacking vital files. Yesterday I did a new 0.21.1 release from the Subversion checkout, so the tarball now includes everything.

Long Silence

If you have been following my blog you have probably noticed I haven’t been writing, or in fact doing anything publicly in the last few months. A lot happened in my personal life, for example I moved to Santa Clara, but more importantly I was buried with work.

It is weird how I always read about other people working insane hours at startups, but it hadn’t really happened to me. Sure, especially early on in my career I did put in a few all nighters, but not consistently. This time it was different.

I had been working on a large project for several months, of which I was the main architect and developer, and we finally deployed it into production. And all hell broke loose! Some problems could have been avoided by having had a better test setup (especially load testing), some was my inexperience dealing with heavily loaded system, some we could caulk to 3rd party software not quite ready for prime time, some to running software on VMs when we should have run on bare hardware, some to bizarre VM freezes (like minutes at time, which is obviously disastrous for a server), some lessons learned with Python GIL and multithreaded applications, plus various other bits and pieces and finally made worse by our steady and respectable growth in both number customers and sizes of data sets. I plan to write about some of that in later posts.

Since customers were being affected, and we didn’t see the full scope of problems in advance, I thought I would work extra hours and fix the issues. I was being optimistic, and assumed just the most pressing issue that was killing us was the one that would get us over the hump. This turned into days, then weeks, then months of working 60-70 hour weeks, 6-7 days a week on 5-6 hours of sleep, fixing issue after issue after issue. Combine this with the move to a new city and things were really crazy for a while. I don’t understand how I did it, given how little time we had to test on many occasions, but I believe I never introduced a catastrophic bug that got rolled into production (if you discount the initial deployment ;) . Finally, just before Christmas, things started humming the way I had planned and how they had worked in our tests months before. Of course there is still a lot of work to do to improve things, but they can be addressed in a more sustainable pace.

Luckily the work I was doing was very interesting, or I wouldn’t have been able to do it for such a long period. In retrospect I can now finally say that I have now personally faced the problem of scaling, and while I’ve always said it is a good problem to have, I now also realize trying to solve it can easily lead to exhaustion because the pressure is huge to solve the issues quickly.

While I learned a lot of things and will be able to avoid some of the issues in the future, this also clearly showed what kind of testing we still need to do better. Besides being better for users of the software, it is also better for developer health and sanity…

I am still recovering from that ordeal both mentally and physically, but I have been feeling much better. I do realize I am suffering from some burnout, but I am also starting to get the itch to continue developing my own software, both free and paid. M2Crypto release has been pending for around 6 months now, and my Android apps are really in need of an upgrade.

I am also months behind in some personal correspondence. So sorry! I will try to get my inbox in order in the next couple of weeks as well.

EV Guidelines Holding Up

About a month ago I was reading a post how one can use a browser while disabling all certificates that it ships with. While this can work, it is definitely not for the lazy or someone not very familiar with the issues. The author’s comment that it was trivial to get an EV certificate with nothing but email verification got me worried, since this should not have been possible per the EV guidelines (PDF).

I tried to comment, but comments had already been closed. I tried to find the author’s email address, but could not (in a couple of minutes of searching that I had available), so I filed a bug to Mozilla to track this down. Luckily it turns out the comment is misleading, and EV guidelines were followed. Thanks to Kathleen Wilson for tracking down the post author and clarifying things!

SSL in Python 2.7

It has been almost two years since I wrote about the state of SSL in Python 2.6. If you haven’t read that yet, I suggest you read that first and then continue here, since I will mostly just be talking about things that have changed since then, or things that I have discovered since then.

The good news is that things have improved in the stdlib ssl module. The bad news is that it is still missing some critical pieces to make SSL secure.

Python 2.7 enables you to specify ciphers to use explicitly, rather than just relying on what comes default with the SSL version selection. Additionally, if you compile the ssl module with OpenSSL 1.0 and later, using ssl.PROTOCOL_SSLv23 is safe (as in, it will not pick the insecure SSLv2 protocol) as long as you don’t enable SSLv2 specific ciphers (see the ssl module documentation for details).

Servers

With that out of the way, there isn’t really much difference to how you would write a simple SSL server with Python 2.7 compared to what I wrote in 2008. If you know your ssl module was compiled with OpenSSL 1.0 you can pick ssl.PROTOCOL_SSLv23 for maximum compatibility. Otherwise you should stick with an explicit version other than v2.

The documentation for the ssl module in 2.7 has improved a lot, and includes good sample code for servers here.

The M2Crypto code hasn’t changed. The next M2Crypto release will add support for 0penSSL 1.0.

Clients

Checking the peer certificate’s hostname is still the weak point of the ssl module. The SSL version selection situation has improved slightly like I explained above. Othewise follow the example I wrote in 2008.

Again, the M2Crypto API hasn’t changed.

Lately I have been working with pycurl at Egnyte, so I decided to give a client example using that module.

import pycurl
 
c = pycurl.Curl()
c.setopt(pycurl.URL, 'https://www.google.com')
c.setopt(pycurl.HTTPGET, 1)
c.setopt(pycurl.SSL_VERIFYPEER, 1)
c.setopt(pycurl.CAINFO, 'ca.pem')
c.setopt(pycurl.SSL_VERIFYHOST, 2)
 
try:
    c.perform()
finally:
    c.close()

I am not a big fan of pycurl due to difficulties getting it compiled and the non-Pythonic API. But it is based on the very powerful curl library, so it comes full featured out of the box.

Other Resources

Besides the Python crypto libraries capable of doing SSL that I mentioned in my SSL in Python 2.6 article, I have found pycurl. Another find in the Python crypto front is cryptlib.

Mike Ivanov wrote a great series about crypto in Python: part 2, part 3 (link to part 1 seems to have rotted). Mike also produced a comparison of different Python crypto libraries (PDF).

The future is also looking bright for the ssl module. Already the upcoming Python 3.2 ssl module will include support for SSLContexts so that you can set options for multiple SSL connections at once, allows you to selectively disable SSL versions, and it allows you to check the OpenSSL version as well.

Space-Time Tradeoff

It is a pretty common strategy in programming circles to make programs run faster by making them use more space (for example, by caching some results from long computations so they don’t need to be done too often). But there is also another kind of space-time tradeoff: something not very elegant is taking a little too much space, but cleaning it up would take too much time and there are more important things to do. It occurred to me that I do this space-time tradeoff in my life outside of programming as well.

I have gathered piles and piles of once-important papers, magazines, books and so forth that I don’t think I will ever read. Likewise, I own a pile of clothes I will probably never wear again. I am sure I have lots of other things I basically never use, but I still have them stored somewhere, taking up unnecessary space. But if I wanted to get rid of all this stuff, I would need to spend quite a bit of time going through each potentially unneeded thing, deciding if I was ever going to use it, and if not, what should I do with it. So I reason that it is better to not worry about the unnecessary junk, as long as I have a place to put down a few important-for-now-at-least things…

Or it could just be that I am trying to rationalize being a pack-rat.