Office Resource Finder

Have you ever been looking for a colleague, printer, or just about any resource in an office and being frustrated there was no map showing you where to go? I have good news for you. I just released Solu, which is a “Self-service Office resource Locator and Updater”. Or “cubicle finder”. Or whatever you want to call it.

This project started as a perfect example of “scratch your own itch”. I was doing some research on how to do REST with Python, and remembered jj recommended Werkzeug at some point, so I took a look. Around the same time I was chatting with our new admin, and noticed the office blueprints on the IT cubicle wall. This lead the two us discussing how nice it would be if there was a map to help us new people around. I ended up saying I could program such a system in 20 hours.

So I ended doing a bit more than research, and studied Werkzeug and jQuery. And although it wasn’t strictly necessary I also spent quite a bit of time researching how to make an easy to deploy package of the application, and how to make it easy to try it out. In the end I spent almost exactly 20 hours on the first version, but I realized I could have done it in about half the time had I used Pylons, mostly because I already knew Pylons and also because the things were I struggled with Werkzeug were already provided by Pylons, or easily integrated into Pylons. After the initial version I’ve spent another 20 hours to fix bugs, write tests, and in general getting it to a stage I felt good to release. Even still it is missing some pieces I know how to do in Pylons, but still don’t know for sure how to deal with them in Werkzeug.

Since I developed this application partially on SpikeSource time, it was nice of them to let me Open Source it to everyone’s benefit.

Some of the things I really enjoyed about Werkzeug include:

  • Interactive Python debugger in the browser
  • Small enough to make the ramp up period quick
  • Good documentation
  • Enough features to provide almost everything I needed
  • Easy to swap template system from Jinja to Mako (since I didn’t want to spend the time to learn Jinja too)
  • Writing tests was easy
  • Easy to work with Unicode
  • Great tutorial which fit my needs almost perfectly

Some of the things I found frustrating about Werkzeug:

  • No builtin mechanism to load application options from a file
  • No help to package the application
  • No cross-site request forgery (CSRF) protection out of the box
  • No internationalization/localization (i18n/l10n) support out of the box
  • No samples or recommendations on how to pass application and options to views and templates
  • No samples or recommendations on how to implement sessions

In the end I used ConfigObj to load the options from an ini file. This was a little trickier than I thought at first because I wanted the manage script to work so that I could run with Werkzeug’s server while testing, but I also wanted to retain WSGI deployment possibility. Packaging took some extra reading about setuptools. CSRF is still a little open, although I got pointers to zine.utils.forms and also found werkzeug.contrib.sessions which should make it possible in the way I was thinking about it. I don’t see CSRF as a huge issue at this point, though, given the data this application handles. I also got some pointers on how to implement get_app() function to get the application object from anywhere, so getting the options stored in the app became easy. I still have some open questions about localization, but those might go away once I actually try to do that.

There are some obvious improvements to Solu, like fixing the CSRF issues, i18n/l10n, multiple and resizeable maps, dealing with errors in a more user-friendly way, and overall making it pretty. I could see someone wanting to pull the information from company LDAP database or some such, and hooking this up with general employee database.

Since I spent so much time making it easy to see what the application is about by providing a website with screenshots, a demo site, easy installation and instructions on how to run your own application, I am interested in hearing how I did.

Released M2Crypto 0.19

I just pushed out M2Crypto 0.19. This ends the longest hiatus in releases (almost a year since 0.18.2) since I took over the project; apologies for the delays. I highlighted the best parts about 0.19 in an earlier post, so I won’t repeat them here. I need to make one clarification regarding Python 2.6 support: the optional timeout parameter added to many network modules is not yet supported in M2Crypto 0.19. I just noticed this too late for this release.

In preparation for 0.19 I did the first ever code coverage analysis of M2Crypto. I installed the latest coverage and nose, and run the M2Crypto unit tests. At first I got 72%. I then added some tests on trunk, and got the number to 75%. Then I added some docstrings and was surprised to note the figure jumped to 78%. Now I just need to write some more docstrings to break the magical 80% code coverage limit ;)

While nose and coverage were surprisingly easy to set up and run, finding out the specific lines of code that were not covered was not very user friendly. For that I installed figleaf. The workflow then became:

nosetests --with-coverage --cover-package=M2Crypto

which wrote the file .figleaf in the current directory. Then I run:

figleaf2html -d build/fig .figleaf

which produced HTML files in the build/fig directory. The HTML files showed the source code, formatted such that it was easy to see what was covered and what not. Basically non-covered lines were colored red.

Update: It seems I messed up the figleaf instructions. The above nosetests line will not produce .figleaf. I know of two ways to produce that. The first one is to add two more options to the nosetests command, which then becomes:

nosetests --with-coverage --cover-package=M2Crypto --with-figleafsections --figleaf-file=.figleaf

Unfortunately trying to process this with figleaf2html leads into:

Traceback (most recent call last):
  File "/usr/bin/figleaf2html", line 8, in <module>
    load_entry_point('figleaf==0.6.1', 'console_scripts', 'figleaf2html')()
  File "/usr/lib/python2.5/site-packages/figleaf-0.6.1-py2.5.egg/figleaf/annotate_html.py", line 256, in main
    coverage = figleaf.combine_coverage(coverage, d)
  File "/usr/lib/python2.5/site-packages/figleaf-0.6.1-py2.5.egg/figleaf/__init__.py", line 89, in combine_coverage
    keys.update(set(d2.keys()))
AttributeError: CodeTracer instance has no attribute 'keys'

The second way which actually works is to use figleaf directly:

figleaf --ignore-pylibs setup.py test -q

in the M2Crypto source tree. Then figleaf2html will work. The downside is that setup.py and test files are included in coverage.

Protocol Testing with Doctests

A couple of years ago I was asked to write some tests (and maybe test framework, I can’t remember) for Cosmo, the Chandler Server. I think the only tools we had used for testing until then were litmus and manual testing, maybe with some unit tests. The protocols to be tested include ticket, CMP etc. protocols.

The thought of expanding litmus, which was a tool written in C, did not sound too promising. Command line tool to test the protocol was also what we were mainly after at that point, so GUI tools were not needed. And it was going to be easier to automate command line tools with Tinderbox.

Around that time I had been really sold on doctests, so I figured by writing some helper functions the protocol tests for Cosmo could be done easily as Python doctests. Developing the “framework” was fast, I was testing within minutes, and probably the time spent on the helper functions still amounts to less than a day, total. I based the helper functions on httplib, but urllib2 or httplib2 would probably make the experience even easier. I named the tool “silmut” , which is an anagram of litmus, and also a word in the Finnish language, meaning “buds”.

In the end we collected each protocol’s tests into its own file. At the top we had some code to do common initialization, and after that followed the actual tests. Here’s an example of a test:

View account
 
    >>> r = request('GET', '%s/cmp/user/%s' % (path, user1), headers=authHeaders)
    >>> r.status # GET account ok
    200

I was actually surprised to note that these tests are still in the Cosmo sources, and apparently being maintained. Just goes to show you don’t always need to be too fancy to get the job done.

Common SSL Misconceptions

There seems to be a single fundamental misunderstanding about Secure Sockets Layer (SSL), or Transport Layer Security (TLS) as the newer standard is called, and that is that given insecure DNS, it is possible to perform a man-in-the-middle attack on any SSL connection. Obviously if this was true, SSL would not be used, so that should immediately make you suspicious.

There are many checks that an SSL implementation must do. Typically SSL libraries will do these checks without application developers needing to worry about them too much. These checks prevent many classes of hacking attempts, for example by foiling tools that automatically create certificates by duplicating all human readable fields for connections they try to spoof, because these generated certificates are not issued by known certificate authorities. But what often seems to be missing by default, and is also often omitted by literary describing SSL deployment is the post connection check done right after the SSL handshake. Typically this check is to make sure that you connected with the host you wanted to connect, and is done by checking the hostname field of the certificate returned by the peer. (There are other kinds of checks that could be done too, like verifying the peer’s certificate fingerprint is in an expected set of fingerprints.) Without this check it would be possible to to perform a MITM attack by requesting a valid certificate from a certificate authority for any domain, and use that to spoof connection to any domain.

There is also some confusion about HTML pages that submit data to a secure URL, or secured page that submits data to non-secure URL. In both of these cases it is possible to attack the unsecured connections, rendering SSL useless. Unfortunately many banks and ecommerce sites still use these techniques. There are a couple of workarounds for end users regarding these: one is to just try and submit the login page without any data, which in some cases leads to SSL-protected version of the page. Also in some cases you can try changing the http protocol to https manually in the urlbar. And if you have ever encountered a secured login page, it would be a good idea to bookmark the page for future use (rather than try to remember type the url or worse follow a link from some page).

It should be noted that given massive DNS vulnerabilities like what Dan Kaminsky found (let’s hope there aren’t more lurking around), it is possible to attack even SSL protected sites, because the attackers can get domain validated certificates for any domain. Only sites serving pure EV content would be safe from tampering, but even then this requires users to use products that support EV and be aware that the site is supposed to be EV protected (and stop if the site comes up without EV indicators).

Update: Dan Kaminsky pointed out in a comment that even an EV site can be attacked by script in another window (assuming the attacker redirected the other window into a domain validated site with same URL as the EV site), so one additional condition needs to be tacked on to ensure security: the browser should have just one window open, on the pure EV site. It could be relaxed a little by saying all browser windows and tabs must be on pure EV sites.

How to Replace Python’s socket.ssl with M2Crypto’s SSL Implementation

It seems like I started a mini-series about “hidden” M2Crypto tools and modules…

Python’s socket.ssl is not secure. If you need any real security you need to look for 3rd party packages (things will improve a little with Python 2.6).

Sometimes you are faced with a library that does SSL, but uses Python’s socket.ssl that you can’t easily replace. For this purpose I wrote a little helper module using M2Crypto. Basically you just need to import this socklib.py before you import the module that is using Python’s socket.ssl, and call socklib.setSSLContextFactory() with context factory that creates secure SSL contexts and your SSL usage just became secure.

The socklib.py implementation is for client side only. It would be easy to expand it for servers, though. It may also lack some features, but it filled the need I had so that is where I stopped. I wrote it for Python 2.5 and haven’t thought what would need to be changed for 2.6.

Root Certificates for Python Programs using Python

OpenSSL itself does not come with root certificates, which means that if you use OpenSSL for anything that requires those certificates (like SSL for example) you will need to get those certificates from somewhere else. This concerns most Pythonistas needing SSL since most Python programs use OpenSSL for SSL.

Most if not all Linux distributions include various sets of root certificates in OpenSSL-friendly formats. Windows also comes with root certificates, but to get access to them you would need to use the Windows-specific APIs.

The Curl project produced a crazy little script that can convert the certdata.txt from the NSS project (from Mozilla) into PEM format, suitable for OpenSSL. The Curl project also provided a converted certdata.txt file for download. Unfortunately the converted file was from a very old version of the certdata.txt file (when I first looked at it). I figured M2Crypto should have it’s own utility to do this conversions, so I ended up porting the script into Python. The dirty little certdata2pem.py script uses M2Crypto for certificate handling.

I used my script to get root certificates for Chandler.

Python Syntax Influencing New Languages

To this day C++ is the language I have programmed the longest in (although my Python experience is catching up fast), and at some point I even thought it would be the only programming language I would ever need and use. I actively stayed away from Python, mainly because I had heard about the forced indentation and having had bad experiences with Fortran before. But within two weeks after being forced to use Python I was sold. The Python syntax is definitely one of the attractions. So even though Python itself hasn’t (yet) taken over the world of programming languages, I am happy to see Python influencing new languages.

A while back jj pointed out Reia, which is a language for the Erlang virtual machine. The syntax looks a lot like Python’s, which almost makes me want to play with it. (The concepts of Erlang make it really attractive with the multicore architectures and all, but just reading the Wikipedia article on Erlang made my head hurt because of the syntax.)

Today I was reading about Delight, which is a Python-like syntax for the D programming language. (This is kind of ironic, because D is the nicer C++.) I can’t say I am sold on all of the ideas of Delight, but I do welcome any attempts to make other programming languages more Pythonic in syntax if nothing else.

Not that long time ago languages marketed themselves by having C-like syntax to make it easier to switch. I am wondering if Python is becoming the new C in that respect.

fcgi.py Exposes Python Tracebacks by Default

I was testing a Python web application that was using FastCGI deployment on Dreamhost, when I found myself looking at a souped up Python Traceback in my browser. At first I couldn’t understand why that was happening. As far as I knew I was running with full production settings and as such I would have expected a terse internal server error message.

Looking at the HTML source of the error page I discovered reference to cgitb. But as far as my code was concerned I did not set that. I tried specifically disabling that in my script but that made no difference. In a momentary act of desperation I did a find for all cgitb.py files under my account and made the cgitb.enable() function do nothing. Yet I was still seeing the tracebacks.

After a bit of scratching my head and throwing different words at Google it occurred to me to take a look at the fcgi.py script. Oops. The [WSGI]Server class has an error() method whose docstring states that it “May and should be overridden”. No $%^, the default just plasters all the dirty little secrets for the world to see! I’d like to see something like Debug[WSGI]Server that pretty prints the error, and leave the [WSGI]Server the production class. The naming would make it clear that you should not be using the debug version in production. As it is now, I wonder how many people actually read all the way towards the bottom of the 1331 line file to discover this gem.

I also added a warning to the Dreamhost documentation regarding Python FastCGI.