Archive for the ‘Python’ Category.

Pulling Android Market Sales Data Programmatically

Android Market handles sales through Google Checkout. I haven’t tried selling anything else online before, but what this setup provides for me as the seller leaves a lot to be desired. One issue you will have trouble with is getting the data needed to file taxes.

Google provides a Google Checkout Notification History API that lets you programmatically query sales data. For my purposes the API requests are really simple: just post a small XML document with the date range I am interested in, get back XML documents that contain my data. If there is more data that fits in a single response, look for an element that specifies the token for the next page and keep pulling until you get all data.

Below is a really simple Python script that uses M2Crypto to handle the SSL parts for the connection (needed since Python doesn’t do secure SSL out of the box). You will also need to grab certificates. You should save the script as, save the certificates as cacert.pem and create gnotif.ini as described in the script below all in the same directory. When you execute it, it will ask for start and end date (in YYYY-MM-DD format) and then fetch all the data, saving them in response-N.xml files, where N is a number.

#!/usr/bin/env python
# Script to query Google Checkout Notification History
# Supporting file gnotif.ini:
# merchant_id = YOUR_MERCHANT_ID_HERE
# merchant_key = YOUR_MERCHANT_KEY_HERE
import base64
import re
from ConfigParser import ConfigParser
from M2Crypto import SSL, httpslib
XML = """\
<notification-history-request xmlns="">
config = ConfigParser()'gnotif.ini')
MERCHANT_ID = config.get('gnotif', 'merchant_id')
MERCHANT_KEY = config.get('gnotif', 'merchant_key')
rawstr = r"""<next-page-token>(.*)</next-page-token>"""
compile_obj = re.compile(rawstr, re.MULTILINE)
auth = base64.encodestring('%s:%s' % (MERCHANT_ID, MERCHANT_KEY))[:-1]
ctx = SSL.Context('sslv3')
# If you comment out the next 2 lines, the connection won't be secure
ctx.set_verify(SSL.verify_peer | SSL.verify_fail_if_no_peer_cert, depth=9)
if ctx.load_verify_locations('cacert.pem') != 1: raise Exception('No CA certs')
start = raw_input('Start date: ')
end = raw_input('End date: ')
data = XML % {'query': """<start-time>%(start)s</start-time>
<end-time>%(end)s</end-time>""" % {'start': start, 'end': end}}
i = 0
while True:
    c = httpslib.HTTPSConnection(host='', port=443, ssl_context=ctx)
    c.request('POST', ENVIRONMENT + MERCHANT_ID, data,
             {'content-type': 'application/xml; charset=UTF-8',
              'accept': 'application/xml; charset=UTF-8',
              'authorization': 'Basic ' + auth})
    r = c.getresponse()
    f=open('response-%d.xml' % i, 'w')
    result =
    print i, r.status
    match_obj =
    if match_obj:
        i += 1
        data = XML % {'query': """<next-page-token>%s</next-page-token>""" %}

As you take a look at the data you will probably notice that you are only getting the sale price information, but no information about the fees that Google is deducting. Officially it is a flat 30%, but I have found out a number of my sales have the fee as 5%. So we need to get this information somehow. Luckily you can toggle a checkbox in your Google Checkout Merchant Settings. Unfortunately there is a bug, and the transaction fee shows as $0 for Android Market sales. I have reported this to Google, and they acknowledged it, but there is no ETA on when this will be fixed.

I also haven’t found any way to programmatically query when and how much did Google Checkout actually pay me. (I can get this info from my bank, but it would be nice to query for that with the Checkout API as well.)

Last but certainly not least, working with the monster XML files returned from Google Checkout API is a real pain. If someone has a script to turn those into a format that could be imported into a spreadsheet or database that would be nice…

Buildbot Slave on Windows XP

Today I installed Buildbot on Windows XP and even got it to run successfully as build slave. It was both harder and easier than I expected. Or more specifically, installing all the dependencies was harder than I thought, but configuring and running the slave was easier than I thought.

I have run Buildbot server and slaves on Linux based systems without any problems. Once you have setuptools installed, it is just a matter of easy_install buildbot and creating the configuration files. I thought once I had Python and setuptools installed on Windows it would be equally simple. Not so.

First of all I tried to use somewhat nonstandard Python 2.5, for which there was no registry entry. This meant that none of the dependencies that had their own executable installers worked. When I tried the zipped versions, these also failed because apparently my Python was compiled with different version of Visual C++ than the extensions I was trying to install. I got past this by grabbing a standard Python installer and going from there.

Next I tried to install setuptools from the exe installer. That failed because it could not find msvcrt71.dll. I downloaded that and put one in Python\DLLs directory and one if the directory from which I was trying to run the installers. Then I was able to get setuptools installed.

After I modified PATH to include the Python and Python\Scripts directories I tried to use easy_install, but it failed to install any package that contained native code. I also added .py to PATHEXT as described in the Buildbot README.w32 file. I continued by grabbing the exe installers of the dependencies and running them one by one. Note that this might be too long, but I found these from various pages describing how to run Buildbot on Windows: pywin32, zope.interfaces, twisted, pycrypto and pyopenssl. I used the latest stable release of each.

Next step was buildbot itself. I downloaded the sources, unpacked, and run python install.

For a change, easy_install worked for the next two packages I wanted to use for the actual tests: easy_install nose, easy_install coverage.

buildbot create-slave command worked fine, but when I first tried to start my slave it would just sit there without doing anything nor giving any errors. It turned out I had given wrong machine name as the buildmaster. I would have expected an error when trying to start. Once I fixed that, the buildbot slave was running correctly. (Note that I have not yet tried to run it as service.)

Finally it was just a matter of fixing broken tests and installing any dependencies the code I was testing had (like .NET 3.5). Feels good having some assurance that the changes I make on Linux won’t hose the Windows world.

Secure Password Scheme for Turbogears2 Application with repoze.who and bcrypt

I am working on a little Turbogears2 application, and wanted to use the repoze.who and repoze.what packages that integrate nicely with Turbogears2 and have gathered fair amount of positive feedback. It seems usage is simple, but I was quite disappointed in how password hashes are done by default, especially after I had read Thomas Ptacek’s educational rant about secure password schemes.

It turns out by default the authentication code that gets generated for you if you tell paster quickstart you want authentication is the weak kind of way Thomas warns about. Luckily the correct way he advices is easy to put in by using the py-bcrypt module. If you want pointers, see repoze.who Issue 85, or the py-bcrypt homepage.

Unfortunately py-crypt is kind of annoying to put in dependencies. The pypi entry is named bcrypt, but the download page links to py-bcrypt, so setuptools machinery can not find the right package. This can be worked around relatively easily by adding the py-bcrypt download link to setup.cfg:

find_links =
# ...

and then in you do something like this:

    "TurboGears2 >= 2.0.3",
    "Catwalk >= 2.0.2",
    "Babel >=0.9.4",
    #can be removed iif use_toscawidgets = False
    "toscawidgets >=",
    "zope.sqlalchemy >= 0.4 ",
    "repoze.tm2 >= 1.0a4",
    "repoze.what-quickstart >= 1.0",
    import bcrypt
except ImportError:
    install_requires.append("py-bcrypt >= 0.1")
    install_requires = install_requires,
# ...

Now when you go deploy your TG2 app with easy_install or similar tools, it will download the py-bcrypt package the first time, and won’t bother you if it already exists.

Testing CherryPy 3 Application with Twill and Nose

I’ve been working on a CherryPy application for a few days, and wanted to write some tests. Surprisingly I could not find any tutorials or documentation on how I should test a CherryPy application. Unfortunately I also missed the last section on CherryPy Testing page; why is CherryPy application testing added as an afterthought? Wouldn’t it make more sense to start the testing section on how people can write tests for their CherryPy applications, rather than first explaining how to test CherryPy itself? Of well, at least I learned something new…

Since I got tests working with Twill first I decided to document my experience, and switch to the CherryPy way later if it makes more sense. The CherryPy Essentials book apparently has a section on testing, so reading that would probably clarify a lot of things.

There is a brief tutorial on how to test CherryPy 2 application with twill, but the instructions need some tweaking to work with CherryPy 3.

On Ubuntu 8.04 I first created a virtualenv 1.3.1 without site packages. I am running Python 2.5.2, and I have the following packages installed in the virtualenv: setuptools 0.6c9, CherryPy 3.1.2, twill 0.9 and nose 0.11.1. The additional packages were installed with easy_install.

My directory structure is as follows:
tests/ contents is simply:

import cherrypy
class HelloWorld:
    def index(self):
        return "Hello world!" = True
if __name__ == '__main__':

Running python will start the web server and I can see the greeting in my browser at URL http://localhost:8080.

The tests directory has two files. is empty. The follows closely the tutorial by Titus, but modified to work with CherryPy 3. The CherryPy 3 Upgrade instructions and CherryPy mod_wsgi instructions showed the way.

from StringIO import StringIO
import twill
import cherrypy
from hello import HelloWorld
class TestHelloWorld:
    def setUp(self):
        # configure cherrypy to be quiet ;)
        cherrypy.config.update({ "environment": "embedded" })
        # get WSGI app.
        wsgiApp = cherrypy.tree.mount(HelloWorld())
        # initialize
        # install the app at localhost:8080 for wsgi_intercept
        twill.add_wsgi_intercept('localhost', 8080, lambda : wsgiApp)
        # while we're at it, snarf twill's output.
        self.outp = StringIO()
    def tearDown(self):
        # remove intercept.
        twill.remove_wsgi_intercept('localhost', 8080)
        # shut down the cherrypy server.
    def test_hello(self):
        script = "find 'Hello world!'"
        twill.execute_string(script, initial_url='http://localhost:8080/')

Now you’d expect that this would work by simply running nosetests command. Mysteriously I got import error on twill (and after I removed the line, also import error on cherrypy). I looked at sys.path which showed that I was somehow picking up the older nosetests I had installed into system Python. which nosetests claimed it was finding the virtualenv nosetests. Still, I had to actually give the explicit path to my virtualenv nosetests before the tests would run without import errors.

All in all testing CherryPy applications turned into a longer adventure than I anticipated. I run into a number of unexpected difficulties, but I finally got it working and learned about twill as a bonus. Thanks for the tip, JJ!

Turbogears2 on Dreamhost

It has been almost two years since I tested Turbogears 1 on Dreamhost. Back then it was quite difficult for me to get it running. But some additional personal experience and improvements in Turbogears2 have made it a breeze. I tested with Turbogears 2.0 although I upgraded to 2.1a2 at some point.

First you need to get virtualenv installed, which is pretty simple after you have downloaded and unpacked the source tarball: python2.5 $HOME. (I wanted it installed in $HOME, but you could use alternative locations as well.) This will install setuptools, but somehow not virtualenv. Then you just do easy_install virtualenv. You will also need PasteDeploy so do: easy_install PasteDeploy.

Next steps might be different for installing a Turbogears2 egg/application, but I used these instructions to install the wiki-20 tutorial in development mode. (To install a properly packaged app you probably just need to do: easy_install app_tarball; paster make-config yourapp production.ini and follow the instructions from FastCGI onwards.)

After that you just follow tg2 automatic installation instructions.

Then use paster quickstart to create a new project template. cd to the created directory, and run python develop to download any missing dependencies and set things up for debugging and development.

Edit as instructed in the tutorial. Then python setup-app development.ini.

After that it is time to create the production ini: paster make-config Wiki20 production.ini.

Next step is getting this running with FastCGI. Create wiki20.fcgi in the webroot directory:

from fcgi import WSGIServer # you could also use flup etc.
from paste.deploy import loadapp
real_app = loadapp('config:/home/your-username/path/to/production.ini')
def myapp(environ, start_response):
    environ['SCRIPT_NAME'] =  # get rid of the .fcgi in urls
    return real_app(environ, start_response)
server = WSGIServer(myapp)

There are a couple of points of note here:

  • I am using which seems to be slightly more reliable than flup on Dreamhost. You’d better edit so that you won’t show private information to everyone in case of errors.
  • There is a trick to get rid of the .fcgi part from the URLs.

Next we’ll need a .htaccess file:

# Enable Dreamhost stats
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} ^/(stats|failed_auth\.html).*$ [NC]
RewriteRule . - [L]
# FastCGI
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^wiki20\.fcgi/ - [L]
RewriteRule ^(.*)$ wiki20.fcgi/$1 [L]

Now when you go to your site, first time it is going to take a while to load your app, but after that things will be snappy as long as the app stays in memory.

Using M2Crypto with boto – Secure Access to Amazon Web Services

Many companies run services in the Amazon cloud infrastructure, so it makes an attractive target for criminals as well. You need to make sure that you really are talking to the right Amazon servers when you use the cloud services.

boto seems to have emerged as the winner in the scramble to develop Python libraries to deal with Amazon Web Services (AWS). By default, boto will use the stdlib httplib.HTTPSConnection. This is a problem, because the stdlib does not provide secure SSL out of the box. However, boto designers have made it easy to plug in alternative SSL implementations that conform to the httplib.HTTPSConnection interface. M2Crypto provides this in httpslib.HTTPSConnection.

The first step is to get CA certificates that we can use to verify that the Amazon servers we will be talking to have valid certificates issued by trusted certificate authorities.

Amazon offers various services that boto provides access to, so the exact details vary a little bit (namely what connection class to instantiate). I’ll use the SimpleDB as an example, because the first 25 machine hours per month are free so it makes a great test system (you still need to sign up for AWS and provide credit card information).

#!/usr/bin/env python
import sys
from M2Crypto import httpslib, SSL
from boto.sdb.connection import SDBConnection
def https_connection_factory(host, port=None, strict=0, **ssl):
    """HTTPS connection factory that creates secure connections
    using M2Crypto."""
    ctx = SSL.Context('tlsv1')
    ctx.set_verify(SSL.verify_peer | SSL.verify_fail_if_no_peer_cert, depth=9)
    if ctx.load_verify_locations('cacert.pem') != 1:
        raise Exception('No CA certs')
    return httpslib.HTTPSConnection(host, port=port, strict=strict,
def create_connection(aws_access_key_id, aws_secret_access_key):
    """Create SimpleDB connection."""
    conn = SDBConnection(aws_access_key_id=aws_access_key_id,
                         https_connection_factory=(https_connection_factory, ()))
    return conn
if __name__ == '__main__':
    # Sample usage
    if len(sys.argv) != 3:
        sys.exit('Usage: %s aws_access_key_id aws_secret_access_key' % sys.argv[0])
    conn = create_connection(*sys.argv[1:])
    domain = conn.create_domain('mytest')
        item, key, value = 'item1', 'key1', 'value1'    
        domain.put_attributes(item, {key: value})
        assert value == domain.get_attributes(item)[key]
    print 'Usage:', conn.get_usage()

The sample application takes your AWS access key and secret access key as parameters, and it assumes cacert.pem file containing the CA certificates is in the same directory. Typically running that application shows that it uses less than 0.006 secondshours of Amazon computing facilities so you could run this application over 15 million4500 times a month without charge.

Update:I mixed up units, which Mocky pointed out; fixed above.

M2Crypto 0.20.2 for Ancient OpenSSL

M2Crypto has been claiming support for OpenSSL 0.9.7 but it actually turned out I wasn’t testing with quite that old OpenSSL version. Recently M2Crypto got support for RSA PSS stuff, but it turns out this was added in OpenSSL 0.9.7h, and you could not build/run M2Crypto against an older OpenSSL version. Arguably you should not use those old OpenSSL versions, but apparently there are people who can’t help it. And since M2Crypto claims support all the way back to 0.9.7 it made sense to make it so.

The M2Crypto trunk and 0.20.2 now omit the RSA PSS stuff if you have too old OpenSSL. Additionally, to prevent this kind of error from happening in the future, I added “minreq” (for Minimum Requirements) Tinderbox client that builds and tests M2Crypto trunk using Python 2.3, OpenSSL 0.9.7 and SWIG 1.3.28 (the current minimum requirements) on Ubuntu 8.04.

Is Python the Apple of Programming Languages?

Is Python the Apple of Programming Languages? I don’t know who to credit for the expression, but I first heard it mentioned by JJ. It certainly rings a bell in me. Python code looks somewhat polished compared to many other languages, mostly due to whitespace rules and the lack of curly brackets. Python is trendy, generally just works, is pretty consistent, opinionated (there is generally one obvious way to do it) and is governed by a dictator. When you think about those adjectives in the computer hardware sector you can’t help but think about Apple. Many of those points are not unique about Python, nor can Python claim the top spot in trendiness etc. but the combination seems to define Python uniquely in my mind at least.

There is at least one obvious difference when I am comparing the Python programming language to Apple. If your needs or opinions differ from Apple’s, you are out of luck. But since Python is open source you can take it and make it fit your needs better with enough determination. Python is also easy to extend.

The reason why I have been thinking about the aesthetics of Python is because when I am faced with a task that requires me to use another programming language, I immediately become disgusted by the thought because the other language does not feel as elegant as Python. I’ve also been meaning to learn Ruby, and even started to read an online book about Ruby, but my interest waved after I realized Ruby code did not look as nice as Python (to me at least). Some of the resistance against other languages is no doubt because of the natural resistance to change, something which I try to combat consciously by doing something out of my immediate comfort zone every once in a while. Maybe an interesting project where Ruby clearly was the best option would help…

JJ and I have also discussed how you can look at a piece of Python code on someone’s monitor from across the room, and make a pretty good guess about the quality of the code in general just by seeing how the structure looks from far away. I guess that is true to some extent with other programming languages, but I have never experienced that as strongly with anything but Python.

Don’t get me wrong, though. I do know many other languages, and I am actively coding in other languages as well, but it is a constant battle to force myself to do so. I do get occasional surprises of joy with other languages, like when I discover I can do something even easier than with Python. Of course releasing something is always a high, regardless of the language used.

I have heard that people who have used Apple products have a hard time migrating, because they tend to experience the lack of polish in other products more severely compared with people who have not used Apple products. I wonder if Apple products are more popular with Pythonistas than with Rubyists and others.

I guess the question of the day is, if you have tasted Python, will you ever be able to enjoy any other programming language?