Pulling Android Market Sales Data Programmatically

Android Market handles sales through Google Checkout. I haven’t tried selling anything else online before, but what this setup provides for me as the seller leaves a lot to be desired. One issue you will have trouble with is getting the data needed to file taxes.

Google provides a Google Checkout Notification History API that lets you programmatically query sales data. For my purposes the API requests are really simple: just post a small XML document with the date range I am interested in, get back XML documents that contain my data. If there is more data that fits in a single response, look for an element that specifies the token for the next page and keep pulling until you get all data.

Below is a really simple Python script that uses M2Crypto to handle the SSL parts for the connection (needed since Python doesn’t do secure SSL out of the box). You will also need to grab certificates. You should save the script as gnotif.py, save the certificates as cacert.pem and create gnotif.ini as described in the script below all in the same directory. When you execute it, it will ask for start and end date (in YYYY-MM-DD format) and then fetch all the data, saving them in response-N.xml files, where N is a number.

#!/usr/bin/env python
# Script to query Google Checkout Notification History
# http://code.google.com/apis/checkout/developer/Google_Checkout_XML_API_Notification_History_API.html
 
# Supporting file gnotif.ini:
#[gnotif]
# merchant_id = YOUR_MERCHANT_ID_HERE
# merchant_key = YOUR_MERCHANT_KEY_HERE
 
import base64
import re
from ConfigParser import ConfigParser
 
from M2Crypto import SSL, httpslib
 
ENVIRONMENT = "https://checkout.google.com/api/checkout/v2/reports/Merchant/"
XML = """\
<notification-history-request xmlns="http://checkout.google.com/schema/2">
%(query)s
</notification-history-request>
"""
 
config = ConfigParser()
config.read('gnotif.ini')
MERCHANT_ID = config.get('gnotif', 'merchant_id')
MERCHANT_KEY = config.get('gnotif', 'merchant_key')
 
rawstr = r"""<next-page-token>(.*)</next-page-token>"""
compile_obj = re.compile(rawstr, re.MULTILINE)
 
auth = base64.encodestring('%s:%s' % (MERCHANT_ID, MERCHANT_KEY))[:-1]
 
ctx = SSL.Context('sslv3')
# If you comment out the next 2 lines, the connection won't be secure
ctx.set_verify(SSL.verify_peer | SSL.verify_fail_if_no_peer_cert, depth=9)
if ctx.load_verify_locations('cacert.pem') != 1: raise Exception('No CA certs')
 
start = raw_input('Start date: ')
end = raw_input('End date: ')
 
data = XML % {'query': """<start-time>%(start)s</start-time>
<end-time>%(end)s</end-time>""" % {'start': start, 'end': end}}
 
i = 0
 
while True:
    c = httpslib.HTTPSConnection(host='checkout.google.com', port=443, ssl_context=ctx)
    c.request('POST', ENVIRONMENT + MERCHANT_ID, data,
             {'content-type': 'application/xml; charset=UTF-8',
              'accept': 'application/xml; charset=UTF-8',
              'authorization': 'Basic ' + auth})
 
    r = c.getresponse()
 
    f=open('response-%d.xml' % i, 'w')
    result = r.read()
    f.write(result)
    f.close()
 
    print i, r.status
 
    c.close()
 
    match_obj = compile_obj.search(result)
    if match_obj:
        i += 1
        data = XML % {'query': """<next-page-token>%s</next-page-token>""" % match_obj.group(1)}
    else:
        break

As you take a look at the data you will probably notice that you are only getting the sale price information, but no information about the fees that Google is deducting. Officially it is a flat 30%, but I have found out a number of my sales have the fee as 5%. So we need to get this information somehow. Luckily you can toggle a checkbox in your Google Checkout Merchant Settings. Unfortunately there is a bug, and the transaction fee shows as $0 for Android Market sales. I have reported this to Google, and they acknowledged it, but there is no ETA on when this will be fixed.

I also haven’t found any way to programmatically query when and how much did Google Checkout actually pay me. (I can get this info from my bank, but it would be nice to query for that with the Checkout API as well.)

Last but certainly not least, working with the monster XML files returned from Google Checkout API is a real pain. If someone has a script to turn those into a format that could be imported into a spreadsheet or database that would be nice…

Similar Posts:

7 Comments

  1. Polyclef:

    Hey…I’m the author of CheckoutCheck, an app in the Android Market that fetches Google Checkout information for particular parameters (e.g. date ranges, payment status, etc.), so I’m reasonably familiar with the Checkout API. Some people aren’t crazy about my app…comments say it’s slow, doesn’t work for large date ranges, doesn’t work for particular people. But I basically just wrote it for myself to check completed transactions for individual days. I also wanted to be able to have a nice summary of how many of each particular app I sold on a particular day, something that the Google Checkout website and exported information didn’t provide. That was a pain, because I had to parse and cross-reference two different XML files from Google Checkout to get the exact report I wanted.

    Anyway, if you or anyone else is interested in the code, I’m willing to share it. I may just go ahead and put it up on Google Code and see if anyone else is willing to continue to develop and improve it.

  2. Peter:

    I just download the csv files that google checkout provides and import them into excel. then i add subtotals at each change in date and graph it. it’s not perfect but at least i can see total sales amounts per day.

  3. Heikki Toivonen:

    @Polyclef: I would be very interested in seeing your code, so please put it up somewhere. I am not so interested in an Android app, but something that I can run on my desktop and get the data I need. I am sure there are more people in my situation, so I am pretty sure we could collaborate on something.

    @Peter: Those csv downloads are fine, except they only go back two months. If you forgot to do that at some point during the year, you must use the notification history API to obtain the data. If Google provided the csv downloads for 18 months like they do for the notification history API, I would be ok with it. The current situation sucks.

  4. Polyclef:

    I’ve put the code up on Google Code:

    https://code.google.com/p/checkoutcheck/

    It’s pretty clunky and may not be very readable, but like I said, it works for me. If you have any questions about it, shoot me an email at polyclefsoftware@gmail.com. Hope someone finds it useful.

  5. Alocaly:

    Hi all,

    This technique is fine for getting data from google Checkout…
    Can we also apply it to get data from the market page ?
    So we could have some graph of the evolution of DL, and active installs ?

    Hum… Looking at the code, I feel there is no open API like the checkout one available for the market…
    Am I correct ??

    Emmanuel / Alocaly

  6. Heikki Toivonen:

    @Polyclef: Thanks, I took a brief look earlier but I will need to spend some more time on the code.

    @Alocaly: AFAIK there is no open API for the market.

  7. Sean:

    I second the idea of a desktop app that will download and mash up the sales data from Google Checkout. I use AppViz for sales on the iPhone app store and would love to have a tool like that for our Android sales. Right now it’s a complete pain to manually download the data and import it into excel…