CaltrainPy 0.5.1 Is Out

Caltrain will update their schedules and fares on Monday, March 2, 2009. They were kind enough to provide the new information online a bit early, so I went ahead and updated my Python Caltrain schedule application and library. Simply easy_install caltrain to get it.

If you run the application, it will now quit when you press ‘q’ or Esc. Naturally the new timetable is included in this release. Caltrain changed their HTML schedule format a bit, which broke the scraper in the previous release; the new release is now more robust in being able to get all the text inside a table cell regardless of HTML markup:

''.join(th.findAll(text=True))

Interestingly enough I did both 0.5 and 0.5.1 releases today. Right after I released 0.5, I run caltrain again in another terminal that was using different virtualenv with the latest BeautifulSoup. It turns out BS 3.1 switched to HTMParser, which can not handle all the bad markup that SGMLParser did. The Caltrain page has this gem in it which killed HTMLParser:

</font color>

A line of preprocessing took care of that.

Incidentally, I will need to update Caltroid, which is a port of CaltrainPy for Android, and the online version, which is mainly targeted at Windows Mobile before Monday. It is going to be a busy weekend.

Similar Posts:

    None Found

2 Comments

  1. Tom Brown:

    https://www.heikkitoivonen.net/blog/2008/03/14/caltrain-schedule-feeds-available/

    calendar.txt:

    SN01272009,0,0,0,0,0,1,1,20090302,20190302
    
    ST01272009,0,0,0,0,0,1,0,20090302,20190302
    
    WD01272009,1,1,1,1,1,0,0,20090302,20190302
    
    SN,0,0,0,0,0,1,1,20090101,20090301
    
    ST,0,0,0,0,0,1,0,20090101,20090301
    
    WD,1,1,1,1,1,0,0,20090101,20090301
    

    Yummy 😉

  2. Heikki Toivonen:

    Yeah, at some point I need to switch my schedule applications to the real feeds instead of scraping the website. One of these days…