Much ado about open data

The tech and civic journosphere reported on Wednesday evening with a revelation that the Maryland Transit Adminstration’s bus tracking project had been ‘hacked’ to ‘save Baltimore $600,000 in one day.’ Hyperbole ensued, and if you aren’t familiar with the story, it can be summed up in the following quotes from The Transit App and the MTA’s rebuttal:

Disclaimer: The content of this post is purely the opinion of the author, and not of any current or previous employer. I have not spoken with the Maryland Transportation Authority or The Transit App before writing this.

The tech and civic journosphere reported on Wednesday evening with a revelation that the Maryland Transit Adminstration’s bus tracking project had been ‘hacked’ to ‘save Baltimore $600,000 in one day.’ Hyperbole ensued, and if you aren’t familiar with the story, it can be summed up in the following quotes from The Transit App and the MTA’s rebuttal:

Baltimore’s data wasn’t made available in a developer-friendly format…This means that Baltimore’s real-time tracking data isn’t compatible with the most popular commuter apps… When reporters asked the MTA why they opted to only show the info on their mobile webpage … the MTA responded that it would have been too expensive.

Why are we still working on it? Well, our data is not in GTFS-RT format. (… General Transit Feed Specification-Real Time – is [a] data format used by developers to make transit apps.) …Our CAD/AVL system pre-dates Google. That’s why GTFS-RT was never a requirement… The cost to convert our CAD/AVL data prior to the development of the interface was going to cost MD taxpayers an additional $600,000.

I’d like to applaud both sides for being civil. Many similar discussions have devolved into the realm of toxicity. What I’ve found lacking in this discussion are nuance and actual lessons to be learned for the industry as a whole. That’s why I’m throwing my hat in the ring.

The news coverage on this has not explained, for those who do not know, what an API is. Application Programming Interfaces are standard methods for applications to talk each other. They are what power GMail in your web browser, and provided data nearly every Internet connected app that runs on your smartphone.

Returning to the MTA’s point. GTFS-RT is relatively new as data standards go (July 2012), and while now common, has been slow in adoption. Transit APIs, however, are not. Major CAD/AVL providers have provided APIs for years, and industry experience has shown that whatever the format, developers are ready and willing to consume them. In the US, Clever Devices released Chicago’s Bus Tracker API in 2009, as did NextBus for many of its clients. Since that time, having APIs as part of AVL infrastructure has become best practice. While one can fault the MTA for not making it a requirement, I will let them pass. Technology is evolving faster than many in the public sector can keep up with and their in-house experience is with a system that is “indeed, almost pre-Internet.”

AVL and passenger information are now commodities. Some vendors are using that fact as a way to lead their clients into making a poor choice, where not everything is included. This leads me to the question: while purely legal, is it ethical for a company with years of experience in a sector to sell a product that is does not fully meet industry best practices?

If a vendor tries to sell a product and won’t be forthcoming about APIs, this means either:

  1. You’re speaking to a salesperson,
  2. They’re trying to sell you something outdated, or
  3. They’re lying to you.

What Chris Whong discovered was that, in fact, the MTA’s Bus Tracker does have an API– it’s what he reverse engineered. This is what has not been said. Why a vendor with decades of experience in the field, one that on their website openly touts their transparency, would not openly offer their clients a product that meets best practices is the real question.