Much ado about open data

The tech and civic journosphere reported on Wednesday evening with a revelation that the Maryland Transit Adminstration’s bus tracking project had been ‘hacked’ to ‘save Baltimore $600,000 in one day.’ Hyperbole ensued, and if you aren’t familiar with the story, it can be summed up in the following quotes from The Transit App and the MTA’s rebuttal:

Disclaimer: The content of this post is purely the opinion of the author, and not of any current or previous employer. I have not spoken with the Maryland Transportation Authority or The Transit App before writing this.

The tech and civic journosphere reported on Wednesday evening with a revelation that the Maryland Transit Adminstration’s bus tracking project had been ‘hacked’ to ‘save Baltimore $600,000 in one day.’ Hyperbole ensued, and if you aren’t familiar with the story, it can be summed up in the following quotes from The Transit App and the MTA’s rebuttal:

Baltimore’s data wasn’t made available in a developer-friendly format…This means that Baltimore’s real-time tracking data isn’t compatible with the most popular commuter apps… When reporters asked the MTA why they opted to only show the info on their mobile webpage … the MTA responded that it would have been too expensive.

Why are we still working on it? Well, our data is not in GTFS-RT format. (… General Transit Feed Specification-Real Time – is [a] data format used by developers to make transit apps.) …Our CAD/AVL system pre-dates Google. That’s why GTFS-RT was never a requirement… The cost to convert our CAD/AVL data prior to the development of the interface was going to cost MD taxpayers an additional $600,000.

I’d like to applaud both sides for being civil. Many similar discussions have devolved into the realm of toxicity. What I’ve found lacking in this discussion are nuance and actual lessons to be learned for the industry as a whole. That’s why I’m throwing my hat in the ring.

The news coverage on this has not explained, for those who do not know, what an API is. Application Programming Interfaces are standard methods for applications to talk each other. They are what power GMail in your web browser, and provided data nearly every Internet connected app that runs on your smartphone.

Returning to the MTA’s point. GTFS-RT is relatively new as data standards go (July 2012), and while now common, has been slow in adoption. Transit APIs, however, are not. Major CAD/AVL providers have provided APIs for years, and industry experience has shown that whatever the format, developers are ready and willing to consume them. In the US, Clever Devices released Chicago’s Bus Tracker API in 2009, as did NextBus for many of its clients. Since that time, having APIs as part of AVL infrastructure has become best practice. While one can fault the MTA for not making it a requirement, I will let them pass. Technology is evolving faster than many in the public sector can keep up with and their in-house experience is with a system that is “indeed, almost pre-Internet.”

AVL and passenger information are now commodities. Some vendors are using that fact as a way to lead their clients into making a poor choice, where not everything is included. This leads me to the question: while purely legal, is it ethical for a company with years of experience in a sector to sell a product that is does not fully meet industry best practices?

If a vendor tries to sell a product and won’t be forthcoming about APIs, this means either:

  1. You’re speaking to a salesperson,
  2. They’re trying to sell you something outdated, or
  3. They’re lying to you.

What Chris Whong discovered was that, in fact, the MTA’s Bus Tracker does have an API– it’s what he reverse engineered. This is what has not been said. Why a vendor with decades of experience in the field, one that on their website openly touts their transparency, would not openly offer their clients a product that meets best practices is the real question.

AVL: Monitoring

Monitoring systems are invisible, yet crucial. They help find out if there is a problem, and also help pin down the source of a failure so it can be fixed. Ideally, all of the boxes from the introductory post should have some point at which they can be monitored, both on input and output interfaces. In this post, I will briefly introduce the concepts in monitoring and explain why AVL systems, whether turn-key or custom-built, should have a monitoring component, and why that component should be evolve after delivery.

The four core concepts of monitoring systems are metrics, checks, alerts, and alarms.
Continue reading “AVL: Monitoring”

AVL: Assignment

Assignment functionality takes location data and turns it into information by linking that location to scheduled piece of work (e.g. a transit trip or a delivery itinerary). It is a crucial, yet opaque component of an AVL system. Each AVL product will have a different take on the problem, with different requirements from input data and operator/supervisor interaction.

Assignment sits on a continuum between Explicit and Implicit or Inferred. Explicit assignments are made by an operator or system that assigns a piece of work to a vehicle. Assignments are often provided by a vehicle operator entering their assignment into a device on the vehicle, or a dispatch system with prior knowledge of which vehicle(s) will perform which assignment. Inferred assignments are made by a process that analyzes the vehicle’s recent behavior to guess what piece of work it is serving. Inferring assignments are more complicated in dense networks, where multiple assignments can be likely.

If you are procuring an CAD/AVL/RTPI system, questions to ask about its assignment system are:

What is the interaction that operators are expected to have with the system? Do they have to interact with a new device? What does that interface look like? Specifically, how many new keys do they have to press? Will there be labor relations issues as a result?

Does the system rely on assignment data from an external system (e.g dispatcher entry)?

What validation is done against operator or dispatcher entered data? Can an operator enter incorrect data?


What level of intelligence is on the vehicle? Most “computer-on-a-bus” solutions download the entire schedule to the bus and can store data for later transmission when losing connectivity. More simple on-board systems do not store any information on the vehicle; they send only the information they have. How/when are are data on vehicles synchronized?


How flexible can the system be when deviating from expectations? For example, how does it handle detours? What about extra unscheduled service?

What does the assignment system it telling you? Does it ever display expected (rather than actual) information? Does it tell you the difference between the two?

AVL: Communication

Communication is the most “magical” of an AVL system’s components– it carries information over the ether from vehicle to server, and when it works, it just works. It is also the most well understood of these components, so this post will focus solely on things to watch out for.

Most AVL systems are now using cellular communications. Hardware is easy to acquire, costs are somewhat reasonable, and speeds are more than enough for low bandwidth applications.

One thing to watch out for is the the “sunsetting” of 2G and 3G networks. In the US, wireless carriers will be shutting down 2G and 3G networks as soon as 2016 (AT&T), and possibly as late as 2020-21 (Verizon). At this point, solutions that provide no upgrade path to 4G should not be considered for the long run.

Private/Packet Radio:
Packet radio is what has traditionally powered AVL communication. Older AVL-over-radio systems suffer from very long update intervals– often 5-10 minutes between position updates. Most new systems have much shorter intervals, but it is crucial that you know what the update interval is, regardless of technology or age. Some unscrupulous vendors are still selling systems of a previous generation.

Long touted as the next big thing, mesh networking has never taken off commercially. The industry has not given up hope =iIt is still seeing new research. Be wary if it is proposed; the supplier better have a good testing and backup plan.

AVL, CAD, and Real-Time Passenger Info for Beginners

Real-time Automated Vehicle Location (AVL) has now become a commodity product. Unfortunately, some companies are using commodity as an excuse to pass off incomplete or unsuitable products to unsuspecting buyers.

This series introduces the concept of AVL and its extensions, Computer Aided Dispatching (CAD) and Real-Time Passenger (or Customer) Information (RTPI). Each of the following sections introduces a key component. The goal of these posts is not to convey mastery, but to get an idea of the right questions to ask. For a more generic introduction see TransitWiki.

The components I will cover are:

Location locates a vehicle,
Communication carries locations to a server
Assignment informs about on what that vehicle is doing,
Dispatcher User Interfaces are used to managing service,
Passenger User Interfaces help passengers, and
APIs allow developers to connect systems and build new interfaces.
Monitoring systems keep the system up and running.

The difference between AVL, CAD/AVL, and RTPI products is the combination of these components. I’ll refer to a solution that includes all of them as the “Full Stack”. You’ll find their definitions below the fold.
Continue reading “AVL, CAD, and Real-Time Passenger Info for Beginners”