Examples of creating GTFS-rt from existing systems

There are several common methods of creating GTFS-rt from an existing system that does not support it out of the box. They typically depend on a system exposing good data, hopefully with trip assignment provided. This post outlines those methods and points to examples of implementation.

Webservice API

Most off-the-shelf AVL systems provide some form of access to the data via an API that is accessible via the web. Modern websites use APIs to avoid requesting an entire page over and over again– they can only request the data that changes and the rest of the page does not need to be loaded.

GTFSrt can often be created by gathering up data from APIs, even when the vendor doesn’t want it.

Caveats: Data for all routes/trips/vehicles may require many separate calls. The least modern sites require screen scraping.

Examples:

Database access

A database, to run post-hoc analysis, is typically included in AVL products. As long as the data is updated when information comes in, GTFS-realtime can be produced by nothing more than a query.

Caveats: Database administrators hate the thought of regular queries happening at short intervals. Don’t worry, that’s what Database replication is for. In short, replication allows you to copy a database, in whole or in part, to another server. It is typically used for either backing up a server in a remote location, or to spread the load on a database across multiple servers. The latter is a perfect argument against an overzealous DBA.

Examples:

Queue

Systems that are built to be distributed across many different components or servers typically use Message Queues as a way to easily pass information across those components without re-inventing the wheel. A common use for a queue in a real-time system is to distribute information in… real time. Any application that needs the data can subscribe to the queue and receive it without placing additional strain on the data producer.

There are many implementations of queue, each with their own benefits and pitfalls. Some common ones include

  • ZeroMQ  and RabbitMQ (different libraries designed to be built into code directly),
  • Apache Kafka (loved by the big data set for its ability to not just send data, but transform it as well),
  •  Amazon SNS and SQS (each a queue as a service),
  • CORBA (please don’t use this if you have a choice), and
  • MQTT (designed for embedded devices).

A GTFSrt producer connects to a queue and builds a GTFSrt message from that. Every N seconds the producer produces a GTFSrt feed and sends it off for distribution.

Caveats: The system needs to be built from the ground up to include a queue. Surprisingly, a number of the big AVL players use them, and they’re not always hidden. This was how some of the first RTPI on the web products got off the ground– a piece of custom software shipping data from an AVL queue to the system that handled distribution.

Examples:

onebusaway-nyc-gtfsrt

Leave a Reply

Your email address will not be published. Required fields are marked *