Transitland · · 3 min read

How current is that schedule?

The GTFS Best Practices guide recommends publishing data for the current data plus at least 7 days in the future. We've used the Transitland archives to assess how many feeds meet this goal.

💡
This blog post is based on a presentation by Ian Rees and a panel discussion along with representatives from Google, Transit app, and MobilityData regarding "How to Make Sure Transit Riders Get New Schedules from GTFS Static Datasets on Time?" at the 2024 International Mobility Data Summit.

Transitland monitors schedule data for thousands of public transit operators. This data is collected and archived — currently at a pace of tens of thousands of updates per year:

The number of feed versions in the Transitland archive continues to grow year by year.

We rely on operators to set their own policies for updates to their schedule data, and the time period covered by that schedule data. Operators may publish data that contains scheduled service for the next week, the next month, or even the next year. In some instances, operators publish data that only contains service that begins at a future date. This is a critical, and often hidden, factor in how useful the data is to riders and other data consumers.

The "7-day rule"

Data that changes frequently or data that only contains schedules for the next few days struggles against an important constraint: Data consumers might require many hours or days before data updates can be processed and available. (For example, to rebuild a routing graph used in trip-planning software.) These constraints mean service changes may be in effect, on the ground, before these changes can be communicated back to riders, via apps and websites.

Schedules that contain less than a week of future scheduled service can also force consumers to make (often faulty) assumptions about trips on future dates.

Currently, the GTFS Best Practices guide recommends publishing data for the current data plus at least 7 days in the future; we'll call this the "7-day rule."

The percentage of feed versions in Transitland's archive by year (on the x-asis) that have 7 days, 30 days, or 90 days of forward-looking schedule coverage as of when they were fetched by Transitland.

Fortunately, this is a mostly "good news" situation.

Using Transitland's feed archive to assess update frequency

Averaged across all transit operators in Transitland, about 95% of static schedule updates follow the 7-day rule. This number remains about 90% even when expanding the future service window to 30 days.

It's also a good picture when disaggregating and looking at the individual operator level: The vast majority of operators have 100% of schedule updates following the 7-day rule.

The most common exceptions are:

  1. operators that have a tendency to publish data that begins in the future, with no scheduled service on the date of publication
  2. operators that fail to update the values in feed_info.txt which explicitly set the range of valid service dates and remove guesswork and ambiguity for the data consumer.

Even if these exceptions apply to comparatively few feeds, it's still unfortunate if they occur in feeds that are important to a data consumer's use-case (e.g., to provide up-to-date trip-planning for travelers on an important transit operator).

Even more carefully characterizing updates

This provides an optimistic baseline for the current state of GTFS updates. For an even more holistic picture, we also need to incorporate a few additional metrics:

These questions will be explored in a future blog post.

💡
The take-away from this blog post for transit data producers: Please check to ensure that your GTFS feed always includes service for the current point-in-time when it is published, and include at least 7 days of forward-looking schedule coverage (if not 30 or 90 days).

Read next