Interline Technologies is the maintainer of the Transitland open data platform. Transitland aggregates approximately 1,000 static GTFS feeds from across the United States (as well as real-time feeds, plus feeds from international sources). To power the platform's APIs, our staff and external contributors maintain the Transitland Atlas feed registry repository, which is publicly hosted on GitHub under a permissive license. This feed registry maintains lists of public GTFS feed URLs in addition to associated metadata (including US NTD IDs). The following comments regarding the proposed NTD reporting changes are based on our experience creating and operating these open-data systems since 2015.
Re Section B. Additional Data Within Publicly Hosted General Transit Feed Specification (GTFS) Datasets
Thank you to agencies and NTD for beginning to collect GTFS feeds from reporting agencies. The 2023 Annual Database General Transit Feed Specification (GTFS) Weblinks dataset is a useful public release.
To further increase the use of this weblinks dataset in future years, please guide as many agencies as possible to host their GTFS feeds at URLs that are public and stable. That is:
- URLs - hosting their GTFS feed on an agency-controlled website (instead of submitting feed version as archives via email)
- public - a URL that third-parties can download from (rather than a private submission to NTD)
- stable - a URL that remains the same, while the GTFS feed archive itself is changed as the agency releases new schedules/versions (i.e., no dates in the file name)
Finally, please keep in mind that it's best for agencies to be reporting to NTD the exact same GTFS feed versions that they share with third-party navigation apps and open-data aggregation platforms. This ensures that all data consumers are working from similar information, and makes it easier for data producers to focus their efforts. This is even more relevant for agencies that produce GTFS Realtime feeds as well (which need to match certain identifiers within associated static feeds). Our feedback regarding adding NTD IDs is also based on this goal of preventing agencies from having to produce separate feed files for NTD reporting purposes than they already produce for trip-planning and operational purposes.
Re Align agency_id
and NTD ID Within GTFS File
We support the proposed goal of adding NTD IDs to GTFS feeds. NTD IDs are a useful "crosswalk" between the rider-facing data in a GTFS feed and the operational/planning/financial datasets in the NTD. However, we recommend that the means of adding NTD IDs to GTFS feeds be carefully considered. Interline strongly agrees with related feedback submitted from other commenters, including MBTA and the MobilityData non-profit (of which Interline is also a member).
We recommend:
- NTD IDs should be added to GTFS feeds in a flexible manner that can map on to an entire feed, an agency within a feed, or a subset of routes within a feed. This is necessary because rider-facing "brand names" of transit agencies do not always line up 1-to-1 with NTD reporting IDs, nor do NTD reporting IDs necessarily line up 1-to-1 with the internal means of organizing an agencies' data systems that produce records in agencies.txt
- NTD IDs be added to GTFS feeds in a way that does not conflict with existing data-producing systems. This is toward the goal of agencies reporting their existing GTFS feeds to NTD, rather than creating "one off" feed versions that they customize and submit once a year to NTD. Based on our experience working with a wide range of transit agencies, we recommend that data-producers have the choice to either add an entirely new file or add columns to existing files. For some agencies, it may be simpler to add a single hard-coded file to their feed, while for other agencies, it may be simpler to add an extra custom column to the relevant files in their feed.
- NTD IDs be added to GTFS feeds in a way that does not conflict with existing data-consuming systems. Consumers can choose to accept or ignore new columns and/or new files in GTFS feeds. This makes it simpler for them to "opt in" rather than to have to adapt to a change in the meaning of agency_id.
Our specific recommendation for your consideration:
- Give reporters two approaches for adding NTD IDs to their feeds.
- The first approach will be to add a file named
us-ntd-ids.txt
(or similar) to their feeds. This would be a CSV file in which each row maps an NTD ID to a chosen entity (e.g., an agency or a route). For a simple situation in which there is one NTD ID for a single agent in a feed, this file would only have one row (after the header row); for a complex situation, there may be multiple rows defining how different agencies and/or routes are mapped to different NTD IDs. - The second approach will be to add a
us_ntd_id
column (or similar) to existingfeed_info.txt
,agencies.txt
, and/orroutes.txt
files. Data producers could add this column to just the files in which they want to list one or more NTD IDs.
Our overall recommendation is to provide flexibility to data-producers in where and how they add US NTD IDs. This will lower the uncertainty of changing existing fields (such as the meaning of the agency_id
). This will also decrease the technical trade-offs for agencies with complicated existing IT systems, which may limit how they can change or customize their GTFS feeds.
This proposal does slightly increase the burden of data-consumers to extract the relevant NTD IDs from GTFS feeds and figure out how to apply them to feed contents. We believe that is a reasonable burden for data-consumers to take on. Transitland already does similar operations across the thousands of GTFS feeds it consumes. In exchange, the benefit of this approach is that it encourages agencies to fit NTD IDs into their production systems (rather than producing one-off feed versions just for NTD reporting).
Finally, we recommend reviewing uptake of these two approaches by agencies after a year or two. Perhaps one approach will be fine and the requirement can be refined in the future. Still, we recommend beginning with reporting agencies having more initial flexibility.
Re shapes.txt
File (Geospatial Drawing of Routes) as Part of GTFS Submission
We agree that recommending agencies add shapes.txt
files for all routes/trips will be beneficial to data consumers, especially trip-planners, GIS analyses, and visualizations. However, in our experience, the quality of shapes.txt
geometries can vary wildly. Imprecise and inaccurate shapes can often be worse than no shapes. Data consumers such as Transitland and trip planning apps already have to run automated checks to see whether our systems even want to ingest shapes.txt
records for a given trip or to instead ignore them.
The shapes guidance webpage shared by MobilityData is worth sharing with agencies. It's also worth sharing with agencies and reiterating the value of the GTFS "best practices."
Thank you for accepting this feedback.