A number of transit agencies in the greater Baltimore region have overlapping service areas, and as a result some stops are shared between operators and/or networks. Typically what this looks like is multiple signs will be fixed to the same post for the different services, and usually the stop will have a different reference number within the systems it serves. If there is a shelter, it is operated by just one of the agencies that uses the stop.
Locally, the Maryland Transit Administration (MTA) stops are the most thoroughly mapped, and functions as the “primary” operator for the region, with connecting agencies often functioning as supplements to it rather than alternatives. In auditing the bus stop information on the map, I had initially thought adding shared stop information to a highway=bus_stop node would involve using operator namespaces like ref:MTA= so that it is clear which agency uses that stop number. The distinction between a shared stop node and two different stop nodes for overlapping services seems like an important one, as in some cases an overlapping service can have a stop very close to another rather than sharing one. This can make a difference in the logistics of connecting between services or assessing some of the finer details of bus stop placement.
A leaning tower of bus stop signs at Washington Monument in Baltimore
However, on further consideration, this proved to be more complicated than I had assumed. How would you decide the namespace to use, and make it clear whether or not that namespace refers to the operator or the network? Would it make sense to re-tag non-shared stops with the name space for consistency? What do you do if the different services have different names for the same stop? Although some existing local tagging schemes use operator namespaces for shared stops, it seems like it would be a challenge for data consumers to utilize this information. I am not aware of any that know how to read information about a stop coded in this way.
The tentative approach I am considering is based on the idea that bus stop nodes are an abstraction with some similarities to address nodes. Address nodes may have ranges of house numbers in them, in the same way bus stops may have a list of route_ref designators that share a service, but it would be considered wrong to combine address nodes that have different street names, postcodes, or city names. (These can occur in the same building or place, but would still be added separately.) Though they share a location they are still different features as they have incompatible properties. Shared bus stops could be placed as separate nodes for different services in a row that is perpendicular to the way the routes run on.
The finer position of the bus stop node does matter because it makes a difference for how the bus stops and boards at a location, and locating a stop on the ground within a block can make a difference from routing. (For example, proximity to a crossing can impact the time it takes to get to.) The logic behind putting them in a row perpendicular to the way the routes are on is that the boarding position is preserved while the nodes do not have to be on top of each other. The position of a bus stop sign along the narrow width of a sidewalk is not completely immaterial but it is not as significant as the position along its length - we don’t literally put stopping positions adjacent to the roadway either so this is kind of similar.
This was a stop where I think it made sense to do this. The Sheppard Pratt Hospital Eastbound stop in the suburb of Towson, Maryland had been used by the Maryland Transit Administration’s Baltimore area local bus network alone until late 2021, when Baltimore County launched their own bus transit operation. The new Baltimore County Loop shares this stop now, but has a different sign on the same post, a different ref number, and even a different name within the system. There is no perfect place to put them but I just positioned them with the bus stop sign sandwiched in between so they are still at the same point on the length of the sidewalk. This approach seems more likely to be understood as a shared stop configuration by passengers and existing data consumers. These are both in the same stop area relation, so if you wanted to determine where shared stops are and what their operators are you could refer to that instead of trying to parse a more unwieldy shared tagging scheme.
Discussion
Comment from Minh Nguyen on 24 мамыр 2022 сағат 22:33
Thanks for this detailed description of the problem. My metropolitan area has some 40 different public transportation agencies that all overlap in exciting ways. For example, this train station is shared by two regional commuter railroads and an Amtrak-branded service run by a consortium of local governments. They share the same platforms, more or less, but all have different names, codes, and websites for the same station. One station is especially confusing because one of the railroads calls it by one name, but the railroad is part of the Amtrak network that calls it by a different name.
For the most part, we’ve been handling this situation using ad-hoc subkeys like
name:Caltrain
andrailway:ref:ACE
. However, this is unsatisfying because such subkeys stand no chance of ever being consumed by data consumers, especially if there’s any difference in how the network is spelled as part of anetwork
value versus as a subkey.I don’t think we should duplicate nodes to handle these situations. It’s still one bus stop, just with multiple signs and multiple services. The one feature principle comes to mind: duplicating the bus stops would throw off any statistics about the distribution of bus stops, and duplicating stop positions would require fudging some positions at heavily shared stops. (I would favor mapping multiple coincident traffic sign nodes if you’re getting into that level of micromapping, but that’s because there are multiple physical signs.)
Instead, I think it would be elegant to add the single bus stop node and single stop position node to multiple
public_transport=stop_area
relations, each corresponding to a different network, which in turn can be part of a singlepublic_transport=stop_area_group
relation. Thenetwork
tag on the bus stop itself can establish the order in which the information should be listed when labeling the stop.Redundant stop areas don’t seem like a big problem to me, because stop areas are abstractions anyways. By analogy, the multiple bus routes that serve this stop can have different networks and route numbers, but there’s no ambiguity as to which network corresponds to which route number, because each route has its own relation. That said, it would be nice to hear the opinion of someone more familiar with public transportation renderers, routers, or QA tools.
Comment from عثمان ਉਸਮਾਨ bgo_eiu on 25 мамыр 2022 сағат 20:41
Well, duplicating a stopping position is not something I would suggest because a stop can have multiple stopping locations even if it isn’t shared. A shared stopping position mode would make sense for where two services which don’t have enough overlap in their timetables to arrive or depart at the same time, whereas if they do overlap often one would have to stop further away from the stop sign/platform.
Here, the two stops on the main road serve 5 routes between them. There used to be a single bus stop name and sign, but the MTA split it into two at some point in a way that has some kind of vague logic to it. The 3 routes serving the stop on the left serve a north/central corridor of Baltimore and its suburbs, and the 2 on the right serve a more eastward region. This is a layover point for operators, and all of these routes are frequent enough that buses for all five of them can land at once, and between operators turning on the lights after the bus leaves or buses just not having working lights you’re just kind of expected to know which route will be in which location. When Baltimore County started its own suburban service, they just had to put a stop in the parking lot driveway because there’s no room anywhere else. The way I would set up the stop group relations here is one for each stop that has its own sign, so that it can be associated with its multiple stopping locations, and then a big group for this bus bay overall. (The biggest gap on the main road here is for the route served by an extended accordion bus.)
I’m curious what happens at bus arrivals at some of the monster stops in the Bay Area or greater LA because it seems like the spatial and timetable considerations that make that possible must be kind of different.
The situation where one signpost is used by different services which stop in the same place seems different in that the stopping position being the same means the same thing to passengers using either service, but the bus stop itself contains information/context related to the destination rather than just the location. Here, one of these stops will take you to York, Pennsylvania, on Rabbit Transit, the primary bus provider for that region which offers a pretty infrequent long distance service, and the other one is most frequently served by a route which goes south into Baltimore proper, but happens to be on the same side because it turns around after that intersection. (Unhelpfully, this intersection is called York & Pennsylvania out of coincidence. Most people wouldn’t guess that you can literally use it to travel to the place called York in Pennsylvania.) With this it’s hard to see them as the same object because they serve different purposes - each agency could move these respective stops to different locations and still consider them the same stop because of the routes they serve and their role within the network. Using the address analogy, I kind of see this as analogous to having one entrance (stopping position) but two nodes in a building for different POIs on floor 1 and floor 2.
For what it’s worth, before I just corrected it, there were four bus stop nodes here. I see two now and then from duplicate imports or leftovers from defunct systems, but not sure how that happened. If any statistics about bus stop distribution get thrown off, the data was off to begin with.
Rail stations are a bit different in that they don’t necessarily imply anything about the destination; with buses it can often be described as “the stop to get to work” or “the stop to go school,” whereas rail stops are rarely described that way. A lot of transit routers and apps get this wrong and make the mistake of merging stops by proximity, or showing nearest bus stops, which is not the most optimal configuration for bus transit - the information you need as a passenger is which routes serve your destination, and where are those stops are. If that’s a mile away, that’s fine, trying to put you on a transfer circuit based on “close” stops like Transit app, Google Maps, or similar does only leads to confusion.
I actually don’t use official names on rail stations. The MTA made this a somewhat easy decision by renaming stations, forgetting they renamed them, and forgetting the official name so often that one sign or timetable to the next will have a different name on it. This results in the absurd situation where they’ve reused the same name on different stations, and assigned names to stations that refer to places other than where they’re located. That situation is not tenable and can only lead to people getting lost, so I just try to pick a name that I know any train operator or passenger would be able to tell which station it is from. I haven’t checked to see if stations served by different operators are referred to by different names between their services as well, but I’m kind of less inclined to say the station name even belongs to a transit agency - the public can have their own name for it, and it still has a name if it’s disused or being transferred between operators. It seems similar to the park name situation in that way; DC and Philly’s parks in particular have official names obscure or confusing enough that locals don’t recognize them. I’m completely unfamiliar with transit in California so this could be way off base, but if that were a station local to me, I’d go with “Santa Clara Transit Center” as the name because all the mapped bus routes there call it that, so that’s the name I would need to know regardless of what service I plan on using.
That said, not everybody even finds having one node for one rail station with one service/operator intuitive either by the looks of it. A lot of metro map schematics duplicate nodes for stops where you can transfer between lines, and looking at the DC Metro on OSM, the triple-transfer station of L’Enfant Plaza has in fact been mapped as three station nodes despite being one place. Google Maps does a node per platform, but because they don’t check these things, they have two nodes even at some single-tracking stations with only one platform.
Comment from عثمان ਉਸਮਾਨ bgo_eiu on 25 мамыр 2022 сағат 20:50
OK, I did check and the Amtrak and the MARC regional commuter service share the BWI passenger rail stop (different from BWI Business District station, and “vanilla” BWI station, which are both closer to BWI airport). Now that I’ve noticed, I’m thinking of just renaming it to BWI Airport Amtrak/MARC station, because that’s how the station is signed in BWI Airport itself and they actually have to pick an unambiguous name, whereas the signs at the station itself, what MARC calls it, and what Amtrak calls it are all slightly different.
Comment from EvanSiroky on 4 маусым 2022 сағат 21:35
Having multiples of the same stop in GTFS datasets is incredibly common. There are a number of efforts underway to try to remedy this. In Norway they have a national stop registry. I believe there is something similar in the UK.
There is additional work to do this at a global scale through the Mobility Data organization through creating shared identifiers.
Comment from عثمان ਉਸਮਾਨ bgo_eiu on 5 маусым 2022 сағат 09:32
Yes - documenting the duplicates, missing stops, and stops which don’t exist in the GTFS data is something I aim to do. I am not sure where the best place to put that information other than OSM is yet, but it seems necessary to have an independent data source to compare against the official data.