OpenStreetMap logo OpenStreetMap

At long last, I’ve done a complete pass over municipal and CDP boundaries in New York State. Barring errors and omissions (I daresay there must be some) every incorporated community and every CDP in the state has had its border checked against NYSGIS Civil Division Boundaries and TIGER/Line 2021 respectively, almost always resolving conflicts in favour of the former. All have place=* nodes representing them, with the node a label member of the boundary relation.

Populations are updated as of the 2020 Census. GNIS, FIPS, NYS SWIS, Wikipedia and Wikidata links are provided.

Most of the remaining work that I’d have to do before I consider the job to be done has to do with the tagging on the place=* nodes. Right now, they’re a hodgepodge. Most of them came in from the TIGER import of 2008 with place=* representing their form of government. This is NOT an indication of the significance of the place. Brentwood, Long Island, a bustling community of over 60,000 souls, is tagged place=hamlet because it does not have home rule. Geneva, a sleepy lakefront village is 3400 inhabitants or so, is tagged place=city because it has a city charter.

For a first stratification, I’d propose simple thresholding on population:

  • City: At least 50000 inhabitants.

    This would encompass New York, Buffalo, Yonkers, Rochester, Syracuse, Albany, Schenectady, Utica, White Plains and Troy. The largest communities not to make the cut would be Niagara Falls and Binghamton. The ‘city’ tag would also fall on the suburban communities of Ramapo, Amherst, New Rochelle, Cheektowaga, Mount Vernon, Brentwood, Clay, Hempstead, Town of Tonawanda, Levittown, and Irondequoit.

  • Town: >4800 inhabitants.

    This was a somewhat arbitrary cutoff. I wanted it to include Saranac Lake (pop. 4887) because that community has the only hospital for many miles around, and has an airport with scheduled, albeit infrequent, service. The threshold could be set higher if the manual work of identifying the sites of such facilities as hospitals, universities, airports, major markets, and so on were to be attempted, but I’d consider that to be Out of Scope.

  • Village: >1000 inhabitants.

    Totally arbitrary, there’s a long tail and you have to cut it off somewhere.

  • Hamlet: Smaller.

There are some tagging anomalies that also need attention.

  1. For townships that didn’t have an identifiable population center with the same name as the township, I reimported label nodes from GNIS. I tagged these with not:place=town place=region to indicate the fact. I seleted region because it was available as a JOSM preset, but I now realize that the Wiki mentions place=municipality, and that seems to be a better fit. I’ll make this change as well.

  2. The only correct use of place=suburb among the objects I’ve examined is that the five boroughs of New York City fit the OSM definition. There are other communities that are mistagged place=suburb because they are near to a major city, but that’s not correct tagging.

  3. CDP’s that don’t correspond to identifiable unincorporated communities (for instance, the ones that represent resident university campuses) are tagged place=locality and this should most likely be left alone. CDP’s that represent portions of a city or surround subdivisions, I’ve retagged place=neighbourhood and these too should be left alone.

  4. Somewhat controversially, I’ve left boundaries of most CDP’s as boundary=administrative. I know for certain that the ones in Nassau County, at the very least, actually are administrative subdivisions without home rule - the towns of Hempstead, North Hempstead, and Oyster Bay all designate hamlets, and often promulgate things like parking regulations and zoning ordinances by calling out the hamlets by name rather than repeating the boundaries in each piece of legislation. I figured that in doubtful cases, it’s better to show the boundaries than to hide them.

  5. Even more controversially, most incorporated communities have an office=government node taking the administrative role, and showing the location of the town administration (the town hall or equivalent) and contact information for general inquiries (usually the town clerk’s office). This is a total abuse of the tag - it’s supposed to identify the capitAl, not the capitOl. Nevertheless, it provides useful information, and I believe that instead of deleting the relation members wholesale, it would probably be better to rename the role.
    Does anyone think it would be worthwhile to work up a proposal for a seat role (or something similar - the Naming of Names is an area that I try very hard to steer clear of)?

Email icon Bluesky Icon Facebook Icon LinkedIn Icon Mastodon Icon Telegram Icon X Icon

Discussion

Comment from Minh Nguyen on 24 August 2022 at 10:32

The thought you’re putting into this boundary mapping and cleanup effort is setting a great example for us to follow in other states that have their own vagaries.

This was a somewhat arbitrary cutoff. I wanted it to include Saranac Lake (pop. 4887) because that community has the only hospital for many miles around, and has an airport with scheduled, albeit infrequent, service. The threshold could be set higher if the manual work of identifying the sites of such facilities as hospitals, universities, airports, major markets, and so on were to be attempted, but I’d consider that to be Out of Scope.

You’ve just justified a one-off exception for Saranac Lake, which would allow you to set a rounder overall threshold that doesn’t sound so arbitrary. Some mappers may be inclined to second-guess or ignore arbitrary-sounding rules.

Somewhat controversially, I’ve left boundaries of most CDP’s as boundary=administrative. I know for certain that the ones in Nassau County, at the very least, actually are administrative subdivisions without home rule - the towns of Hempstead, North Hempstead, and Oyster Bay all designate hamlets, and often promulgate things like parking regulations and zoning ordinances by calling out the hamlets by name rather than repeating the boundaries in each piece of legislation. I figured that in doubtful cases, it’s better to show the boundaries than to hide them.

It sounds like these particular imported CDPs are coincidentally coincident to real places that should have been mapped as administrative areas but, like minor civil divisions, were omitted from the TIGER boundary import. You may want to add border_type=* so that someone doesn’t come along, see “CDP” inside tiger:NAMELSAD, and think it only represents a CDP and therefore should be retagged as boundary=census.

This is a total abuse of the tag - it’s supposed to identify the capitAl, not the capitOl. Nevertheless, it provides useful information, and I believe that instead of deleting the relation members wholesale, it would probably be better to rename the role.

I’ve been using operator and operator:wikidata to associate a government’s headquarters with the boundary relation representing the government’s jurisdictional area. The operator:wikidata tag of the amenity=townhall or office=government would match the wikidata tag of the boundary.1 I find this approach to be more flexible in cases where a government’s offices aren’t centralized in a single building, typical of county governments in some states. It’s also consistent with tagging for company headquarters, park offices, university administrative offices, etc.

If the office must be a member of the boundary relation, then a seat role would be an improvement. But this comes uncomfortably close to site relation semantics, for something that isn’t as compact as a site.

  1. More precisely, there would be separate Wikidata items for the place versus its government, linked by the authority and applies to jurisdiction properties. Data consumers would need to consult the Wikidata API or a database extract or query the Wikidata Query Service to determine the relationship between the office and the boundary. But so far I’ve yet to come across a compelling articulation of why a data consumer would need to automatically associate these things anyways. 

Comment from gdt on 24 August 2022 at 13:02

I have long thought that place=foo and admin boundaries are not definitely related even though in most cases they match. The first is the hierarchy of “settlements” which one can determine by ignoring government and seeing where people live, and the second is government. Granted, typically governments are organized around where people live, so in New England there are town centers and town boundaries. But there are also secondary villages within towns, that historically where somewhat separate culturally.

So when putting place= and associating population, is that the population of some admin thing, or does one essentially tile with place= and count population in each polygon?

And then there is quarter/neighborhood as place, which are meant to be sub-parts of city, vs town/village/hamlet which aren’t. So in Acton, MA, it is a town (I think, <50K) in osm-speak, and there is South Acton and West Acton which are not separate by government but which have old town centers. In counting their population, is it removed from Acton’s? I think this situation needs a sub-part of town vs village, as in the modern world admin and locality are messily intertwined.

Also there is place=locality for places that have names but it’s not about people living there, but that can be avoided.

Comment from gdt on 24 August 2022 at 13:18

Also, I thought that OSM had population cutoffs for these terms, and I’d rather see an exception than a tweaked threshold. If you look at the one you want to promote, is the underlying reality that the number of people who consider that they sort of live there, even if outside some admin polygon, is higher?

Also, city and town are relative. In sparse regions, a place with a hospital and airport is a big deal as you say. That level of people. adjacent to a big city, is not worthy of promotion.

Comment from ke9tv on 24 August 2022 at 15:44

@Minh Nguyen:

I’m ok with a rounder arbitrary threshold with an exception for Saranac Lake. It turns out that “about 5000” seemed like the right level anyway, but I’m equally good with saying “5000, but make sure Saranac Lake is included because of its very high regional importance.”

Your suggestion of border_type to label the type of government that the border bounds is a good one. So good, in fact, that it’s already there. I didn’t trouble labeling Cities, Towns and Villages with CDP since they’re all CDP’s, but you’ll see a handful of just hamlet (mostly dissolved Villages that still have legal boundaries but have devolved to the Town government), CDP (for CDP’s that I know are not governmental at all, like university campuses or unnamed suburban regions like ‘Ithaca Northwest’ or cases where the CDP is misaligned to a political unit and I can’t fix it - as with my home town of Niskayuna), and hamlet:CDP for both the cases that I know are civil subdivisions and the cases where I’m not sure.

I retained City of XXX, Town of XXX and Village of XXX consistently in name=* simply because border rendering looks too weird without it. It’s quite common in New York to have City of Plattsburgh be the chartered city, and Town of Plattsburgh be the remainder of the township, which has its own government. The two governments are independent of each other, and both slot in at admin_level=7, so that’s no guide. It looks really strange to see a rendered political boundary with ‘Plattsburgh’ on both sides!

I did NOT put City of, Town of or Village of on place nodes with only a handful of exceptions. Town of Tonawanda is NOT an accident; it’s actually a name that’s in common use to distinguish it from the adjoining cities of Tonawanda and West Tonawanda. I’m blanking on the name, but there’s also a township in the Finger Lakes that’s named for a village OUTSIDE the town boundaries (the township got subdivided). (That’s one of the ones that got an artificial place=region (to be place=municipality soon) from the (previously non-imported) GNIS entry for the township. Clearly, the Towns that were remainders when Cities were chartered also get this treatment. Town of Hempstead also got handled carefully because of its high significance: Town of Hempstead (admin_level=7) has about 800,000 inhabitants. It likes to bill itself as “America’s Largest Township.” But none of the inhabitants associates with it as a “home town”. If they say that they live in “Hempstead”, they’re talking about the Village of Hempstead within the town, itself a significant suburban city of about 60,000.

In general, it’s hard to deal with place nodes that represent different levels of the administrative hierarchy: the Village of Schoharie within the Town of Schoharie within the County of Schoharie. In general, I retained the smallest place because that’s what people identify with, and kept it as the label of all the same-named objects. It didn’t seem to make sense to have multiple nodes at the same location (and if they’re exactly at the same location, it gives a lot of the electronic tools heartburn. In all the cases I spotted where town and village are named alike (a couple of hundred), residents of the town outside the village generally identify as “coming from” the village or hamlet that they live in - unless they really are in the hinterland, but even then, they’ll often qualify their answer to say that they live outside the village.

If there were tiger:NAMELSAD tags on any of these things, they were gone before I got here. I knew that it was a TIGER import because so many had TIGER in the source tag, but there were no tiger:* tags on any of them. There were a lot of gnis:* tags on the places, and I got rid of them, saving only the feature ID. Nobody needs the redundant information of which state and county the features are in, or the fact that the node represents a “Populated Place”. Most particularly, nobody needs to have every place node state that New York is the 36th state in alphabetical order or that Wyoming County is the 61st county of the state.

It’s an easy enough operation to push operator and operator:wikidata down the admin_centre link. One use case I had in mind for going the other way is, “I just moved to town; where do I go for voter registration, dog licensing, property tax information, etc?” In all cases in New York, the town or city clerk’s office is a starting point. The clerk (an elected position) is the official custodian of records (and the boss of pretty much all the local bureaucrats). I don’t know how easy it would be get that from the OSM data if we were to turn the relationship upside down the way you suggest.

Comment from ke9tv on 24 August 2022 at 16:02

@gdt:

Arbitrary population cutoffs were removed from the Wiki for place=* quite some time ago. Mappers weren’t following them at all, and instead using something like the Christaller model that you suggest.

As far as fitting the OSM terms into a Christaller model, I don’t have any source of data for ‘number of people who say they live in a place’. What I have - what all of us have - is census figures, tied to arbitrary polygons. Generally, of course, these polygons follow the political boundaries.

In all these cases of administrative regions/CDP’s, the population - and the source for it - is tagged on the boundary, so that’s unambiguous information about what population was counted, who counted it, and when it was tabulated. (And in all cases in New York at admin_level>6 that’s now the 2020 Census. It appears someone else already did the counties.) That’s really the best any of us can do.

As for the population on place nodes, that pretty much is tagging for the renderer, but it’s information that at least some renderers use. At least, it’s not “lying to the renderer.” population=* on aplace node is asserting “there is an enumeration region of this name containing this point which has the given population.”

It’s not ideal, it’s a starting point. It’s surely better than the mess we have, in which Geneva (pop. <4000) is a city because it has a city charter, while Brentwood (pop. >60000) is a hamlet because its only local government is the Town of Islip. I don’t think we’re going to do significantly better than a somewhat arbitrary framework without a lot more effort than I’ve put into this, and I’ve been working on this project for over half a year at this point.

Comment from Minh Nguyen on 24 August 2022 at 21:12

population=* on aplace node is asserting “there is an enumeration region of this name containing this point which has the given population.” […] It’s not ideal, it’s a starting point.

Last year, there was a proposal to use the Census Bureau’s urbanized areas as the basis for population tags on place nodes. Urbanized areas ignore jurisdictional boundaries in favor of population distribution, which theoretically would line up better with that the place nodes represent, but the messy reality is that populated places are also a function of commerce and industry, which the urbanized area definitions don’t consider, and sometimes downright arbitrariness.

Ultimately, there’s no purely data-driven method for correctly sizing every place label on a map without some degree of human judgment. As you say, the population tags are just a starting point. It may be good enough for the “long tail” of places that a data consumer wouldn’t know how to classify manually.

It’s an easy enough operation to push operator and operator:wikidata down the admin_centre link. One use case I had in mind for going the other way is, “I just moved to town; where do I go for voter registration, dog licensing, property tax information, etc?” In all cases in New York, the town or city clerk’s office is a starting point. The clerk (an elected position) is the official custodian of records (and the boss of pretty much all the local bureaucrats). I don’t know how easy it would be get that from the OSM data if we were to turn the relationship upside down the way you suggest.

To me, this use case doesn’t sound fundamentally different than searching for your state legislator’s constituent service office, police precinct, school board office, or power utility office. In general, we aren’t mapping service areas as boundaries, but some government offices happen to have service areas that conform to an administrative boundary. Even so, it’s up to the user to do their homework about which local office can help them.

In some states, things get too complicated to express in tags. For example, San José’s water utility – a bona fide part of city government – serves only 12% of the city, not including where I live. For most purposes, the county sheriff’s office serves unincorporated areas but not cities and towns. In a neighboring county, the county’s public health department doesn’t serve one city that has their own public health department. There’s a contract city nearby that contracts with other governments to provide basic services and generally doesn’t provide services “in house”.

As long as there’s a distinct item for the government as opposed to the place, then both the boundary and office could be tagged with the same operator:wikidata, making that a little easier. But I don’t think there’s very much a data consumer should infer based on that relationship.

Comment from ke9tv on 26 August 2022 at 02:17

As promised, place=region is dead; long live place=municipality.

I’m still working through the details of how to do a mechanically-aided edit to push operator and operator:wikidata down from the boundary onto the seat of government, and then break the admin_centre link. (My usual automate via JOSM tricks won’t quite work, because the API I’m using doesn’t edit relation memberships.) I do not want to do this as two thousand more changesets!

Comment from they on 23 December 2023 at 16:31

While not based strictly off population, the US Census Bureau’s statistical areas seem to align well with common usage of the place=city tag in OSM - an important regional hub. Statistical areas were used for identifying regional centers in New Mexico’s highway=trunk network using this guidance: “a city is considered the urban core of any Metropolitan Statistical Area (MSA) or Micropolitan Statistical Area (μSA) that is not part of a Combined Statistical Area (CSA) that contains a MSA. If the μSA in question is part of a CSA that does not contain a MSA, then the largest μSA in that CSA shall be considered a city. “ Most cities in New Mexico follow these criteria. The two exceptions - Deming and Grants - are towns of about 10,000 people located within a commutable distance to a much larger metropolitan area. A little more under the radar, when someone decided to tag every incorporated place in Wyoming as place=city, I used these same guidelines to clean that up, leaving only the Census’ statistical areas as cities.

For New York, it might make sense to only consider metropolitan areas as cities. That would align with your threshold of 50,000 inhabitants, except that is not restricted to the city limits but instead the entire statistical area for a metropolitan statistical area. This would result in places like Binghamton, Ithaca, and Watertown retaining their city status while preventing places like Clay, a suburb of Syracuse with just over 50,000 inhabitants, from getting tagged as cities.

With that said, I think this mechanical edit will be a big improvement overall, and the cities will be easy to clean up after the fact.

Comment from RDreher on 20 May 2025 at 04:06

This is not how the state of New York works and I don’t like or approve of the way you have ruined the map. Population is an arbitrary and stupid way to decide if something is a village, town, city or not. It does not take into account density, number of shops and business, how walkable a place is, all the sorts of things that might factor in to whether a place feels like sleepy village or an urban metropolis, or anything in between. On the other hand, the legal designations that the state of New York uses for our municipalities are clear and meaningful rules. More importantly, they are form the language that we are used to using about our places. A New York town is usually an low-density rural place, while a village is actually usually a higher density place within that town. When my friend moved to Glenville and felt far away from us in Scotia, I and our friends would say “remember when he lived in the village?” Any New Yorker looking at the map after what you’ve done to it wont understand what the heck they are seeing. There’s no reason the place= designation needs to reflect population. Population is listed as it’s own piece of information, and it makes exceedingly more sense if place= designates some formal, legal, and meaningful information to the people to whom that place is home.

Comment from Minh Nguyen on 20 May 2025 at 15:26

If you’re looking at a place=* tag to understand official designations, please consider looking at the associated boundary relation’s border_type=* tag instead. Tags like place=town are intended to represent population centers, somewhat irrespective of government structures. This may make place nodes less useful to you in orienting yourself on the map, but it’s a tradeoff in favor of harmonizing a system beyond state lines.

If we are interested in refining the population-based heuristics to account for density and business activity, first we need to choose a geographic extent independent of official municipal boundaries. This proposal calls for a more rigorous basis for place classification that generalizes nationally without as many fudge factors. It is a result of many difficult debates on the forum and elsewhere. But you don’t need to wait for this to happen to start using border_type=* today.

Comment from RDreher on 20 May 2025 at 16:43

And why should place= be used “somewhat irrespective of government structures??” Would you call something an apartment building just because it is large? Would you call a garden “farmland” because it has some carrots growing in it? Would you call any large church a “cathedral” irrespective of whether or not it’s the seat of a bishop? Would you call a Japanese shrine a park because you can’t understand how they worship?

Houses, apartments, etc are not defined by size. They are defined by the socially, economically, and politically defined systems that govern their use. A single family home could be a big McMansion being shared by a large multi-generation family, or an apartment building could be three or so separated stories with one tenant each that looks a lot like a house from the outside. Human’s are not ants, you can’t just categorize our built environment like you’re some space alien from above who doesn’t understand us. Roads, buildings, land use regions, everything else on the map is informed by the ways the locals use and categorize those things. I studied anthropology, and anthropologists know how imposing outside categories when describing a culture leads to misunderstanding, and rather they must try to understand the way that culture organize the world, and then find the best translations for that.

In the state of New York, the way we organize the world is legally defined and clear. These legal terms also are how we locally understand our places. Nobody in their right mind would call Scotia, NY a “town,” as we all know it as “the village.” Latham, NY is getting built up recently, but it’s still mostly just suburban sprawl and strip malls, not much of a walkable village or “town” in the OSM sense. Rather, it’s just one named center of the sprawling suburban landscape that is the Town of Colonie, that is to say it’s a hamlet. You can’t understand that just be looking at its population. The village of Woodridge, NY is dead for most of the year, until summer hits and the Jewish community floods it with life and traffic. All its businesses are mostly open seasonally. Is Woodridge an itty-bitty backwoods “hamlet” in the OSM sense, or “village” or “town?” You can’t make these calls in any meaningful way without understanding the local customs, laws, and terminology. On the other hand, the State of New York has already created a system with which we designate our settlements, and it’s clear and it’s precise, and it’s how we actually talk about our own places. Yet that useful and more culturally informed method of designation was washed away by one guy with his bots and his OCD.

It is absolutely clear to me that the current way of dealing with hamlets, villages, towns, and cities on OSM is absolutely imperialistic and uninformative. Anthropologists used to be in the business of “categorizing” people and their cultures with broad brush strokes to compare humans: this culture lives in a tribe, while this one a more developed “kingdom,” and these in a complex “state.” Can you see why this approach is condescending and fails to understand the intricate social structures particular to each individual culture?Right now this is what you are doing with the way place= terms are being used. Anthropology today now favors a more relativistic approach, one that tries to see cultures on their own. For the people of the State of New York and anywhere in the US, that means that if we say it’s a village and not a town, than goddamnit it’s a village!

Comment from Minh Nguyen on 20 May 2025 at 16:59

Please don’t get too caught up in the literal word place. It’s just an unfortunate keyword in our tagging ontology, a database column name. A data consumer can choose to use or ignore that classification. Most renderers ignore it, only using the node for its location, name, and population. Most geocoders ignore it as long as the node is linked to a boundary relation.

And yes, to some extent, any vaguely nationally or internationally harmonized classification system is going to be “imperialistic and uninformative”, to use your words. This particular set of keywords started in the UK, like the rest of OSM. You have no idea how problematic it is for OpenHistoricalMap, which has to shoehorn precolonial societies and more into OSM’s tagging system! Hence border_type=*, an utterly locally oriented key. We Americans are big fans of that key and are trying to position it as an alternative. Please use it.

Comment from RDreher on 20 May 2025 at 17:57

Ok, you’ve quenched my fury a bit. I commend you on your disarming ability to sympathize with my cause. Real conflict resolutions skills there.

I understand now how this flaw in the system and its practices I’ve stumbled into and have gotten worked up about is sort of a “known bug,” that it’s a feature of how the convoluted system has evolved and that the community is already figuring out ways to better work around it. It’s rather infuriating that basically the terminology is convoluted and confusing in that way, and as you can tell from my experience it is exceedingly unintuitive for people just getting started contributing.

I still don’t like it of course, and I what I don’t understand is if the exact designation of place= doesn’t matter that much, why the heck did ke9tv here have to go and change every place in New York to be based entirely on population? Can we just get rid of place= designations for municipalities entirely, and just use something more generic? If we can’t do that, and it’s a matter of which obsessive compulsion wins out, then ke9tv’s obsessive compulsion to make the place= match population size is no more valid than my desire to have this designation aligned with their official designations. If it doesn’t actually matter, why not do it the way I’d like? Ugh, it’s just so frustratingly arbitrary and obnoxious.

Comment from Minh Nguyen on 20 May 2025 at 18:16

Thanks for your patience. We’ve all been there.

Arbitrary is a good word for it. We can’t quite get rid of the place=* key, because nothing else would indicate the node represents a population center (even an informal one like a neighborhood). But I do think of it as a backwards compatibility shim for the most part. Kevin’s retagging from a few years ago reset what had previously been a chaotic, difficult to explain patchwork of tagging up to that point – not at all aligned with official designations either. We’ve had to do similar stuff in other parts of the country, especially New England and the Midwest where people came with their own ideas about how to shoehorn local terminology into the existing software-supported tags.

The proposal I linked above would attempt to repurpose place=* values based on Census Bureau concepts, which are their own ball of fun. The idea is to solve another problem we have, which is that you can’t determine based on OSM data whether a given place is a suburb in the North American sense of the word. (place=suburb means something different than what you’d expect, because keywords largely use British English for historical reasons.)

Comment from RDreher on 20 May 2025 at 18:32

I look forward to any progress you get changing how place= works so that it might reflect something more meaningful to the actual locality. I for the time being need to find a way to just get over it, as obviously I have better, more healthy things to do with my life than moan about how OSM tags work. I realized from the start that I would just be screaming into the void getting worked up about this, but I hope you can understand at least why I had such an emotional reaction to this seemingly benign thing.

Comment from gdt on 20 May 2025 at 19:21

The key definitional aspect, which I think is the reason for ~all disagreement, is that there are two more or less unrelated concepts.

One is units of government and administrative boundaries.

The second is what human geographers call “populated places” which is (perhaps was long ago) about clusters of dwellings, totally independent of administrative boundaries. This is what the village/town/city is about.

Now, we blur these. In the US, we blur them partly because almost always, admin boundaries were established to align with the places. And, because there has been so much development that in between village centers there is just a sea of houses, rather than farmland or forest. So talking about where a place ends is really dificult.

I would say the path forward is in being clear about tagging for admin boundaries, and not insisting on using any particular state’s label for the kind of entity. In MA level=8 is either a city or a town, and every bit of MA is in exactly one such entity. They are the same thing, but cities have city councils and usually mayors, and a town has a town meeting and a Board of Selectmen (often renamed SelectBoard around me, but I’m in MA). Trying to put this in tags doesn’t really make sense because the next state has different rules and you really want tags to make sense for data consumers without state-specific lookup tables. I know NY is more complicated.

What we should be doing about place names is less clear. There are village centers with names, separate from admin boundaries, and people know those names. They belong on the map. But the idea that they have populations is very difficult because they are nodes, not polygons, and really should not be polygons.

The big point is that names for places as points and admin boundaries are really separate.

Comment from RDreher on 21 May 2025 at 03:28

Minh Nguyen, gdt, you two are the only people I’ve talked to about this so far that at least understand my perspective and admit the current system has it’s problems. Against my better judgement I have been posting comments on changes concerning this issue up and down the map today, and the vast majority of people just obliviously refuse to understand the relativistic nature of place terminology and that municipalities are the purview of socially constructed reality, not a reality we can easily define through universal and objective principles. I don’t think ke9tv gets this either.

Let’s face it, all of us are basically broken, neurodivergent people who have found that contributing to this map can momentarily satiate that unquenchable burning in our brain that needs to impose meaningful order on the anxiety inducing chaos that is life. I get it. What I take issue with is a naive mentality that insists on imposing artificial order on what is supposed to be a messy, organic reality. Every one of us probably had that time in life when we could not get over the fact that English spelling is not perfectly rational and phonetic, that it has weird quirks and exceptions. I sure was that kid once, but eventually I learned linguistics and the beautiful and organic evolution that lead to our quirky spellings, and now I see it as wonderful, not infuriating. ke9tv seems to be displaying that kind of narrow and obsessive understanding to what hamlet, village, town, and city should mean. He calls New York’s designations, “messy” and the result of “historical accident.” What he calls “messy” I call our reality. What he calls “accident,” I call organic development. It’s like he learned what these words “should” mean as a kid and refuses to evolve his understanding to fit actual social usage, and then just resents it. To me, the forms of government ABSOLUTELY are what defines a village verses a town verses a city, because these are what form the foundation of our social reality and constructed environments, and trying to organize it any other way imposes a naive and arbitrary order that is unnatural and meaningless to me.

For example if you live in a village in NY, you very likely have sidewalks, because the statewide village law stipulates regulations that encourage sidewalks. Towns often do not. You can often know you have left the Village of Scotia when the sidewalk ends. This physical aspect of our built environment is bound up in the political structure of our community. Yes we have tiny cities and large villages, but so what? Why should we abhor this reality and insist in a much more simplistic and rudimentary understanding of what these terms should mean? Just like the actual intricacies of English spelling can have a perfectly valid logic of their own, the way the State of New York designates its places has a perfectly valid reasoning to it; it’s just not the simplistic one the OSM guidelines and current community practice espouse.

The more I think about it, using the US Census Bureau concepts for place=* sounds more and more like a great idea to me, so I encourage you to keep pushing it, or some similar reform. Since I understand now that the border_type=* is doing the much more important work of giving the actual state designations, the incorporated/unincorporated distinction of the US Census Bureau would make place=* designations work together with border_type=* designations to make the nature of different state’s hamlets/villages vs. villages/boroughs more clear, making the thing do something actually useful instead of just arbitrarily organizing places into childish categories.

I’m too much of a miserable, anti-social nutjob to join forums and engage more deeply with the community on this. I have to take care of my own mental health and dealing with thoughtless conformists insisting I’m wrong does not help. I thank you two for at least engaging with my thoughts and feelings, and kindly explaining to me how the current state of things came to be.

Comment from gdt on 21 May 2025 at 12:12

It is not correct to assume that because NYS labels something (as an administrative boundary/entry) a village corresponds to what OSM calls village. Tags mean what they are defined to me, as if they are uppercase defines in a C header file. In particular, they do not change meanings as one moves to different jurisdictions with different legal definitions of the terms.

In MA, there is more or less no definition of village. A ‘town’ is merely a municipality (level8) with a particular form of government. Some of them feel like osm-cities. Some MA-cities feel like osm-towns. Some MA-towns feel like OSM-hamlet because if you blink you won’t notice them.

So my advice is stop expecting the OSM tags to match the local words. The point of OSM is to have a consistent semantic representation, not to encode local words.

Comment from ke9tv on 22 May 2025 at 22:15

I think you and I are mostly in agreement.

TL:DR: Population is a terrible surrogate for the nebulous ‘local importance’ that place=* is supposed to represent. The form of government is an even worse surrogate (demonstrated with examples below). Since the only real use of place=* is to choose the typeface on small-scale maps, and to inform geocoders in areas where the municipal boundaries are not available, it’s wise to follow the OSM convention so that small-scale maps on the OSM web site will have at least some reasonable choices of lettering on place names. I didn’t choose the place=* convention, and I’m not trying to defend it. I simply chose to follow it, having absolutely no control over how the OSM renderers interpret it, and hoped that if I inverted the importance of the places, that mappers familiar with the places would fix it.

Form of government (County/Borough, City, Town, Village, Hamlet)

You want to be able to access the form of government and precise municipal boundary for hamlets, villages, towns, cities, and boroughs/counties in New York. You have that information in the boundary relation, including having the form of government in border_type. That tag is local - across the border in the New England states, there are only vestigial counties, carved up into towns and cities; in Pennsylvania, boroughs are a different thing from what they are in New York, and so on.

Census Burea concepts

OSM supports the tagging boundary=census if you want to try to map the Census Bureau metropolitan (and micropolitan) statistical areas and the CDP’s. I haven’t tried - that was not the purpose of the project that revised the municipal boundaries. In parts of the state, the CDP’s coincide with the towns’ designation of boundaries of hamlets, and TIGER offered the best available representation of the boundaries (since the Census Bureau does try to support aggregating statistics for designated communities, even if the communities lack home rule.

Arbitrariness

place=* is what it is. I don’t particularly like it, but I tried to make it consistent with the rest of OSM, because it’s used in the renderer on the main map. (I don’t like that either!) We agree that it’s pretty arbitrary, but how big to make the lettering on a municipal name on a small-scale map is pretty arbitrary - and that seems to be the only significant thing that place=* is used for around here.

Historical Accidency

I will continue argue that ‘form of government’ is not as informative as you make it out. It is not in any way indicative statewide of the character of the community. ‘Town’ is the least indicative: it ranges from Town of Red House (mostly forest, population less than 100) to Town of Hempstead (suburban to urban, population about 800,000 in a comparable land area), so we’re talking nearly four orders of magnitude in population density, and comparable differences in just about any feature of the human geography that you could name, apart from the form of government. About all that can be said is that a Town is the government you get if you don’t vote to incorporate. (In fact, some of our Towns don’t have central business districts or anywhere to hang a ‘place=*’ node that made sense. They get nodes with ‘place=municpality not:place=town’ on the centroids of their land area, because every admin boundary is expected to have a ‘place’ node but because we don’t want them rendered inappropriately. An example is Town of Tusten; the Tusten settlement was burnt to the ground in the French and Indian War and never rebuilt. The town offices are in the Village of Narrowsburg. (My brother’s farm is in the rural hinterland of the township, nearer to Yulan than to Narrowsburg.) There needs to be a place=* node for the township by OSM convention, but we use municipality because we basically want it never to render. If we were doing a custom rendering for a political map, we’d use the boundary relation in any case!

Cultural difference between Towns and Villages

You argue that Towns are more rural than Villages. That does tend to be true away from the major cities, where villages incorporated so as to be able to provide government services for a settlement that would not be feasible for the entire township. In the suburban counties surrounding New York City, the cultural relation between Town and Village is often inverted. Often, Villages incorporated to preserve their ‘small-town’ character while urbanization proceeded apace around them. Near where I grew up (in the hamlet of Inwood, Town of Hempstead), a number of villages (Woodsburgh, Hewlett Neck, Hewlett Harbor, Hewlett Bay Park, …) are considerably less densely populated than nearby hamlets (Inwood, Woodmere, …) and have correspondingly fewer amenities, are zoned for much larger lots, lack multi-family dwellings (and often have no business districts at all!) and so on. Village of Cedarhurst is much more urban than the other Villages in that area, partly because of history: it was once a hub joining multiple rail lines, and partly because it is landlocked; most of the neighbouring communities have waterfront. Village of Lawrence is something of a mixed bag; it has a business district near the train station, but its southern part is almost indistinguishable from neighbouring Woodsburgh.

Many of the smaller Cities (ranging from Sherrill, which has largely abandoned its city charter in favour of being administered as if it were a Village, to ones like Glen Cove or Cohoes) are hard to distinguish culturally from nearby Villages or Hamlets. Often, the significant difference is largely in the fact that they incorporated before the uniform Village Law was passed in 1911, and therefore there was no way for them to cede any part of their governmental authority to the Towns that they were carved out from. (Villages have some latitude in what services their governments may provide. All depend on the Towns for certain functions.)

The Village that started the whole thing

I have ho objection to the reclassification of Woodbridge. I have no personal experience with that particular village. If it has a summer population of outsized importance compared with its size as determined by the count of permanent residents, by all means upgrade it, and thanks for the additional information. I object only to the flat assertion that “it’s legally a village, and that’s that.”

Conclusion: more similarities than differences?

You and I are 100% in agreement that place=* is arbitrary and sweeps too much under the rug. I simply don’t believe that the form of government is more informative - at least not in a way that can be applied uniformly across the state. New York has the diversity of OSM in miniature: what a Town or Village represents, culturally, is far too diverse to reduce to ‘well, the law says it’s a village!’

I suspect that our differences might stem from the possibility that I’ve experienced more diverse places in the state. I was born in Queens; spent most of my childhood in Town of Hempstead, and now live in Town of Niskayuna. My daughter and son-in-law live in Town of Glenville - but self-identify as living in Scotia despite being a block or two outside the Village. (They’re in the neighbourhood south of Mohawk Avenue that is outside the Village boundary, but is the result of Scotia growing beyond its assigned limits. Many of their services are provided by the Village by agreement with the Town.) My brother lives in the quite rural Town of Tusten - which has no live settlement of the same name (its offices are in the Village of Narrowsburg) and intentionally has a place=* node that does not render on the main map. I have been hiking in the Town of Hardenburgh - nearly uninhabited; its population is only a few hundred, it has no central settlement, and its only municipal office is a highway maintenance garage. But I have family in some ‘hamlets’ in Suffolk County that are cities in their own right.

Comment from gdt on 22 May 2025 at 23:01

@ke9tv: Very well said!

Log in to leave a comment