I’ve just counted some statistics on a planet file from 14th of October. Here it is:
This table shows a number of nodes, both tagged and untagged, that are referenced by ways and relations. You can see that nearly 97% of 3.5 billion nodes are untagged, and most of these — 88% — are part of exactly one way or relation. Like, when you trace a building, you add four untagged nodes that are part of that closed way.
98.4% of all nodes are part of something, but only 12% (424 million) have two or more parent objects. This could help with designing a data storage for nodes.
There are equal amount of tagged nodes that are not part of anything, and part of an element. Interesting are these 9 million tagged nodes that are part of two or more ways. The taginfo says there are 2.5 million crossings and 860 thousand traffic signals, so that’s a ⅓ of that.
Finally, we have a million of nodes with no tags not being a part of anything. I wonder when someone puts on their OSM saviour cape and a programmer’s hat and rids us of these.
การอภิปราย
ความคิดเห็นจาก ff5722 เมื่อ 18 ตุลาคม 2016 เมื่อเวลา 22:20 น.
I haven’t bothered to learn overpass syntax yet, but I found these two scripts;
Combining these should give all nodes without tags and not part of a way…
ความคิดเห็นจาก SimonPoole เมื่อ 18 ตุลาคม 2016 เมื่อเวลา 22:39 น.
The redaction process created a large number of orphan untagged nodes, typical example of that happening would be when a road was redacted away, but the nodes not (because they where created / moved by somebody that accepted the CTs). As a result the nodes may still have residual geometry information (by how they are arranged) and should only be removed when that aspect has been checked.
The other source of such nodes are naturally (broken) imports, unluckily there is no penalty for not cleaning up after you have messed up.
ความคิดเห็นจาก ImreSamu เมื่อ 18 ตุลาคม 2016 เมื่อเวลา 22:57 น.
The Taginfo version : http://taginfo.openstreetmap.org/reports/database_statistics
few days later : ( 2016-10-18 00:58 UTC )
ความคิดเห็นจาก Zverik เมื่อ 19 ตุลาคม 2016 เมื่อเวลา 08:23 น.
Thanks Imre, I didn’t know Taginfo had that statistics. I did this because of the number of references though.
Simon, thanks for reminding of the redaction, I forgot how many orphaned nodes it left. Of course my last remark about removing these is sarcasm: I certainly do not want for anybody to do mass-deletions.
ff5722, nice scripts, thanks for sharing!
ความคิดเห็นจาก SK53 เมื่อ 20 ตุลาคม 2016 เมื่อเวลา 12:47 น.
Only a million lonely nodes seems quite small by older standards. When Cadastre first came out I was cleaning up a hundred thousand or so at a time. Matt (zere) used to have a duplicated node map too which was a big problem particularly with TIGER, NHD & landuse imports in the US (more or less until ogr2osm fixed most of those isseus).
ความคิดเห็นจาก ianlopez1115 เมื่อ 21 ตุลาคม 2016 เมื่อเวลา 08:00 น.
@ff5722, I did a bit of research and some tweaking based on previous examples, and here’s what I was able to come up: an overpass query looking for nodes without tags not belonging to ways or areas here.