لوگوی OpenStreetMap اوپن‌استریت‌مپ

History of all Tags

ارسالی از tyr_asd در 31 اوت 2016 به English. آخرین به‌روزرسانی در 1 سپتامبر 2016.

TL;DR: head over to http://taghistory.raifer.tech/ for usage graphs of arbitrary OSM tags over time (by number of OSM objects).

In OpenStreetMap, tags define what an object is. Whether it is a mountain, a river, a house, or a postbox: Every map feature has it’s own tag (or set of tags).

OSM doesn’t have a fixed set of object categories. Over time, a more and more faceted and diverse set of features got mapped in OSM, thus the amount of different tags grew. At the same time, sometimes, tagging of a specific thing changes: Features that used to be mapped with one tag, get newer, better and more refined tags. That’s OpenStreetMap evolving.

Of course, OpenStreetMap is also still growing, but not all the tags are getting more widely used at the same pace: For example, while it’s quite possible that most of the world’s railway stations are already mapped in OSM, there are still many juicy pastures left to be mapped out there.

a friendly goat

While there exist superb tools to get to know about the current state of all tags used in OSM (Taginfo most notably, but also the Overpass API to some extend), until now it was quite difficult to get oneself a good picture of the data evolution process. For example, questions like: from when on a specific tag was getting used, when an obsoleted tag got taken over by a different one or which tags got more traction lately are difficult questions to answer with OSM’s current tool set.

For some of these questions, people programmed their own solutions, each answering their own question, like how many km’s of Italy’s roads were there in OSM over time (link), or how many buildings have been mapped in Austria (link). Similarly, the OSM-Analytics platform has recently started to provide such statistics for arbitrary regions for a limited set of map features (currently one can choose between buildings and roads, but there are plans to add more in the near future). What all of those tools have in common is that they can’t handle the full variety of tags that’s so essential in OSM.

To step into the gap between tools like taginfo (where the full variety of OSM’s tags is so beautifully visible – stay tuned for Jochen’s talk on SOTM in a couple of weeks!) and the more specialized tools like osm-analytics, I’ve created taghistory which allows one to get a historical usage graph for each of OSM’s tags (with daily granularity) and to compare different tags against each other:

highway=ford vs. ford=yes

The tool is currently in it’s very early stage, the’re many things to do and improvements to be done. It’s also important to note that the historical usage of a tag is currently only defined as a the respective number (count) of OSM objects! That’s similarly to the statistics produced by taginfo, this metric is subject to the some limitations, most notably the effect that one cannot directly compare the number of tags used for different linear and polygonal features such as roads, land cover, etc. because such features are typically divided up into many OSM objects of different sizes. For example, an existing road may be divided up into two pieces when a new turn restrictions is added, resulting in that the count of each of the tags used on the road (even obsolete ones) is increased by one in the OSM database. That means that one needs to pay close attention when comparing tags that are typically used on such features, even when comparing subtags that are typically used on the same kind of parent object (e.g. different values of the highway tag).

That being said, have lot’s of fun while digging into the depths of OSM tags’ history. Here’s the link of the tool again: http://taghistory.raifer.tech/ (and the link to the project’s source code repository and issue tracker: https://github.com/tyrasd/taghistory). What’s your favourite tag? I find the created_by graph quite interesting:

history of the usage of the created_by tag

Email icon Bluesky Icon Facebook Icon LinkedIn Icon Mastodon Icon Telegram Icon X Icon

Discussion

نظر از mvexel در 31 اوت 2016 ساعت 21:33

Very cool!! Interesting to explore the data that way. It’s fun to try and recreate what happened, for example here:

centerturnlane

My guess: * People started mapping ways with center_turn_lane=yes * Someone decided that those tags needed to go away and wrote a bot * Mappers decided to use it anyway :)

Oh but wait:

center-centre

New guess: * Someone decided that the correct spelling was centre_turn_lane and wrote a bot to rename the tags * Mappers decided to use center_turn_lane anyway :)

نظر از tyr_asd در 1 سپتامبر 2016 ساعت 06:58

Math1985 has more interesting examples on his osm diary page: osm.org/user/Math1985/diary/39404

نظر از Alecs01 در 1 سپتامبر 2016 ساعت 19:00

Excellent tool, thanks!

نظر از d1g در 10 سپتامبر 2016 ساعت 14:43

tyr, I had idea to include Google results with “amenity=public_building” query

e.g.

Full timeline:

2006-03-24 wiki: amenity=public_building added to map features

2007-10-16 JOSM: amenity=public_building added

2015-11-15 JOSM: office=administrative, office=government added

2016-03-02 wiki: office=administrative and amenity=public_building deprecated

2016-04-01 JOSM: amenity=public_building dropped and deprecation warning added

should be drawn as vertical lines with number. Where every number is linked to external resource to see if there any mistakes during discussion.

We (or Math1985) shouldn’t fiddle with wiki or any other resource to see when tag was added/removed/mentioned for the first time.

We definitely need such tool.

نظر از d1g در 10 سپتامبر 2016 ساعت 15:08

For example, “payment:troika” was discussed deep in the public trasport thread http://forum.openstreetmap.org/viewtopic.php?pid=596732#p596732

http://taginfo.openstreetmap.org/keys/payment%3Atroika#overview

There no discussions of it at tagging list or at wiki or anywhere else in OSM.

نظر از d1g در 10 سپتامبر 2016 ساعت 15:25

… but wait, if you search more, “payment:troika” is used in OsmAnd already https://github.com/osmandapp/OsmAnd-resources/commit/996c7727287ebadbea0919d83be4a2d4fa8adccc#diff-bc091b281dee9cb9288fad5990fe5538

and was discussed in some more minor discussions at other channels

نظر از GRUBERND در 15 سپتامبر 2016 ساعت 19:04

lovely tool. i guess you are using stats from a database analysis. how about counting the nodes associated to way/polygon objects instead of the objects themselves? this would totally eliminate the statistical jumps through splitting, merging and other operations.

نظر از tyr_asd در 15 سپتامبر 2016 ساعت 21:11

Funny idea, that could indeed partially improve the issue with split ways. Still, a proper solution would have to track the actual length and/or area of the respective objects.

نظر از Jojo4u در 30 سپتامبر 2016 ساعت 12:24

Under which licence do the generated charts stand?

نظر از tyr_asd در 2 اکتبر 2016 ساعت 10:05

@Jojo4u: You’re free to do everything you want with the generated charts as long as you comply with ODbL’s minimal requirement for produced works, i.e. citing OSM as the data source. A link back to this blog article and/or the website taghistory.raifer.tech is very much appreciated, though. :)

نظر از joost schouppe در 9 نوامبر 2016 ساعت 14:25

Would it be hard to implement permalinking to the charts one makes?

نظر از tyr_asd در 10 نوامبر 2016 ساعت 17:14

@Joost, probably not too hard. There’s already a ticket on github for that, where any progress will be documented: https://github.com/tyrasd/taghistory/issues/6

نظر از Polarbear در 16 دسامبر 2016 ساعت 21:30

How often is the demo site http://taghistory.raifer.tech/ updated? It seems to be stuck at some time in October or so?

نظر از tyr_asd در 17 دسامبر 2016 ساعت 22:48

Sorry, currently, there’s no updates! :( I’ve been looking into doing updates via Overpass’ augmented diff, but I’ve run into some issues which need to be resolved upstream before it can work (see links in https://github.com/tyrasd/taghistory/issues/10). The alternative of reprocessing the history dump every week or month is currently also not really an option because of my limited computing resources.

نظر از SafwatHalaby در 24 اکتبر 2017 ساعت 14:50

I added this to the quality assurance page: osm.wiki/Quality_assurance#OSM_Tag_History

نظر از SafwatHalaby در 24 اکتبر 2017 ساعت 14:52

@tyr_asd have you considered processing daily planet diffs?

نظر از tyr_asd در 25 اکتبر 2017 ساعت 09:49

@SafwatHalaby: yes, but regular planet diffs don’t contain all necessary data to keep this kind of data up to date (because they don’t include the tags the modified osm objects had before the diff). Overpass’ augmented diffs could in principle work, but they have other technical issues, see: https://github.com/tyrasd/taghistory/issues/10

نظر از SafwatHalaby در 25 اکتبر 2017 ساعت 11:30

I think in theory you can have all the info. - An initial OSM DB populated from a planet file starting from where you’ve ceased updating your current data - When a new diff arrive, the old state of each node is in your OSM DB, and the new state can be extracted from the diff. These are used to update the statistics. - Apply the diff to the OSM DB planet file, and now you can repeat the process the next day.

I don’t know how easy or complicated that’d be in practice.

نظر از tyr_asd در 25 اکتبر 2017 ساعت 18:40

Yes, sure. But one of my main design goals for this tool was to avoid having to set up and run any kind of database containing the full OSM data, so that’s not really an option for me, unfortunately.

But if anyone out there already runs a (daily) updated OSM DB which could produce deltas of the counts of tags in their db (for little extra processing cost), please contact me – I’d love to use that data for updating the taghistory service.

نظر از dieterdreist در 22 ژوئن 2018 ساعت 09:44

Martin, this is such a great tool, I am using it all the time! Thank you very much also for the recent update (because outdated data makes it less useful, obviously). Would it be complicated to add a permalink function? (one single tag / key would already by much better than nothing). It would make it easier to share findings with others, where space is limited (e.g. mailing lists don’t like picture attachments).

نظر از marc__marc در 7 اوت 2018 ساعت 22:27

@tyr_asd did you have a POC to test your idea of a weekly/daily run ? - what’s the memory requirement and cpu usage for the current script ? - with an already minutly-updated OSM DB, what would be the little extra processing cost to produce deltas of the counts of tags you need ? did you have alreay the script or need to create it ? - I like the idea of using taginfo to avoid the need to parse twice the same file for the same king of info… but I understand that this may need additional dev time

نظر از Grillo در 12 اوت 2019 ساعت 22:17

Since mid 2018 this tool doesn’t seem to work properly anymore, as in no exact numbers are given. Any ideas why?

نظر از MalgiK در 6 فوریه 2020 ساعت 17:52

Thanks a lot of adding the perma-link functionality :-)

نظر از mueschel در 17 ژوئیه 2020 ساعت 17:47

Could we get another update of the database? I like the tool, but current data is already a year old.

For regular updates: You can get the total numbers of each key from Taginfo. The API provides a convenient way to download one table with all ~80k keys currently in use. You can’t query the past history, but it should be perfectly fine for a daily or weekly update of numbers.

نظر از DaveF در 2 نوامبر 2020 ساعت 21:05

Hi Another request to update the database as I’ve found this to be a useful tool. Would it take much time to incorporate Taginfo’s data?

نظر از tyr_asd در 16 نوامبر 2020 ساعت 11:19

Taginfo now also features historic development data in its new “chronology” tag (see https://blog.jochentopf.com/2020-11-08-10-years-of-taginfo.html) now. It’s limited to tag keys and the “most frequent tags”, but I think this should already solve most needs for current tag count statistics (and for the rest one can still use https://api.ohsome.org). PS: Taghistory’s web interface is updated now to also fetch taginfo’s chronology data if available.

نظر از Matija Nalis در 13 مه 2023 ساعت 22:07

So, does that “Taghistory’s web interface is updated now to also fetch taginfo’s chronology data if available” mean that taghistory now too only works for “most frequent tags”?

I.e. https://github.com/tyrasd/taghistory/issues/34

نظر از tyr_asd در 14 مه 2023 ساعت 08:50

@Matija Nalis, well, it still shows the history for all tags up to some time in 2018. But I guess that after 5 years the added usefulness of that partial data is quite limited.

برای نظردادن وارد شوید