Design of tag transformer (unabbreviator) plugin for JOSM
نُشِر بواسطة JoshD في 8 أغسطس 2011 باللغة EnglishI've thought about making a plugin for JOSM to unabbreviate street suffixes, and put my thoughts down on a wiki page, and mentioned it on the dev@ and josm-dev@ lists, but thought I'd mention it here since I'm asking more for design input rather than development advice (but I'll take that as well!).
Read (and edit!) the original wiki page here:
http://wiki.openstreetmap.org/wiki/JOSM/Plugins/TextTransform
These are some notes about the creation of a JOSM plugin which can perform text transformations on tags. A possible name is TextTransformer or TagTransformer.
Use cases
Un-abbreviating
The motivating idea for this plugin is to un-abbreviate street types, such as "St" -> "Street", "Rd" -> "Road". As abbreviations vary from region to region, there needs to be a way to choose between different abbreviation dictionaries and logic (street type at beginning, at end, etc.). By default (or by design) this should not apply to objects that have tiger:reviewed=no.
Fixing spelling
Changing key name
Perhaps a key was used incorrectly or changed name.
Normalizing phone numbers
This example is obviously just for North America.
Implementation
The GUI mockups above give an indication of what the dialog would look like. The plugin will only consider selected objects, so we can take advantage of all the powers of JOSM's built-in search tool. "Transformers" will themselves be plugin-like, or at the very least implement a common base class. A simple signature might be OsmObject transform(OsmObject input)
, as it could then perhaps automatically filter by region, or ignore objects with a certain ID number, last author, version, etc.
مناقشة
تعليق من Chaos99 في 8 أغسطس 2011 في 06:17
Hi,
I just can't see how your tool will know what is the right name for a street. At least here (Germany) we tag by the rule 'What it says on the street sign', which may be any form of abbriviated, non-abbriviated or even mixed forms of street names. So if a street is called Willi-Brand-Str. this may as well be the official name. Expanding this to Willi-Brand-Strasse would be wrong. To know what's right you either need to look at the street sign or compare to some kind of official import data.
I do see a use for the tool with the phone number as target. This is a pure formating issue without impact on content. But those transformations are already done by bots, not be JOSM plugins. So all tags get changed, not just the ones generated by JOSM.
Just my two cents....
تعليق من z-dude في 8 أغسطس 2011 في 08:29
Also, expanding it to be "Willi-Brand-Street" would be wrong as well.
Also, from the documentation wiki "St. could be Street or Saint" so St. Elmo's Fire would be renamed "Street Elmo's Fire" if you automate things too much.
osm.wiki/Key:name
تعليق من JoshD في 8 أغسطس 2011 في 10:40
I wasn't planning on getting into the politics or technical details of un-abbreviating, as there is quite a bit to it, especially if you keep up with the mailing lists. Conventions and exact implementation will vary by country, which is why in my example I made it clear that each region (country, state, city, etc.) will need their own un-abbreviating script/dictionary.
For the US, I would likely use the logic from balrog-kun's script, using the USPS abbreviation list. For ambiguous cases I can mark them such that they can't be changed en masse, but rather must be individually reviewed (selected).
And keep in mind that the design of this tool is such that changes are reviewed before being made. Also this is meant for the average user to run over a small region that they're working on, not a bot that requires technical expertise and is run over entire countries or the world.
تعليق من Chaos99 في 8 أغسطس 2011 في 10:47
So it's more of a 'search-and-replace' function for JOSM? Ok, then I see where you want to go. Of course experienced users could do with regEx, but for the novice mapper a gui could be helpful.
I haven't used them, but aren't there already plugins/functions to do, lets say, add a source tag to all ways which are also tagged as building=yes? I remember reading such things on the JOSM mailing list ...
تعليق من JoshD في 8 أغسطس 2011 في 10:58
Chaos99: Yes, this would basically be a find-and-replace function for JOSM, with preview. It would be a little more than that, because it would allow you to remove a tag and add a new one (e.g. changing a key name).
I did just discover that the CommandLine plugin for JOSM does provide a regular expression search and replace, however I think it's worth having a dedicated dialog to make the functionality available to an average user.
تعليق من Pink Duck في 8 أغسطس 2011 في 12:08
Chaos99: While it is fine to map what is shown as per the street sign, it is less useful for databases search engines to automatically expand/guess abbreviations to match search terms with a canonical form. It is better for the full form to be provided by the mapper somewhere. While the signs are sometimes abbreviated due to physical space limitations, the OSM database has no such issue. With full street name information, renderers can selectively abbreviate via an abbreviation lookup table where necessary.
تعليق من JoshD في 8 أغسطس 2011 في 12:16
Pink Duck, Chaos99: If you wish to discuss/debate abbreviations more thoroughly, consider reading and responding to the recent thread on the talk@ list with the subject "shortened names", which you can read here on Nabble, all 94 messages (!).
I hope any further comments will consider the design of a general tag transformer, rather than focusing on one specific transformation (such as un-abbreviation).
تعليق من maxolasersquad في 8 أغسطس 2011 في 12:34
I would definitely find such a tool very useful. I've been going along my city and renaming the streets as appropriate. If I had a tool that would list out each way with a highway key and a suggested new name that I could just mass approve, that would be exceptionally helpful for me.
تعليق من chriscf في 9 أغسطس 2011 في 14:20
It would be useful to be able to transform the key as well as the value. This would allow things such as "if 'name' matches /^[A-Z]\d+$/, change key from 'name' to 'ref'"
In general, the consensus is that abbreviations are generally a bad idea. The one exception is with directions in the US, which often appear twice in an address.
تعليق من chriscf في 9 أغسطس 2011 في 14:21
Actually, disregard that - I've now spotted an example that didn't load in my browser the first time around.
تعليق من marscot في 9 أغسطس 2011 في 22:09
Ave to Avenue, prk to Park, also would be good to find things I have put in and forgot to tag the Source but I cant find them now