Summary: What if AI creates the Changeset Comments? We could send locations, tag types, and quantities to get an output. AI would have to be run locally with small models for cost and be validated by the user.
Problem 1: Time I assume that 1,000 users create 2 changes in 1 day. We assume that each change set takes 3.5 seconds. 1000 users *2 changes * 3.5 seconds per change = 7000 seconds. OSM Users spend about 1.9 hours per day.
Problem 2: Skill Outsourcing Users should spend time on the things AI can’t do.
Problem 3: Server Side Peer Review We have human generated changeset comments. We could create AI generated changeset comments. We could ask the AI, “are these 2 changeset comments so different that it looks malicious”?
General AI Inputs: 1. Location: Where did the user map? 2. Feature Types: What tags did the user use?
AI Prompt: “You are an AI system. A user made edits in OpenStreetMap, a collaborative mapping project. They mapped locations[Mappleville, MN, USA; Bobville, MN, USA] with tags[50xSidewalks, 20xMarkedCrossings, & 10xReligous Areas]. You will create a changeset comment that concisely tells human reviewers what this changeset was about in 3 sentences or less. Exact numbers are not important. Changesets describe changes, so don’t request anything. Don’t mention anything that is common across all changesets.”
AI Response (https://www.meta.ai/): “Added sidewalks, marked crossings, and religious areas in Mappleville and Bobville, MN. Improved pedestrian and accessibility mapping. Enhanced local community information.”
Specific AI Inputs for Locations: 1. Cities[1 to 5], States[1 to 5], Countries[1 to 5]. 2. Is this a place with unclear boundaries? (What if somebody maps the ocean) 3. What is the size of the bounding box for this edit in KM?
Specific AI Inputs for Feature Types: Tags[1 to 6] & corresponding Quantities
Algorithms: 1. Sort the following tags by how frequently each was used in descending order and a limit of 5. 2. For each city, how often was each tag used? Create a table unless the table is huge.
Complexities of the process: 1. Disputed Boundaries: This was the changeset that changed the boarder. 2. Large Edits: Do not run this edit over changesets larger than 500 edits. 3. Malicious Inputs: Somebody named a building tag after a war crime. The AI received that as an input. What does the AI say? 4. Resource Allocation: Developer Time could be better spent doing something else. 5. Irregular Edits: I will use every tag in OSM only once. I will map an area the size of a continent.
Complexities of AI in general: 1. Uncommon Languages: Are these things only good at the 5 biggest languages? 2. Edit Safety: The user mapped religious areas in 2 different nations that share a disputed boarder and are in a war. 3. Money: Laptops with TPU’s are not common in 2024 (but will be in 2030). Mobile Editors with TPU’s are not common in 2024 (but will be on high end phones in 2030). Running AI costs money. Who will pay for it?
Solutions: 1. AI runs locally on a TPU. 2. If you use the outputs of an AI for changeset comments, you are responsible for safety.
Disclaimers: 1. I don’t work in AI. 2. I describe what I don’t have the resources to build. 3. I assume that developer resources should focus on high priority tasks.
Expected Development Difficulty: 1. Web to TPU is hard: Graphics have standard libraries (OpenGL). AI TPU’s are not common and don’t have standard libraries. 2. This can create giant tables if you are not careful.
The benefits of manual changesets: 1. Spam is harder to create in bulk. 2. Self reflection is encouraged. 3. Individuality is good to see. 4. Changesets are the alternative to the Change Approval Board (CAB meetings). It is supposed to take effort.
TLDR: OpenStreetMap (OSM) edits could be aided with AI-generated changeset comments, potentially saving users 1.9 hours daily. AI could analyze edit locations and feature types to generate concise comments, freeing users to focus on tasks that require human expertise. However, implementing AI-generated comments requires addressing complexities like disputed boundaries, TPU libraries, and malicious inputs.
Discussion
Komentar od SomeoneElse u 26 April 2024 u 17:02
We already have factual “this is just what was added” comments from things such as StreetComplete. That’s not so bad, in the context of a StreetComplete changeset, where it’s obvious that someone is answering questions on their phone (because that’s what StreetComplete is).
That would be less useful to generalise to other changesets, because it’s missing the “why”. A completely random sample shows that people do put a fair but of description into changeset comments that simply couldn’t be determined “by AI”, like here.
Komentar od FargoColdYa u 26 April 2024 u 20:53
Hello @SomeoneElse. Thank you for the advice. These are great examples.
Komentar od H@mlet u 29 April 2024 u 14:21
Hi.
I definitely don’t put into changeset comments information that can easily be found in the changeset itself, such as location, type of object modified / added / deleted…
So AI generated changeset comments sounds like a bad idea.
But AI generated changeset description, as an additional functionality (new tag in the changeset, or just displayed) might be nice.
Sometimes (especially on mobile) it’s hard to find out what the changeset is about by looking at it, so a few sentences description might be useful.
It could be a feature in OsmCha for example, or on the OSM.org website at some point.
Regards.
Komentar od spughetti u 13 July 2024 u 16:18
Adding to the H@mlet’s comment. Mentioning the location of the changeset in the changeset comment is unnecessary, the bbox already shows the location of the changeset in a way that’s far easier to understand than text. The objects that were modified/added/deleted is important IMO as it can take a while to figure that out from the changeset data itself.
As for the results.
“Added sidewalks, marked crossings, and religious areas in Mappleville and Bobville, MN. Improved pedestrian and accessibility mapping. Enhanced local community information.”
The last 2 sentences in this comment are unnecessary fluff and were probably just added to meet the 3 sentences mark you specified, I’d recommend a prompt that specifies to keep the changeset comment as short as possible without leaving out any crucial information.
I’m personally not a big fan of AI, especially its integration into OSM, so I prefer a system that will recreate the same results each time. But this is nice proof of concept nonetheless.