A minute of facts about the duration of changesets
TrickyFoxy erabiltzaileak 31 Urtarrila 2024 datan argitaratua English hizkuntzanDisclaimer: I used changesets through August 2023 to calculate the statistics
-
84% of changesets closed within a minute
-
99.6% closed within two hours
Diagram of the distribution of changesets durations (count/duration in seconds):
The upper part is in the form of a table:
-
only 1201 were open for more than 12 hours
-
from 2020 to August 2023, only 53 more than 12 hours (of which 39 were made by wheelmap_visitor, 2 by StreetComplete, and the rest by JOSM).
Warning, question:
- Are we sure we want to spend the whole day monitoring what the user does in their changeset?
Eztabaida
Xvtn erabiltzailearen iruzkina 31 Urtarrila 2024 16:51-eann
I had no idea until now that changeset creation time was even recorded. Interesting!
Pieter Vander Vennet erabiltzailearen iruzkina 1 Otsaila 2024 01:07-eann
A changeset is opened by sending some metadata to the server (e.g. contributor ID, editor, comment, …).
Then, one or more ‘changeset XML’-files can be uploaded.
Then, the changeset is closed if:
https://mapcomplete.org (which I develop) exploits this behaviour by not closing a changeset and by trying to reuse a changeset as much as possible.
kmpoppe erabiltzailearen iruzkina 1 Otsaila 2024 05:38-eann
Hi, and thank you for taking the time to analyze this!
I Have seen that table before, haven’t I? On Github or on Discourse?
Anyway:
Granted, I don’t know where to find the code exactly, but I guess there’s not much “monitoring” involved. You’ll probably see a process that checks every N seconds, whether there are changesets that match Pieter’s description of points 2 and 3 (either 1 hours since the last upload or 24 hours since creation) and then shuts those changesets down. What happens before those times isn’t really something I guess is monitored in any shape, form or fashion.
StreetComplete does the same, it creates it’s own little database of “OpenChangesets”, grouped by the “ElementEditType” (i.e. the “Task” or “Question” that the user was asked) and updates the changesets with changes that fit the same Edit Type, as long as the Changeset isn’t older than 20 minutes. By that time it closes its own Changesets automatically.
TrickyFoxy erabiltzailearen iruzkina 1 Otsaila 2024 08:56-eann
I forgot to write about one unpleasant feature of open changesets: while they are open, you can’t comment on them. I wouldn’t risk rolling them back either. You can find a full story about the problems of changesets here https://youtu.be/aRcHLKbXlcM
I published it only in the Telegram chats: https://t.me/OpenStreetMapDev/7668 https://t.me/ruosm/792186
The problem is that you already suggest that open changesets should be handled in a special way. This already sounds weird.
As you can see, after a glimpse of 3600 seconds, these are changesets that are closed automatically. These are either StreetComplete users who no longer made any edits as part of the quest. Or connection breaks.
@NorthCrab, in its implementation of API 0.7, offers one interesting thing: sending changes with a single HTTP request. Take a look at the current API and realize how overcomplicated it is: (
Open changesets in their current form, a strange and inconvenient thing. But I think they can be made better.
Pieter Vander Vennet erabiltzailearen iruzkina 1 Otsaila 2024 11:01-eann
Tbf, for some usecases (e.g. the *Complete-editors) it allows to avoid many changesets
mmd erabiltzailearen iruzkina 1 Otsaila 2024 11:15-eann
5 years ago we’ve already discussed to add an optional “close_changeset=true” attribute to the osmChange header. This would, as the name says, close the changeset as part of the upload, without the need to send an additional changeset close message. Unlike the proposed API 0.7 changes, it wouldn’t introduce an incompatible change, since it’s an optional attribute only.
Link: https://github.com/openstreetmap/openstreetmap-website/issues/2201
Andy Allan erabiltzailearen iruzkina 1 Otsaila 2024 14:17-eann
It’s much simpler than that - there’s no extra monitoring process involved. Whenever something happens to the changeset (e.g. open, diff upload, individual element update, etc), its
closed_at
attribute is updated.https://github.com/openstreetmap/openstreetmap-website/blob/e83f0bd13121ab520c68d3a49a3f0f59a1266cd2/app/models/changeset.rb#L186-L198
Then the next time you try to do something (e.g. another diff upload) the code just checks if the changeset
closed_at
has already passed - if so, the changeset is closed, if not, theclosed_at
is updated again, etc. The “close changeset” method just checks if the changeset is still open, and if so, sets theclosed_at
to right now.https://github.com/openstreetmap/openstreetmap-website/blob/e83f0bd13121ab520c68d3a49a3f0f59a1266cd2/app/models/changeset.rb#L69-L76
So there’s no moving parts within the codebase, no ‘watch’ process and not even an extra update to the db to close each changeset. It’s a clever design (and not something I was involved with!).
I think the more important bits is the side effects on other systems, for example changeset comments, or 3rd-party analysis tools, that might be waiting for a changeset to close before triggering an alert etc. There’s a case to be explored if 24 hours is too high an upper bound for changesets to be kept open (of course, a changeset also needs activity every 60 minutes for every one of those 24 hours, since the changeset closed_at is only extended 60 minutes at a time - so the default is to keep it open for 1 hour (reasonable?) with an upper limit of 24 hours (debatable?)).
kmpoppe erabiltzailearen iruzkina 1 Otsaila 2024 14:51-eann
So, if the opening client hasn’t closed the changeset and it would be closeable (1h, 24h) even loading the CS on the website or via the API wouldn’t trigger the closing but only trying to upload data into the CS again?
Andy Allan erabiltzailearen iruzkina 1 Otsaila 2024 15:32-eann
Hmm, not quite.
Remember that all changesets - open or closed - have a
closed_at
date, it’s just that initially it’s one hour in the future (you can think of it more like “will_be_closed_at”) and often that time has passed already (so more like “was_closed_at”) and the only difference is whether that timestamp is before or afterTime.utc.now
. There are no updates to the database when a changeset automatically closes, the “will_be_closed_at” timestamp was already saved in the database, either during changeset open or during the last successful update.The only ways to close a changeset are to a) wait for the
closed_at
timestamp to pass or b) update theclosed_at
timestamp to beTime.now.utc
by calling the changeset/close API method - which is just an express version of a) for the impatient!It’s one of these parts of the API where the mental model of a changeset (two states, open vs closed, and various actions like ‘close’ and ‘automatically close’) and the actual code implementation (a predetermined
closed_at
time, which can be in the future, and can be updated in certain limited circumstances) are quite different. The mental model is useful for mappers and there’s nothing wrong with it, but when you look at the code / database it’s quite different.kmpoppe erabiltzailearen iruzkina 1 Otsaila 2024 15:45-eann
NOW I got you! Thanks for clearing that up.