OpenStreetMap logo OpenStreetMap

Post When Comment
Peer reviewed paper on gender differences in OSM editing now available online

Interesting read, in particular your analysis and discussion of the literature in the beginning and in the end. I am however somewhat underwhelmed by the analysis of the results of the survey. The idea of connecting information from a survey with analysis of the editing activity of the participants is in principle a very promising approach but i think you severely underuse the possibilities of it. In particular i think although you say you were mindful of the inherent limitations of surveying in this context you do not sufficiently reflect on how the lack of unbiased representation in the results of your survey affects your results. For example Figure 1 and 2 indicate a significant difference in average seniority of mappers who participated in the survey between men and women. This could be in itself an interesting point of analysis and discussion (possibly indicating either most women mappers became only interested in the project more recently or that female mappers have a significantly lower retention rate - that is stay active as mappers for a shorter time on average). Since it is fairly trivial that long time mappers of all genders will have on average different work patterns than newcomers it is not unlikely that the differences you observe do not actually have any relationship to the mapper being a man or a woman but can fully be explained by systematically different seniority of the male and female mappers who have participated in your survey.

Along the same line - you mention that you inquired about 5 demographic indicators (gender, age, educational background, location and country of origin) - but you do not seem to discuss the results of this (either in terms of differences between male and female participants or the representation of overall demographics in OSM).

I appreciate that you are often critical of bias and preconceptions you can find in the literature but i miss a critical reception of the widespread misunderstanding of the free form tagging system in OSM - in particular in Stephens (2013) - which you cite a lot. This is a scheme that runs through the vast majority of the scientific literature on OSM. The free form tagging scheme and the principle of any tags you like is one of the core principles and unique features of OSM and - as much as it is hated and despised by some people - has been central in making OSM what it is today. Discussing either data quality or social dynamics in the project without appreciation of this is always at least significantly incomplete.

Somewhat related to this your analysis of the numbers of different feature types does not seem to make the necessary differentiation between tagged nodes and untagged nodes. An untagged node in OSM is essentially a coordinate pair. Editing an untagged node is equivalent to making a geometry modification (of any kind of geometry, no matter if node, linear way, polygons or other types of relations). Without making this differentiation counting numbers of nodes edited is of relatively little use.

One other very useful addition to your methodological approach in general that comes to mind is that you could also analyze the whole mapper community (i.e. those who have edited in OSM but have not responded) as a third reference group. You could even differentiate this by indicators like seniority or region of main activity and thereby try to assess how much these factors affect editing patterns independent of possible distortions by selective response to your survey.

What would also be interesting to look at in the context of how mapper demographics affect the content of the database is how organized mapping projects fare in that regard (which meanwhile represent a quite significant fraction of editing activities in OSM). As mentioned here organized mapping projects usually instruct mappers collectively what to map and how to map it. In case of commercial organized mapping this is obviously largely determined by business interests. In the context of organized humanitarian mapping a frequent point of criticizm is that such efforts project a European/American mindset about geography and cartography on other parts of the world. It would be interesting to analyze if organized efforts in a similar fashion project either a male or female perspective and priorities on mapping.

Import of Southeast Alaska Hydrology

There are good reasons why NHD coastal hydrography data has not been imported in a wider fashion into OSM in the past. Many of the feature classes of NHD have no equivalent in the OSM tagging system and therefore cannot be imported without a case-by-case review and new classification. In coastal areas this in particular applies to feature codes 36400 and 53700 but beyond that of course also feature codes 46000 to 46007 where importing as waterway=stream is very often wrong. osm.org/way/697889834 for example is not a waterway=stream, it is a waterway=river (hint: if it can be seen on satellite imagery in a densely forested area it is always a river).

Never mapping walking dunes

Not mapping something because it is not constant over time is not an idea i would subscribe to. Based on this principle much of the existing mapping of tidal flats, rivers or crevasses on glaciers as well as a significant percentage of the world coastline would need to be removed. Even for rapidly changing structures the pattern of these structures is often a characteristic property of the local geography and it often is rather constant (like the area and direction in which crevasses form on a glacier or the direction and spacing between dunes).

On the utility of Sentinel-2 satellite images

As said - you can specify a specific strip timestamp or you can specify a date range (like 2019-05-01T00:00:00Z/2019-05-15T00:00:00Z). You can use the code-de browser or the ESA Open Access Hub to identify the image you want. And you can use the WMS in JOSM then. How useful this is for mapping depends on the location and what you specifically look for.

On the utility of Sentinel-2 satellite images

The nodata color of this WMS is white and there obviously is no image around Nottingham with that exact time stamp. There is a good, mostly cloud free image covering Nottingham at 2019-04-20T11:21:19.024Z - but as said the standard tone mapping does not really give you anywhere near the full potential.

On the utility of Sentinel-2 satellite images

General agreement on using open data imagery more extensively for mapping - either exclusively or as supplement to higher resolution data for recognizing recent changes.

But i kind of disapprove of advertising commercial services here that lure in users with the offer of free services to later once people got used to it start demanding money.

If you want Sentinel-2 imagery processed ready for use code-de offers free full resolution WMS services where you can select individual images or date ranges. For the most recent image for the area you look at you can in JOSM use for example:

wms:https://geoservice.code-de.org/Sentinel2/wms?SERVICE=WMS&VERSION=1.1.1&REQUEST=GetMap&FORMAT=image/jpeg&TRANSPARENT=true&STYLES=raster&LAYERS=Sentinel2:S2_MSI_L1C&TIME=2019-05-18T15:07:21.024Z&SRS={proj}&WIDTH={width}&HEIGHT={height}&BBOX={bbox}

You can also specify a time range. Note this is all using the standard tone mapping used for previews of Sentinel-2 images. This does not make use of the full potential of the data in particular in dark areas (like here) or bright areas. Processing the images yourself for a specific application (which you can do since it is open data obviously) allows you to extract a lot more information.

Relatively crude pixel statistics mosaics like the EOX imagery are generally not advisable for mapping purposes because of the high incidence of artefacts and high noise levels leading to very difficult interpretation, low content of actual information and a high likeliness of misinterpreting artefacts for actual features. This might still be convenient for a quick look but investing time into tracing from such imagery is universally a bad idea. Editors should IMO display a prominent warning along these lines when mappers use such layers.

North East Greenland

There are two changeset discussions related to this:

osm.org/changeset/68256914 osm.org/changeset/68269698

which already contain a few significant points on the matter. In particular

  • the importance of diligence on the legal aspects
  • the selection of fitting tags for the features
  • the need for conflation with existing data to avoid duplicates.

In general North Eastern Greenland has seen a significant amount of mapping during the past 1-2 years by 4rch moving it from one of the worst mapped areas of Greenland to one of the most accurately mapped. If you map physical geography there please don’t use the 20 year old legacy images you get in Bing and other global image layers, use either the Greenland mosaic i prepared:

http://maps.imagico.de/#map=6/73.428/-23.093&lang=en&l=greenland&r=osmim&o=2&ui=8

which is available in all common editors or any newer open data imagery (from Sentinel-2 or Landsat 8) plenty of which is available (though requires custom preparation).

Disputed boundary tagging sprint (2019-03)

To be frank - your effort is highly problematic because it is calling for systematically entering non-verifiable data into OSM. Doing so without first having and concluding an open discussion with the community about revising our core principles is something i would consider very close to calling for vandalizing OSM.

Johnparis with his proposal took the appropriate approach to discuss his ideas and be open to adjusting them from feedback received. You on the other hand present finished instructions here and call for implementing them without offering to have an open discussion first.

Now it is quite possible that your approach could be more successful since there is a significant fraction of the OSM community who want to abolish verifiability in the OSM sense or at least downgrade it to a suggestion. It is even possible that there meanwhile would be a majority for this among mappers due to changing demographics. But that does not make it the right approach. And it is rather short sighted because the social cohesion of the OSM community fundamentally depends on verifiability - which you try to punch out of your way so to speak with your approach. I wrote about this subject more extensively.

Consequently this:

But there isn’t yet a consensus in OpenStreetMap about which approach to take

Is a gross mis-characterization of the situation. There is no consensus at the moment if disputed boundaries should be recorded in OSM.

My suggestion to you: Collect the disputed boundary data in a separate database outside the main OSM db. You could do this either within existing projects (like wikidata) or as a separate new database that is compatible to OSM so can be used with the same tools. This would be the respectful and responsible way to pursue your interests here and could serve as a blueprint for other types of non-verifiable data people might want to record.

The questionable definition of "active mapper" - Wer ist ein "aktiver Mapper"?

Naja - bezüglich der Contributor Terms ist das Ganze ziemlich akademisch, denn

  • ich hab noch niemanden gehört, der die praktische Umsetzung dieses Teils der CTs für eine realistische Möglichkeit hält - weil (a) die Hürde extrem hoch ist und (b) die Erfordernis “a natural person” sehr schwer universell zu verifizieren ist.
  • die Anzahl der Benutzer-Konten ähnlich deinen Beispielen ist vermutlich im Vergleich zu den Accounts ohne Edits im letzten Jahr einerseits und wirklich aktiven Mappern anderseits recht gering.

Meine Hoffnung ist ja, dass wir irgendwann mal dazu kommen, dass aktive Mapper automatisch ein Stimmrecht in der OSMF bekommen und wir dafür dann eine solide und robuste Definition entwickeln, was ein aktiver Mapper ist. Das ist natürlich garnicht so einfach, denn es gibt ja eine extrem große Bandbreite von Mapping-Verhalten, die man dabei abdecken möchte und gleichzeitig irgendwelche Spammer- und Astroturfing-Accounts nicht dabei sein sollen. Aber mit ein bisschen Gehirnschmalz-Einsatz ist das durchaus machbar.

Label painting guide continued

I am not a local of the area nor have i ever visited it so i have no way to reliably verify the name from here.

I would suggest not to concentrate on name tag verifiability. Completely different topic which has nothing to do with the subject of this diary entry. If you want to reject verifiability as a concept because you have no way to verify names at the distance i have no issue with that.

Label painting guide continued

I have not expressed any position on name tagging here. I have no issue with the verifiability of the name Baffin Bay.

Label painting guide continued

This is what just happened - you demonstrated that it is false and needs to be improved.

No - and you selectively quoting what you like and leaving out the rest does not make it right. If i demonstrated something that is the self referential cultivation of completely made up ideas is swashing over to OSM from Wikipedia.

As linked to i have written at length about the idea of verifiability in OpenStreetMap. I don’t really feel like explaining the fundamentals again here.

Everyone is free to map stuff the way shown. What i try to do here is give readers a bit of an idea why this is a bad idea and a bad development for the project. Not everyone will understand that - as said irony is a tricky thing. And some people just want to paint labels in the map (which i perfectly understand).

Label painting guide continued

Nice to see someone is falling for my trap (sorry).

Please check if the following applies:

☐ you have understood that verifiability in OSM means independent verifiability based on the observable geographic reality.

☐ you are aware that the polygon painting in OSM is not anywhere near the IHO declaration.

A bit of analysis of the OSMF board election results

The raw data is relatively easy to interpret: All lines containing a vote start with 1 so you can just grep for ^1 to get the data lines. After the 1 follows the vote with the candidates represented as numbers. The first empty place on the ballot is marked with a 0.

It is clear that since Tobias got by far the most first priority votes most of the identical ballots were with him on position one. Of the 189 people who voted Tobias first as i mentioned 82 voted Joost second and of these as you say 11 had no third choice and 29 had Stereo on third and then empty. This seems a pretty natural distribution if you take into account the political similarities (i.e. that people who voted for Tobias first have a higher preference for some candidates than for others).

Regarding “So no statistical conspicuity for sold voted” - that is not something you can necessarily observe as an outsider. The technique Rory discussed would be along the lines of “You should vote for A first and for us to verify you actually did so please fill the rest of your ballot with the following random sequence” Since there are 6!=720 possibilities for this specific combination together with the desired first position candidate you would have a relatively high risk of no other voter incidentally voting the same combination. But only the person who actually assigned someone to vote this way would be able to detect if the instructed voter obeyed the instruction.

Similar things apply for the possibility of collective voting instructions. Such instructions would usually call for who to vote for on position one and maybe two but there is not that much gain in instructing people to a specific whole sequence. The distribution of pairs of first and second choice is - as i analyzed - pretty broad and while there are combinations more frequent than others (which is natural given similarity and dissimilarity in what candidates represent) there is no single one that stands out specifically.

If you’d think about how much effort it would have taken to change the election to a different outcome - the distance between Joost and Miriam in the end was about 45 votes. That is is about how much votes you would have needed to add or remove to change the result - assuming that this specific change (Miriam instead of Joost) is what you want to accomplish. All other potential goals would have been much more expensive to accomplish.

OSMF Board election 2018 - Answers provided after deadline

I added links to this and other statements by candidates made after November 30 on

osm.wiki/User:Imagico/Analysis_of_OSMF_board_candidates_2018/#Tell_us_a_little_about_your_OSM_activities

Some numbers about mailing lists

I would suggest to focus on recent years and not the full duration of the archive since that could lead to a bias towards the old timers. A long time experience is a valid argument for a candidate but should not be mixed with recent activity.

I think it would be good to include other channels than mailing lists - you mentioned the forum and wiki activities already - diary entries and comments to them are another venue, so are changeset discussions. I think it is fine to exclude proprietary platforms because their use for OSM community discourse is problematic anyway.

And it is always important to keep in mind that in communication quantity does not necessarily say something about quality. Some people comment a lot of things while others contribute more rarely but provide more thoughtful messages.

Membership Working Group Updates

I understand the difficulties. My main concern here is that the decision of the board on accepting requests for waivers due to financial hardship should be subject to external supervision and transparency and the best way to do that is to have numbers for that being publicly reported on a regular basis. This should be separate for the two reasons for waivers currently allowed.

Obviously it will take some time until a routinely working system to track and document decisions is established. I just wanted to make sure this is on the radar of the MWG.

Towards a dedicated public issue tracking/project management system for OSM

My impression is that while github and github clones are in total fairly programmer centric (though you can argue about to what extent this applies to the issue tracker aspect of them alone) phabricator seems more management centric. Since Wikimedia has a much higher degree of centralized management than OSM you would need to think about to what extent it makes sense to use a tool designed specifically for that. In general i would prefer using software that is being developed for a wide audience and avoid tools that are developed for a specific application outside of OSM that is in the future not unlikely to develop in a very different direction.

But ultimately i think this should be about collecting the options we have, find out the pros and cons and then make an informed decision.

Membership Working Group Updates

Thanks for the update.

That Iran is on top of your list is both nice to see and logical since it is the country from your list with the most active mapper community. It is also the most active mapping community in the Middle East in total.

Even more important than the relatively formal decisions on the technical waiver approvals are of course the financial hardship cases where a subjective decision is to be made. Am i right to assume that even if the board makes the decisions on those the MWG will be able to provide reporting on this in the future?

The most surreal and memorable OSMF board meeting yet

Regarding the structure of OSMF governance and checks and balances - i think we have discussed this before - i generally tend towards a direct democracy and am fairly critical towards multiple levels of indirect control (like local mappers -> local chapters -> OSM parliament -> executive board). But the need to facilitate communication across language and culture barriers is of course paramount and this is obviously a difficulty with direct democracy.

I think it would be good to work on and discuss different options in this domain but any such scenario would mean the current board and the OSMF members essentially giving up a large part of their privileges and i am not sure this is realistic.

Regarding the basis of my assessments about what the OSMF needs - yes, i make fairly bold statements here and do not support all of them with a lot of arguments. My assessments about this kind of thing are usually based on plausible scenarios i see. But it is always possible there are scenarios i do not see. Therefore i always look for others describing possibilities that are new to me. If you have a scenario how the current development of the OSMF without any major changes leads to a future with the OSMF supporting the local OSM contributors all over the world in their diverse needs without this being overshadowed by external interests i would be eager to hear.