OpenStreetMap logo OpenStreetMap

OpenStreetMap Carto Complexity

Posted by pnorman on 16 November 2015 in English.

This is a repost from my blog

I often refer to OpenStreetMap Carto as the largest most complex open multi-contributor map style, but what does that mean?

Broken down, it means

  • It’s the largest open stylesheet. If you measure in code size, features rendered, or complexity, nothing else is close;

  • It’s the largest multi-contributor map style that doesn’t have a company dictating what is worked on. This means we get merge conflicts. They got so bad we changed the technology we use to define layers to make them solvable; and

  • It’s the largest style using OpenStreetMap data. Some proprietary styles like OpenCycleMap, MapQuest Open, and Mapbox Streets are complex, but none of them render the range of features we do.

This complexity didn’t come about out of nowhere. It’s been building since contributions shot up in October 2014. This is when we introduced YAML layer definitions, making the style much easier to edit and streamlined the feature merge process.

The style is large enough that no one person can understand it all. I know I can’t and I’m a maintainer. There are too many parts, and too many interdependencies between them. How does this style stack up against other big Mapnik styles which show a range of features? Styles like OpenStreetMap “FR” Carto and OpenStreetMap Carto German which try to showcase all of OSM data are forked versions of OpenStreetMap Carto, but there are some truly independent styles we can look at.

Not all styles use YAML layers, so to make the measurements consistent I processed layers defined in JSON through a bit of python:

python -c 'import sys, yaml, json; print yaml.safe_dump(json.load(sys.stdin))'

This is the reverse of osm-carto yaml2mml.py and gives the layers in the same YAML form.

Stamen have taken part of the design of CartoDB Basemaps as well as their own maps, and all three make use of some variation of High Road which simplifies you to only ever see three road classifications at a zoom, and what they are changes with zoom level.

Mapbox Streets’ heavy use of SQL is unusual. They are using triggers to post-process osm2pgsql data into multiple tables, simplify, and transform tagging. This novel approach probably brings with it interesting maintenance challenges, and normally I’d recommend using Imposm or osm2pgsql lua transforms.

Cycle.travel uses 285 lines of Lua to have one of the most sophisticated handling of cycle-related tags for rendering surface quality, and it would take significantly more SQL to do the same work in layer queries.

Surprisingly, Mapnik XML line counts are comparable to CartoCSS line counts, so we can look at the Mapnik XML stylesheet from 2012, MapQuest Open, and OpenTopoMap, three full-featured Mapnik XML stylesheets.

What’s shocking is the linecount of osm-carto compared to everything else. The next three most complex CartoCSS styles have about the same number of lines combined.

The choice of imposm vs osm2pgsql or the use of intermediate vector tiles don’t seem to change style complexity.

Thanks for Richard Fairhurst, AJ Ashton, and Andy Allan for numbers for their stylesheets. Komяpa provided some MapCSS numbers, but I ultimately didn’t use them since I wasn’t sure MapCSS and CartoCSS linecounts were comparable.

Email icon Bluesky Icon Facebook Icon LinkedIn Icon Mastodon Icon Telegram Icon X Icon

Discussion

Comment from imagico on 16 November 2015 at 08:36

There are two important things you analysis misses i think:

  • In addition to Mapbox Streets there are also other styles that use preprocessing. Like OpenTopoMap In fact you could say through the coastlines all styles make some use of external data preprocessing that is based on additional code. This is of course the same for all styles.
  • In addition code complexity is largely influenced by the feature set offered by the underlying software. Styles vary in what versions of the various tools they require for example and if they use custom extensions (like for PostGIS). Quite a lot of the code complexity in osm-carto is there to work around limitations of the capabilities of the software used.
  • many styles use external non-osm data which often essentially means externalizing processing complexity.

Comment from pnorman on 22 November 2015 at 02:36

In addition to Mapbox Streets there are also other styles that use preprocessing. Like OpenTopoMap

I hadn’t counted those lines, but there’s only about 50 there. And of those, only about 10 lines seem to be ones you should run, the rest are a SQL statement that doesn’t seem to do anything.

CartoDB basemaps also use some preprocessing, but not trigger-based like Mapbox streets.

3k lines of preprocessing triggers and other SQL is, in my experience, unique.

In fact you could say through the coastlines all styles make some use of external data preprocessing that is based on additional code

None of them are maintaining their own coastline code, except perhaps to import coastlines into PostgreSQL. I think I counted it where they’re doing that but shouldn’t have, because I didn’t count osm-carto get-shapfiles.sh which is also probably longer than other styles.

Quite a lot of the code complexity in osm-carto is there to work around limitations of the capabilities of the software used

All the styles use PostgreSQL, PostGIS, and Mapnik. I don’t think any of them are making any assumptions about versions, unless they happen to create databases and use the CREATE EXTENSION syntax. I guess some of the CartoCSS styles would also work with Mapserver and magnacarto instead of Mapnik and carto, but no one is targeting that.

many styles use external non-osm data which often essentially means externalizing processing complexity

With the possible exception of CartoDB, they’re not editing the external sources. If they don’t have to edit them, it doesn’t add to the maintenance burden.

Log in to leave a comment