OpenStreetMap logo OpenStreetMap

pnorman's Diary

Recent diary entries

Reporting Shortbread problems

Posted by pnorman on 25 July 2025 in English.

A style using the Shortbread vector tiles is now live on OpenStreetMap.org. There’s been some minor issues with the roll-out, but nothing that caused me to panic. I’ve been taking a couple of days to unwind and relax, as this has been a major milestone.

A feature like this includes work across multiple projects, so people have been asking where to report issues.

  1. Any issues that are present on the OpenStreetMap website but not on the vector demo page should be reported to openstreetmap-website. Some issues reported have been with attribution controls and strange panning at extremely high zooms

  2. Style issues (e.g. colors, symbology, fonts, etc) are handled by the VersaTiles Colorful and VersaTiles Eclipse styles. Both can be found in a VersaTiles repository.

  3. What is supposed to be in the tiles is defined in the Shortbread Vector Tile Schema 1.0. There are lists of what features should appear, and where they should appear.

  4. The code responsible for what is actually in the tiles is contained in the Street Spirit repo. If something is missing that is supposed to be in Shortbread, this needs fixing. This also handles simplifcation and other parts of generating the tiles.

  5. The software that serves the tiles is Tilekiln. If the wrong HTTP headers are being sent, this is where you report it.

  6. The chef code pulls everything together on the OSMF servers. It’s unlikely you should report an issue directly here.

  7. Some font rendering issues are part of MapLibre GL JS. This is most likely to come up in Southeast Asia.

Identifying changed coastline

Posted by pnorman on 19 June 2025 in English.

Just a quick blog post on some coastline work I was doing.

For the OSMF Shortbread vector tiles I had to identify when coastlines has changed. The solution I came up isn’t specific to Shortbread, but is useful for anyone using the tiled ocean shapefiles.

I’m going to start by assuming that the old ocean data is in water_polygons and the new data is in loading.water_polygons. Other parts of my code already handle this. The shapefiles are loaded into tables that have the columns x int, y int, way geometry.

To start I want to find any geometries that have changed. For geometries in the new data that aren’t in the old data, I can get this with a LEFT JOIN. I want a set of geometries that includes any geometries from the new data that aren’t in the old. This set can be made by excluding any geometries in the old data that have identical x, y, and binary identical geometry to a new geometry. It’s possible this set includes extra geometries, but that’s okay.

A RIGHT JOIN would find geometries in the old data that aren’t in the new. Combining these gives a FULL OUTER JOIN. If I then collect the geometries in each shapefile tile I can compare them to find the geometries

SELECT ST_SymDifference(ST_Collect(old.way), ST_Collect(new.way)) AS dgeom
    FROM water_polygons AS old
    FULL OUTER JOIN loading.water_polygons AS new
        ON old.way = new.way and old.x = new.x and old.y = new.y
    WHERE new.way IS NULL OR old.way IS NULL
    GROUP BY COALESCE(old.x, new.x), COALESCE(old.y, new.y)

This gets me the difference in geometries for the entire world in about two minutes. But I need tiles, which is it’s own complication.

See full entry

Performance testing vector tiles

Posted by pnorman on 10 June 2025 in English.

Load testing vector tiles

As part of bringing the new vector tile servers into production, I had to benchmark their performance. Since there’s a cache in front of the servers, it’s challenging to benchmark them accurately. Although we’ve never had a heavy load on the vector tile servers, we’ve been running raster tile servers for years.

All tile requests on the standard layer are logged, and from those logs, I can generate a list of tiles to benchmark the vector tile servers. The logs are stored as Parquet files, which I query using Amazon Athena, a hosted Presto database.

Vector tiles and raster tiles typically have different scales at the same zoom level. To convert raster tile requests to equivalent vector tile requests, I divide the x and y coordinates by 2 and decrease the zoom level by 1. I also skip zoom 0 raster tile requests to simplify the process, as these don’t affect performance since zoom 0 is always cached.

The OSMF shortbread tiles have a maximum zoom of 14. Lower scales (higher zoom levels) are achieved by overzooming on the client side. Requests from zoom 1 to 15 should have their zoom level lowered by 1. Requests from zoom 16 to 19 need their zoom level decreased by the difference between their level and 14. I divide the x and y coordinates by 2 the appropriate number of times to match the new zoom level.

Filtering to have only cache misses gets me a request list on the backend servers.

SELECT 
    CASE WHEN z > 15 THEN 14 ELSE z - 1 END AS v_z,
    bitwise_right_shift(x, CASE WHEN z > 15 THEN z-14 ELSE 1 END) AS v_x,
    bitwise_right_shift(x, CASE WHEN z > 15 THEN z-14 ELSE 1 END) AS v_y
    z, x, y
FROM fastly_success_logs_v1
WHERE year=2025 AND month=5 AND day = 1 AND hour = 1
    AND z >= 1
    AND cachehit = 'MISS';

Unfortunately, this is the wrong list.

See full entry

It’s difficult to write in all map style languages. A style written in JSON, like MapLibre, has a few extra pain points because JSON is not designed for editing by humans.

Some “common” style languages are

  • CartoCSS
  • Mapnik XML
  • MapCSS
  • MapServer
  • MapLibre GL/Mapbox GL

Some, like CartoCSS, are designed for human editing, while others, like Mapnik XML, serve as a lower-level language. MapLibre GL falls into this category of not being designed for editing by humans. MapLibre GL preprocessors like glug were designed to help with this, but none of them have taken off. Other style projects like openstreetmap-americana have taken a different route. Their developers have written a program in JavaScript that generates the style.

I’m taking a different route. I’m creating a language that uses minimal pre-processing of its input to produce MapLibre GL. I don’t aim to solve every difficulty with MapLibre GL, only the ones that impact me the most. The end result will be a pre-processing language

The biggest problems I encounter when writing MapLibre GL are

  1. No comments

    Comments are essential so other readers understand what’s written

  2. Everything has to be in one file.

    With large styles this is a burden. More than one file makes it easier to edit.

  3. Having to repeat definitions instead of using a variable.

    Something like a color or symbol definition might appear a dozen times in the style. If you want to change it, you need to make sure you got all the occurrences.

  4. Inability to make versions of the style in different colors.

    When you only want to change a few superficial elements of the style, you want to contain those changes to one file.

  5. Not having support for more colorspaces

    I work in perceptual colorspaces like Lch. It’s a lot of converting that the computer should automate.

What issues have you found when writing MapLibre GL styles?

This blog post explains how I handle a typical bug report for the new OSMF Shortbread tiles. Here, I focus on the “island” seems to be missing from “place_labels” report from SomeoneElse

After verifying that the report is correct, I set up my editor environment. It’s useful to have an environment that syntax highlights Jinja SQL files, as well as other files. I use a Visual Studio Code-based editor with the Better Jinja plugin.

The issue is in the place_labels layer. After checking Shortbread, I see that place=island should show at zoom 10 or higher, so there is a bug. Tilekiln creates tiles by reading definitions from shortbread.yaml, so I check there for the place_labels definition.

place_labels:
    description: Holds label points for populated places.
    fields:
        kind: Value of OSM place tag
        name: *name
        name_en: *name_en
        name_de: *name_de
        population: Value of OSM population tag
    sql:
    - minzoom: 4
        maxzoom: 14
        file: shortbread_original/place_labels.04-14.sql.jinja2

This file shows that for zooms 4 to 14, the SQL for the layer is in shortbread_original/place_labels.04-14.sql.jinja2. Since this file is in shortbread_original, osm2pgsql-themepark created it, and it remains unchanged.

SELECT
        ST_AsMVTGeom(geom, {{unbuffered_bbox}}, {{extent}}, {{buffer}}) AS way,
        name,
        name_de,
        name_en,
        kind,
        population
    FROM place_labels
    WHERE geom && {{bbox}}
        AND {{zoom}} >= minzoom
    ORDER BY population desc

There aren’t any obvious bugs in the SQL. There’s no filtering out of islands, so either the data isn’t making it into the place_labels table or it has the wrong zoom. The data is loaded by osm2pgsql, and shortbread.lua tells osm2pgsql how to do that.

themepark:add_topic('shortbread/places')

See full entry

Minutely Shortbread tiles

Posted by pnorman on 29 February 2024 in English. Last updated on 5 March 2024.

I’ve put up a demo page showing my work on minutely updated vector tiles. This demo is using my work for the tiles and the Versatiles Colorful stylesheet.

With this year being the year of OpenStreetMap vector maps I’ve been working on making vector tile maps that update minutely. Most maps don’t need minutely updates and are fine with daily or, at most, weekly. Minutely updates on OpenStreetMap.org are a crucial part of the feedback cycle where mappers can see their edits right away and get inspired to map more often. Typically a mapper can make an edit and see their edit when reloading after 90-180 seconds, compared to the days or weeks of most OSM-based services, or the months or years of proprietary data sources.

Updating maps once a week can be done with a simple architecture that takes the OSM file for the planet and turns it into a single file containing all the tiles for the world. This can scale to daily updates, but not much faster. To do minutely updates we need to generate tiles one-by-one, since they change one-by-one. When combined with the caching requirements for osm.org, this is something no existing software solved.

For some time I’ve been working on Tilekiln, a small piece of software which leverages the existing vector tile generation of PostGIS, the standard geospatial database. Tilekiln is written specifically to meet the unique requirements of a default layer on osm.org. Recently, I’ve been working for the OSMF at setting up minutely updated vector tiles using the Shortbread schema. A schema is a set of definitions for what goes in the vector tiles, and Shortbread is a CC0 licensed schema that anyone can use and there are existing styles for.

See full entry

I’ve been looking at how many tiles are changed when updating OSM data in order to better guide resource estimations, and have completed some benchmarks. This is the technical post with details, I’ll be doing a high-level post later.

Software like Tilemaker and Planetiler is great for generating a complete set of tiles, updated about once a day, but they can’t handle minutely updates. Most users are fine with daily or slower updates, but OSM.org users are different, and minutely updates are critical for them. All the current minutely ways to generate map tiles involve loading the changes and regenerating tiles when data in them may have changed. I used osm2pgsql, the standard way to load OSM data for rendering, but the results should be applicable to other ways including different schemas.

Using the Shortbread schemea from osm2pgsql-themepark I loaded the data with osm2pgsql and ran updates. osm2pgsql can output a list of changed tiles (“expired tiles”) and I did this for zoom 1 to 14 for each update. Because I was running this on real data sometimes an update took longer than 60 seconds to process if it was particularly large, and in this case the next run would combine multiple updates from OSM. Combining multiple updates reduces how much work the server has to do at the cost of less frequent updates, and this has been well documented since 2012, but no one has looked at the impact from combining tiles.

To do this testing I was using a Hezner server with 2x1TB NVMe drives in RAID0, 64GB of RAM, and an Intel i7-8700 @ 3.2 GHz. Osm2pgsql 1.10 was used, the latest version at the time. The version of themepark was equivalent to the latest version

The updates were run for a week from 2023-12-30T08:24:00Z to 2024-01-06T20:31:45Z. There were some interruptions in the updates, but I did an update without expiring tiles after the interruptions so they wouldn’t impact the results.

To run the updates I used a simple shell script

See full entry