Load testing vector tiles
As part of bringing the new vector tile servers into production, I had to benchmark their performance. Since there’s a cache in front of the servers, it’s challenging to benchmark them accurately. Although we’ve never had a heavy load on the vector tile servers, we’ve been running raster tile servers for years.
All tile requests on the standard layer are logged, and from those logs, I can generate a list of tiles to benchmark the vector tile servers. The logs are stored as Parquet files, which I query using Amazon Athena, a hosted Presto database.
Vector tiles and raster tiles typically have different scales at the same zoom level. To convert raster tile requests to equivalent vector tile requests, I divide the x and y coordinates by 2 and decrease the zoom level by 1. I also skip zoom 0 raster tile requests to simplify the process, as these don’t affect performance since zoom 0 is always cached.
The OSMF shortbread tiles have a maximum zoom of 14. Lower scales (higher zoom levels) are achieved by overzooming on the client side. Requests from zoom 1 to 15 should have their zoom level lowered by 1. Requests from zoom 16 to 19 need their zoom level decreased by the difference between their level and 14. I divide the x and y coordinates by 2 the appropriate number of times to match the new zoom level.
Filtering to have only cache misses gets me a request list on the backend servers.
SELECT
CASE WHEN z > 15 THEN 14 ELSE z - 1 END AS v_z,
bitwise_right_shift(x, CASE WHEN z > 15 THEN z-14 ELSE 1 END) AS v_x,
bitwise_right_shift(x, CASE WHEN z > 15 THEN z-14 ELSE 1 END) AS v_y
z, x, y
FROM fastly_success_logs_v1
WHERE year=2025 AND month=5 AND day = 1 AND hour = 1
AND z >= 1
AND cachehit = 'MISS';
Unfortunately, this is the wrong list.
If a user requests the tiles 10/0/0, 10/0/1, 10/1/0, and 10/1/0 at the same time, they would have requested only one tile, if it was a vector tile map, the 9/0/0 tile. Something similar happens above zoom 14. When the user is viewing a higher zoom, they’re still requesting zoom 14 tiles. I can’t deduplicate all requests because if a user requests a tile that results in a cache miss and then returns the next day to request the same tile, it could still result in a cache miss. Similarly, different POPs might serve two users, so their requests shouldn’t be deduplicated either.
In a day, I see 216,107,962 cache misses. After deduplication by tile, IP, user-agent, and time to the nearest hour, that drops to 214,960,391. This means my numbers will be off by about 0.5%, which is fine.
Using this, along with a bit of string formatting, I get a list of URLs to test.
WITH tiles AS (
SELECT
CASE WHEN z > 15 THEN 14 ELSE z - 1 END AS v_z,
bitwise_right_shift(x, CASE WHEN z > 15 THEN z-14 ELSE 1 END) AS v_x,
bitwise_right_shift(x, CASE WHEN z > 15 THEN z-14 ELSE 1 END) AS v_y,
useragent, ip, year, month, day, hour
FROM fastly_success_logs_v1
WHERE year=2025 AND month=5 AND day = 1
AND z >= 1
AND cachehit = 'MISS'
)
SELECT concat('https://dribble.openstreetmap.org/shortbread_v1/', CAST(v_z AS varchar), '/', CAST(v_x AS varchar), '/', CAST(v_y AS varchar),'.mvt')
FROM tiles
GROUP BY v_z, v_x, v_y, useragent, ip, year, month, day, hour
Exporting the results and stripping quotes from the CSV format, I get a 1GB list of tiles.
For my initial load test I tried siege. Because the list of tiles was 1GB and too big for Siege to handle, I pulled out the top million URLs and used siege’s option to select random URLs from the file.
siege -f 20250501.txt -b --no-parser -r once --internet -c 200
Quick tests show the performance levels off at around 100 tiles rendered per second or 1.5k tiles served per second. At this point, I hit the limits of Siege and my testing machine. This is better than a raster tile server, so it’s sufficient for an initial release.
Discussion
Comment from Firefishy on 11 June 2025 at 11:43
Nice work!
Comment from avena701 on 25 June 2025 at 20:25
I’d be interested to know whether you’ve evaluated the client-side performance (user experience) of the tilekiln-shortbread-demo. I’m quite eager to use this technology and have set up a slightly modified clone of the demo for a specific application for a friend. However, he finds that it loads too slowly (compared to other maps on the same hardware) and has asked me to use a raster map instead. 😕
I’ve also noticed that the tilekiln-shortbread-demo (https://pnorman.github.io/tilekiln-shortbread-demo) performs noticeably worse on my hardware than other maps, but still within somehow acceptable limits (1-4s). On some browsers (e.g. Bromite on my Fairphone / Chromium on Desktop), it can even take around 20–30 seconds to load.
I don’t know where to look for bottlenecks and if there is anything I can do on the web developer side?