After many years, I finally decided to dive back into running a full planet import using the Overpass API.
To keep things budget-friendly, I rented a modest Intel-based server for around 40€/month — nothing fancy, just dual data center SSDs and 64GB of ECC RAM.
What surprised me was the runtime: the initial import using release 0.7.58.5 took a whopping 33.5 hours — at least 10 hours longer than I had anticipated. I managed to shave that down to 26.5 hours by tweaking some settings, like enabling LZ4 compression across the board and increasing the chunk size parameter. It helped, but clearly, there’s still room for optimization.
I continued testing with my own experimental Overpass fork, that includes support for PBF, multithreading, and many other changes under the hood. Initial measurements looked quite promising with 10.5 hours total runtime. After some further analysis and improving some data structures, the import took 7 hours and 23 minutes. Peak memory consumption was still quite ok at 22G. I tried different settings to achieve lower memory consumption, at the cost of longer processing time (e.g. 8 hours and 13G peak memory).
Depending on compression settings, the final planet database was in a range of 230-265GB.
Detailed results are available on this wiki page: osm.wiki/User:Mmd/Planet_import_challenge_22
That’s all for today.
토론
2022년 10월 12일 08:11에 pnorman님의 의견
Does overpass not read the history PBFs for historical data?
2022년 10월 12일 19:12에 mmd님의 의견
Historical object versions is a challenging topic. I reran some tests, starting with a 2012 planet, and subsequently applying daily diffs in PBF format. I ended up processing years 2012-2017 at about 600x speed (1 day = 600 OSM days), with a package size of 3 days. Using the official release is much slower and works with XML files only.
Using a full history planet instead doesn’t work. The importer wasn’t designed with this use case in mind and uses way too much memory.
2022년 10월 14일 18:24에 PierZen님의 의견
Hi mmd, for those that are not familiar with these programming tools, could you describe what type of import is done exactly with Planet-OSM - moving into Overpass db or anything else ?
2022년 10월 15일 14:03에 mmd님의 의견
The goal of the import is to set up a new Overpass db using a recent OSM planet file. As usual, the import process writes all nodes, ways and relations to disk, and includes object metadata (user, timestamp, object version number, …). Once the import has finished, you can use the db to run some queries.
In case you don’t want to go through this process, there’s also some clone database available for download (see docs for details).