It starts with the planet - downloading OSM the right way
Posted by pnorman on 17 January 2018 in English.This is a repost of an entry on my blog.
To do something with OpenStreetMap data, we have to download it first. This can be the entire data from planet.openstreetmap.org or a smaller extract from a provider like Geofabrik. If you’re doing this manually, it’s easy. Just a single command will call curl
or wget
, or you can download it from the browser. If you want to script it, it’s a bit harder. You have to worry about error conditions, what can go wrong, and make sure everything can happen unattended. So, to make sure we can do this, we write a simple bash script.
The goal of the script is to download the OSM data to a known file name, and return 0 if successful, or 1 if an error occurred. Also, to keep track of what was downloaded, we’ll make two files with information on what was downloaded, and what state it’s in: state.txt
and configuration.txt
. These will be compatible with osmosis, the standard tool for updating OpenStreetMap data.
Before doing anything else, we specify that this is a bash script, and that if anything goes wrong, the script is supposed to exit.
#!/usr/bin/env bash
set -euf -o pipefail
Next, we put the information about what’s being downloaded, and where, into variables. It’s traditional to use the Geofabrik Liechtenstein extract for testing, but the same scripts will work with the planet.
PLANET_FILE='data.osm.pbf'
PLANET_URL='http://download.geofabrik.de/europe/liechtenstein-latest.osm.pbf'
PLANET_MD5_URL="${PLANET_URL}.md5"
We’ll be using curl to download the data, and every time we call it, we want to add the options -s
and -L
. Respectively, these make curl silent and cause it to follow redirects. Two files are needed: the data, and it’s md5 sum. The md5 file looks something like 27f7... liechtenstein-latest.osm.pbf
. The problem with this is we’re saving the file as $PLANET_FILE
, not liechtenstein-latest.osm.pbf
. A bit of manipulation with cut
fixes this.