I’m working on publishing a summary of sites using tile.osm.org and want to know what format would be most useful for people.
The information I’ll be publishing is requests/second, requests/second that were cache misses, and domain. The first two are guaranteed to be numbers, while the last one is a string that will typically be a domain name like www.openstreetmap.org, but could theoretically contain a poisoned value like a space.
The existing logs which have tiles and number of requests are formatted as z/x/y N
where z/x/y are tile coordinates and N is the number of accesses.
My first thought was TPS TPS_MISS DOMAIN
, space-separated like the existing logs. This would work, with the downside that it’s not very future proof. Because the domain can theoretically have a space, it has to be last. This means that any future additions will require re-ordering the columns, breaking existing usage. Additionally, I’d really prefer to have the domain at the start of the line.
A couple of options are - CSV, with escaping - tab-delimited
Potential users, what would work well with the languages and libraries you prefer?
An example of the output right now is
1453.99 464.1 www.openstreetmap.org
310.3 26.29 localhost
136.46 39.68 dro.routesmart.com
123.65 18.54 www.openrailwaymap.org
107.98 0.05 www.ad-production-stage.com
96.64 1.78 r.onliner.by
91.42 0.16 solagro.org
87.83 1.53 tvil.ru
84.88 12.98 eae.opekepe.gov.gr
74.0 2.32 www.mondialrelay.fr
63.44 1.93 www.lightningmaps.org
63.22 14.01 nakarte.me
55.1 0.74 qualp.com.br
52.77 11.25 apps.sentinel-hub.com
46.68 4.07 127.0.0.1
46.3 1.96 www.gites-de-france.com
43.47 1.15 www.anwb.nl
42.46 10.52 dacota.lyft.net
41.13 6.63 www.esri.com
40.84 0.69 busti.me