OpenStreetMap 标志 OpenStreetMap

Carnildo的日记

最近的日记文章

Sigh

Carnildo 于 2020年九月23日 以 English 发布

I spend hours studying news reports and carefully tracing building outlines in Malden, noting which ones did or didn’t survive the Babb fire.

And then a bunch of wanna-be do-gooders come by and crap out HOT-quality mapping, complete with duplicate buildings, vehicles mapped as buildings, scribbled outlines, and nothing but self-congratulatory hashtags for edit summaries. I think I’m going to just revert the whole batch.

位置: Malden, Whitman County, Washington, 99149, United States

What the robots.txt file does

Carnildo 于 2019年六月24日 以 English 发布

Disclaimer: I am not an OSM website developer. All information here was obtained by looking at the OSM GitHub repository and poking at the OSM website.

There’s been some controversy recently over the contents of the OpenStreetMap robots.txt file. I think it might be informative to look at what the file actually does.

Allow: /user/

This does nothing. “Allow” lines in a robots.txt file permit the crawling of URLs that would otherwise be denied, but there’s nothing in the file that would deny the /user hierarchy.

Disallow: /traces/tag/
Disallow: /traces/page/

These are various alternate ways of searching the GPS traces that have been uploaded on the site. The main trace listing is still accessible.

Disallow: /trace/

This is the API endpoint for accessing GPS traces. It is not intended to be displayed in a web browser, and contains nothing useful for a search engine.

Disallow: /api/

This is the API endpoint for editing the map. It is not intended to be displayed in a web browser, and contains nothing useful for a search engine.

Disallow: /edit

This is the URL for the in-browser editor. Everything under this URL is behind a login barrier, and it contains nothing useful for a search engine.

Disallow: /message

This is the URL hierarchy for the on-site PM system. Everything under this URL is behind a login barrier, and it contains nothing useful for a search engine.

Disallow: /login

This is the above-mentioned login barrier. It contains nothing useful for a search engine.

Disallow: /history

This is the visual history browser. The contents change far too rapidly to meaningfully index on a search engine.

Disallow: /geocoder

This is the on-site search system. Search engines searching search engines never ends well.

Disallow: /browse

Disallow: /*lat=
Disallow: /*node=
Disallow: /*way=
Disallow: /*relation=

查看完整日记文章