Logo d'OpenStreetMap OpenStreetMap

Google Summer of Code 2022

Unviáu por tareqpi el 24 May 2022 en English

Hi everyone, my name is Tareq Al-Ahdal. I am a computer science undergraduate student at Universiti Teknologi Malaysia. Recently, I got accepted into Google Summer of Code 2022 as an open source contributor with OpenStreetMap. I will work this summer on enhancing Nominatim: OpenStreetMap’s geocoding software that enables us to search and find location addresses based on their names and vice versa.

Nominatim is currently using a computed importance value to rank the search results based on the location’s perceived importance. This importance value is derived from the popularity of the Wikipedia article of each location. However, not every location on earth has its own Wikipedia article. As a result, the locations that do not have their own Wikipedia articles will not have an importance value, thereby the ranking of the search results, in that case, is deemed inaccurate. OpenStreetMap has data regarding the number of times users accessed each location on the map. This data is a good indicator of how popular a place is. The aim of my work is to integrate this data into Nominam’s computation of the importance value so that the search results become more accurate which will help the users find the correct places that they are looking for in less time.

I will use this diary to keep you updated about my work. Please feel free to reach out if you have any questions regarding my work or anything else you have in mind.

Allugamientu: Taman Tun Dr Ismail, Kuala Lumpur, 60000, Malaysia
Email icon Bluesky Icon Facebook Icon LinkedIn Icon Mastodon Icon Telegram Icon X Icon

Discussion

Comentariu de bryceco el 24 de May de 2022 a les 18:01

If anyone else is curious about the map access statistics referenced here, it’s the number of times each map tile has been served by the tile server: https://planet.openstreetmap.org/tile_logs/

Comentariu de tareqpi el 24 de May de 2022 a les 21:46

Yes, thank you @bryceco 🙌

Comentariu de mmd el 25 de May de 2022 a les 15:11

Tiles are mostly served by Fastly CDN these days. IIRC, tile_logsonly includes tiles which haven’t been cached by the CDN and need to be re-rendered. This might result in quite some skewed data. I’d highly recommend to get in touch with @pnorman to discuss those topics before starting actual implementation work.

Comentariu de tareqpi el 25 de May de 2022 a les 22:57

Ok, I will to him. Thank you @mmd

Comentariu de pnorman el 27 de May de 2022 a les 05:10

No, the tile logs are from the CDN and are an accurate count of successful requests for tiles. Prior to 2021-04-13 the logs were only from the second layer of the old CDN.

The logs include successful requests on tiles where there were at least 10 requests, and the requests came from at least 3 distinct IPs. Most of them represent real views, but there are some artifacts.

Entrar pa dexar un comentariu