Logotip de l'OpenStreetMap OpenStreetMap

As a US user, I don't run into many accented characters. I'm usually blissfully ignorant of character encoding issues, happily using the same ASCII codes as I did 25 years ago on my Apple II.

In loading the USGS points of interest, especially for Puerto Rico, I am encountering various accented vowels and ñ (n~, hope I got it right here :-). The source data is encoded as iso-8859-1, while JOSM works in utf-8. I noticed this when I loaded my OSM file into JOSM, and got little squares instead of legible letters.

No problem, I thought, I can add an encoding line to the start of the OSM file, saying it is iso-8859-1. No variations of upper/lower case or with/without dashes made any difference. It still came up with little boxes for accented chars in JOSM. (This is JOSM version 1504). I think JOSM only takes UTF-8 from disk files, and doesn't obey the XML encoding specified in the file.

Doing a hex dump on the file showed single-byte values for the special characters, so I was sure it was iso-8859-1 and not utf-8.

Plan B: make the file really UTF-8. In Vim, I loaded the file, then did

:set fileencoding=utf-8

and saved the file. That did the trick. Hexdump showed multi-byte characters, and JOSM showed them correctly on the screen. Problem solved.

Icona de correu electrònic Icona de Bluesky Facebook Icon Icona de LinkedIn Icona de Mastodon Icona de Telegram Icona de X

Discussió

Comentari de Firefishy el 31 Març 2009 a les 07.41

Another way...
iconv - codeset conversion

$ iconv -f ISO-8859-1 -t UTF-8 [file]

Inicia sessió per a fer un comentari