Simbolo d'OpenStreetMap OpenStreetMap

As a US user, I don't run into many accented characters. I'm usually blissfully ignorant of character encoding issues, happily using the same ASCII codes as I did 25 years ago on my Apple II.

In loading the USGS points of interest, especially for Puerto Rico, I am encountering various accented vowels and ñ (n~, hope I got it right here :-). The source data is encoded as iso-8859-1, while JOSM works in utf-8. I noticed this when I loaded my OSM file into JOSM, and got little squares instead of legible letters.

No problem, I thought, I can add an encoding line to the start of the OSM file, saying it is iso-8859-1. No variations of upper/lower case or with/without dashes made any difference. It still came up with little boxes for accented chars in JOSM. (This is JOSM version 1504). I think JOSM only takes UTF-8 from disk files, and doesn't obey the XML encoding specified in the file.

Doing a hex dump on the file showed single-byte values for the special characters, so I was sure it was iso-8859-1 and not utf-8.

Plan B: make the file really UTF-8. In Vim, I loaded the file, then did

:set fileencoding=utf-8

and saved the file. That did the trick. Hexdump showed multi-byte characters, and JOSM showed them correctly on the screen. Problem solved.

Icôna de mèl Icôna de Bluesky Icôna de Facebook Icôna de LinkedIn Icôna de Mastodon Icôna de Telegram Icôna de X

Discussion

Comentèro de Firefishy lo 31 March 2009 a 07:41

Another way...
iconv - codeset conversion

$ iconv -f ISO-8859-1 -t UTF-8 [file]

Sè branchiér por balyér un comentèro