When uploading an image to the iNaturalist website that contains a non-English description metadata, the text imported into the “Note” field appears misencoded.
Steps to Reproduce
Take a photo and set the EXIF “rdf:Description” metadata (named Title in Windows) to a non-English string (e.g., in Cyrillic, Chinese, or other non-Latin charset).
Upload the photo to the iNaturalist website via the standard upload interface.
Check the “Note” field associated with the uploaded image.
Expected Behavior
The non-English “rdf:Description” should be imported into the “Note” field correctly, preserving the original characters.
Observed Behavior
The text in the “Note” field is garbled, misencoded, or replaced with unrelated characters.
Additional Notes
The “rdf:Description” is UTF-8 by default, but it appears to be handled as if it were ANSI.
Have any steps been taken to address this bug? It’s quite surprising that in 2025, the site still does not handle UTF-8 correctly - something that has been standard for over two decades. Would you recommend that I open an issue on GitHub regarding this?
I just tested this, and it works fine for me. Are you sure the tool you’re using to edit the image metadata isn’t the problem? I used exiftool to add the relevant tags. To be clear, these are the XMP dc:description and dc:title tags, as defined here: XMP - Dublin Core Schema.
It would help if you could post the complete XMP data from an example image that demonstrates the problem (i.e. as produced by the tool you use for editing the title/description). In particular, it would be interesting to see the values of the xml:lang attributes (if any).
The exact exiftool command I used for setting the tags was:
PS: one thing I forgot to ask is what your system locale settings are. I don’t use Windows myself, but my understanding is that its default encoding isn’t UTF-8. Have you explicitly changed the settings yourself? The garbled text shown in your image looks very similar to what would be produced by decoding UTF-8 encoded text using the Windows cp1252 encoding. The locale on my Linux system is LANG=en_GB.UTF-8, which perhaps explains why I don’t have the problems you have.
I attempted to set the description metadata using both Adobe Photoshop’s File Info tool and the standard Windows file property editor, both of which, to my understanding, generate de-facto standard XMP metadata. Locale is en-US. However, neither method was successful.
In production, I use Adobe Lightroom along with Jeffrey Metadata Wrangler plugin, which reliably produces metadata that displays correctly in both Photoshop and Windows. Despite this, the metadata still fail to import properly on iNaturalist.
– Here I clean all metadata in the reference image
exiftool.exe -all= -overwrite_original test-noexif.jpg
copy test-noexif.jpg test-photoshop.jpg
copy test-noexif.jpg test-windows.jpg
– Then I edit test-photoshop.jpg with Adobe Photoshop → File Info and Save
exiftool.exe -xmp -w txt -b test-photoshop.jpg
– Then I edit test-windows.jpg with RightClick/Properties and Apply changes
exiftool.exe -xmp -w txt -b test-windows.jpg
The relevant tags are all encoded as UTF-8, but the iNaturalist uploader is attempting to decode some of them using Windows cp1252. (EDIT: actually, having now looked at the iNaturalist source code, it’s even less sophisticated than that. The raw bytes of some tags are just treated as ASCII text - regardless of the encoding - and all the code does is strip out any null bytes).
I had a look at the metadata of your image files, and found that the title/description/subject isn’t only written to the XMP section - there’s also three other tags added in the EXIF section. The one that’s causing the problem is the ImageDescription tag. Once that’s removed, the uploader reads everything else correctly. The two other EXIF tags are XPTitle and XPSubject. These are Windows-specific, but according to the spec they can be encoded using either UTF-8 or ANSI, and have the data type int8u. By contrast, the ImageDescription tag has a data type of string, which only supports ASCII (see: Standard Exif Tags).
So the bug here is that UTF-8 encoded data is being written to a tag which does not really support it. Given that, it’s not unreasonable for the uploader to refuse to guess the encoding and just treat the data as ASCII text. It uses a third-party library to load the exif (piexifjs), so it would be understandable if the current behaviour wasn’t changed. In the meantime, I suppose the ideal work-around from your end would be to configure your image metadata editing software to just never write to the ImageDescription tag (assuming that’s possible).