Weird encoding error when exporting data in species_guess

Using the website at export link below, sometimes the species_guess column has random incoherent symbols. I haven’t done any searching to see if there is a pattern with what observations have this error and which do not, but they seem to be consistent from one export to the next where the same observation has the same symbols. This is certainly an encoding error of some sort. Out of my 1000 most recent observations, 18 had this error

https://www.inaturalist.org/observations/export.

Using this string the first row should display this error
q=urochloa+mutica&search_on=names&quality_grade=any&identifications=any&user_id=kevinfaccenda

In excel
image
in notepad
image

Curiously, throwing the Chinese characters into google reveal that it is the Chinese common name of the observation. How the heck did that get into a column of latin names?

This is not an iNat encoding error, as you demonstrate the characters are displayed correctly in notepad. The issue is Excel failing to read the utf8 characters. Rather than opening the csv file directly use the data/text import wizard and set the column to utf8.

3 Likes

That sounds like it’s similar to this bug report from 2020 regarding common names downloading in random languages:

https://forum.inaturalist.org/t/checklists-download-in-random-languages-despite-no-changes-to-settings-seen-in-the-taxon-common-name-field/10980

Fair enough, there still should not be Chinese characters in a field which is latin names however. There is still a bug here.