There are a bunch of duplicate lexica that require continual maintenance because any user can add a new lexicon.
At this point there should probably be some sort of staff approval process, so we don’t end up with situations like now: separate entries in the dropdown for “Spanish”, “Español”, Español (Chile)", “Español chileno”, “Spanish (Peru)”, and both “Russian” and “русский”
There should be just one Russian and the languages should be translated in the dropdown based on the viewer’s language, or display the native name next to the translated name, e.g. “Russian / русский” or “Inglese / English”.
If a common name needs to be applied to a particular place, there is already a system for that. Rather than creating a new lexicon for Peruvian Spanish, assign the name to Spanish and set it as default to a particular place: Peru.
Can’t you add the most common lexicons in advance and block access for normal users?
(We have 40 lexicons now).
I don’t think it’s a good idea, there are a lot of common names in ethnic languages used mainly by indigenous people, normal users should be allowed to request to add new lexicons for these common names.
With a new approval system, normal users can just flag a curator to add the new lexicon they want
There are several hundred lexicons available. There are 40 languages into which the site has been translated. They are not the same thing.
I think the problem is getting worse. A brief look through the current list shows:
[note that Australia has many indigenous languages]
Español (Costa Rica)
Modern Greek (1453 )
Sotho ( Northern)
Manx English Dialect
And that was just what stood out as probably duplicates/wrong (though some might be legitimate, apologies in advance if that’s the case).
I think we’re going to go with this (https://github.com/inaturalist/inaturalist/issues/2675). Should solve most of these issues and not require extra staff time.
No clean up will be done? Maybe start with the list of lexicons above from Jwideness with less then 25 records…
There’s a name for muskrat in a lexicon called “Indigenous”, which says almost nothing about where this name is from.
I think Botswana still has 40 lexicons yet to be added to iNat. That is just one country.
A new added lexicon which is on this list should be added and approval is not required.
Perhaps this list only shows the standard 3-letter symbols for language names and not a standardised American English name for a language. Fortunately most languages are not on this list.
A difficulty I am having with this: in trying to learn the local names for taxa in the Dominican Republic, I have my settings set to show Spanish, with the location specified as Dominican Republic. Unfortunately, I am now seeing Spanish names for all sorts of taxa which do not occur in the Dominican Republic. My guess is that if there is no name specific to the Dominican Republic, it defaults to the general Spanish name, which is not what I am looking for. Adding to the difficulty is that unless I go to each taxon page, I have no way of knowing if the Spanish name I see is the one used in the Dominican Republic, or one used in the Spanish-speaking world generally.
I could see this difficulty being a reason why someone might have added those separate regional lexicons in the first place.