Clean up currently available lexicons

Si lozi = Silozi


synonyms = {
“Aou 4 Letter Codes” => [“Aou 4 Letter Codes”],

Another (recently added?) duplicate lexicon that needs to be merged into Scientific Names is Scientific Name (the existence of which makes it take a little bit longer than it would otherwise to tab through the new taxon name page when adding scientific name synonyms)


Made a github issue here:


Setswapong is very correct and Setswapng is a mistake that should be removed.


There are at least two names with the Setswapng lexicon:

Selete ( Botswana) → Selete?


I’m looking into this today. I have a script that does a lot of this stuff, so I’m trying to update that script and to automate it so it runs once a month or something.

However, it’s not at all clear to me what should be considered the canonical lexicon for any given set of synonyms. Is it reasonable to just choose the first name for the language on the English Wikipedia page for that language? For example, that would mean “Kwangali,” RuKwangali," and “Ru Kwangali” would all become “Kwangali.”

I’m not going to do anything about “Luo” if there are multiple languages with that name. There are only ~12 names with that Lexicon. It seems like people have been using “Tongan” for the Pacific island language and “Tonga” for the African language, so I don’t think there’s anything to do there.

Lexicons like “Und” and “Indigenous” will just be left alone since they need manual attention. I guess I could synonymize “Creole (English)” and “English (Creole)” but they’re both equally useless, right? There are a bunch of English creoles out there and presumably they could have different names for the same taxa.


Had a quick look at the script, may have missed it.
Occasionally there are names that don’t have a lexicon, as far as I’m aware there is no way to search for them, would it be possible to assign them to something like ‘Und’ or ‘Other’ so they can be found in the DWCA export for correction where possible?

1 Like

A few more for the script
Español (Uruguay)
Wissenschaftliche Namen → Scientific Names

український → Ukrainian ?
臺灣閩南語 → Taiwanese Hokkien ?
Swahili (Individual Language) → Swahili ?

1 Like

It should be impossible to create taxon names without a lexicon, but there are some that exist from before we instituted that rule. I don’t want to give them a lexicon of “Und” or something because that basically violates the rule, but I can include them in the DwC-A under the language und (a real ISO 639 code!). That would meet your need, right?

The first/top name in the Language Infobox on English Wikipedia pages should be the exonym, not sure how reliable this is though. Other options may be referencing to the name used in ISO 639 or Glottolog.

Yes, the few I can recall were imports from a while back.

Yes, that would work

I would like that discussion to be on a flag, so local people can say which is the correct name for their language. Or at least a locally based scientist.


To be clear, you are asking for the English name of the language, right?

(As an example, for the language spoken in Germany, you would want “German”, not “Deutsch”, since language names are translated in Crowdin.)


The English name for the lexicon, see also topic title. Georgian, Macedonian might be an exception.

Ju Hoan, Ju|'hoan, Juǀ’Hoan → Juǀ’hoan
Swati, Si Swati → Swazi
Oshi Kwanyama, Oshikwanyama, Kwanyama ( Ovamboland) → Kwanyama
Papiamentu → Papiamento
Punjab → Punjabi

Thank you. I have corrected all three mistake. Selete (Botswana) should be removed from the lexicon as well as the misspelling Setswapng. They were my mistakes.


Please consider keeping prefixes for names of southern African languages. The prefix shows the word is a language. Tswana would be wrong and perhaps meaningless without the correct prefix Se and only understandable or acceptable to an English person without the prefix. What is wrong with Setswana as the iNat language name ?Can’t iNat language names in Africa be made afrocentric rather than anglocentric, just to keep the English natives happy ! Perhaps iNat should consider removing superfluous suffixes. Do we really need SH at the end of a language name ? Can’t the names Engl, Wel, Ir, Span, Swed, Dan,Turk be used if suffixes and prefixes are an iNat taboo ? It would be much simpler and words would be shorter.
Why does iNat feel a strong need to perpetuate the linguistic mistakes of the colonial British in southern Africa, while African Govts are trying to reafricanise the names of their languages and shake off the colonial legacy of fake or wrong English names? This implies there is a need to select the correct and most acceptable and up-to-date name for each language. Is it going to be Swazi, Siswati or Swati ? Which is most correct and acceptable to somebody living in Eswatini ( not colonial Swaziland }. Names of countries get changed and corrected so why not languages as well ? Please consider keeping the correct language prefixes !

1 Like

Please keep the Botswana language Selete and dont remove the prefix to change the language name to Lete. Lete is a language name in Ghana. Prefixes are sometimes useful since they inform us which part of Africa the language name belongs to. They also group similar languages/dialects together in the language list. Does iNat need to emulate the mistakes of Wiki by removing important prefixes ?

Coming from iKapa (Cape Town) where the local language is isiXhosa - it amuses me that the website is called

Africa rules!

iKapa - altho the Wiki should use i not I

I wonder why and who chose to name us ‘i Naturalist’ ?


In fact those language names should all be searchable. With the main entry under the singe ‘default accepted English’ version.
In the same way that we can search for taxonomy synonyms.
A living language is a life form, and the name changes as history unfolds. And is rewritten. (Or reinvented)

I wonder how many people using iNat in English (or Spanish or …) are working with their second, or even third language?

1 Like

Ju Hoan, Ju|'hoan, Juǀ’Hoan → Juǀ’hoan

Why use a letter L to replace a click symbol. Surely Wiki is wrong to rename the language in this way or leave out the | because it is difficult to pronounce by some.

1 Like