The state of common names

I was curious to see how many and how regularly common names were being added to iNat, here are the results for anyone else interested :)

Common names Jan-Feb 2023

Lexicons and name counts from the Darwin Core archives exported on the first of each month.

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names | # Lexicon Names
1 English 264269 | 28 Zulu 3444
2 Chinese (Simplified) 103690 | 29 Catalan 3322
3 Czech 79041 | 30 Mandarin Chinese 3119
4 Japanese 67843 | 31 Turkish 3045
5 Spanish 60465 | 32 Indonesian 3018
6 Russian 58783 | 33 Setswana 2705
7 Chinese (Traditional) 55805 | 34 Croatian 2378
8 French 47528 | 35 Slovenian 2353
9 German 44454 | 36 Vermont Flora Codes 2273
10 Dutch 37134 | 37 Bulgarian 1955
11 Finnish 36257 | 38 Slovak 1612
12 Portuguese 31243 | 39 Latvian 1540
13 Swedish 28584 | 40 Nahuatl 1511
14 Norwegian 27025 | 41 Armenian 1430
15 Danish 20854 | 42 Xhosa 1401
16 Afrikaans 19031 | 43 Maori 1368
17 Korean 18575 | 44 Tagalog 1354
18 Arabic 15212 | 45 Malay (Individual Language) 1307
19 Thai 13501 | 46 Belarusian 1290
20 Polish 13316 | 47 Malayalam 1254
21 Italian 13129 | 48 Aou 4 Letter Codes 1230
22 Lithuanian 12880 | 49 Slovene 1230
23 Estonian 12143 | 50 AOU 4-Letter Codes 1143
24 Ukrainian 9400 | 51 Ojibwe 1013
25 Hebrew 7215 | 52 Hodges Number 1012
26 Hungarian 7156 | 53 Visayan 1006
27 Greek 4747 | 54 Hawaiian 1003

Slovene is a ‘bad’ lexicon these names should be in the Slovenian lexicon. While the common name would be searchable it won’t show at the top of the species/taxon page. Adding the two together, ignoring any duplicates, puts Slovenian in 28th position.

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change | # Lexicon Change
1 Chinese (Simplified) 3472 | 14 Dutch 289
2 Portuguese 1985 | 15 Arabic 252
3 French 1751 | 16 Thai 248
4 English 1381 | 17 Italian 191
5 German 1075 | 18 Scientific Name 115
6 Russian 781 | 19 Korean 110
7 Danish 587 | 20 Hebrew 78
8 Japanese 562 | 21 Swedish 53
9 Lithuanian 530 | 22 Kannada 49
10 Chinese (Traditional) 490 | 23 Ukrainian 47
11 Hungarian 451 | 24 Quechua 46
12 Polish 440 | 25 Slovak 37
13 Spanish 290 |

Another ‘bad’ lexicon ‘Scientific Name’ should be ‘Scientific Names’.

New Lexicons

Lexicons created during the past month
Lexicon Names | Lexicon Names | Lexicon Names
Afar 1 | Hidareb 1 | Ontario Plant Codes 29
Bilen 2 | Kunama 1 | Saho 1
Español (Uruguay) 1 | Lamba 1 | Slovenčina 1
Ganda 1 | Mzimba 1 | Tigre 2

‘Bad’ lexicons strike again, ‘Español (Uruguay)’ should be Spanish with the place set to Uruguay.

9 Likes

I’m pleased to see Nahuatl and Hawaiian on the first list as representatives of western hemisphere indigenous languages. I think Mexico’s cultural diversity is often overlooked by outsiders, especially people like me who grew up with burritos and fajitas in a certain border state.

3 Likes

Why top 100 and “smaller lexicons” have repeats?
Btw Slovencina is Slovak.

Oops, “Smaller” shouldn’t be there. I’ve edited the original post, thanks!

1 Like

Zulu 28, Xhosa 42 and Setswana 33 from Southern Africa. Oh and Afrikaans 16 (a young language)

From my ignorance I cannot pick up other African languages, but most are from Europe (and cosmopolitan English ((French and Spanish too?)) is everywhere, ‘Esperanto’ on iNat)

I would guess some of the New lexicons are African?

I have a copypasta - but seldom get a chance to use it

Please add your local common name to iNat
https://forum.inaturalist.org/t/how-to-add-a-common-name-to-a-taxon/9792

2 Likes

(so as not to derail, but for anyone interested in linguistic diversity here: https://www.inali.gob.mx/)

(edit to add: a good place to begin, includes videos of some of Mexico’s 68 indigenous languages, if you want to hear what is spoken here scroll down to maya: https://site.inali.gob.mx/Micrositios/Guardavoces_Mexico_multilingue/)

2 Likes

Why do not use just the last download? There is a date_added column in the Darwin Core archives, I should 1-1-2023 t/m 31-1-2023 for calculating the activity in januari 2023…

There still doesn’t need to be both “Aou 4 Letter Codes” and “AOU 4-Letter Codes.” These need to be combined and the duplicates removed.

In that Slovenčina should be Slovak.

Likewise, I’m pleased to see Zulu, Setswana, and Xhosa on the first list as representives of southern hemisphere indigenous languages.

5 Likes

All 11 of the official languages in South Africa have lexicons on iNat, though not quite as well represented as those 4. There are also a few of the unofficial ones too.

The majority are from the Horn of Africa, I’ve added wikipedia links for the curious.

The ‘created’ column is useful for some queries. I mainly wanted to see total change per lexicon, also which (if any) lexicons were losing names.
Names ‘transferred’ out of a ‘bad’ lexicon into a ‘good’ one will still have the original creation date. New copies of existing common names are re-created on the output taxa as part of taxon changes, there are about 1000 of these names created during January.

Messy, yes! Don’t think there are any duplicates, should be a straightforward combination.

And Maori from Aotearoa / New Zealand

1 Like

Saw weird things happening, this might be a good explanation.

I do not think so, check this table, some occurs 4 times: https://forum.inaturalist.org/t/finding-out-common-names-numbers-per-language/17902/34

Ndebele and Sotho are ( I guess) another ones, in the language und(efined?).
https://forum.inaturalist.org/t/finding-out-common-names-numbers-per-language/17902/32

Download the file in that thread if you want to see all >1000 lexicons.

1 Like

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names | # Lexicon Names
1 English 265,248 | 28 Zulu 3,446
2 Chinese (Simplified) 106,455 | 29 Catalan 3,325
3 Czech 79,212 | 30 Turkish 3,323
4 Japanese 68,145 | 31 Mandarin Chinese 3,115
5 Spanish 60,749 | 32 Indonesian 3,050
6 Russian 59,695 | 33 Setswana 2,706
7 Chinese (Traditional) 56,006 | 34 Croatian 2,445
8 French 49,627 | 35 Slovenian 2,376
9 German 45,411 | 36 Vermont Flora Codes 2,273
10 Dutch 37,283 | 37 Bulgarian 1,957
11 Finnish 36,316 | 38 Slovak 1,638
12 Portuguese 31,553 | 39 Latvian 1,548
13 Swedish 28,615 | 40 Nahuatl 1,511
14 Norwegian 27,220 | 41 Armenian 1,431
15 Danish 21,584 | 42 Xhosa 1,401
16 Afrikaans 19,045 | 43 Maori 1,366
17 Korean 18,639 | 44 Tagalog 1,354
18 Arabic 15,250 | 45 Malay (Individual Language) 1,309
19 Polish 14,282 | 46 Belarusian 1,291
20 Thai 13,647 | 47 Malayalam 1,255
21 Italian 13,204 | 48 Aou 4 Letter Codes 1,232
22 Lithuanian 12,978 | 49 Slovene 1,229
23 Estonian 12,149 | 50 AOU 4-Letter Codes 1,139
24 Ukrainian 9,427 | 51 Ojibwe 1,023
25 Hebrew 7,542 | 52 Hodges Number 1,012
26 Hungarian 7,378 | 53 Visayan 1,006
27 Greek 4,760 | 54 Hawaiian 1,005

Polish moves up one place to overtake Thai.
Venda only needs 5 more names to join the 1k club.

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change | # Lexicon Change
1 Chinese (Simplified) 2765 | 16 Norwegian 195
2 French 2099 | 17 Czech 171
3 English 979 | 18 Dutch 149
4 Polish 966 | 19 Albanian 147
5 German 957 | 20 Thai 146
6 Russian 912 | 21 Lithuanian 98
7 Danish 730 | 22 Aotearoa (New Zealand) Bilingual Maori And English. 76
8 Kazakh 516 | 23 Italian 75
9 Hebrew 327 | 24 Croatian 67
10 Portuguese 310 | 25 Korean 64
11 Japanese 302 | 26 Finnish 59
12 Spanish 284 | 27 Arabic 38
13 Turkish 278 | 28 Luxembourgish 36
14 Hungarian 222 | 29 Indonesian 32
15 Chinese (Traditional) 201 | 30 Swedish 31

New Lexicons

Lexicons created during the past month
Lexicon Names
Aotearoa (New Zealand) Bilingual Maori And English. 76
Fwe 1
Interslavic 1
1 Like