The state of common names

I was curious to see how many and how regularly common names were being added to iNat, here are the results for anyone else interested :)

Common names Jan-Feb 2023

Lexicons and name counts from the Darwin Core archives exported on the first of each month.

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names | # Lexicon Names
1 English 264269 | 28 Zulu 3444
2 Chinese (Simplified) 103690 | 29 Catalan 3322
3 Czech 79041 | 30 Mandarin Chinese 3119
4 Japanese 67843 | 31 Turkish 3045
5 Spanish 60465 | 32 Indonesian 3018
6 Russian 58783 | 33 Setswana 2705
7 Chinese (Traditional) 55805 | 34 Croatian 2378
8 French 47528 | 35 Slovenian 2353
9 German 44454 | 36 Vermont Flora Codes 2273
10 Dutch 37134 | 37 Bulgarian 1955
11 Finnish 36257 | 38 Slovak 1612
12 Portuguese 31243 | 39 Latvian 1540
13 Swedish 28584 | 40 Nahuatl 1511
14 Norwegian 27025 | 41 Armenian 1430
15 Danish 20854 | 42 Xhosa 1401
16 Afrikaans 19031 | 43 Maori 1368
17 Korean 18575 | 44 Tagalog 1354
18 Arabic 15212 | 45 Malay (Individual Language) 1307
19 Thai 13501 | 46 Belarusian 1290
20 Polish 13316 | 47 Malayalam 1254
21 Italian 13129 | 48 Aou 4 Letter Codes 1230
22 Lithuanian 12880 | 49 Slovene 1230
23 Estonian 12143 | 50 AOU 4-Letter Codes 1143
24 Ukrainian 9400 | 51 Ojibwe 1013
25 Hebrew 7215 | 52 Hodges Number 1012
26 Hungarian 7156 | 53 Visayan 1006
27 Greek 4747 | 54 Hawaiian 1003

Slovene is a ‘bad’ lexicon these names should be in the Slovenian lexicon. While the common name would be searchable it won’t show at the top of the species/taxon page. Adding the two together, ignoring any duplicates, puts Slovenian in 28th position.

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change | # Lexicon Change
1 Chinese (Simplified) 3472 | 14 Dutch 289
2 Portuguese 1985 | 15 Arabic 252
3 French 1751 | 16 Thai 248
4 English 1381 | 17 Italian 191
5 German 1075 | 18 Scientific Name 115
6 Russian 781 | 19 Korean 110
7 Danish 587 | 20 Hebrew 78
8 Japanese 562 | 21 Swedish 53
9 Lithuanian 530 | 22 Kannada 49
10 Chinese (Traditional) 490 | 23 Ukrainian 47
11 Hungarian 451 | 24 Quechua 46
12 Polish 440 | 25 Slovak 37
13 Spanish 290 |

Another ‘bad’ lexicon ‘Scientific Name’ should be ‘Scientific Names’.

New Lexicons

Lexicons created during the past month
Lexicon Names | Lexicon Names | Lexicon Names
Afar 1 | Hidareb 1 | Ontario Plant Codes 29
Bilen 2 | Kunama 1 | Saho 1
Español (Uruguay) 1 | Lamba 1 | Slovenčina 1
Ganda 1 | Mzimba 1 | Tigre 2

‘Bad’ lexicons strike again, ‘Español (Uruguay)’ should be Spanish with the place set to Uruguay.

12 Likes

I’m pleased to see Nahuatl and Hawaiian on the first list as representatives of western hemisphere indigenous languages. I think Mexico’s cultural diversity is often overlooked by outsiders, especially people like me who grew up with burritos and fajitas in a certain border state.

4 Likes

Why top 100 and “smaller lexicons” have repeats?
Btw Slovencina is Slovak.

Oops, “Smaller” shouldn’t be there. I’ve edited the original post, thanks!

1 Like

Zulu 28, Xhosa 42 and Setswana 33 from Southern Africa. Oh and Afrikaans 16 (a young language)

From my ignorance I cannot pick up other African languages, but most are from Europe (and cosmopolitan English ((French and Spanish too?)) is everywhere, ‘Esperanto’ on iNat)

I would guess some of the New lexicons are African?

I have a copypasta - but seldom get a chance to use it

Please add your local common name to iNat
https://forum.inaturalist.org/t/how-to-add-a-common-name-to-a-taxon/9792

3 Likes

(so as not to derail, but for anyone interested in linguistic diversity here: https://www.inali.gob.mx/)

(edit to add: a good place to begin, includes videos of some of Mexico’s 68 indigenous languages, if you want to hear what is spoken here scroll down to maya: https://site.inali.gob.mx/Micrositios/Guardavoces_Mexico_multilingue/)

4 Likes

Why do not use just the last download? There is a date_added column in the Darwin Core archives, I should 1-1-2023 t/m 31-1-2023 for calculating the activity in januari 2023


There still doesn’t need to be both “Aou 4 Letter Codes” and “AOU 4-Letter Codes.” These need to be combined and the duplicates removed.

In that Slovenčina should be Slovak.

Likewise, I’m pleased to see Zulu, Setswana, and Xhosa on the first list as representives of southern hemisphere indigenous languages.

5 Likes

All 11 of the official languages in South Africa have lexicons on iNat, though not quite as well represented as those 4. There are also a few of the unofficial ones too.

The majority are from the Horn of Africa, I’ve added wikipedia links for the curious.

The ‘created’ column is useful for some queries. I mainly wanted to see total change per lexicon, also which (if any) lexicons were losing names.
Names ‘transferred’ out of a ‘bad’ lexicon into a ‘good’ one will still have the original creation date. New copies of existing common names are re-created on the output taxa as part of taxon changes, there are about 1000 of these names created during January.

Messy, yes! Don’t think there are any duplicates, should be a straightforward combination.

And Maori from Aotearoa / New Zealand

1 Like

Saw weird things happening, this might be a good explanation.

I do not think so, check this table, some occurs 4 times: https://forum.inaturalist.org/t/finding-out-common-names-numbers-per-language/17902/34

Ndebele and Sotho are ( I guess) another ones, in the language und(efined?).
https://forum.inaturalist.org/t/finding-out-common-names-numbers-per-language/17902/32

Download the file in that thread if you want to see all >1000 lexicons.

1 Like

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names | # Lexicon Names
1 English 265,248 | 28 Zulu 3,446
2 Chinese (Simplified) 106,455 | 29 Catalan 3,325
3 Czech 79,212 | 30 Turkish 3,323
4 Japanese 68,145 | 31 Mandarin Chinese 3,115
5 Spanish 60,749 | 32 Indonesian 3,050
6 Russian 59,695 | 33 Setswana 2,706
7 Chinese (Traditional) 56,006 | 34 Croatian 2,445
8 French 49,627 | 35 Slovenian 2,376
9 German 45,411 | 36 Vermont Flora Codes 2,273
10 Dutch 37,283 | 37 Bulgarian 1,957
11 Finnish 36,316 | 38 Slovak 1,638
12 Portuguese 31,553 | 39 Latvian 1,548
13 Swedish 28,615 | 40 Nahuatl 1,511
14 Norwegian 27,220 | 41 Armenian 1,431
15 Danish 21,584 | 42 Xhosa 1,401
16 Afrikaans 19,045 | 43 Maori 1,366
17 Korean 18,639 | 44 Tagalog 1,354
18 Arabic 15,250 | 45 Malay (Individual Language) 1,309
19 Polish 14,282 | 46 Belarusian 1,291
20 Thai 13,647 | 47 Malayalam 1,255
21 Italian 13,204 | 48 Aou 4 Letter Codes 1,232
22 Lithuanian 12,978 | 49 Slovene 1,229
23 Estonian 12,149 | 50 AOU 4-Letter Codes 1,139
24 Ukrainian 9,427 | 51 Ojibwe 1,023
25 Hebrew 7,542 | 52 Hodges Number 1,012
26 Hungarian 7,378 | 53 Visayan 1,006
27 Greek 4,760 | 54 Hawaiian 1,005

Polish moves up one place to overtake Thai.
Venda only needs 5 more names to join the 1k club.

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change | # Lexicon Change
1 Chinese (Simplified) 2765 | 16 Norwegian 195
2 French 2099 | 17 Czech 171
3 English 979 | 18 Dutch 149
4 Polish 966 | 19 Albanian 147
5 German 957 | 20 Thai 146
6 Russian 912 | 21 Lithuanian 98
7 Danish 730 | 22 Aotearoa (New Zealand) Bilingual Maori And English. 76
8 Kazakh 516 | 23 Italian 75
9 Hebrew 327 | 24 Croatian 67
10 Portuguese 310 | 25 Korean 64
11 Japanese 302 | 26 Finnish 59
12 Spanish 284 | 27 Arabic 38
13 Turkish 278 | 28 Luxembourgish 36
14 Hungarian 222 | 29 Indonesian 32
15 Chinese (Traditional) 201 | 30 Swedish 31

New Lexicons

Lexicons created during the past month
Lexicon Names
Aotearoa (New Zealand) Bilingual Maori And English. 76
Fwe 1
Interslavic 1
2 Likes

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names - # Lexicon Names
1 English 267,713 | 28 4,604
2 Chinese (Simplified) 109,758 | 29 Turkish 3,493
3 Czech 79,263 | 30 Zulu 3,447
4 Japanese 68,510 | 31 Catalan 3,413
5 Spanish 62,174 | 32 Slovenian 3,370
6 Russian 60,289 | 33 Indonesian 3,161
7 Chinese (Traditional) 56,836 | 34 Mandarin Chinese 2,962
8 French 51,705 | 35 Setswana 2,733
9 German 46,199 | 36 Croatian 2,453
10 Dutch 37,526 | 37 Vermont Flora Codes 2,273
11 Finnish 36,329 | 38 Bulgarian 1,958
12 Portuguese 32,127 | 39 Slovak 1,659
13 Swedish 28,725 | 40 Latvian 1,595
14 Norwegian 27,251 | 41 Nahuatl 1,515
15 Danish 21,725 | 42 Armenian 1,433
16 Afrikaans 19,085 | 43 Xhosa 1,402
17 Korean 18,855 | 44 Malay 1,400
18 Hungarian 16,259 | 45 Maori 1,366
19 Arabic 15,280 | 46 Tagalog 1,355
20 Polish 14,482 | 47 Belarusian 1,294
21 Thai 13,885 | 48 Malayalam 1,256
22 Lithuanian 13,526 | 49 Aou 4 Letter Codes 1,234
23 Italian 13,287 | 50 AOU 4-Letter Codes 1,138
24 Estonian 12,152 | 51 Ojibwe 1,023
25 Ukrainian 9,441 | 52 Hodges Number 1,012
26 Hebrew 7,572 | 53 Hawaiian 1,006
27 Greek 4,899 | 54 Visayan 1,006

Hungarian up 8 places
Lithuanian up 1
Turkish up 2
Slovenian up 2
Indonesian up 1
Malay up 2
The unnamed ‘lexicon’ (#28 above, #2 below) is the count of names without a lexicon

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change - # Lexicon Change
1 Hungarian 8881 | 21 Korean 216
2 4604 | 22 Polish 200
3 Chinese (Simplified) 3303 | 23 Turkish 170
4 English 2465 | 24 Cherokee 149
5 French 2078 | 25 Swahili 148
6 Spanish 1425 | 26 Danish 141
7 Malay 1318 | 27 Greek 139
8 Slovenian 994 | 28 Lezghian 134
9 Chinese (Traditional) 830 | 29 Aotearoa (New Zealand) Bilingual Maori And English. 121
10 German 788 | 30 Indonesian 111
11 Russian 594 | 31 Swedish 110
12 Portuguese 574 | 32 Catalan 88
13 Lithuanian 548 | 33 Italian 83
14 Japanese 365 | 34 Czech 51
15 Waray-Waray 315 | 35 Latvian 47
16 Tamil 255 | 36 Afrikaans 40
17 Bunun 252 | 37 Norwegian 31
18 Dutch 243 | 38 Arabic 30
19 Thai 238 | 39 Hebrew 30
20 Kazakh 237 |

New Lexicons

Lexicons created during the past month
Lexicon Names - Lexicon Names - Lexicon Names
4604 | Hokkien 3 | Mvskoke 2
2 Likes

Thanks.

Those four-letter-codes, maybe we should ask the contributors if these are two different lexicons?

About Zulu and isiZulu:

https://groups.google.com/g/inaturalist/c/3foWuuC-n2k/m/Do-3ye1oEAAJ

They’re the same thing. I thought there was an AOU line in the cleanup script which was run about a week ago, so was surprised to see they hadn’t been merged. I’ll need to have another closer look.

3 Likes

Is there a way to see a list of all lexicons?

1 Like

In another topic started by Marina is a link to a file with all lexicons of 1-1-2023 but you can also download de DWCA or check the list with lexcions when adding a common name.

The 4-letter codes disappeared from the -add-common-name-lexcion combobox.
https://github.com/inaturalist/inaturalist/blob/main/tools/clean_lexicons.rb

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names - # Lexicon Names
1 English 269,656 | 28 4,423
2 Chinese (Simplified) 113,091 | 29 Turkish 3,909
3 Czech 79,278 | 30 Zulu 3,458
4 Japanese 68,685 | 31 Catalan 3,418
5 Spanish 62,500 | 32 Slovenian 3,380
6 Russian 61,670 | 33 Indonesian 3,166
7 Chinese (Traditional) 57,488 | 34 Mandarin Chinese 2,962
8 French 52,679 | 35 Tswana 2,740
9 German 46,355 | 36 Croatian 2,459
10 Dutch 38,164 | 37 Aou 4 Letter Codes 2,374
11 Finnish 36,338 | 38 Vermont Flora Codes 2,273
12 Portuguese 33,300 | 39 Bulgarian 1,961
13 Swedish 28,851 | 40 Slovak 1,689
14 Norwegian 27,259 | 41 Latvian 1,614
15 Danish 22,204 | 42 Nahuatl 1,516
16 Afrikaans 19,100 | 43 Armenian 1,433
17 Korean 18,931 | 44 Xhosa 1,407
18 Hungarian 16,697 | 45 Malay 1,402
19 Arabic 15,284 | 46 Sotho 1,385
20 Polish 14,598 | 47 Maori 1,368
21 Thai 14,035 | 48 Tagalog 1,355
22 Lithuanian 13,743 | 49 Belarusian 1,294
23 Italian 13,344 | 50 Malayalam 1,258
24 Estonian 12,154 | 51 Ojibwe 1,023
25 Ukrainian 9,481 | 52 Hodges Number 1,012
26 Hebrew 7,778 | 53 Hawaiian 1,006
27 Greek 4,901 | 54 Visayan 1,006

‘Aou 4 Letter Codes’ and ‘AOU 4-Letter Codes’ merged
Sotho enters the top 1000+ following the merge of Sesotho, Sotho and Sotho (Southern)

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change - # Lexicon Change
1 Chinese (Simplified) 3333 | 17 Lithuanian 217
2 Tswana 2740 | 18 Hebrew 206
3 English 1943 | 19 Japanese 175
4 Russian 1381 | 20 German 156
5 Portuguese 1173 | 21 Thai 150
6 Aou 4 Letter Codes 1140 | 22 Swedish 126
7 French 974 | 23 Polish 116
8 Sotho 973 | 24 Cherokee 105
9 Chinese (Traditional) 652 | 25 Korean 76
10 Dutch 638 | 26 Aou 6 Letter Codes 61
11 Hill Mari 533 | 27 Italian 57
12 Danish 479 | 28 Kannada 42
13 Hungarian 438 | 29 Ukrainian 40
14 Lozi 421 | 30 Sepedi 39
15 Turkish 416 | 31 Slovak 30
16 Spanish 326 |

New Lexicons

Lexicons created during the past month
Lexicon Names - Lexicon Names - Lexicon Names
Chifwe 1 | Otjihimba 2 | Chitotela 1
Mbalangwe 1 | Sesfontein Damara 2 | Cisubiya 1
Olkola 1 | Sifwe 1 |
2 Likes

Good to see our African languages bubbling up!

1 Like

Yes, it is. Although I hadn’t known that Otjihimba was a separate language from Otjiherero.

1 Like

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names - # Lexicon Names
1 English 271,242 | 29 4,347
2 Chinese (Simplified) 118,036 | 30 Catalan 4,133
3 Czech 79,299 | 31 Zulu 3,458
4 Japanese 69,017 | 32 Slovenian 3,410
5 Spanish 62,679 | 33 Indonesian 3,171
6 Russian 62,078 | 34 Mandarin Chinese 2,962
7 Chinese (Traditional) 57,867 | 35 Tswana 2,744
8 French 53,004 | 36 Croatian 2,461
9 German 46,479 | 37 Aou 4 Letter Codes 2,374
10 Dutch 38,383 | 38 Vermont Flora Codes 2,273
11 Finnish 36,370 | 39 Bulgarian 1,991
12 Portuguese 36,152 | 40 Slovak 1,764
13 Swedish 28,911 | 41 Latvian 1,640
14 Norwegian 27,285 | 42 Nahuatl 1,516
15 Danish 22,661 | 43 Armenian 1,434
16 Korean 20,639 | 44 Xhosa 1,410
17 Afrikaans 19,209 | 45 Malay 1,404
18 Hungarian 17,189 | 46 Sotho 1,386
19 Arabic 15,310 | 47 Maori 1,370
20 Polish 14,820 | 48 Tagalog 1,355
21 Thai 14,200 | 49 Belarusian 1,296
22 Lithuanian 13,857 | 50 Malayalam 1,260
23 Italian 13,437 | 51 Ojibwe 1,022
24 Estonian 12,158 | 52 Hodges Number 1,012
25 Ukrainian 9,743 | 53 Hawaiian 1,006
26 Hebrew 7,802 | 54 Visayan 1,006
27 Greek 4,903 | 55 Venda 1,000
28 Turkish 4,433 |

Korean up one place
‘Undefined’ (#29) down one as curators chip away at it
Catalan up one place
Venda reaches 1000 names and joins this list!

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change - # Lexicon Change
1 Chinese (Simplified) 4945 | 15 Dutch 219
2 Portuguese 2852 | 16 Irish 193
3 Korean 1708 | 17 Spanish 179
4 English 1586 | 18 Thai 165
5 Catalan 715 | 19 German 124
6 Turkish 524 | 20 Lithuanian 114
7 Hungarian 492 | 21 Afrikaans 109
8 Danish 457 | 22 Italian 93
9 Russian 408 | 23 Slovak 75
10 Chinese (Traditional) 379 | 24 Swedish 60
11 Japanese 332 | 25 Cherokee 41
12 French 325 | 26 Finnish 32
13 Ukrainian 262 | 27 Bulgarian 30
14 Polish 222 | 28 Slovenian 30

New Lexicons

Lexicons created during the past month
Lexicon Names - Lexicon Names - Lexicon Names
Aotearoa (New Zealand) Bilingual English And Maori 1 | Enan 1 | Peridea Moorei Ochreipennis 1
Biballa 1 | KaxinawĂĄ 1 | Quimbundo 1
Cassange 1 | Manganja 2 | Quimbundu 1
Cazumbo 1 | Muanha 2 | Quissange 1
Cewa 1 | Nyungwe 2 |
2 Likes