The state of common names

There still doesn’t need to be both “Aou 4 Letter Codes” and “AOU 4-Letter Codes.” These need to be combined and the duplicates removed.

In that Slovenčina should be Slovak.

Likewise, I’m pleased to see Zulu, Setswana, and Xhosa on the first list as representives of southern hemisphere indigenous languages.

5 Likes

All 11 of the official languages in South Africa have lexicons on iNat, though not quite as well represented as those 4. There are also a few of the unofficial ones too.

The majority are from the Horn of Africa, I’ve added wikipedia links for the curious.

The ‘created’ column is useful for some queries. I mainly wanted to see total change per lexicon, also which (if any) lexicons were losing names.
Names ‘transferred’ out of a ‘bad’ lexicon into a ‘good’ one will still have the original creation date. New copies of existing common names are re-created on the output taxa as part of taxon changes, there are about 1000 of these names created during January.

Messy, yes! Don’t think there are any duplicates, should be a straightforward combination.

And Maori from Aotearoa / New Zealand

1 Like

Saw weird things happening, this might be a good explanation.

I do not think so, check this table, some occurs 4 times: https://forum.inaturalist.org/t/finding-out-common-names-numbers-per-language/17902/34

Ndebele and Sotho are ( I guess) another ones, in the language und(efined?).
https://forum.inaturalist.org/t/finding-out-common-names-numbers-per-language/17902/32

Download the file in that thread if you want to see all >1000 lexicons.

1 Like

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names | # Lexicon Names
1 English 265,248 | 28 Zulu 3,446
2 Chinese (Simplified) 106,455 | 29 Catalan 3,325
3 Czech 79,212 | 30 Turkish 3,323
4 Japanese 68,145 | 31 Mandarin Chinese 3,115
5 Spanish 60,749 | 32 Indonesian 3,050
6 Russian 59,695 | 33 Setswana 2,706
7 Chinese (Traditional) 56,006 | 34 Croatian 2,445
8 French 49,627 | 35 Slovenian 2,376
9 German 45,411 | 36 Vermont Flora Codes 2,273
10 Dutch 37,283 | 37 Bulgarian 1,957
11 Finnish 36,316 | 38 Slovak 1,638
12 Portuguese 31,553 | 39 Latvian 1,548
13 Swedish 28,615 | 40 Nahuatl 1,511
14 Norwegian 27,220 | 41 Armenian 1,431
15 Danish 21,584 | 42 Xhosa 1,401
16 Afrikaans 19,045 | 43 Maori 1,366
17 Korean 18,639 | 44 Tagalog 1,354
18 Arabic 15,250 | 45 Malay (Individual Language) 1,309
19 Polish 14,282 | 46 Belarusian 1,291
20 Thai 13,647 | 47 Malayalam 1,255
21 Italian 13,204 | 48 Aou 4 Letter Codes 1,232
22 Lithuanian 12,978 | 49 Slovene 1,229
23 Estonian 12,149 | 50 AOU 4-Letter Codes 1,139
24 Ukrainian 9,427 | 51 Ojibwe 1,023
25 Hebrew 7,542 | 52 Hodges Number 1,012
26 Hungarian 7,378 | 53 Visayan 1,006
27 Greek 4,760 | 54 Hawaiian 1,005

Polish moves up one place to overtake Thai.
Venda only needs 5 more names to join the 1k club.

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change | # Lexicon Change
1 Chinese (Simplified) 2765 | 16 Norwegian 195
2 French 2099 | 17 Czech 171
3 English 979 | 18 Dutch 149
4 Polish 966 | 19 Albanian 147
5 German 957 | 20 Thai 146
6 Russian 912 | 21 Lithuanian 98
7 Danish 730 | 22 Aotearoa (New Zealand) Bilingual Maori And English. 76
8 Kazakh 516 | 23 Italian 75
9 Hebrew 327 | 24 Croatian 67
10 Portuguese 310 | 25 Korean 64
11 Japanese 302 | 26 Finnish 59
12 Spanish 284 | 27 Arabic 38
13 Turkish 278 | 28 Luxembourgish 36
14 Hungarian 222 | 29 Indonesian 32
15 Chinese (Traditional) 201 | 30 Swedish 31

New Lexicons

Lexicons created during the past month
Lexicon Names
Aotearoa (New Zealand) Bilingual Maori And English. 76
Fwe 1
Interslavic 1
2 Likes

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names - # Lexicon Names
1 English 267,713 | 28 4,604
2 Chinese (Simplified) 109,758 | 29 Turkish 3,493
3 Czech 79,263 | 30 Zulu 3,447
4 Japanese 68,510 | 31 Catalan 3,413
5 Spanish 62,174 | 32 Slovenian 3,370
6 Russian 60,289 | 33 Indonesian 3,161
7 Chinese (Traditional) 56,836 | 34 Mandarin Chinese 2,962
8 French 51,705 | 35 Setswana 2,733
9 German 46,199 | 36 Croatian 2,453
10 Dutch 37,526 | 37 Vermont Flora Codes 2,273
11 Finnish 36,329 | 38 Bulgarian 1,958
12 Portuguese 32,127 | 39 Slovak 1,659
13 Swedish 28,725 | 40 Latvian 1,595
14 Norwegian 27,251 | 41 Nahuatl 1,515
15 Danish 21,725 | 42 Armenian 1,433
16 Afrikaans 19,085 | 43 Xhosa 1,402
17 Korean 18,855 | 44 Malay 1,400
18 Hungarian 16,259 | 45 Maori 1,366
19 Arabic 15,280 | 46 Tagalog 1,355
20 Polish 14,482 | 47 Belarusian 1,294
21 Thai 13,885 | 48 Malayalam 1,256
22 Lithuanian 13,526 | 49 Aou 4 Letter Codes 1,234
23 Italian 13,287 | 50 AOU 4-Letter Codes 1,138
24 Estonian 12,152 | 51 Ojibwe 1,023
25 Ukrainian 9,441 | 52 Hodges Number 1,012
26 Hebrew 7,572 | 53 Hawaiian 1,006
27 Greek 4,899 | 54 Visayan 1,006

Hungarian up 8 places
Lithuanian up 1
Turkish up 2
Slovenian up 2
Indonesian up 1
Malay up 2
The unnamed ‘lexicon’ (#28 above, #2 below) is the count of names without a lexicon

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change - # Lexicon Change
1 Hungarian 8881 | 21 Korean 216
2 4604 | 22 Polish 200
3 Chinese (Simplified) 3303 | 23 Turkish 170
4 English 2465 | 24 Cherokee 149
5 French 2078 | 25 Swahili 148
6 Spanish 1425 | 26 Danish 141
7 Malay 1318 | 27 Greek 139
8 Slovenian 994 | 28 Lezghian 134
9 Chinese (Traditional) 830 | 29 Aotearoa (New Zealand) Bilingual Maori And English. 121
10 German 788 | 30 Indonesian 111
11 Russian 594 | 31 Swedish 110
12 Portuguese 574 | 32 Catalan 88
13 Lithuanian 548 | 33 Italian 83
14 Japanese 365 | 34 Czech 51
15 Waray-Waray 315 | 35 Latvian 47
16 Tamil 255 | 36 Afrikaans 40
17 Bunun 252 | 37 Norwegian 31
18 Dutch 243 | 38 Arabic 30
19 Thai 238 | 39 Hebrew 30
20 Kazakh 237 |

New Lexicons

Lexicons created during the past month
Lexicon Names - Lexicon Names - Lexicon Names
4604 | Hokkien 3 | Mvskoke 2
2 Likes

Thanks.

Those four-letter-codes, maybe we should ask the contributors if these are two different lexicons?

About Zulu and isiZulu:

https://groups.google.com/g/inaturalist/c/3foWuuC-n2k/m/Do-3ye1oEAAJ

They’re the same thing. I thought there was an AOU line in the cleanup script which was run about a week ago, so was surprised to see they hadn’t been merged. I’ll need to have another closer look.

3 Likes

Is there a way to see a list of all lexicons?

1 Like

In another topic started by Marina is a link to a file with all lexicons of 1-1-2023 but you can also download de DWCA or check the list with lexcions when adding a common name.

The 4-letter codes disappeared from the -add-common-name-lexcion combobox.
https://github.com/inaturalist/inaturalist/blob/main/tools/clean_lexicons.rb

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names - # Lexicon Names
1 English 269,656 | 28 4,423
2 Chinese (Simplified) 113,091 | 29 Turkish 3,909
3 Czech 79,278 | 30 Zulu 3,458
4 Japanese 68,685 | 31 Catalan 3,418
5 Spanish 62,500 | 32 Slovenian 3,380
6 Russian 61,670 | 33 Indonesian 3,166
7 Chinese (Traditional) 57,488 | 34 Mandarin Chinese 2,962
8 French 52,679 | 35 Tswana 2,740
9 German 46,355 | 36 Croatian 2,459
10 Dutch 38,164 | 37 Aou 4 Letter Codes 2,374
11 Finnish 36,338 | 38 Vermont Flora Codes 2,273
12 Portuguese 33,300 | 39 Bulgarian 1,961
13 Swedish 28,851 | 40 Slovak 1,689
14 Norwegian 27,259 | 41 Latvian 1,614
15 Danish 22,204 | 42 Nahuatl 1,516
16 Afrikaans 19,100 | 43 Armenian 1,433
17 Korean 18,931 | 44 Xhosa 1,407
18 Hungarian 16,697 | 45 Malay 1,402
19 Arabic 15,284 | 46 Sotho 1,385
20 Polish 14,598 | 47 Maori 1,368
21 Thai 14,035 | 48 Tagalog 1,355
22 Lithuanian 13,743 | 49 Belarusian 1,294
23 Italian 13,344 | 50 Malayalam 1,258
24 Estonian 12,154 | 51 Ojibwe 1,023
25 Ukrainian 9,481 | 52 Hodges Number 1,012
26 Hebrew 7,778 | 53 Hawaiian 1,006
27 Greek 4,901 | 54 Visayan 1,006

‘Aou 4 Letter Codes’ and ‘AOU 4-Letter Codes’ merged
Sotho enters the top 1000+ following the merge of Sesotho, Sotho and Sotho (Southern)

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change - # Lexicon Change
1 Chinese (Simplified) 3333 | 17 Lithuanian 217
2 Tswana 2740 | 18 Hebrew 206
3 English 1943 | 19 Japanese 175
4 Russian 1381 | 20 German 156
5 Portuguese 1173 | 21 Thai 150
6 Aou 4 Letter Codes 1140 | 22 Swedish 126
7 French 974 | 23 Polish 116
8 Sotho 973 | 24 Cherokee 105
9 Chinese (Traditional) 652 | 25 Korean 76
10 Dutch 638 | 26 Aou 6 Letter Codes 61
11 Hill Mari 533 | 27 Italian 57
12 Danish 479 | 28 Kannada 42
13 Hungarian 438 | 29 Ukrainian 40
14 Lozi 421 | 30 Sepedi 39
15 Turkish 416 | 31 Slovak 30
16 Spanish 326 |

New Lexicons

Lexicons created during the past month
Lexicon Names - Lexicon Names - Lexicon Names
Chifwe 1 | Otjihimba 2 | Chitotela 1
Mbalangwe 1 | Sesfontein Damara 2 | Cisubiya 1
Olkola 1 | Sifwe 1 |
2 Likes

Good to see our African languages bubbling up!

1 Like

Yes, it is. Although I hadn’t known that Otjihimba was a separate language from Otjiherero.

1 Like

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names - # Lexicon Names
1 English 271,242 | 29 4,347
2 Chinese (Simplified) 118,036 | 30 Catalan 4,133
3 Czech 79,299 | 31 Zulu 3,458
4 Japanese 69,017 | 32 Slovenian 3,410
5 Spanish 62,679 | 33 Indonesian 3,171
6 Russian 62,078 | 34 Mandarin Chinese 2,962
7 Chinese (Traditional) 57,867 | 35 Tswana 2,744
8 French 53,004 | 36 Croatian 2,461
9 German 46,479 | 37 Aou 4 Letter Codes 2,374
10 Dutch 38,383 | 38 Vermont Flora Codes 2,273
11 Finnish 36,370 | 39 Bulgarian 1,991
12 Portuguese 36,152 | 40 Slovak 1,764
13 Swedish 28,911 | 41 Latvian 1,640
14 Norwegian 27,285 | 42 Nahuatl 1,516
15 Danish 22,661 | 43 Armenian 1,434
16 Korean 20,639 | 44 Xhosa 1,410
17 Afrikaans 19,209 | 45 Malay 1,404
18 Hungarian 17,189 | 46 Sotho 1,386
19 Arabic 15,310 | 47 Maori 1,370
20 Polish 14,820 | 48 Tagalog 1,355
21 Thai 14,200 | 49 Belarusian 1,296
22 Lithuanian 13,857 | 50 Malayalam 1,260
23 Italian 13,437 | 51 Ojibwe 1,022
24 Estonian 12,158 | 52 Hodges Number 1,012
25 Ukrainian 9,743 | 53 Hawaiian 1,006
26 Hebrew 7,802 | 54 Visayan 1,006
27 Greek 4,903 | 55 Venda 1,000
28 Turkish 4,433 |

Korean up one place
‘Undefined’ (#29) down one as curators chip away at it
Catalan up one place
Venda reaches 1000 names and joins this list!

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change - # Lexicon Change
1 Chinese (Simplified) 4945 | 15 Dutch 219
2 Portuguese 2852 | 16 Irish 193
3 Korean 1708 | 17 Spanish 179
4 English 1586 | 18 Thai 165
5 Catalan 715 | 19 German 124
6 Turkish 524 | 20 Lithuanian 114
7 Hungarian 492 | 21 Afrikaans 109
8 Danish 457 | 22 Italian 93
9 Russian 408 | 23 Slovak 75
10 Chinese (Traditional) 379 | 24 Swedish 60
11 Japanese 332 | 25 Cherokee 41
12 French 325 | 26 Finnish 32
13 Ukrainian 262 | 27 Bulgarian 30
14 Polish 222 | 28 Slovenian 30

New Lexicons

Lexicons created during the past month
Lexicon Names - Lexicon Names - Lexicon Names
Aotearoa (New Zealand) Bilingual English And Maori 1 | Enan 1 | Peridea Moorei Ochreipennis 1
Biballa 1 | Kaxinawá 1 | Quimbundo 1
Cassange 1 | Manganja 2 | Quimbundu 1
Cazumbo 1 | Muanha 2 | Quissange 1
Cewa 1 | Nyungwe 2 |
2 Likes

Incidentally, when a new lexicon is added, how long does it take to show up in the list of lexicons? I have been adding names in Guaymí (a Chibchan-family language spoken in the Chiriquí region of Panama), and I have to use the “add new lexicon” option with each word because adding the first name did not put it on the list.

1 Like

Definitely a bit weird. here’s one of those names: https://www.inaturalist.org/taxon_names/3359924/edit

It shows as “unknown” on the edit page:

image

But appears correct on the taxon page:

1 Like

This is a known issue, the lexicon will be added if you wait long enough. It is solved in the end: https://www.inaturalist.org/taxa/155941/taxon_names/new https://forum.inaturalist.org/t/it-seems-khanty-language-is-missing-from-possible-languages/37887/2

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names - # Lexicon Names
1 English 272,712 | 29 Turkish 4,617
2 Chinese (Simplified) 120,837 | 30 4,293
3 Czech 79,326 | 31 Zulu 3,471
4 Japanese 69,396 | 32 Slovenian 3,434
5 Spanish 63,233 | 33 Indonesian 3,189
6 Russian 62,223 | 34 Mandarin Chinese 2,962
7 Chinese (Traditional) 58,185 | 35 Tswana 2,750
8 French 53,810 | 36 Croatian 2,467
9 German 47,328 | 37 Aou 4 Letter Codes 2,374
10 Dutch 38,573 | 38 Vermont Flora Codes 2,273
11 Portuguese 36,553 | 39 Bulgarian 1,994
12 Finnish 36,387 | 40 Slovak 1,771
13 Swedish 28,926 | 41 Latvian 1,696
14 Norwegian 27,307 | 42 Nahuatl 1,516
15 Danish 22,807 | 43 Armenian 1,434
16 Korean 20,855 | 44 Xhosa 1,410
17 Afrikaans 19,242 | 45 Malay 1,407
18 Hungarian 17,593 | 46 Sotho 1,386
19 Arabic 15,349 | 47 Maori 1,371
20 Polish 15,158 | 48 Tagalog 1,355
21 Thai 14,339 | 49 Belarusian 1,296
22 Lithuanian 14,164 | 50 Malayalam 1,261
23 Italian 13,541 | 51 Ojibwe 1,022
24 Estonian 12,165 | 52 Hodges Number 1,012
25 Ukrainian 9,775 | 53 Hawaiian 1,010
26 Hebrew 7,824 | 54 Visayan 1,006
27 Catalan 5,480 | 55 Venda 1,000
28 Greek 4,906 |

Portuguese up one place
Catalan up two places
‘Undefined’ (#30) down one as curators continue to work on it

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change - # Lexicon Change
1 Chinese (Simplified) 2801 | 14 Dutch 190
2 English 1470 | 15 Turkish 184
3 Catalan 1347 | 16 Danish 146
4 German 849 | 17 Russian 145
5 French 806 | 18 Thai 139
6 Spanish 554 | 19 Guaymí 117
7 Hungarian 404 | 20 Italian 104
8 Portuguese 401 | 21 Icelandic 82
9 Japanese 379 | 22 Latvian 56
10 Polish 338 | 23 Luxembourgish 48
11 Chinese (Traditional) 318 | 24 Arabic 39
12 Lithuanian 307 | 25 Afrikaans 33
13 Korean 216 | 26 Ukrainian 32

New Lexicons

Lexicons created during the past month
Lexicon Names - Lexicon Names - Lexicon Names
Alyawarr 1 | Guaymí 117 | Kumeyaay 1
Chochenyo 1 | Jersey French 4 | Spa1 1

1 - Bad import, should presumably be Spanish.

2 Likes

Top Lexicons

Lexicons with 1000 or more names
# Lexicon Names - # Lexicon Names
1 English 274,202 | 29 Turkish 4,682
2 Chinese (Simplified) 123,728 | 30 4,278
3 Czech 79,403 | 31 Zulu 3,476
4 Japanese 69,648 | 32 Slovenian 3,438
5 Spanish 63,826 | 33 Indonesian 3,191
6 Russian 62,414 | 34 Mandarin Chinese 2,964
7 Chinese (Traditional) 58,456 | 35 Tswana 2,907
8 French 54,862 | 36 Croatian 2,470
9 German 47,529 | 37 Aou 4 Letter Codes 2,376
10 Dutch 38,636 | 38 Vermont Flora Codes 2,273
11 Portuguese 38,390 | 39 Bulgarian 1,994
12 Finnish 36,443 | 40 Slovak 1,776
13 Swedish 28,963 | 41 Latvian 1,705
14 Norwegian 27,345 | 42 Nahuatl 1,516
15 Danish 22,913 | 43 Armenian 1,434
16 Korean 21,068 | 44 Xhosa 1,411
17 Afrikaans 19,258 | 45 Malay 1,407
18 Hungarian 17,723 | 46 Sotho 1,393
19 Polish 15,787 | 47 Maori 1,371
20 Arabic 15,402 | 48 Tagalog 1,355
21 Thai 14,380 | 49 Belarusian 1,296
22 Lithuanian 14,225 | 50 Malayalam 1,261
23 Italian 13,729 | 51 Ojibwe 1,022
24 Estonian 12,173 | 52 Hawaiian 1,014
25 Ukrainian 9,908 | 53 Hodges Number 1,012
26 Hebrew 7,882 | 54 Romanian 1,011
27 Catalan 5,793 | 55 Visayan 1,006
28 Greek 4,908 | 56 Venda 1,003

Arabic up one place
‘Undefined’ (#30), names without a lexicon
Hawaiian up one
Romanian, new entry on the list!

Active Lexicons

Lexicons with 30 or more new names
# Lexicon Change - # Lexicon Change
1 Chinese (Simplified) 2891 | 16 Hungarian 130
2 Portuguese 1837 | 17 Danish 106
3 English 1490 | 18 Czech 77
4 French 1052 | 19 Turkish 65
5 Polish 629 | 20 Dutch 63
6 Spanish 593 | 21 Gumbaynggirr 63
7 Catalan 313 | 22 Lithuanian 61
8 Chinese (Traditional) 271 | 23 Hebrew 58
9 Japanese 252 | 24 Finnish 56
10 Korean 213 | 25 Arabic 53
11 German 201 | 26 Thai 41
12 Russian 191 | 27 Norwegian 38
13 Italian 188 | 28 Swedish 37
14 Tswana 157 | 29 Serbian 30
15 Ukrainian 133 |

New Lexicons

Lexicons created during the past month
Lexicon Names - Lexicon Names - Lexicon Names
Cadigal 1 | Gumbaynggirr 63 | Kunda 4
Goro 1 | Ju|'Hoan1 2 |
  1. See also Ju Hoan, Ju|'hoan, Juǀ’Hoan & Juǀ’hoan
1 Like

https://forum.inaturalist.org/t/finding-out-common-names-numbers-per-language/17902/33

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.