Finding out common names' numbers per language

If a lexicon has not yet been added, is there a way to create it?

1 Like

Yes anyone can create a lexicon, which has lead to duplicates like Norwegian Bokmål and Norwegian Bokmal; and Nonbre Comun, Nombres Científicos, Nomes Científicos, Nomi Scientifici, Noms Scientifiques, Научные названия and 學名. There are about 670 lexicons at this moment!

2 Likes

Thank you a lot!

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Does anybody know if it’s possible to do a some kind of search of names’ statistics with the UI or other way?

If you can read between the lines you find an answer in the Tutorial section or even in this thread…
inaturalist-taxonomy.dwca\VernacularNames_Languages_count

https://drive.google.com/file/d/1Nbqi-qRmJQijoXTL_gsQBUATdHg3y8WE/view?usp=sharing (per year)

Data 1-1-2023, threshold 10.000. there are over 1015 lexicons.

language lexicon COUNT
en English 262888
zh-CN Chinese (Simplified) 100218
cs Czech 79038
ja Japanese 67281
es Spanish 60175
ru Russian 58002
zh Chinese (Traditional) 55315
fr French 45777
de German 43379
nl Dutch 36845
fi Finnish 36241
pt Portuguese 29258
sv Swedish 28531
nb Norwegian 26996
da Danish 20267
af Afrikaans 19030
ko Korean 18465
ar Arabic 14960
th Thai 13253
it Italian 12938
pl Polish 12876
lt Lithuanian 12350
et Estonian 12132

Interesting idea and maybe also valid for ‘common names’. But it has the same issue as a taxon name: if the correct name is not available the observation will get an incorrect name or a taxon at a higher value.

More than 12,000 names in Estonian. Estonian is rather a niche language, with a bit over 1 million speakers according to Omniglot. This is similar to Fang and only half as many as Fante. Then there is Yoruba, with 42 miliion speakers.

I note that of the top 23 in your embedded image, the majority are European languages, and the one that could be thought of as African is of European origin (Afrikaans, genetically related to Dutch). Ethnologue estimates that there are 448 Indo-European languages as compared with 1,553 Niger-Congo languages. Niger-Congo is the largest language family, followed by Austronesian with 1,257 languages and Trans-New Guinea with 481. Indo-European is fourth.

What if the three largest language families had proportionately as many common names in iNat’s database as the fourth-place family?

2 Likes

You need to find people to add those, I’m saddened to know how few names have many languages, of them top is Kazakh with 230, that’s unacceptable, no wonder there’re so few observers.

2 Likes

I agree that there is a relation between the number of common names in iNaturalist and the use of the platform. But most people are too busy to add them… But does my reaction answer your question? What statistics you mean?

names=common names?
names=contributor? I do not get it.


https://forum.inaturalist.org/t/run-job-to-associate-translated-common-names-to-places/3964
About common names for lexicions without a languague and without a place.

Names meaning common names, I wouldn’t refer to people as names.)
Your doc gives a current statistics, I don’t know what is needed to find it a year later or any random day, by statistics Iean just that, number of names in each language (added on iNat).

If Kazakh is you favorite language, I will add and extra table with Kazakh in it:

language lexicon COUNT
und Zulu 3440
und Mandarin Chinese 3119
und Setswana 2698
und Vermont Flora Codes 2273
und Nahuatl 1509
und Armenian 1428
und Xhosa 1400
und Tagalog 1353
und Malay (Individual Language) 1307
und Slovene 1257
und Malayalam 1253
und Aou 4 Letter Codes 1226
und AOU 4-Letter Codes 1150
und Hodges Number 1012
und Visayan 1006
und Venda 993
und Faroese 846
und Tsonga 809
und Icelandic 792
und Shona 732
und Sotho (Southern) 712
und Irish 697
und Swati 657
und Palauan 648
und Mongolian 612
und Español (Argentina) 566
und Western Mari 532
und U.S.D.A. Symbol 523
und Kwéyòl 481
und Bikol 476
und Other 462
und Ndebele 445
und Cebuano 429
und Si Lozi 421
und Sotho 410
und Sotho (Northern) 362
und Fijian 341
und Ilokano 339
und Navajo 317
und Chuvash 313
und Herero 310
und Waray (Philippines) 307
und Sesotho 300
und Modern Greek (1453 ) 275
und Alabama 270
und Totonaco 270
und Carolinian 264
und Zapoteco 259
und Asturian 255
und Bunun (Taiwan) 252
und Malagasy 248
und Tamil 245
und Oshiwambo 241
und Kazakh 230
und Hindi 218
und Romaji 213
und West Flemish 212
1 Like

Exactly what language is “other”? And is it really necessary to have two lexicons of Aou 4 Letter Codes and AOU 4-Letter Codes?

‘Other’ contains these VernacularNames. If 47 species are called Lampalampa I guess the Lexicon other might be a trash can for common names.

https://drive.google.com/file/d/1Nbqi-qRmJQijoXTL_gsQBUATdHg3y8WE/view?usp=sharing

‘Other’ seems to contain lexicons that are WIP:
{ lexicon: “Other”, comment: “We banned this, but not sure what to do with existing ones” },

VernacularName Count
Banog 47
Danlugan 47
Lampalampa 47
Lubay-lubay 47
Tausay 47
Bagis 19
Tambago 12
Palagi 8
Pugpugot 8
Tapel 8
Ngege 6
Taysiw 6
Alibang-bang 5
Katonde 5
Mul-mul 5
Kapalagi 4
Mukunga 4
Tatifi 4
Kodo 3
Mongit 3
Tandu 3
Tapua 3
etc.
https://www.inaturalist.org/taxa/424291-Siganus-punctatissimus

Common names per Year (Covid=2020,2021)

created_YEAR COUNT
2023 508
2022 197976
2021 238635
2020 228220
2019 206377
2018 101252
2017 51147
2016 26715
2015 34608
2014 18598
2013 30383
2012 11421
2011 29920
2010 2879
2009 1397
2008 7384

These 4-letter-codes are not Unique:

vernacularName COUNT
RSFL 4
YSFL 3
BCBE 3
TRPE 2
STTS 2
ROPI 2
MOPA 2
GREJ 2
GBHA 2
COSH 2
CORS 2

Please could Si lozi and Silozi be combined. Si Lozi is wrong and Silozi is correct.

1 Like

And Si Swazi is a correct lexicon, I expect it to be siSwati and siLozi

https://en.wikipedia.org/wiki/Swazi_language
https://en.wikipedia.org/wiki/Lozi_language

If you wish you can check if the Cleaning Script https://forum.inaturalist.org/t/clean-up-currently-available-lexicons/12453/7 fixed this.

Is what you want to know ‘’ Number of common names’ per language for the languages where you have added common names""?

1 Like

That is why I asked in the other thread where to find the list of lexicons.

I’m rather impressed with myself. Unless there is someone besides me interested in Láadan, as far as I know I added all names in that lexicon; 98 of them, according to the VernacularNamesCount.zip. I had no idea the number was so high. It didn’t feel like that many when I was adding them.

I will say, it is rather tedious to scroll through that list, as it does not appear to be searchable. If you don’t know how many names are in a given lexicon, you literally have to scroll through the whole thing until you see the name.

2 Likes

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.