How complete is Arthropoda on INaturalist?

So, there have been posts celebrating the milestones of 1/3 of vertebrates being on INaturalist, and then another post celebrating the milestone of 1/2. But then I wondered, what percent of arthropods have been observed on INaturalist? And so I carried out my investigation. I took total species count for 126 arthropod taxa from a variety of sources. I used taxon specific catalogues and databases whenever possible, otherwise I used recent papers. Most of the taxa are order level, but there is some subclasses too. I will simply refer to them all as orders hence forth for the sake of simplicity. I am only counting known species, so not theoretical numbers of how many species are really out there. Then I looked up how many species of each order are in inaturalist, and how many observations of each order there are. I chose not to filter by research grade, since doing so would end up excluding thousands of correctly id’d observations. However, inevitably there is a number of incorrectly identified observations as well. This, combined with the fact that the number of arthropod species is always being revised due to both new discoveries and taxonomic reshuffling, should mean that all the numbers you see hence forth should be taken with a bit of a grain of salt.

Of course orders with more species overall will have more species observed on INaturalist.

And orders with more species observed on INaturalist also have a larger amount of total observations.

But what does this all actually amount to?

In total, 225,430 species of arthropods have been observed on inaturalist, compared to a known total of 1,238,277. That is around 18.2% completeness, compared to the roughly 50% for vertebrates. But if you break it down by taxa, it paints a more complicated picture.

Here is the breakdown by higher taxa. As you can see, the percent completeness for hexapods (insects) is slighter higher than for arthropods as a whole, and every other taxon is below the arthropod average. People talk a lot about how insects are overlooked, which they are, but non insect arthropods are clearly worse off!

Breaking it down by order makes things even more clear. 42 out of 126 orders have a completeness percent higher than for arthropods as a whole, while the rest are much lower.

20 Likes

So, what makes some orders more or less sampled than others? Is it perhaps the number of species?

There isn’t really a clear pattern here. Orders with less than 100 species have the highest average completeness at just over 20%, but orders with over 10,000 species have the second highest average completeness at 14.8%. Orders with between 100 and 1,000 species have the third highest average completeness at 14.15%, and orders with between 1,000 and 10,000 species have the lowest average completeness at only 13.47%. I guess orders with intermediate species counts have too many species to sample quickly and easily, but not enough species that they compose a massive part of the fauna which generate a lot of public interest? Perhaps.

Perhaps habitat might be the deciding factor?


I used the habitat of adults, since arthropods are primarily collected and identified by their adult stage. So this means stuff like dragonflies and mayflies are under terrestrial. Terrestrial orders do have a higher average completeness than aquatic orders. This makes sense, as terrestrial habitats are generally more accessible to humans than aquatic ones. Aquatic orders have an average completeness of 14.75% and terrestrial orders have an average completeness of 17.94%. Keep in mind this is not a perfect division since there are aquatic species in terrestrial orders and vice versa.

But there is one more factor to consider. Body size. People pay more attention to giant lobsters, tarantulas and butterflies than to tiny zooplankton, mites, and springtails.


And what do you know, it’s an extremely strong correlation! Large bodied orders have an average completeness of 20.94% and small bodied orders 7.3%. My assignment of orders as “large” or “small” was admittedly somewhat arbitrary, but generally if you need a microscope to see most species of a given order as anything more than a moving dot (or to see them at all), they are under the small bodied category. This includes most mite orders, all copepod, ostracod, and cladoceran orders, pseudoscorpions, palpigrades, pauropods, springtails, psocids, and several obscure crustacean orders. Most of everything else was lumped as ‘large bodied’. This is far from a perfect division, as “large bodied” taxa contain many very tiny species, and “small bodied” taxa also contain some fairly large species (compare fairy wasps vs giant velvet mites). If you could sort this out by family, the division would grow even sharper.

So to lay out all the possible differences, lets do a case study of two orders in particular, the Odonata and Thermosbaenacea. Odonata, the dragonflies and damselflies, is the most completely sampled arthropod order with more than 10 species, at 67.55%. Thermosbaenacea is a group of troglobitic (cave dwelling) crustaceans with no observations on INaturalist, and thus no observed species.

  1. Size. Odonates are generally pretty large and conspicuous insects, and thus easy to see. Thermosbaenaceans are sarcely more than a few mm long.
  2. Rarity. Odonates are reasonably common in many areas of the world, and at times form large swarms. Thermosbaenaceans on the other hand are quite rare.
  3. Accessibility. Odonates can be seen flying around in basically every kind of terrestrial habitat, and their young can be found in all manners of shallow freshwater bodies like ponds and creeks. Thermosbaenaceans live in caves and thermal springs, places where only people with specialized gear go into.
  4. Popularity. Everyone knows what a dragonfly is, and they are generally among the most well liked insects. Ask anybody if they know what a thermosbaenacean is and they will look at you like you came from outer space.
  5. Ease of identification. Odonates are generally among the easier insects to identify, and from what I hear many of them can be reliably identified from photos even which is atypical for arthropods. I don’t know how easy it is to identify thermosbaenaceans to species level, but if its anything like other microarthropods its very hard and you need a microscope.
  6. Number of species. At around 6,000 species, odonata is certainly a a large order, but it is not incomprehensibly large. Several other taxa such as lepidoptera, coleoptera, and araneae share in common with odonates many of the advantages listed above, but their extremely high number of species makes sampling them to an even remotely close to complete level very difficult. Odonates have few enough species that all their other advantages can operate at full force. Thermosbaenacea actually has very few species, only 34, so in theory a truly dedicated team of people could put every species on INaturalist in a short amount of time. But in spite of the low species count, they have pretty much everything else working against them, so it keeps them from reaching any satisfactory level of sampling.

Allow me to get through one last proposition. Once INaturalist reaches 20% of all arthropods observed, we should make it a holiday!

31 Likes

This was honestly a really intresting way to look at how we are reaching different taxonomic groups through iNaturalist monitoring. I would also be very curious to see how geographic bias plays into things considering many of the most biodiverse places in the world are rather underobserved.

Thanks again for all of your work!

7 Likes

Amazing work! Enjoyed reading this.

1 Like

Great write up! Funnily enough I actually published a paper last year on something very similar: completeness of Australian terrestrial invertebrates on iNat. Open access paper here: https://doi.org/10.1002/fee.2604

Here’s our main figure looking at completeness as of 2021, and how it changed from 2020:


Figure 1. Recognition and completeness of Australian terrestrial invertebrates from photographs uploaded to iNaturalist across 39 iconic taxa (for 2020, n = 527,313; for 2021, n = 1,013,171). The size of each point is scaled to the log of the number of observations for that taxon. The colored rectangles were manually drawn to group what we perceived as taxa with similar recognition and completeness on iNaturalist, based on the 2021 dataset. Capital letters A–D denote each group.

Odonata was also top of the charts for Australia!

15 Likes

Wow, I will surely look into this study in depth. I didn’t know someone pursued a very similar investigation as me! Of course the scopes were a bit different, as I did the whole phylum arthropoda regardless of habitat or geography and you did land inverts of australia, but nonetheless the aim is quite similar over all. And glad to see we came to similar results as well!

3 Likes

How do you think should we celebrate reaching 20% of all arthropoda?

Clearly, we should go outside and find yet more arthropods!

4 Likes

Very interesting post, thank you for sharing these statistics!
One comment:

I think, habitat is a far larger factor than shown here, but “aquatic” vs. “terrestrial” is too broad to show that.
Apart from obvious ones (like deep sea species being basically impossible to observe for the average naturalist unless they get washed ashore), there could also be an interesting difference between marine and freshwater habitats, between pelagic and benthic, or for terrestrial organisms between organisms living in the soil, under logs/rocks/leaf-litter, on plants, or flee-flying ones.
Also, differences between ecosystems would be interesting to see (forests vs swamps vs meadows, etc…)

2 Likes

This is a very good point! However, finding the order level averages for what is the most common of these specific habitat preferences would have been a lot more difficult, and somewhat beyond the scope of my capabilities for the time being. Though it does deserve a more in depth investigation in the future.

2 Likes

If I was peer reviewing this, I would ask you to show the error bars and to run a statistical test for significance. My visual impression is that the first three bars are likely not significantly different, while the fourth bar might be.

1 Like

What are those exactly?

When dealing with averages, error bars usually show how much the individual data points differed from the average. So, taking your bar graph as an example, there were presumably several orders with <100 species, and you showed the average, just over 20% complete. But within that set, not every order was exactly the same percent complete; some were more complete, others less complete. Error bars would give us an idea of how much variation there was within that category. Often standard deviation or standard error would be used. Error bars are explained more fully in this article: Error Bars in Graphs: What They Tell Us About Data

Statistical tests include procedures such as Student’s t-test and the various types of analysis of varance (ANOVA). Essentially, they calculate the likelihood that an observed difference in the data represents a meaningful difference in real life. So, for example, say you have five coins, and you flip each coin 100 times. Coin 1 comes up heads 56 times, coin 2 comes up heads 48 times, coin 3 comes up heads 51 times, coin 4 comes up heads 50 times, and coin 5 comes up heads 43 times. Does this mean that there is a real difference in the coins, and that coin 1 is biased toward heads or that coin 5 is biased toward tails? Not necessarily. These numbers may have happened randomly by chance.

So when you see a p-value in a scientific paper, this is what is being tested. P<0.05 means that there is a less than 5% likelihood that the differences are just random chance, which means that we can be 95% sure that they represent a real difference between the samples. Similarly, P<0.01 means that there is less than a 1% likelihood that the differences are random chance, and we can be 99% sure that there is a real difference between the samples.

2 Likes

[…geographic bias…]

Since I spent my last 4 years with worldwide checklists of Noctuoidea and Geometroidea (both Lepidoptera), here some data related to this, given in % of species observed on iNat vs on country/region checklist.

As you can see, there are major differences

4 Likes



I decided to look further into this, and it seems there is not any significant difference between aquatic and terrestrial orders in terms of taxonomic comprehensiveness. However…

In terms of sheer raw number of observations, there is a clear separation between primarily aquatic and primarily terrestrial taxa

4 Likes

Come and help to ID where there are obs, but not enough identifiers. Imagine what treasure waits, patiently, to be found by skilled eyes!

See geographic bias

Very interesting! :D
I’d have thought that fewer observations would correlate with completeness. I notice also that aquatic orders appear to have fewer species than terrestrial ones, so maybe that has something to do with this?

I think the difference is that not all observations are identified to species

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.