Using Inat data for relative abundance?

jbbohan · October 26, 2024, 6:30am

How would you get an idea of the evolution through the years of a population in a given area using Inat observations?
Some statistical standards must exist right?
Do you standardize the number of observations of a species by the number of observers (that increases exponentially with time)? By the number of observations of all related species (a kind of measure of a sampling effort)?

ahospers · October 26, 2024, 11:36am

There should be topics on that, may be ListLenght method?
https://kevintshoemaker.github.io/NRES-746/Occupancy.html
Occupancy models are used to help estimate true occupancy of a species, also known as the latent state (z). They can assist with accounting for imperfect detection of organisms in a study, and help us determine the true occupancy and the detection probabilities of the species at the site. Because detecting wildlife species is not done with 100% accuracy, we can use occupancy models to help us determine the proportion of times that we don’t detect the species and either the species is truly not there or the species is there, and we just did not detect it. Therefore, occupancy models can help us determine the uncertainty in our detection.
https://forum.inaturalist.org/t/spatial-data-completeness/41390/4
Libellengemeenschappen van Vlaamse poelen, vijvers en meren (Maxime Fajgenblat)
https://www.kuleuven.be/wieiswie/nl/person/00134599
https://orcid.org/0000-0002-2233-1527
https://pureportal.inbo.be/nl/publications/libellengemeenschappen-van-poelen-vijvers-en-meren-in-vlaanderen-
https://forum.inaturalist.org/t/how-to-distinguish-increased-observations-of-a-species-from-overall-increased-observations/35740
https://forum.inaturalist.org/t/biases-in-inat-data/23943/84
https://forum.inaturalist.org/t/published-papers-that-use-inaturalist-data-wiki-1-up-to-2019/2859
Problems with old observations/complete lists vs current occational observations with incomplete lists
https://natuurtijdschriften.nl/pub/1023675/VLIN2023038010011.pdf
https://natuurtijdschriften.nl/pub/1000339/SchubbenEnSlijm2021013001002.pdf
https://www.vlinderstichting.nl/actueel/nieuws/nieuwsbericht/25jaarnem-een-kijkje-in-de-keuken-bij-het-meetnet-vlinders
Even voor de volledigheid (ik hoop dat deze links werken, anders probeer ik het wel anders):

Article about occupancy modellen: https://www.researchgate.net/publication/262830758_Opportunistic_citizen_science_data_of_animal_species_produce_reliable_estimates_of_distribution_trends_if_analysed_with_occupancy_models?ev=prf_pub
Article about vliegtijden butterflies: https://www.researchgate.net/publication/5616084_Bias_in_phenology_assessments_based_on_first_appearance_data_of_butterflies?ev=prf_pub

Occupancy-detection Modeling

https://www.youtube.com/watch?v=uhCT4L6VxzI State of Dragonflies in Britain and Ireland 2021 report summary by Dave Smallshire State of Dragonflies in Britain and Ireland 2021 report summary by Dave Smallshire https://www.youtube.com/watch?v=uhCT4L6VxzI Dave Smallshire is part of the Editorial team on the State of Dragonflies in Britain and Ireland 2021 report. In this video, Dave delves into some of the findings from the report and discusses potential causes and future research needs. Read the full report on our website here:
Dave Smallshire - Europe's Dragonflies Talk one from our Autumn Meeting which took place online on Saturday 14th November 2020. Dave Smallshire - Europe's Dragonflies Talk one from our Autumn Meeting which took place online on Saturday 14th November 2020. https://www.youtube.com/watch?v=zD7kPTBkvbE&list=PLh65JUJK7GolKCSiKQFLAI23HZlvIJ1fi
https://forum.waarneming.nl/index.php/topic,488826.msg2495675.html#msg2495675
`Preformatted text` Melchior 2013 CBS [Hallo Hisko (en anderen),](https://forum.waarneming.nl/index.php/topic,67838.msg1356531.html#msg1356531) Vandaag ben ik op een symposium geweest van het CBS in samenwerking met het Netwerk Ecologische Monitoring in Den Haag. Het thema was Orde uit Chaos, Occupancy-modellen veranderen de natuurmonitoring. Ruim 130 mensen hebben daar gehoord hoe het mogelijk blijkt te zijn om trends van soorten te halen uit losse (niet gestructureerd verzamelde) waarnemingen (zoals op [waarneming.nl](https://waarneming.nl/)) met behulp van nieuwe statistische modellen (occupancy-modellen). Helaas heb ik daar niemand van Stichting Natuurinformatie gezien, want [waarneming.nl](https://waarneming.nl/) kwam diverse keren aan de orde. Ik vond het een zeer interessante dag die voor [waarneming.nl](https://waarneming.nl/) voor de toekomst belangrijk kan zijn. De trends van de ontwikkeling van soorten worden tot nu toe vooral berekend aan de hand van waarnemingen uit de meetnetten van het Netwerk Ecologische Monitoring (o.a. broedvogelkarteringen, vlinderroutes, etc). Door de gestructureerde manier van monitoring worden niet alleen de aanwezigheid van soorten gemeten, maar ook de afwezigheid van de andere soorten (nul-metingen). **Het zijn juist die nulmetingen die van doorslaggevend belang zijn bij het bepalen van trends. Het verzamelen van deze waarnemingen via de meetnetten is echter erg arbeidintensief en het is gebleken dat de trends uit deze gegevens soms moeilijk te interpreteren zijn. Het verzamelen van losse, terloopse waarnemingen, zoals op [waarneming.nl](https://waarneming.nl/), neemt de laatste jaren een grote vlucht. Maar daar zitten juist vrijwel geen nul-waarnemingen bij. B**ovendien zijn er grote verschillen in methode, intensiteit en kennis van de waarnemingen/waarnemer, waardoor die waarnemingen niet zomaar te vergelijken zijn voor trendanalyses. Door middel van occupancy-modellen blijkt toch het mogelijk te zijn ook deze losse waarnemingen te betrekken bij deze trendanalyses. Het voert tever om hier precies te vertellen hoe occupancy-modellen precies werken. Ik ben ook geen statisticus. De **theorie achter het model is dat door gegevens van een gebied te gebruiken als dat in een seizoen meerdere keren bezocht zijn. Uit deze gegevens zijn trefkansen te berekenen. Deze trefkansen spelen een sleutelrol in het berekenen van trends van soorten.** Het symposium heeft ook duidelijk gemaakt dat hoewe**l occupancy-modellen krachtige hulpmiddelen zijn, trends verreweg het beste berekend kunnen worden als er meer nul-waarnemingen verzameld worde**n. Ik denk dat [waarneming.nl](https://waarneming.nl/) daarin een grote(re) rol zou kunnen spelen. Op dit moment is het mogelijk om nul-waarnemingen alleen op gebiedsbasis in te voeren (bijvoorbeeld http://waarneming.nl/waarneming/view/74212849). Ik zou ervoor willen pleiten het mogelijk te maken dit ook op andere schaalniveau's mogelijk te maken (de standaardniveau's van [waarneming.nl](https://waarneming.nl/), dus 1000, 100 en 10 meter). Natuurlijk is het niet de bedoeling als je op een plek een Merel waarneemt, je 349 andere vogelsoorten invoert die je niet hebt waargenomen. Iedereen zal inzien dat dat geen meerwaarde heeft (en dat zal ook niet snel gebeuren aangezien het veel werk is om dat allemaal in te voeren). Het gaat meer om situaties van een vindplaats van een Groenknolorchis waar je na twee jaar terugkomt en waar die ineens is verdwenen of die kleine libellenpopulatie in dat vennetje die opeens na langdurig en gedegen zoeken helemaal verdwenen lijkt te zijn. Dat zijn de nul-waarnemingen waar je het natuurbeheer en –beleid mee verder kunt helpen. Ik heb al eerder gepleit voor de mogelijkheden om nul-waarnemingen in te voeren. Bij deze doe ik een ultieme oproep dit mogelijk te maken.
p.s. Het viel me trouwens op dat huidige nul-waarnemingen in geen enkel overzicht terugkomen. Logisch, gezien de huidige opzet. Mochten de mogelijkheden voor nul-waarnemingen verruimd worden, zou het goed zijn als dergelijke waarnemingen ook ergens zichtbaar worden (Kan een uitdaging voor sommige mensen zijn om die soort toch terug te gaan zoeken!).
https://forum.waarneming.nl/index.php/topic,67838.msg1356531.html#msg1356531
https://forum.waarneming.nl/index.php/topic,248778.msg1483898.html#msg1483898 Centraal Bureau voor de Statistiek en De Vlinderstichting in het Journal of Applied Ecology, 11 september 2013 online. http://onlinelibrary.wiley.com/doi/10.1111/1365-2664.12158/abstract. https://forum.waarneming.nl/index.php/topic,248778.msg1485171.html#msg1485171
https://www.natuurpunt.be/sites/default/files/documents/publication/herremans_-_2012_-_fenologie_goden_uit_oosten_laat_in_2011.pdf
https://forum.inaturalist.org/t/spatial-data-completeness/41390/4
https://forum.inaturalist.org/t/estimating-species-populations-from-number-of-users/29567/10
https://jamesepaterson.github.io/jamespatersonblog/2020-11-09_occupancy_part2.html
https://nsojournals.onlinelibrary.wiley.com/doi/10.1111/oik.08215 p: [Deriving indicators of biodiversity change from unstructured community-contributed data ](https://onlinelibrary.wiley.com/doi/10.1111/oik.08215). It’s pretty much about how to overcome the biases and “messiness” of iNat data to find patterns and see change through time. I apologize that it’s not open-access - Wiley’s OA fees are prohibitive, and since this was grant-funded work we couldn’t afford it.https://x.com/giorapac/status/1404827801841061888?s=20 ![image|648x500](upload://mZoqjzrVIIMg9gSSmQ4fOASmc04.jpeg) ![image|680x426](upload://fLqVGrhAMMGhU8X3PECMbEnK6fR.jpeg)
Here is the paper. It should be open access, so anyone can read it without a paywall. [https://esajournals.onlinelibrary.wiley.com/doi/10.1002/fee.2783 ](https://esajournals.onlinelibrary.wiley.com/doi/10.1002/fee.2783)
They compare butterflies reported on iNaturalist with those reported on eButterfly. They are only able to compare the coasts of the mainland US + Canada because there are too few lists posted on eButterfly from the interior of the US and Canada.

Here is the figure showing the over- and under-represented butterflies on iNaturalist:

jasonhernandez74 · October 26, 2024, 7:41pm

That wouldn’t necessarily work. You could have several observers who each post one observation of a given species in a given area, and then one observer who posts an observation of every individual of that species that they encounter.

This relates to a disagreement in another thread:

A problem with it (besides being really tedious for an identifier who wants to see some variety while identifying) is that it doesn’t work with what you are proposing.

paul_dennehy · October 26, 2024, 9:48pm

I don’t think iNat is particularly useful for abundance data. The goal is to encourage people to interact with nature, so if someone is really excited to see a cool species for the first time, they may post a bunch of photos of that species from the same area. Good for them, I’m glad they’re excited. iNat is a record of human-organism interactions, not a measure of organism presence/absence. No matter how many statistical methods are used, the fact remains that the abundance/lack of data points for each species is linked mostly to the idiosyncrasies of the humans posting their interactions. I think bunnies are cuter and more fun to look at than squirrels, so my yard has a massive pile of rabbit observations and almost no squirrel observations, entirely because of my personal preferences of what animals I like to look at. Every human observer has their own personal preferences like this, and disentangling the personal biases of over 3 million observers from their 200 million + observations seems like a statistically impossible task. iNat is great for documenting new and interesting records, and for getting broad range data about well-observed species, but the more one “zooms in” on the data, the more it becomes clear that abundances and small-scale range information from iNat is the very definition of sampling bias. Which is fine, because iNat isn’t based on any sort of robust data-gathering protocol; it’s a site to encourage interaction with nature. “Send 3 million people outside to take any number of pictures of whatever they want to for a decade” would be the worst study design ever for detecting a particular species’ rise or decline in population, and the best statistics in the world can’t fix data from a poorly designed study.

natev · October 26, 2024, 11:11pm

I’ve found this to be impossible on a local scale, for reasons already elaborated here. I find it to be possible with a ton of caveats on regional and national levels—individual idiosyncrasies tend to level out, to a degree. Even in that case, iNaturalist is best used for identifying interesting hypotheses to test (e.g., “hm, looks like the range of this insect is most abundant over these ecoregions—let’s design study to test that hypothesis”).

duncanross · October 27, 2024, 12:38am

I personally post every single instance of seedling sycamores (or even seeds) I find in areas outside of (locally) known woodlands, in one particular area, as it’s a highly invasive species and it’s useful to have a good record of where they’re distributing to and how significantly. Some days I have posted 50 observations of sycamores saplings ;)

jbbohan · October 27, 2024, 4:33am

Worst design but very exciting field survey!

jbbohan · October 27, 2024, 4:43am

I think I should be able to detect a ‘really’ significant trend by simply standardizing over several variables to go beyond bias:

My county has that species of snail (with N its number of observations per year), I could plot
N/ total number of observations in the county
N/ total number of observers
N/ total number of species observed
N/ total number of mollusks observed

If all 4 plots showed an increase in relative abundance that would be hint at the population increasing. Of course, with that level of bias, if not all plots ‘agree’ then there is not much to say.
Comments?

eyekosaeder · October 27, 2024, 6:34am

[I just realised a significant portion of my post has been already said by @natev far more concisely and to the point than I could have, so I have deleted that part…]

It would be a hint, yes, but I’d question whether this would be the most effective way to get that data (I can think of a lot of things which may cause a decrease in one of the plots, so I assume cases where all 4 agree would be relatively rare).
Additionally there are far too many different explanations for an increase, so I wouldn’t be comfortable relying on these plots data too much (i.e. I wouldn’t use them in a scientific paper).

I think the best (safest) way to do this would be gathering the data yourself (or with a group of likeminded people). For that you can use iNat: You could make a project and upload every instance of the species of choice you come across within a given area. However, the same thing can be achieved with just a pen and a paper (or an excel spreadsheet) so in this case, I would agree with Jason that it may be annoying to many IDers and I wouldn’t recommend using iNat for this.

ahospers · October 27, 2024, 7:08pm

That is exactly what Maxime Fajgenblat did. He created a profile for every observer and related the total amount of species during every visit to the total number of species seen in thate area (in that year). It ttook12 days calculatoins

Problems with old observations/complete lists vs current occational observations with incomplete lists
https://natuurtijdschriften.nl/pub/1023675/VLIN2023038010011.pdf
https://natuurtijdschriften.nl/pub/1000339/SchubbenEnSlijm2021013001002.pdf
https://www.vlinderstichting.nl/actueel/nieuws/nieuwsbericht/25jaarnem-een-kijkje-in-de-keuken-bij-het-meetnet-vlinders
(https://assets.vlinderstichting.nl/docs/b3ffb8ee-9e6b-48f9-b467-981ef3ea6fb5.pdf https://www.vlinderstichting.nl/service-en-vragen/publicaties/vlinderbalans)

This kind of graphics was present in his presentation as he was analysing the data. The webinar should be present somwhere (NVL) but i am afraid the data is not published…so not available as it has not been publisched.

De kolom Decade geeft aan in welke periode van 10 dagen het hoogste aantal wordt bereikt. Dat is de periode waarin de soort het meest is waargenomen.
De kolom Aanw geeft het percentage lijsten waar de soort op voorkomt binnen dezelfde soortgroep. Een hoger percentage betekent dat de soort vaker waargenomen wordt.

spiphany · October 27, 2024, 8:11pm

Was this calculated per outing or across all of a person’s activities? It seems like the latter would require that observers’ habits remain reasonably constant over time. This certainly isn’t the case for me, and I doubt I am an exception in this. My interests have become more focused over time, and camera equipment also plays a role in what I document and choose to look for. I know I’m also more likely to record certain taxa if it is the first time I am visiting a particular location, or if I’m not having luck finding other organisms of interest (maybe it is the wrong time of day or the wrong time of year).

duncanross · October 28, 2024, 8:48am

One counter to this could be that people may become more aware of a particular species for some particular reason, aside from increasing abundance. Maybe a meme made some species of snail popular, so people took more notice of it.

jhbratton · October 28, 2024, 9:56am

If it becomes known that you are interested in a particular snail species in your county, you may find iNaturalists in that county are helpfully uploading pictures of every specimen they see, which would skew your results. So I suggest you keep the details of your study secret.

david99 · October 28, 2024, 10:31pm

I have found that I upload observations of organisms more often that are rare than organisms that are common. In fact, I specifically avoid uploading photos of many common species, except for the City Nature Challenge, while I upload every single instance of certain rare or rarely encountered species. For commons species this will probably not be much of an issue, but it will definitely skew the results for less common species, especially if I lose interest in them and stop reporting them.

In fact, that’s exactly what happened with a specific species of bee. If you look at the Inat data for one species, it will appear that there was a huge population boom in 2022, then a reduction back to their original level. In reality I took hundreds of photos of them one summer, then learned what they were and stopped taking more photos of them.

sedgequeen · October 29, 2024, 1:28am

iNaturalist is good for evidence of presence. It’s poor for evidence of absence (was the organism there or not found or of no interest to the observers?). It’s terrible for abundance.

In theory, one could convert the observations to abundance if one adjusted to number of observers, time spent in appropriate habitat, interest in & ability to find the species, style of observing (post a species once ever or on every visit or?), and number of individuals per post. Actually, none except the last are known or knowable. I’m afraid that any attempt to measure abundance would be a garbage in / garbage out situation.

ahospers · October 29, 2024, 10:56pm

I think per location and for each year, but better take a look at the article (or webinar). But i think one knows the best observers. I saw a webinar and several people could guess who it was by just seeing the statisict of the profile. Not nowing the age, visitng areas or seasons…only profile statistics

I thought the whole calulation lasted 10 days, 240 hours but i never found the article. He had two talks, one in Belgium and one in the Netherlands i thought.

https://www.researchgate.net/publication/236655213_Estimation_of_vascular_plant_occupancy_and_its_change_using_kriging
https://forum.inaturalist.org/t/spatial-data-completeness/41390/11 The approach involves sophisticated statistics on a 10 km grid of plant observations.
https://forum.inaturalist.org/t/a-tool-to-help-you-fill-local-data-gaps-easily-missed/37575 pecies I should upload to iNat, so I came up with a small web application does that. The idea is that it helps us record the species are ‘easily missed’ because they’re common and you presume someone has probably already recorded it locally. You can find the tool here: https://simonrolph.github.io/easily_missed/ It then presents this data with a little map, some headline numbers for how many species are in the local area versus the region, and a “doughnut score” which is just a % of the previously mentioned. The idea is that if you record the suggested species you’d boost the area’s doughnut score and build a more complete picture of the wildlife in your local area https://forum.inaturalist.org/t/spatial-data-completeness/41390/12 https://forum.inaturalist.org/t/number-of-inaturalist-observations-gridded-data/16572

rupertclayton · October 30, 2024, 10:35pm

One more caveat in addition to the many mentioned by other commenters: iNat data is likely to have a fine-scale abundance bias favoring locations closer to trailheads (or at least my iNat is probably biased in that way).

Typically, I start out on a trail, take photos of every identifiable type of organism and pepper the first km or two with observation points. As I get farther along, much of what I see are the same species, and I’ll only rarely add second or third observations for the same species. Also, my hiking companions may tire of the slow pace and I’ll pack away the phone for a while.

If someone really needed to correct for that type of bias, I guess it might be possible (e.g. add a weighting factor that adjusts the raw abundance figure based on the distance from a driveable road). But it might be better just to realize that the abundance signal in iNat data is too compromised by the noise of factors like this for us to derive many valid conclusions.

ahospers · October 31, 2024, 12:28pm

I thought some people using iNat data are in this discussion https://forum.inaturalist.org/t/species-accumulation-curves-for-inat-data/38222/30

https://forum.inaturalist.org/t/species-accumulation-curves-for-inat-data/38222/29

https://onlinelibrary.wiley.com/share/ZHQTBHHSYHITEZZNYMGV?target=10.1111/j.1654-1103.2010.01247.x

https://bsapubs.onlinelibrary.wiley.com/share/YMDXUWKTUG5C5D8JMVFA?target=10.3732/ajb.1000215

Topic		Replies	Views
Mapping distributions of iNatters via common species General	26	1855	April 29, 2024
Species Accumulation Curves for iNat data General question	33	3890	January 31, 2023
Not an unbiased dataset General	36	4028	October 4, 2020
Abnormality bias General	21	3525	April 19, 2020
Improving Data Quality General	33	1441	June 13, 2022

Using Inat data for relative abundance?

Related topics