iNaturalist Data Quality Webinar

TLDR: Upcoming webinar which may be of interest to people here: https://www.eventbrite.co.uk/e/falling-through-the-cracks-inaturalist-invertebrate-records-in-the-uk-tickets-1461082720749

Hi all!

Later this month I’m doing a webinar all about my MSc research project which investigated how well the current digital infrastructure for recording biodiversity in the UK deals with iNaturalist records. The short answer is that currently lots of iNaturalist data is ‘falling through the cracks’. By that I mean lots of data is being ‘lost’ and not contributing to initiatives to which it could be really informative e.g. conservation planning. The main reason for this loss is that in the UK iNaturalist data is perceived by many as being low quality in terms of data quality attributes like identification accuracy, location accuracy, recorder name acceptability, and provision of supplementary information (e.g. life stage, sex). In the UK, iNaturalist records are only used once they have fed through from iNaturalist into another platform (iRecord) and gone through a round of additional verification done by volunteer taxonomic specialists. The perception of low quality is putting many such verifiers off from looking at iNaturalist records and meaning many records are not used. And yet, whilst this perception is widespread, it has never actually been quantified whether UK iNaturalist records are indeed lower quality in these regards compared to records submitted through other channels.

My research addressed this gap. I used UK non-marine invertebrates as a case study to look at how perceived data quality concerns with iNaturalist records compared with real data quality issues. There’s a lot of interesting findings, nuance and detail in my results, but it would be spoiling my talk to go over it now. If you’re interested, I suggest you sign up to hear my talk which will be on the 16th September 2025 (1–2pm UK time). The link is above. The talk will also be recorded so even if you can’t make it on the day do still sign up to be emailed access to this afterwards.

Best wishes,

Joss Carr

24 Likes

Awesome, looking forward to it! I’m in California, USA, so won’t make the live showing - but will watch the recording. Good luck and thanks for sharing!

1 Like

Signed up straight away. Should be really interesting, thanks for letting us know and… see you there.

1 Like

This looks amazing! I registered, though on the west coast of Canada I can’t promise I’ll be up at 5 am, haha.

As someone who runs free virtual events often (and someone who has no idea of how often you run them - you seem quite professional so I assume you may already know this) here is a bit of encouragement because your seminar is so interesting to me: Don’t get discouraged if you get a lot of tickets sold and then people don’t show up for the live. Free events sometimes have a lot of registrations and then the turnout for the live is small because of the flexibility of having a recording. I think this is incredibly valuable information that will get many eyes on it.

Thank you so much for your hard work!

6 Likes

10pm here, so I should be able to watch live. Sounds very interesting.

2 Likes

The recording of my talk is now available alongside a write-up of the Q&A here: https://biologicalrecording.co.uk/2025/09/16/inaturalist-invertebrates-2/

4 Likes

Thank you for the written version. And for encouraging identifiers.

2 Likes

Thanks for sharing all this Joss!

Great that somebody has logged and quantified the issues further to move on from the anecdotal evidence and the endless circular debates on UK Facebook groups haha.

My thoughts (lengthy …sorry! … ) …but fwiw :


Recorder names

Interesting that the recorder names were the biggest issue.
This confuses me, I see no real logic or solid argument around this still.
It makes sense historically, without photos that an observers name might carry weight.
But given iNaturalist records are with photos and most identification is done by a third party, this seems of little relevance. Especially given the privacy concerns attached to sharing real time location data associated with one´s name.

I would be curious to see the data on the verifiers and whether there was a bias here in age and gender. I would imagine there is a dominance of verifiers on iRecord who are older and/or male. Whilst I imagine those with the greatest privacy concerns would be younger and female. I know of at least four female friends (all late 20s / early 30s ) who have been subject to stalking and personally, I would not encourage friends or family to use their real names. I see this as a perception that simply needs dismantling on the iRecord side. It just seems like bad practice to me to use real names or have this as a prerequisite for records to be verified.


Identification accuracy

Great to put this argument a bit more to bed. It´s a shame you can´t find a way to obtain data on accuracy of incoming records on iRecord. When I started biological recording, I would often log species at a higher level or incorrectly - without the community to verify or guide, I imagine there are doubtless more incoming records on iRecord either inaccurately logged or simply not logged at a species level in comparison to iNaturalist records which are at least seconded, and none of which even come through if not at genus/species. My guess would be that any discrepancy between the two platforms is even less significant than verifiers claim. If they were higher overall, this would only be down to ID support on Facebook groups I expect - which is simply a less automated and less practical platform for the purpose.


Life stage and annotation

I think a big issue here is in regard to perception of records as a static datapoint - again, historically more true… but on iNaturalist they are continually being added to, annotations may be added years down the line. Personally, I often don´t add on upload for one reason or another. I am more likely to add annotation as an identifier than as an observer I think.

I would be curious to know whether you took the records from the iRecord side post-import, or if not, how you defined what age of record to subject to analysis. It must vary greatly by age of record.

As I´ve said on Facebook threads, iRecord could choose to only import data after a certain point in time to increase level of granularity in this respect if it were really a huge issue. Regardless, any verifier can presumably at least filter by age of record and choose to check older records if they so desire.

iRecord is such a black box from the outside - are records even updated after import? If it reaches RG and goes to iRecord in August, and an annotation is added in September, is the iRecord data updated if not already verified?


Licensing

I wonder if part of the issue here is that datapoints and photos need to have distinct licensing if possible - perhaps something one could feature request on the iNat side if not already requested. Though maybe it simply isn´t realistic in terms of broader dataflow to GBIF and iRecord.

Personally, I use CC-BY-NC. I have no issue with my data being openly licensed with regard to the logging of the species and location, etc. But when it comes to the photos I do wish to have control over commercial usage of extreme macro work as the photos are higher quality and equipment more costly, etc.

I wish it were easier to choose license on upload on iNaturalist though. I would always have any phone uploads on open license by default if that was an option. And I would choose to use a more open license by default if I could easily alter select photos to retain CC-BY-NC when I so desire. Again, a feature request around this may help.


Spatial precision / location names

Here again, the exact naming of a location seems to me to be more of a historical tradition than of relevance now given GPS I think (?)… Nevertheless, I accept location names can add weight to accuracy. I think there are a few issues here which could be addressed in feature requests on the iNat side, if not already logged. e.g. I´ve noticed the location name defaults to a coarse name even if you make minor adjustments of 50 metres or so. Cool to see that anyhow, this issue wasn´t as significant as verifiers presumed.


Species coverage

This just seems to me to be a trade off between an inclusive platform like iNaturalist with newer users / more photo-based vs a more exclusive platform with older users who collect specimens.
I see this as less likely to change as it seems integral to user-base.

If we do want to attempt to shift iNaturalist UK users in general, I wonder what the percentage of records are per top 50 / 100 / 500 users. Given the data has a long tail of distribution, focus on addressing any of these issues with the top users would at least see the biggest result.


I am also curious how many (if any) of these issues resonate with data users outside of the UK.

6 Likes

Yes, I analysed records within iRecord from both sources. I was never entirely sure whether records get ‘fixed’ with the current annotations at the term of verification or whether this can become updated in iRecord if updated on iNaturalist…

Exactly… I don’t know either.

I’m glad you note this too. The skewed distribution of records per user does mean that changing the quality of most of the data is largely a matter of changing the behaviour of the top 1,000 observers (or so).

Also curious about this. From what little I have heard the licensing issue (i.e. that the UK organisations don’t use CC-BY-NC iNaturalist observations) is unique to our islands.

1 Like

I feel like either @bazwal or @matthewvosper might have looked into this and commented on it on another thread already…

———-

Also curious w many of these issues whether they affect actual real-world usage of data in papers / projects ? I imagine the majority of on the ground conservation work revolves around either coarser or more bespoke sampling for which many of these issues are actually of less concern

My understanding is that changes to records in iNaturalist continue to feed through to iRecord until the record is verified/rejected on iRecord. I’m not 100% sure which actions on iRecord lock it down, as I know there are many end points for a record.

1 Like

Presumably at least for UK hoverflies we can say x observations over y months old should be z% annotated now, given your mapping of the data. Do you have a rough sense of what the numbers are for this sort of thing?

And is there a sweet point where you see a drop off in significant additional granularity?
(like 90% are annotated within 6 months therefore this is the optimal age of observation to begin verification on iRecord )
I guess it varies so much by taxon the numbers are a bit meaningless though for addressing platform-wide issues. Do you feed this stuff back to the Hoverfly Recording Scheme atm tho?

I would prefer iNat to keep the location name which I carefully added. Instead of defaulting to Google’s place name Silver Mine or Erf … - which is often wrong, which is WHY I wrote in the correct name Silvermine or Elsie’s Peak.

2 Likes

I record data at the end of each month. Typically at the end of each month close to 100% of UK hoverfly observations posted that month have life stage, and between 70 and 80% have sex. Obviously some of those observations are nearly a month old while some are less than 1 day old!

2 Likes

How are obscured observations processed into iRecord? If it works the way I’m guessing it works, at least some of the complaints about pin placement and coarse names can be attributed to location obscuring from iNat. E.g. I assume this observation has a precise private pin and a precise private location description, but publicly it’s just “Oxfordshire, England, GB” and the pin may be kilometers away from the true location.

2 Likes

I still haven’t watched it yet, but definitely plan to!

Just anecdotally as the person who reads basically all iNat support tickets: no other country seems to be as concerned about including/recording the name of the observer as those in the UK are. Folks in the UK are the only ones who consistently bring this up as an issue when it comes to data use and ingestion into the data recording system. This isn’t a judgement either way, just my experience.

I believe I’ve heard that some state natural heritage agencies in the US skip over observations that lack location accuray data, but I don’t remmeber specifics, unfortunately.

5 Likes

Yes, I understood this to be part of the issue. Wasn’t it something to do with there being no formal bridge?… so iRecord just pulls via the API, and hence iNat can’t initiate something for users to trust iRecord with coordinates as a singular project would?

But yes, I see all the other obs from the area have the same location name.
I didn’t realise location name was obscured actually.
Good to know!

A somewhat ironic example as these fritillaries in Oxford are pretty obvious and well known haha
( ubiquitous on the college grounds as I recall )

Interesting.
As a British person I feel bad you have to deal with support tickets about this! :sweat_smile:
It really makes no logical sense to me…

I guess in part this is just due to the historical precedence(?) - at least in the UK, people seem to claim the traditions of biological recording and the associated networks of county recording schemes, etc etc …all go back a relatively long way.

1 Like

It’s not common at all, and it’s not a problem.

Yes, I believe it’s due to long-established recording schemes, although I’m not very familiar with them.

Yes… though I tend to be very suspicious of British claims like these tbh haha…

I have read mention of the history of biological recording in Victorian times… and according to the BRC organised recording does go back to late 17th to mid 18th century :

Natural history societies began to form as early as the mid 18th and throughout the 19th century. ….The identification of species and documenting their distribution became important to many societies, often using the Watsonian vice-counties for such records. By the early 20th century most local and national societies had a healthy mix of both self-educated and academically qualified members, which continues to the present day.

But that would just seem to be in line with Linnaeus give or take ( 1707-1778).
So I would have imagined there must have been similar recording schemes starting across Europe.
I remain confused why the UK is so particularly steeped in tradition here.

Fwiw, according to ChatGPT :

Sweden: gave taxonomy
Germany: gave theory and ecology.
France: gave comparative anatomy and colonial collections.
Britain: combined these influences with a massive, decentralised volunteer base and cheap publishing.

( then paraphrasing here, but apparently )… there were some unique infrastructural elements in play in UK, with the “parson-naturalist” overseeing the local parish, leading to the use of amateur recording nationally in a way which didn’t come into play until later in other countries for various reasons

——-

Are there regional recording schemes in the US or elsewhere in the same way we have county recording schemes in the UK?

Are there national recording schemes elsewhere in the way we have in the UK?
e.g. in Diptera, we even have national schemes specifically for different families : Heleomyzidae, Stratiomyiidae, etc ….

Pretty much all of my knowledge of older British natural history comes from the Aubrey/Maturin books, FWIW, and those are mostly about recording nature outside of Britain. Currently re-reading them, they’re so good.

1 Like