Rampant guessing of IDs

I think that right from the start, one of the purposes of iNaturalist has been to make scientifically useful observations available from all kinds of observations different people are out there making for a diversity of reasons. I think that iNaturalist is succeeding reasonably well, in the context of unsystematic data collection by people differing in purposes, interests, skills, equipment, and location. At least there are a lot of us.

One of the problems about accepting these data for use is a matter of expectations. Any data set from iNaturalist must be viewed as raw material, not a finished product. The good news is, the researcher can get a huge amount of data for free. The bad news is, he or she should skim quickly through every observation (or at least all outliers) before analysis. (Yes, I do know how long that takes – it’s the same process I go through when checking identifications of some taxon of interest to me.)

We should all do what we can to improve data accuracy, but we shouldn’t despair about the inevitable variation in this huge citizen science project.


Point taken, though I have encountered many new users who are the opposite, hesitant to contribute any IDs at any level, for fear of somehow appearing arrogant or “above their station.” That is also not a good thing in the context of iNaturalist. But granting the point, how else could we

without ultimately running up against the same issues, and still needing some education to help guide an appropriate choice?


I think a choice in weighting could help these people too.
I wouldn’t stray outside my chosen taxa either for this reason, and I am extra wary if identifying outside my geography. If I could choose weight though, I would feel a lot more free to pitch in elsewhere and learn as I go. It would certainly feel more like a “field notebook”. Potentially messier, but more free-ing.

I could imagine various ways to implement it.
Possibly the simplest would be visual …instead of just “agree”, two choices like… :-
:white_check_mark: and :+1:
With tooltips to explain difference.
Or a scale with some greyed out so you can toggle level of confidence : :white_check_mark: :heavy_check_mark: :heavy_check_mark:

Alternatively, a button to “confirm” could be less visible than button to “accept” - like withdraw and delete are at the moment.

1 Like

This will be anecdotical as others comments in this thread. When going through Identify I’ve found lots of observations from those users with 1-10 obvs, many of which are Unknowns or have a coarse ID in notes or as Placeholder (mistakes or don’t know how things work). What I think is that the majority of new users come to iNat not because you can use it as your ‘field notebook’ or to help science but because with iNat you can “easily” ID something and calm your curiosity. If you don’t receive instant feedback (whatever the reason) you probably won’t explore more about the iNat community or its features (even more if you are using the app).

If there’s still space for a comment about why one would prefer the app. I use mainly the iOS app to add observations as it was how I discovered iNat, it’s easy for me to chose the pics from the camera roll and add them to an obvs (with the metadata the iPhone captures), if I want to upload with the website I would’ve to copy all the pics I want from the Photos app to my laptop and then to the website (things can get lost, I use this method only when I have a great number of photos). I think I don’t will get down in the app probably until I get a camera just for observations (sidenote: I don’t have experience with the android app)


I agree with that, do you think it’s the initial goal for platform or result or “false” advertising as id app?

1 Like

I don’t think is the intended goal for iNat, but it has been used just for the sake of IDs without exploring more about it or its impact (e.g. a teacher is a regular user but make projects for his class that only involve adding Obvs and IDing a certain amount of organisms with almost not real engagement with the platform. Maybe Seek is a way to avoid this cases).

I’m not really sure if INat advertise itself as an ID app but I you look something like ‘How to identify plants/animals’ in a search engine or wherever you get apps, iNat is up in results.


i too almost entirely rely on the app for making observations. I think the app is great for that. It’s less useful for monitoring ones own observations for comments and ID disagreements, and for helping with ID and monitoring those observations for disagreements.


To add, I use USB cable and copypaste them into folders for days, I really feel much better doing it now and adding accuracy, changing wrong gps spots, just uploading hundreds of observations many times faster than doing one by one via app, without getting tired somehere in the middle of one day photos and then checking where exactly it was. But I have no experience with iPhone at all.
I thought now about those users who like making observations with app, somehow I forgot about this version of workflow initially!


Sorry, but I look at 10’s of thousands of butterfly observations per year, and the CV works remarkably well when I bother to check what it suggests. I started looking at suggestions as a kind of “shortcut” when I have to correct a mis-identification because it allows me to enter a new ID without taking my hand off the mouse (this shaves a few seconds off the time required to apply a correction, which adds up over hundreds of observations).
Does CV work 100% of the time? No, it doesn’t. In some cases, I can see why it suggests the wrong species (marked sexual dimorphism, or seasonal variants), but in other cases, I have no idea why it suggests the wrong species. I’ve rarely seen cases where CV doesn’t recognize that there’s a butterfly in the photo. Even with butterflies that closely resemble dried leaves, CV correctly picks out that it’s a butterfly, and even when the top suggestion isn’t correct, it’s usually in the ballpark.
But maybe this is because in our region (Ontario, Canada), we have a LOT of butterfly observations (circa 40,000 per year), and there are a small group of us who look at all the observations and make sure the IDs are as accurate as possible. So, the CV is well trained for butterflies in Ontario.
You get what you pay for. That’s what I tell critics of iNat. I’ve called out so-called experts who criticize the quality of iNat observations for not contributing to making them better.


I’ve been meaning to make a feature request for that. It seems like it would be useful to many people for many reasons.


Regarding CV: I’m impressed that it gets things right as often as it does, but I think the CV is–very unsurprisingly–most accurate at identifying the species that are most often observed. This can give you pretty good accuracy metrics using internal testing of CV against iNaturalist data–you’re testing the CV against a data set that is mostly large numbers of observations of the small set of taxa that the CV is best at.

Here’s an alternate way of testing CV’s accuracy that I think more closely resembles what outside experts are doing when they get a negative opinion of CV: take a set of observations that are of interest to you and whose identity you know, and see how often the CV will give a “pretty sure” ID that is correct at the species level. Out of curiosity, I ran this test on 30 of my own recent observations.

Correct species: 0
Correct genus: 15
Neither: 15

If someone new to iNaturalist ran that test, I don’t think you could blame them for concluding that the CV is a pretty bad source of identifications.


Does it even show species as “pretty sure”? It seems even when it’s one species possible for it, it says “sure” for genus and list this single species. I checked 20 observation that are now at species level and I’m sure in their id, cv got confused with distand shot of multiple bird species and a leaf mine, while it guessed another mine and super common leaf gull.

Yes, it is often not understood that the vast majority of species-level taxa in iNaturalist are not part of the CV model. It will guess wrong 100% of the time for an observation of one of those species, unless the species happens to get suggested as part of the new “seen nearby” algorithm.

Fortunately, as you note, a greater proportion of observations are of taxa included in the model, so it performs very well a fair amount of the time in actual use. A specialist in rarely observed taxa may indeed be disappointed, but maybe less so once they understand what the CV model is and is not.


Given that neither way of looking at the accuracy of CV is inherently more correct, though, very different opinions are perfectly reasonable.

For instance, if I just called every Verbascum in North America Verbascum thapsus, I bet my accuracy would be pretty high… but you wouldn’t say that I’m any good at telling species of Verbascum apart.


Again, I think that the OPs issues here really aren’t about CV accuracy in any case. The issues are more user interface and navigation, so perhaps this should be a separate thread… but it’s great to explore other metrics for this - I’ve wondered about doing similar ad hoc tests - be interesting to test a bunch of different taxa and geographies and compare results.


There is always the “human factor” For example here https://www.inaturalist.org/observations/87716056
The species chosen by the observer was not on the first place in the list the CV provided. The CV was right to propose some kind of beetle (at least it did for me).