Rampant guessing of IDs

I’m guilty of this sometimes. Usually if I don’t know what I’m looking at or have a general idea I’ll take a computer suggestion. If I’m wrong and someone submits a different ID, I’m grateful because it points me in the right direction and I can go and read more about the species I saw. After that, if I think the person is right, I’ll agree with their ID because they really helped me out and I want the info to be correct.

4 Likes

Totally fine, you are not guilty at all. Problematic are people who do the same ‘without that’ instead of ‘after that’.

It seems like the CV can get confused by similar “patches” on dissimilar objects, like the white streak on this feather seems to have been mistaken for the white streak above this owl’s eye.

Filtering the CV suggestions by taxon helps if there are bad suggestions. Overall I think the CV is pretty good! I definitely use it (with some checking) to add coarse IDs to organisms I’m not personally familiar with.

You can just withdraw your first id and make the second ider happy, each expert would love in users stopped agreeing, nobody is perfect and everyone makes mistakes, yes, RG doesn’t mean much, but let’s not create more baseless RG. Not only it’s said in rules to only add ids you can make yourself, but it’s eliminating work an expert done to make this id.
Isn’t it even funny when user ids insects, then someone ids order and suer agrees, then someone ids family and user agrees, then a species and user agrees, because if you knew what it is why did you wait so long? Or maybe stop hitting agree…

4 Likes

No, probably it was fed enough feathers with certain ids, I noticed it with other parts of bird, etc.

As discussed in other threads, the stats from this article can be very misleading.
I wish they weren’t rolled out so regularly without qualification.
The CV is incredible, but it’s not without flaws and I think it helps no-one to gloss over them.
So you’ll have to forgive me if I jump down the rabbit hole once again, and if this is a bit of a essay! …but… its important, I think that these %s are tempered with a bit more detail.

From the same article, talking just generally about accuracy of RG obs :

accuracy varies considerably by taxon, from 91% accurate in birds to 65% accurate in insects”.

86642122a0d7f4ab452279f20018b5cdd2a07431_2_690x388

Note there are only about 10000 species of birds globally but potentially over a 1000000 species of insect, so the 91% top end seems fairly meaningless in this context. The other taxa fall between 77% and 89% but I feel cautious here too as there is still a massive bias toward more charismatic taxa in the content of the dataset (1200 expert bird IDs and 800 reptiles… but less than 200 expert insect IDs). Lower end values of 200 strike me as too small a dataset to draw clear conclusions. There are also hints that the observations are geographically biased to N.America, but this is unclear as far as I can see. Doubtless, in any case, these stats vary wildly by both location and taxa.

Anyway, of course, we are talking about CV accuracy not RG accuracy here.
But. The accuracy range that you refer to is only based on comparison against RG data!
From the comments in the same article, @kueda states:

Observations in the test set for vision and the test set for the whole system are drawn from Research Grade observations

So, as I understand it, in insects for example, we are talking about a 60-80% model accuracy against a dataset which is already only 65% correct. E.g. Giving us 40-50% average autosuggest accuracy and 60% for “pretty sure of”. So maybe I’m way off the mark, but to me it seems the stats from the article don’t prove the opposite of @dan_johnson’s statement. If anything they support the idea that there are indeed a significant % of incorrect autosuggests. (or were at the time of the analysis in the taxa explored).

However! I don’t want to muddy these murky waters further by adding more misleading stats into the mix. As indeed even that’s not the full picture I think, when it comes to this “60-80%” stat. Which appears to be based on this graph :

Here we see the opposite picture. The insects are the highest scoring in the accuracy range for CV suggests? Higher than birds?! Now this really doesn’t ring true…

Seems to me like this is probably indicative of overfitting in the model. Potentially, (and counter-intuitively) the higher end of this scale - 80% - in this context might well be a signature of the weaker part of the training model if it is out-performing birds, even though birds have a lower median accuracy value.

I’m still just starting out myself with machine learning so these observations might well be off-base too, but it seems to me that concepts of “accuracy” with regard to ML models simply don’t correlate with standard notions of accuracy in the context they are being applied here.

Regardless of all this, I agree its getting better! It’s doubtless incorrect less often now than it was at the time of the analysis. Hats off to the developers - its amazing what it can do already. But it varies wildly by taxa and location, so one person’s experience will likely be very different to anothers. As such, I just think it’s important these stats aren’t pitted directly against valid frustrations with the existing system. @kueda didn’t claim it to be a rigorous piece of data analysis in the first place.

9 Likes

I think it is actually feasible that the insect CV accuracy being higher than for birds in that figure is legitimate. Consider this scenario:

I take photos of 10 birds that are all very common, super easy to ID myself. Because of this I don’t bother to check the CV suggestions (why would I when I already know the exact ID?), and I just manually type in the names*. That figure for birds is therefore ‘losing’ an easy 10/10 to add to its value on the y axis.

Conversely, if I take a photo of an obscure brown bird that I struggle to ID, or a dud/blurry photo that doesn’t show diagnostic features well, I’m more likely to pick a CV option. Of course these are the types of photos the CV struggles with, so the accuracy takes a hit.

From an insect perspective, I think (and this is purely anecdotal) that people in general are far less capable of recognising even some of the most common insect species that are actually quite easy to ID (relative to their ability to recognise birds), and that generally people disproportionately use the CV for insects compared to birds. I suspect therefore that the relatively high accuracy value for insects is being driven by a huge flood of common, easy to ID insects that are simple for the CV to get right. For example, in the USA alone up until October 2019 when that journal post was made, there were 45,000 observations of Apis mellifera. I just clicked through 20-30 of them randomly (super small sample size I know, but just a coarse example), and 50%-60% were CV suggestions, so that’s 10-15 easy ones the CV is slurping up.

I’m not saying this is definitely the case, and by all means your explanation could well be correct instead, I’m just saying I don’t think it’s impossible the value for insects was higher than for birds/trying to offer an explanation.

(hopefully this makes sense logically, it’s quite late in my time zone and I’m half asleep)

*obviously there is the disclaimer here that a number of users use the CV options even when they know what the species is as a time saving method (I do this all the time). But I suspect this probably has quite a small effect

4 Likes

This is not connected to the way in which the accuracy is being measured in the model described by @kueda’s article.

The article states:

There are a lot of ways to test this, but here we’re using photos of taxa the model trained on exported at the time of training but not included in that training as inputs, and “accuracy” is how often the model recommends the right taxon for those photos as the top result.

So the model uses a bunch of photos from RG obs. It’ll split it into something like 80% training data and 20% test data. It runs the algorithm against the training data (which includes IDs for each observation) to generate a model. This generated model is then run against the test data without IDs for each observation to see how many it gets correct.

Whether users use CV suggestions or not in placing the observation at RG is irrelevant in this metric.

7 Likes

ah very fair then, thanks for clarifying. That’s what I get for not reading things properly, a lesson to always check the numbers and background info!

2 Likes

For many of us, the identification process is difficult and we all are subject to rampant guessing in some aspects of the natural world (we might be great at birds, but useless with mollusks). I’ve had many experts have discussions about what an observation is. This seems to me a power of iNaturalist and not a weakness. We are all here to learn. There may be some users that insist on their identifications, but there’s no use in changing the entire platform for a few stubborn people. It’s fine to let bad identifications be. They just won’t reach research grade. I can understand why it might be frustrating, but I think a shrug and a sigh should be the correct antidote.

5 Likes

The problem is, the existing user interface design encourages the mistakes the OP talks about. The UI is amazing and the whole setup of the platform is unparalleled in my opinion… but nevertheless, it has flaws and they lead to the issues the OP states.

Sure - shrug, sigh, don’t take it too seriously…but healthy too to discuss change and debate constructively how to make it even better.

I agree. The issue here isn’t with the CV, it’s issues with the UI which are the problem.

Why hide the withdraw button?
Why suggest species-level IDs in complex taxa when not pretty sure of anything?
Decisions around the design of the UI are the root of the bigger problems @dan_johnson describes, not the CV model.

6 Likes

Generally, yes, that isn’t a great practice. Your ID should represent your experience/expertise as an independent observer and identifier. If the expert provides characters that you can then ID or their ID spurs you to research a species that you hadn’t before and you can make your own (re)identification, that’s great - it’s fair to change your ID to agree.

But agreeing solely based on the other person’s credentials or perception of expertise isn’t a great practice, even if they have a PhD, etc. Using the leaderboards as an indicator of expertise is problematic itself, as for some taxa, the leaders/top IDers are not experts, but just people who’ve clicked “Agree” a lot! So best practice in most situations will only be to ID based on what you can verify/interpret yourself.

6 Likes

I agree completely. This is a very active site. I don’t see any problem with initial guesses being wrong. Accepting the CV identification, even if you don’t know enough to really know if it’s correct, will label your observation in a way that the appropriate experts will find it.

There will always be a trade-off between making things easy for novices to use, and ensuring IDs are more authoritative/correct. Given the objectives of the site, I’m happy to have novices encouraged to participate. I’d rather have more wrong IDs that I can easily correct, than fewer IDs in total, even if they tend to be more accurately identified.

I think it’s worth trying to shift your perspective. Rather than: “look at all these incorrect IDs”, how about “look at how many observations I improved!”

7 Likes

I don’t see anything to be guilty about in your practice. It conforms with iNat’s objectives (to engage and learn) and it follows the FAQ guidelines for usage. There are any number of users who wish iNat’s purpose, goals, and accepted usage would be different, though, that is very evident.

What do you mean by, you want “the info to be correct” ?

If you mean you want the community taxon to be correct but your initial ID is in conflict, why use the agree button and not the withdraw button?

( this isn’t criticism, I’m just curious to understand different user experiences of the site )

2 Likes

But it’s a trade-off too, with fewer wrong ids we would have more experts to make right ones, maybe it wouldn’t help new users, but for anyone who observes small-sized groups knows how hard it is to do without a single expert on the site. With all the millions of observations we have now, it seems all the experts need to be on he level of 600k ids as leaders are now, or most observations will never be ided correctly, with correct ids to start it takes less human time to get to the final id.

You asked me about why not use the Withdraw button.

I believe this often happens because the iOS app does not even have an Withdraw button. Then, too, it is not very obvious on the website either nor is its usage explained with a tooltip. I don’t know about the Android implementation.

5 Likes

Ah sorry, that question was meant for @dcvmnaturalist really.
But yes, the complete absence of a withdraw button on iOS is doubtless another sizeable contributor to the OP’s concerns …and one would think a relatively easy thing to fix.

I put in a feature request for it, but it was rejected as " “Feature parity with Android” is already planned for future iOS development".

I flagged it on this thread though at least… it should be a priority to fix in my opinion.

3 Likes

This is exactly right but i will also say that over the last few years things have shifted and the bizarre and frustrating hostility towards iNat has decreased, at least in my area. Of course in areas where it gets less use that will vary too. As you say it is a joint field notebook not a herbarium or someone’s PhD dissertation. I also believe some experts are just afraid to state their own IDs because they too are often wrong about IDs. Things are changing, but it is too slow a process and annoys the heck out of me too

9 Likes

Or (for taxa above species) people who know well how to identify a few very common species within that taxon, but are as clueless as you about 99% of them.

4 Likes