Change computer vision suggestions to only above species level

tony_wills · March 17, 2019, 9:37pm

Change the actions of the computer vision AI to never suggest an identity to species or below. eg to only sub-genus level or above.

In scientific terms I am suggesting that the auto identification’s precision (ie to species level) exceeds its accuracy in many cases. The problem this causes is that it promotes observations to ‘research’ grade with only one ‘real’ subsequent ID.

An alternative would be to not count ‘automatic’ IDs towards ‘research grade’. (And I don’t accept the suggestion that all IDs are chosen by the person clicking on the auto suggestion, and therefore they are not AI identifications :-).

The AI/computer vision is a fantastic tool, but instead of helping in getting rid of poor initial IDs or lots of ‘unknowns’ and helping the community, it appears that it often results in more work in correcting wayward IDs, or perhaps worse posting of incorrect ‘research’ grade identifications. (eg see https://forum.inaturalist.org/t/create-separate-accounts-for-students-assigned-to-use-inaturalist). AI suggestions to genus level would still help decrease the number of ‘unknowns’ without the existing side effects.

–Tony

charlie · March 17, 2019, 9:40pm

I think that really depends on taxa. For plants where there are huge and diverse genuses it would make the algorithm mostly useless. And genus might actually be harder than species since it’s often defined by obscure flower parts or what not. As for not making the suggestions research grade that’s an interesting idea. I’ve often wondered about maybe running the algorithm automatically on everything but displaying it differently and not having it influence community ID at all. It could still be searchable and experts on taxa could scan those too. Just a thought. One could also just make it more conservative, equivalent to how it works on Seek.

But yeah if it was genus only I don’t think I’d personally ever use it.

tony_wills · March 17, 2019, 9:51pm

I was thinking it would just do its identification process as usual, but then only suggest the genus of the species it would otherwise suggest.

Maybe only do the genus level ID for initial uploads, then offer more suggestions to people adding IDs subsequently.

jciv · March 17, 2019, 11:13pm

As a new user, the AI was the best part of iNatualist. You can often get answers in seconds in the field if you want. I had heard of iNaturalist and used it once before, but I was not interested in it till I found out about that feature.

If computer vision isn’t allowed to do species level identification, I think it will discourage a lot of new users. I agree in many cases this would eliminate mistakes. But for really well documented species, by now the AI should be pretty well trained. Many common birds and butterflies when narrowed down by location seem like they should be pretty reliable.

I disagree with the idea that IDs chosen by the person clicking on the auto suggestion shouldn’t count. I can’t remember the names or spellings of all the species I might post. I click the Auto Suggest box, if it gives me a good result, I click the species or genus. Otherwise, I type in the Class. It is a time saver. Doesn’t mean I am relying only on the AI. Many of my observations would never reach Research Grade if my AI assisted IDs are not counted. There aren’t many people IDing spiders in South Texas. I am lucky when I can get any agreement even to genus.

There has to be a balance between ease of use and accuracy to get new users to stick with iNaturalist. I don’t think genus level IDs is going to do that. I see so many people try iNat and never get IDs (because they leave their stuff as Unknown) so they don’t stick with it. People like quick answers.

mtank · March 17, 2019, 11:33pm

I think the system is pretty good as it is. The main problem is that people will accept the auto-suggested ID when even with a cursory look at the species page, it’s obviously a different species.

Maybe the best compromise is that you should need to click through the information page (with its larger image), or a custom confirmation/comparison page before you can accept the auto-ID at species level (or maybe at any level).

tony_wills · March 17, 2019, 11:48pm

Some good points there

Yes, it is nice to have prompt suggestions as to species. This perhaps suggests the second suggestion of just not making them count towards a community ID would be better.

But:

The trouble is how do we differentiate between that level of interaction, where we are using the tool for assistance, and the automatic acceptance of whatever the AI suggests?

A “solution” is to weigh IDs depending upon the user’s “reputation” (ie whether they’ve demonstrated from previous IDs that they know what they are talking about), but that is a bigger more complex change to iNat. I am looking for an interim “'fix”.

–Tony

pfau_tarleton · March 17, 2019, 11:56pm

For many taxa (e.g. beetles) restricting suggestions to Genus and above wouldn’t help. But for some taxa, I’m surprised at how well it works (and it’s improving constantly). I’d hate to see any restrictions placed on it–except maybe a warning message that the user should not accept the suggestion unless they have verified it using other sources. I use it a lot because I can’t remember names of things very well, so I’d not want it to prevent my ID from counting just because I use it.

jdmore · March 18, 2019, 12:00am

If this suggestion did get implemented, I would want to see an accompanying button that I could click labeled “Species Nearby” so that I can still use AI for species level suggestions if I choose.

I also think that implementing a more geography-aware CV/AI system would go a long way toward mitigating some of its issues, as being discussed here and here.

tony_wills · March 18, 2019, 12:08am

Perhaps this feature change should be rephrased as asking that the primary auto suggestion should always be to above species level, so that the user has to actively do a little more decision making.

jdmore · March 18, 2019, 12:14am

Yes, I think that could help. On the other hand, I also wouldn’t too many hurdles in the way of “I know exactly what this thing is (and so does CV/AI), I just want to see the choice, click on it, and move on.” Which is how I personally use CV/AI a lot of the time.

We need more identifiers as it is, and I wouldn’t want to do anything to slow them down!

tony_wills · March 18, 2019, 2:01am

Well that’s actually where this proposed feature is coming from - the extra work created by having to overcome bad IDs. If the initial ID is just to a high level taxon in the right branch, then no problem. But if you have IDs in the wrong direction you need more than one ‘correct’ ID to pull the community ID back into line.

Actually I’m waiting for somebody to come back and say something like : “we’ve analysed the stats and the number of correct auto-generated IDs has lowered the number of wrong initial IDs by X amount, and reduced the number of ‘unknown’ observations by Y amount. And that any apparent increase is due to the increased number of observervations, but numbers of ‘bad’ IDs is increasing slower than the number of 'good IDs.”

jdmore · March 18, 2019, 2:08am

I would love to be convinced that a feature that slows down my ID rate will be more than compensated for by having fewer things to ID. So yes, bring on the stats!

charlie · March 18, 2019, 2:29am

for the plants i look at… i feel like it’s more or less a wash. there are random things mis-id’ed but about half the totally random and weird ones don’t even come from the algorithm. And i verify plenty of correct IDs that are from the algorithm (though of course others may also be using it as an auto-correct type shortcut, i do that too sometimes). The iNat staff definitely has stats on how ID time and such have changed, some are visible here but that only goes back a year. It looks like average time to get an ID is drifting downward but i would also guess some of that is just due to northern hemisphere winter. I wish that graph went back further!

I do think limiting the algorithm geographically would make it much more an asset than it is now.

carrieseltzer · March 18, 2019, 2:42am

Without going into a lot of detail that is not in my area of expertise, I can share that the iNat team is actively working on improvements to computer vision that will perform much better for suggesting taxa coarser than species (but will still suggest species when we have enough data).

@charlie, there’s a url hack to the stats page to see more stats history, e.g. https://www.inaturalist.org/stats?start_date=2008-03-20

tony_wills · March 18, 2019, 2:42am

I suppose they didn’t want it getting it wrong for vagrant specimens etc, but yes if it’s not known from the area, then don’t suggest it would work for me.

charlie · March 18, 2019, 2:44am

omg awesome!

seems like vagrants and outliers are rare and important enough that they warrant the extra human review imho though

carrieseltzer · March 18, 2019, 2:45am

Should have added that the team is also exploring the impact of removing species from CV suggestions if they have not been seen nearby.

tony_wills · March 18, 2019, 3:49am

Ok, enough said, the team are on to it, we can pack up and go back to the work of observing things :-)

Just as an aside: I was just doing some back-of-the-envelope type validity checks using that stats page, looking at the percentages in each year of species level community identifications - and I saw a decline from about 90% in 2008-2009 to 73% in 2017-2018 … then I realised it wasn’t 90% of the 2008-9 obs had been IDd to species in 2008-9, but rather that over the past 10 years 90% have finally got there!

jdmore · March 18, 2019, 3:54am

Hmmm, I hope it is a little more nuanced than that. There are still many many species that just haven’t yet been observed on iNaturalist. Hopefully it will also take into account all the atlases, species ranges, and place checklists that have already been added to iNaturalist – and will as a side benefit motivate further development of those geographic data sources!

EDIT - though thinking more clearly, of course species never observed in iNaturalist would not be showing up as suggestions anyway. But still, for species that have been…

andrewgillespie · March 18, 2019, 11:05am

I would not like to see this. The species level id is getting quite good for the things I submit. And if it does make a mistake then it is an opportunity to train it. Not going to species level would mean it does not get better at the species level.

Topic		Replies	Views
Computer suggestion being ''too precise'' General	22	1333	January 24, 2024
Problems with Computer Vision and new/inexperienced users General	132	5964	October 28, 2021
Don't use computer vision General	168	10156	July 20, 2020
Force computer vision to back off on the specificity of suggested IDs in regions with cryptic or hard-to-identify species Feature Requests identification , computer-vision	94	3092	July 20, 2026
Better use of location in Computer Vision suggestions Feature Requests	53	8617	April 13, 2021

Change computer vision suggestions to only above species level

Related topics