Add a "visually identifiable" scale parameter to set user expectations on confidence of computer vision IDs

DianaStuder · July 1, 2020, 3:32pm

AI suggests this is XY

We need to let new people know not to auto-agree
(do you know this is XY? Then click agree)

iNat is pretty sure - sounds definite and unambiguous. This IS what it is, no doubt about it!

naturesarchive · July 2, 2020, 11:55pm

Right - I’m not saying to prevent selecting it. I was really thinking along the lines of “setting expectations with the observer” to help them understand when a visual ID is actually quite difficult and unlikely unless they are an expert. I’m not a UI/UX person, but I’m confident appropriate steering/wording could be done.

naturesarchive · July 3, 2020, 12:10am

I agree that this would be a challenge. One example of this I read recently that leaf miners, for example, are often easier to identify by the host plant and mining pattern than visually looking at the insect directly.

But, interestingly, it seems like the AI determines if it thinks an insect is a nymph or adult when making the suggestion. It would be good to surface those assumptions. like “iNat thinks this is a nymph Milkweed Beetle”. That would provide the users with more insight into the suggested ID and hopefully give them pause when it is clear that the observation is not a nymph, for example.

A possible solution for this would be to seed a tuple data structure, i.e. {taxonomic_level, life_stage}, and leaving unseeded tuples as it works today, and providing the extra insight only on seeded tuples. This gets a little more complex from a database perspective, so another idea would be to only seed in cases where ID challenge is consistent across all life stages - i.e. taking a minimally viable starting point from which you can then improve upon with future augmentations to the feature.

cmcheatle · July 3, 2020, 12:14am

It doesn’t do that. It compares the photo to a set of photos which include larva etc and determines this looks like x.

It doesn’t say this is a larva, let’s go compare against all the larva photos I have access to and limit my analysis to those.

naturesarchive · July 3, 2020, 12:19am

I agree that it would be hard to do with everything all at once. A good approach is to start simple. Maybe pick a few families, run an experiments, see if it can scale and be maintained. My view is maybe it never has to be completed. If it can simply be improved in some useful way, say, for some problematic taxa, that might still be a win.

My secondary thought is that with the idea of seeding, it ties in with the entire concept of machine learning. ML needs humans to seed the knowledge. By seeding, over time, ML would learn and could inform the AI which scenarios are not identifiable, filling in the gaps. It might take years, but that’s pretty much how it works.

sbushes · July 24, 2020, 10:27pm

I like this idea insofar as I think its really helpful just to forewarn newer users or those unfamiliar with a taxa that there might be an issue. We have this in place in the UK platform iRecord. It automatically alerts you when you ID a species which is out of typical range or not usually identifiable from photos. I think it works well. I think it would also help identifiers from having to state all the time…“not identifiable from photos”. Formalising this issue helps it to remain impersonal as well which could also save time.

I don’t feel this needs to be connected to the CV suggestions though? Why not simply initiate the notification when anyone IDs the taxa in question?

As this is already implemented in iRecord, I can’t imagine its totally unworkable. But as a way of not adding extra effort to create, perhaps it could use some sort of consensus voting mechanism to initiate… so the development side is just something simple, generic and embedded in all taxa. Then the rest could be crowdsourced. This sort of range of input would inevitably create a sliding scale of opinions, which could represent how tough it is to ID from photos alone. I’m no expert, but I know a whole bunch of Diptera not to even try to go to species on. I think it wouldn’t hurt to pass on that information in the form of a vote. Plus the output doesn’t have to be anything rigid, complex or formal I think in comparison with taxonomic structure. It can be just a “might be a tough one FYI” kinda thing like @Star3 suggests:

I also like the symbols options suggested by @lm77. My worry though would be that this might get lost in the existing interface. Maybe in addition to a notification somewhere?

As a general rule of thumb, isn’t it just always harder to ID juveniles, nymphs, bare trees, etc? Wouldn’t it still be helpful to flag the taxa for users, even if only applied to those which are well known to be difficult as adults, or when in leaf?

Megachile · July 24, 2020, 11:49pm

Tangential to the Feature Request discussion here, but I’ve often wished iNat supported a taxon info along the lines of Bugguide’s. Or a public-facing general-info flag system. So when I go to a species info page, I could see notes like “hey this whole family requires microscopy” or “here’s a useful Guide or key” or info on distinguishing commonly mistaken species.

upupa-epops · July 25, 2020, 2:07am

There was this topic: https://forum.inaturalist.org/t/add-comments-or-wiki-like-functionality-on-taxa-pages-to-discuss-identification-and-other-relevant-issues/91

fffffffff · July 25, 2020, 8:14am

Is iRecord for UK only or all territories? Creating such thing for one country already is a very big work, but for all species of the world it does sound undoable in short perspective.

sbushes · July 25, 2020, 9:27am

UK I think.

Short term, even if you could just warn people going to species on things like Sarcophaga and Psychodidae, that could be helpful. The most common ones are the ones which take the most energy.

Perhaps the flags could just be raised on these repeat offenders, where there is the most need… rather than even trying to conquer all species in all locations.

bazwal · July 25, 2020, 4:52pm

This is really only partially true at best. The out of range alerts cause a lot of false positives because they’re solely based on existing records that have been accepted by the NBN Atlas. This is usually out of date, and has a large number of gaps and omissions - so the information it provides is often quite misleading. The ID difficulty ratings are more useful, but they only cover a limited subset of UK species. The ratings are provided by the various taxon-specific recording schemes in the UK and are quite narrow in scope. They only offer brief advisory messages about what kinds of evidence the schemes require, and don’t usually offer any details about why the idenification might be difficult. Here’s an example:

screenshot-plbrQl

These messages can be useful if you already have some recording experience, but it is very common for new users to be confused by them. There are many posts on the iRecord forum from people thinking the messages mean their records have been invalidated in some way. Ironically, the usual advice is to simply ignore them.

I don’t wish appear too negative about iRecord (which I also use myself), but I’m afraid your assessment here may be somewhat inaccurate.

sbushes · July 25, 2020, 8:18pm

Ok nice! Good to have some finer grain to my limited awareness of the iRecord system

The negative time costs resulting from any automation are a good point.
Any negative time costs to this action should be weighed against positive time gains though.

At present, on Sarcophaga photos ID’d to species, most identifiers might write a personalised note, explaining the difficulty. Every time every identifier writes a new note, adds a considerable net cost in identifier time to the overall process.

Lets say this is semi-automated and there is a button added for identifiers which when pressed gives the user a more formal note explaining the difficulty. Now identifiers only have to press a button instead of writing it out…but its doing pretty much the same thing. Would this be a worthwhile change in terms of time cost/gain?

This would also enable the logging of the button presses. So then lets say after a period of time, the buttons for genera which are just being pressed over and over again are just pressed in advance, forewarning the user automatically as @star3 suggested… or making a little symbol show up on the taxa in question, as @lm77 suggested. More time saved! Would this second change be worthwhile in terms of time cost/gain?

I would imagine the queries from users who receive an automated suggestion won’t be more significant than the existing queries identifiers have to grapple with person to person. In addition, this negative response to the automation could itself be generic, formalised and minimised… with FAQs, whats this buttons, etc…

The cost for the developers to implement though … is a whole other negative cost to factor in…

fffffffff · July 25, 2020, 9:43pm

IDers should save those messages, I’d die if I type a polite message each time someone didn’t mark a cultivated plant or forgot to add an id.

sbushes · July 25, 2020, 10:00pm

Sure. I know some do. I need to get round to it! I’m less organised :)
Whats the tool for it again?

In any case though …if some are using an automated message of their own and some not… this is just more evidence to build it in to the system in my book. Especially if iNaturalist wants to make the on-ramp for identifiers as smooth as possible…

sbushes · July 25, 2020, 10:21pm

…but maybe I’m the one suffering from confirmation bias here…

fffffffff · July 25, 2020, 10:26pm

I saved it on another platform and copy when start iding, as new people make common mistakes this copied message gose many times in a row. I would like to have an option to have saved messages, but not if it is pre-made by someone else.

sbushes · July 25, 2020, 10:33pm

For me, I would rather it was just generic.
Less personal means less chance of it being taken personally.
Plus, I’m not diplomatic enough to get the tone right so would rather it was written by a pro
I get that that makes it more impersonal though…which is another “cost” in the equation I guess…

Potentially, saved messages could still be logged in some way though…like button presses. This segues into other ideas floated on other threads too about saving the ID features folks mention - which is already in the pipeline by the look of it :

naturesarchive · July 26, 2020, 12:47am

In the online world, it isn’t too hard to take a step-by-step approach. Features can be rolled out by taxa, by region, etc. Perhaps boiling the ocean with all regions/taxa is undoable, but starting small and learning what works and what doesn’t work seems like a good approach. The other way to look at it is “does it improve data accuracy”? That helps reduce the pressure for perfection.

pisum · July 26, 2020, 1:11am

it’s actually better to anticipate many of the the potential issues upfront so that you can create a foundation that will at least be flexible enough to handle things that will inevitably come up. just slapping something together for one place or taxa at a time and then adding as you go seems like a good way to create a lot of major rework, which is not a good thing.

Star3 · July 26, 2020, 2:12am

Unless I feel like personalizing it, I just copy & paste from the frequent responses page (that’s what it’s there for, after all):
https://www.inaturalist.org/pages/responses

I have personalized ones saved to an unpublished journal entry, so I have access to them on any device as long as I am logged into iNat.

Topic		Replies	Views
Computer Vision should take into account fraction identified to species General	17	555	March 10, 2025
Difficulties with Identifying General	22	788	August 30, 2024
Change computer vision suggestions to only above species level Feature Requests	32	2703	May 29, 2019
Search filter for observations that have identifications from Computer Vision suggestions Feature Requests	3	343	May 6, 2022
Log computer vision taxon guesses Feature Requests computer-vision	13	977	September 23, 2023

Add a "visually identifiable" scale parameter to set user expectations on confidence of computer vision IDs

Related topics