Automatic iNat suggestion for "unknown" observations that reach a certain age

Hi - I’m with you on recruiting more identifiers. & I do like the idea above of distinguishing a high level CV id from a user id, so that it doesn’t carry the weight of a human opinion but does come up in filters for plants.

For India, lots of observations HAD been languishing for years in Unknown - the reason that Unknowns in India only stretch back 9 months is that I and others have spent the hours classifying them (starting at the oldest) because we’re motivated by a particular project effort. These included some highly identifiable observations from a particular user that had just been ignored due to age. We are limited by the amount of available identifier effort - and there are many other ways to improve the data resource (in this case on wild flowering plants), like marking up pot-plants and street trees and finding / drawing attention to good observations with missing metadata (usually an upload problem with the app)…


My perception is that in most cases identifiers just don’t realize or remember that the Unknowns are there to be gone through, and that there are easy ways to include them.

A concerted education campaign would be one way to address that, but if the technology can help too, then why not? :wink:

1 Like

I’m a non-expert at everything, and I spend a lot of time going through unknowns. I learn a lot from this! But I wonder if I wouldn’t learn more having cv automatically applied, and then going through local guides / examples and trying to one-up cv in more restricted domains.

I don’t know a good way to implement, but I can imagine I’d be more useful going through a series of unknowns where “computer vision result was marked very wrong”. We all know this happens, and that sometimes humans can quickly figure out where things went wrong and get the observation back on the rails.


"By “surfacing old unknown observations,” what I mean is, when someone filters their identification stream for all needs-ID vascular plants from Nevada (as an example), I want them to see two kinds of observations in that stream:

all needs-ID Nevada observations identified by a human as Tracheophyta or a descendant
all old Unknown Nevada observations identified by CV as Tracheophyta or descendant"

This seems like the right solution. The problem is not that observations do not have an ID, the problem is that experts in a particular taxon cannot find the observations which no one has yet identified to that level. For example, I have reviewed all the aphid observations, but I will never see the observations of aphids which are currently stuck at an identification of insects, arthropods, or unknown.

The computer vision should be good enough to add at least a few hundred obvious aphids to the search. And even things which are not aphids, but look similar, are most likely to be correctly identified if added to that search.


Wow, thank you for tackling those Indian plant observations stuck in Unknown! I found them too exhausting and frustrating, and had to step away a while ago. There were indeed many fantastic observations there just waiting for some attention.


I have been following this discussion with interest. When I created this feature request, I was mostly spending my time IDing “unknowns.” I was concerned that the number of “unknowns” were increasing but I was really interested in increasing the efficiency of putting the correct label on the observation.

I think both of those premises are somewhat faulty:

  1. It turns out that the number of unknown observations is not increasing. Bouteloua posted a nice graph that shows that the number of unknowns spike in April in the Northern Hemisphere but then it gets worked off during the winter months. It has been roughly stable for several years.

  2. Choess pointed out that the “unknown” observations are not the “rate-limiting step,” in IDing observations, which I think is a good way of putting it. There are way more observations stuck at higher-level taxonomic classifications (Kingdom, phylum, subphylum, etc.) in plants, for example, than unknowns.

So from an efficiency standpoint, the proposed feature request might not really help that much, especially if it is just used to get an unknown to a kingdom or phylum-level classification.

From the above, I’m less convinced now that this feature should be implemented.

If we could somehow just get the observations that are suitable for CV to be the ones that CV is applied to, and these observations could be IDed to genus, family or order with an acceptable error rate, I would be all for that. But there are a bunch of unknowns that have blurry images, or the focus of the observation is unclear, or have any number of other issues that can confuse the CV; if these kinds of observations can’t be screened out somehow, I imagine that the error rate would be unacceptable.

If I were proposing a new feature request, I think I’d suggest that it be a sort of experiment to be tested on a limited geographic region/or maybe even taxonomic group (or maybe on a “low risk” group like observations marked “captive/cultivated”) to see how well it works. Obviously it will only work well in areas that have lots of observations already, and I can imagine it working better on some taxa than others. I wouldn’t limit it to unknowns, either, maybe go down to phylum and see if it could help address some of the backlog at higher classifications.

That’s my 2 cents, thanks for all the thoughtful comments.


i suggested above that the cv taxon should be captured in a separate field. i’m thinking if it could be expanded to capture confidence of the cv suggestion (see, then i think that gets you what you want, and it could also get what i was describing here:

1 Like

We’ve discussed this idea as a team, and while it’s certainly interesting, it’s unlikely to be something we implement in at least a year, if we do implement it.


I always appreciate it when you let us know. Thank you.


There is a discussion thread about this:

Although there is no definite solution defined, I think a new DQA entry would not hurt and woudl be useful, whatever is done later with these observations:

There used to be iconic taxa buttons?!
I’ve wondered about proposing this before.
Why & when were they cut out? I’d imagine these to be helpful in encouraging users not to use species-level autosuggests blindly so much.


They were removed when computer vision was incorporated into iNat. If we were to add them in again, I think we’d have to figure out a flow that was wasn’t confusing. Currently many people don’t know they can search for taxa and aren’t just limited to the CV suggestions. Anyway, this is getting a bit far afield from the requested feature and maybe something to think about for the app redesign.


Simple experimental solution:

Create a bot user that IDs old, neglected observations. :mechanical_arm::robot::mag_right::card_file_box::spider_web:
Instruct it to never suggest species or subspecies, only genus or subgenus. That way, it won’t move anything towards Research Grade.
You could even have it automatically retract its suggestions when enough human users provide theirs.

And if said bot turns out to be more annoying than helpful, just ban it for trolling! :skull_and_crossbones:


i like this idea, i feel like others might not, but i like it.


My perspective on this is similar to that of @matthias55 (edit: in his 2nd post) above. When looking at observations older than say 18 months, I doubt there are that many observations in the unknown pile that can be IDed to genus easily (be it via CV or manually).

Personally I’m more concerned about blurry images, or nice images showing a landscape with probably 200 species visible, but no focus on any of them and none IDable in any reasonable sense. What I find frustrating about this is that if I leave it unIDed generations of IDers to come will look at this image again and waste their time too. I know, this has been discussed on another thread (or multiple).


i wonder how hard it is to track how many people click ‘reviewed’ on each observation. And maybe after a certain number, an observation could get ‘demoted’ maybe not quite to casual but to a lower priority, or something. Though i will sometimes click reviewed on something i’d have no chance at identifying so maybe it’s a bad idea. Maybe separate them out as ‘extra challenges’ and create a way to mark them more easily as needing casual grade. Something like that


How many Reviewed clicks - is definitely info available to iNat.
Set a barrier - 6 clicks is Unidentifiable.
Then the suckers for punishment can still try and work out What Are We Looking At. If they want to …

On second thoughts. If that was visible to us This obs has already been reviewed six times and is still UNKNOWN It would be a kind warning to scroll on by.


Visibility would be good. Being able to filter on this would be perfect.

1 Like

Yes, you can!

Observations of Aphididae currently without identification at all:

Similarly, observations of Lepidoptera currently without identification at all:


For plants, most useful is trees:

I detest identifications to Plants, Vascular Plants, Flowering Plants, most especially for easy to ID groups like daisies, peas and trees.
Especially when even a novice can easily identify a tree, and trees are such well known subset of the flora of any region, often with multiple field guides and abundant enthusiasts. If novices interested in helping with identification could just be informed about these projects for lower groups it would really help.

But I dont understand why butterflies and moths are not just identified to Lepidoptera, and Aphids as Aphididae? Why do they need a special project?
It is polyphyletic groups such as Trees, Succulents, Bulbs, Seaweeds, Aquatic Plants, etc. that need projects to assist with IDs.
Now if there could be icons for these polyphyletic groups for novices to easily post unidentified observations, that would be supercool!