Add filter for observations with only original ID

Platform(s), such as mobile, website, API, other: Mobile, web

URLs (aka web addresses) of any pages, if relevant: NA

Description of need:
I would like to focus on identifying “ignored” observations that don’t have any identifications beyond the initial ID. I think this would help keep users engaged so they know someone else has at least looked at their observation, rather than just never getting a response of any kind on the platform.

Feature request details:

Add an “only one ID” filter or something similar. I would love to focus on identifying observations that don’t have at least one additional identification beyond the observer’s identification. I know there is a filter for “disagreements” but what about observations that no one has looked at - or at least which no one has added an additional idenfication for.
In detail, describe the feature you are requesting. This includes its functionality, where the feature is implemented, and what it might look like. Screenshots or mock-ups are helpful. The idea is to have a concrete and actionable request which the community can discuss and vote on. It might change through discussion, but it’s much easier to iterate and talk about something specific.

I approved this but after talking with an engineer this would be probably be a heavy lift. He says right now we don’t store these data in our index, so if we added this we would have to reindex every single observation on iNat.

3 Likes

Isn’t this just the basic “Identify” feature? My interest is snakes, so I just push the Identify button, then I add the county that I’m familiar with. These observations are almost all just the observers ID. Sorry if I’m misinterpreting your request though.

1 Like

Identify (as a basic thing) will find anything in Needs ID, whether it has zero IDs or 100. So no, it doesn’t answer the question at all.

1 Like

Forgive my ignorance, but isn’t a re-index eventually going to be necessary anyway? Couldn’t this be tacked on to whatever the next re-indexing is?

4 Likes

I was looking around in the codebase just to investigate this, and I was wondering if this request could be mostly implemented by implementing a filter that utilizes this indexed field (num_identifications) on observations, and checking if it equals zero. I believe this would return observations that do not have any identifications by anyone except potentially the observer (the observation could have 0 identifications or 1 or more identification(s) by the observer). I checked my local environment to test this out, and it seems that field is indexed in my local ES cluster, and it seems to fit my assumption about the field… but I also know that my local data is limited :)

It would not exactly provide what @swisschick is asking for, but it would be closer! Although perhaps OP can chime in and see if this filter would be good enough.

edit: It seems a filter on num_identifications = 0 can also include observations where all of the identifications from all non-observer users are withdrawn… so that is also something to consider!

8 Likes

95% of the observations I see under “needs ID” only have the observers ID. That’s practically what the OP is looking for.

It isn’t. I don’t know what taxa you generally look at, but in taxa that often can’t be ID’d to species from photos, you end up with a situation where there are lots of observations that have gotten a second ID from a specialist but are not RG because they were ID’d at genus or subgenus or species group. In many cases IDers do not regularly use the “ID cannot be improved” for all sorts of reasons, so these observations remain “needs ID” indefinitely.

In addition, taxa that are difficult to identify from photos are often also taxa where there is a shortage of IDers, which means that unless IDers are able to find a way to coordinate their activities, you end up with some observations getting looked at by multiple people and some not getting looked at by anyone at all, with the selection of which observations fall into which group being fairly idiosyncratic.

So for identifiers of such taxa, it would be helpful to be able to distinguish between observations that have been looked at by someone and observations that have not.

3 Likes

I look at plant observations at a range of levels, and I find lots of Needs ID with more than one ID - certainly more than 5%. Even for plants that can easily be identified to species, the observer doesn’t always start with a species-level ID, so it sometimes takes a few iterations to get to a species-level ID, even without disagreements (speaking of disagreements, I recently noticed a notification for an observation which had just got its 14th ID, at which point it finally reached RG). Of course, that’s probably not the sort of thing you’d want such a filter for, but that’s not my point.

1 Like

I like the idea of giving the observer some feedback. One thing to notice is if you are on the iNat website and look at observations in grid view, you can spot the “only original” observations by the little number tag below them, and the observer by icon. So you could filter, say, for plants in a county with not a lot of observations and you might see an observer with a lot of “only originals” and give them some attention. It’s not fast but it can be interesting and rewarding.

I looked through the most recent 1,500 plant observations in my state that “needs ID” (50 pages using the identify function). There were 42 that had two or more IDs so only 2.8%. I used the same parameters for one year ago and there were 149 that had two or more IDs which is 9.9%. My quick guess was 5% and the actual average I found was 6.35%.

More recent observations will have been looked at by fewer people, so it is more likely that “needs ID” observations will only have a single ID.

If you look at just mosses or just grasses instead of all plants, I suspect you will find that a much higher percentage of “needs ID” observations have multiple IDs. Ditto with many arthropod groups (say, diptera, hymenoptera) and probably fungi, gastropods etc.

1 Like

For comparison I looked at two pages of "needs ID” arthropod observations sorting by random and 28/60=~47% of observations had more than one ID, you can try here:

https://www.inaturalist.org/observations/identify?order_by=random&verifiable=true&taxon_id=47120

1 Like

I decided to try a sample of plants on Random in my state (Victoria) and came up with 33 out of 100 having 2+ IDs. Admittedly that’s not as big a sample as 1500 but does suggest a big difference between what you’re seeing and what I’m seeing - which is fascinating. I wonder what the difference is.

1 Like

The difference is “most recent” versus “random”, as @spiphany and @yongestation identified

You only quoted part of the figures:

Yes, my random sample will have gone back well over a year, but there’s still a big difference between the overall average (recent + 1 year) of 6.35% and the 33% I found. I could try sampling each year and see how much the percentage changes over time, but I really don’t care that much. :-)

The plant identifier coverage of your respective areas may be different, or the percentage of observations that are uploaded with an initial species-level ID (meaning that they become RG with few intermediate steps), or IDer habits (whether people add refining IDs even if they can’t suggest a species), or the identifiability of most plants, etc. Lots of factors that might affect the percentage of observations that are identified and confirmed at species level and how quickly this happens.

I suppose this request wouldn’t help IDers find observations that were initially entered with a very broad ID and then refined but haven’t gotten confirmation for the refining ID, but it would at least help find many of those observations that haven’t been looked at by anyone with expertise.

If number of IDs per observation is not currently stored accessibly by the system, I wonder whether community taxon vs. observation taxon is – i.e., if the observation taxon is more specific than the community taxon or there is no community taxon because there is only one ID and there is a way to filter for such observations, maybe this would be an alternative way to find observations that need attention rather than ones that already have a consensus above species level.

1 Like

I imagine it depends on the number and interests of the identifiers in each of our areas. Your area might have more identifiers looking at plants. Or maybe the types of plants in each of our areas has more dedicated identifiers? Like wildflowers versus trees versus mosses, etc. It’s really fascinating to think of all the different types and combinations of variables.

1 Like