Provide a way to filter observations by disputed IDs

I would like the ability to filter observations by IDs that have been disputed. (apologies if this has already been proposed, I searched for it but didn’t find anything. There was a discussion about finding your own “maverick” IDs by typing into the URL, but this is a different proposal. If there is a way to type directly into the URL and achieve what I’m proposing here, please let me know as it would solve this problem)

Often I am working through higher level taxonomic groups and find disputed IDs that have been kicked up to a high level and need additional IDs to reach genus or species level.

For example, suppose that an observation is (incorrectly) labelled winterberry. It is corrected to honeysuckle. Now if the original observer doesn’t take any action, the observation will sit at dicots (class) until two additional identifiers find it and agree with the correct ID. Sometimes it takes months or even years to get the 3rd identifier.

I think what is happening is that the observation gets buried in large numbers of higher level IDs, some or maybe many of which are just low quality observations (blurry photo, picture of a tree trunk, or a single leaf) or hard to identify past, say, dicots. The key point is that a lot of the disputed IDs would not be that hard to confirm if they could be found easily. Having the ability to filter for these observations would decrease the time that the disputed observations spend in taxonomic limbo.

1 Like

@matthias55 I changed this to a general topic instead of a feature request, because I’m pretty sure the following will do what you are asking. But if not, we can change it back to a more targeted feature request.

[EDIT: turns out that the following may not be a reliable long-term solution to this request, so converted it back to a Feature Request.]

This will work in Identify mode (not in explore Observations mode), but that is probably where it is best used anyway. If you follow this link:

https://www.inaturalist.org/observations/identify?order_by=updated_at&order=asc&lrank=phylum

You will see the Filters set for

  • lowest rank of observation ID = Phylum (change to whatever rank you like)
  • sorted by date the observation was last updated, oldest first

Then if you manually add

  • &identifications=most_disagree

to the search URL, it will restrict to a much smaller set (I currently see 858 at Phylum) where most of the IDs added were disagreeing IDs.

I think that is the kind of search you are looking for, but again, if not, we can discuss a more targeted feature request. I know the filter interfaces are being reviewed for possible improvements, so maybe this is one that will become a regular filter option.

The other available options for this parameter are

  • &identifications=some_disagree
  • &identifications=most_agree

Those didn’t seem to have much effect for this particular use case.

13 Likes

I would give this two hearts if I could. Can you add it to the search wiki? I would do it but I don’t know what the difference between “some disagree” and “most disagree” is. Testing it out on plants, “some disagree” seems just to give me items where there is only one ID.

3 Likes

We can add it, but I have to confess I am not sure about the differences either, or whether this functionality will even be continued, after reading this old topic. I just know it seemed to work for this particular use case.

thank you very much!

hmm, I think I replied thank you too soon.
I tried this and it doesn’t seem to capture most of the disputed IDs.
e.g. with this link:
https://www.inaturalist.org/observations/identify?iconic_taxa=Plantae&order_by=updated_at&order=asc&lrank=class&place_id=42&identifications=most_disagree

I only get 9 results.

If I take off the “&identification=most_disagree” there are 7729 results.
I would guess that all the “disputed” IDs would yield at least hundreds of results.

Even if I go down to family level (rank high = family). there are only 15 hits in all of Plantae. There should be 1000s here.

1 Like

@matthias55 Yeah, as mentioned in the old topic I linked to above, it turns out that the most_disagree functionality is based on an outdated model of identifications in iNaturalist, and may soon be discontinued altogether. So I will go ahead and convert this topic back to a Feature Request.

A couple of questions:

I notice that your link restricts the search to just Pennsylvania. I assume that you are expecting to see higher numbers just within Pennsylvania?

Can you post a sampling of links to observations that you think the search should be capturing but is not?

Thanks!

Hmm, I’ve worked PA, OH, IN pretty heavily. I would expect all these states to have similar levels of disputed IDs because they have enough folks going through the flora observations and catching mistakes.
I would posit that it would hold across the US and Canada and really I would expect that you’ll find this level of disputed IDs in any taxonomic group where someone is working to catch errors.

Some examples of what I would like the filter to catch:
https://www.inaturalist.org/observations/7055140
https://www.inaturalist.org/observations/7066901
https://www.inaturalist.org/observations/7045167
https://www.inaturalist.org/observations/7078137
https://www.inaturalist.org/observations/7230415

these are all situations where an initial ID is provided and then disputed by 1 or more identifiers, keeping the community ID
higher until enough additional identifiers find it.

Here’s one that’s slightly different:
https://www.inaturalist.org/observations/7399915

The identifier disputed the initial ID, but didn’t propose a low level ID. “e.g. it’s not what you said, but I don’t know what it is.”

I would like the filter to catch this as well.

Here’s one that got stuck at family level (and has been stuck for 3 years, though it’s easily IDed as glossy buckthorn)
https://www.inaturalist.org/observations/762205

Thank you!

1 Like

Sorry, I wasn’t very clear with my first question. Is the number of disputed ID observations returned by the search a lot lower than you were expecting to see within Pennsylvania?

I think you are saying yes – I just wanted to make sure it wasn’t a result of having inadvertently limited the search to just that state.

Yes, typing “&identifcation=most_disagree” yields far fewer results than a query that would find every time someone disagreed with an ID. It’s also not just PA:
E.g. I did the same search for California plants at the phylum level.
32K results without “&identification=most_disagee,” 50 with that extra search term.

if I go to class, it is 70K without, 118 with.

We’re talking maybe a 2 orders of magnitude difference between expected and yielded.

I briefly have looked for a pattern in the 50 results that were returned and I can’t find anything obvious. The dates range from 5 years ago to as recent as a month, so it’s not like just the older observations are getting returned.

2 Likes

I was also looking for a tool like this. I played around with it and I agree with @matthias55 - adding most_disagree to the search returns far fewer results than expected. (Some_disagree, by the way, doesn’t seem to narrow it down at all.)

I was looking at Asteraceae at family level within Australia - the initial search has 26 pages of results: https://www.inaturalist.org/observations/identify?iconic_taxa=Plantae&order_by=updated_at&order=asc&lrank=family&taxon_id=47604&place_id=6744

But with most_disagree you get only 2 results: https://www.inaturalist.org/observations/identify?iconic_taxa=Plantae&order_by=updated_at&order=asc&lrank=family&identifications=most_disagree&taxon_id=47604&place_id=6744

These are the most_disagree results:
https://www.inaturalist.org/observations/115108 - this one appears as Asteraceae because that was the original observer’s opinion; someone else disagrees, but they’ve opted out of community ID.
https://www.inaturalist.org/observations/17210269 - this one had the same observer mistakenly add an incorrect ID and then changed their mind, so not really a “disagreement”. (looks like their new ID was still wrong, but that’s beside the point)

I guess this search returns observations where there’s one ID of Asteraceae and another ID outside that family? What I’d really like to find is observations where more than 1 user agrees that it’s within that family (or class, order, whatever) but they disagree on the lower level ID.

Ideally, that search would return results like this:
https://www.inaturalist.org/observations/22131024
…where refining the ID would often be pretty straightforward.

Is there any way to do this currently? I think it would be quite useful!

From the staff regarding the some/most disagree parameters:

The only way to get results like that that I’m aware of is to specify one of the IDs, e.g. Asteraceae observations where at least one ID is Cirsium vulgare: https://www.inaturalist.org/observations/identify?taxon_id=47604&ident_taxon_id=52989&rank=family

But that’s just a partial solution.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

i made something that may help to address the problem described in the feature request: https://jumear.github.io/stirpy/lab?path=iNat_obs_with_disagreeing_ids.ipynb.

it’s a notebook that contains Python code, in a framework that should allow you to run and adapt the code from your browser (without having to set up a Python environment on your machine). hopefully there are enough comments in the script that you’ll be able to understand what it’s doing and be able to run it without too much effort. if anyone has any questions, please don’t hesitate to ask.

4 Likes

Ancestor Disagreement within the CID algorithm ^^

Pre-Maverick project filtered for Pennsylvania still has 5K obs - then you can filter to taxon or finer location (plants just over 1K)
https://www.inaturalist.org/observations/identify?project_id=156949&place_id=42

And about 1K which have already been resolved from the pre-maverick filter https://www.inaturalist.org/observations/identify?place_id=42&project_id=178331

Disagreement held at Plantae for Pennsylvania only 200
https://www.inaturalist.org/observations/identify?per_page=10&lrank=kingdom&taxon_id=47126&ident_taxon_id=56327%2C311249%2C311313%2C50863%2C311312%2C64615%2C57774%2C211194&place_id=42

1 Like

Made a github issue here: https://github.com/inaturalist/inaturalist/issues/4188

5 Likes

if you guys are doing new development for this, can a new disagreeing_id accept a value latest, which would return a subset of yes where the latest id in the identification chain is the disagreement?

i think this could help to find cases that are identified incorrectly even if they are still research grade.

4 Likes

I can’t wait!

You can use various permutations of three search filters: &taxon_id=, &lrank=, and &ident_taxon_id=.

For example, this search should give you all observations at Angiospermae with ID disagreement:

https://www.inaturalist.org/observations?place_id=any&taxon_id=47125&lrank=subphylum&ident_taxon_id=47124,47163

Here &taxon_id is the taxon ID # for Angiospermae, &lrank is the taxonomic rank of Angiospermae (i.e. subphylum), and &ident_taxon_id is a list of the taxon ID #s of all taxa exactly one rank lower than your search taxon (in this case, just Magnoliopsida and Liliopsida).

This works (well, I think it works) because &ident_taxon_id=, which the search url wiki says filters for “observations to which a particular ID has been added (even if that is not the consensus ID),” actually filters for that particular ID taxon and its descendants. Plus, it can take multiple (comma separated) taxa!

If you wanted all Tracheophyta with ID disagreements (another taxon things get stuck at), you’d put that taxon ID # in &taxon_id=, phylum in &lrank=, and the seven classes under Tracheophyta into &ident_taxon_id=. Same process for Magnoliopsida (dicots), except you’d have to put in the taxon ID #s of all 53 orders under it to get all Magnoliopsida with ID disagreement (I’m sure it’s pretty trivial for those with some coding knowledge to get a comma-separated string of that, but that is not me).

If you don’t want to see all records with every type of ID disagreement in a particular high-level taxon, you can just put the taxon ID # of a taxon you’re proficient in IDing into &ident_taxon_id=, set your lowest rank filter one rank higher than that taxon, and that’s it! It should pick up observations all the way up to ‘State of matter Life’ in which at least one IDer has suggested an ID of your taxon (or one of its descendants). This (hopefully) shows all disagreements (and only records with disagreements) because if there were no disagreement, it wouldn’t be at a higher taxonomic level than what you’ve put into &ident_taxon_id=.

Here, for example, are all(?) verifiable records with disagreement (above genus) where at least one person has suggested the genus Brassica, or one of its species: https://www.inaturalist.org/observations?place_id=any&ident_taxon_id=53113&lrank=subtribe

Note that I’m not 100% certain this returns all records with disagreements at a particular taxon, since I haven’t figured out how to return records at a particular taxon with no disagreement (e.g. all records at Angiospermae that are as such not because of disagreement, but because no one has refined it further). If that number plus the number of records at Angiospermae with ID disagreement returned from the above method equals the total number of records at Angiospermae, then this method does indeed return all such records.