Why are Two agreeing IDs (and no disagreeing IDs) needed to achieve research grade?

I do not interpret the term Research Grade as meaning the CID constituted research. Rather, I interpret it as “the observation has passed one filtering step that leads to its inclusion in the dataset used by researchers.” Researchers ought to also review the data quality before its use. I appreciate that it is easy to find and address inaccuracies on iNaturalist due to the CID system.

7 Likes

Bryce,

I was just looking at your profile, and I have to say, I appreciate you writing a thorough bio!

And from one prairie person to another, I love that you’re repping the prairie crocus, both as your profile picture, and as your banner!

The first prairie crocus that I see, around Easter, is one of my favourite moments of the year in Calgary.

2 Likes

One big problem is that not all species require the extra level of attention, and arguably most of the species that get observed are large, common, and easily identifiable. You don’t really need multiple reviews of House Sparrows and Mallards.

On the other hand you have very rare species that are not commonly reported and the number of observations on iNat are low enough that the current identifiers can easily take care of them, so requiring more ID’s on these isn’t really necessary. See Bombus ashtoni.

While there are certainly species that maybe should have more review before going to GBIF, how do you determine which ones? We’ve currently got half a million species some of which should have more reviewers, and some which don’t need it.

To repeat what others have said though, it doesn’t matter if something is Research Grade or not. Every single observation is up for review all the time. If you see one that’s wrong, Disagree with the ID.

6 Likes

I think the thought process is that requiring 3 IDs would leave an ungodly number of observations in “Needs ID” perpetually because there aren’t enough identifiers for many taxa to make 3 IDs a reasonable goal. Even how things are now, for the taxa that I identify, there are tens of thousands of 2+ year old observations that are very identifiable and yet haven’t received any ID attention since they were posted. Doubling the requirement of how many additional identifiers (beyond the observer’s initial ID) are needed to pull things out of “Needs ID” would be a horrible mistake.
As for the meaning of “research grade”, my understanding is that it just means these are the observations a researcher should prioritize looking at, as they’ve had multiple eyes on them already. No researcher would ever uncritically accept these IDs, any more than they uncritically accept the IDs of museum specimens, GBIF records, etc. Every dataset has errors- the “RG iNat observations” dataset has a smaller proportion of errors than the “All iNat observations” dataset, and so is often preferable to focus on for research purposes. On the other hand, as someone doing identifications, the dataset of “Needs ID iNat observations” is where it’s most efficient to focus my attention, because that’s where I’ll find most of the errors.
So I’d push back against the assertion that RG means “nothing”. It certainly doesn’t mean “100% accurate”, but no one is under the impression that it does, so that seems like a moot point.

6 Likes

No, just 2.

2 Likes

Sorry, when I say ID I mean anything added not by the observer.

2 Likes

It’s obvious to everyone here on the Forum that RG means that the observation has been checked at least once but it doesn’t mean that the identification is correct. However, there seem to be a lot of people, including researchers or potential researchers, who think it means the observation is correctly identified. So, pedantic person that I am, I disagree with your statement. I agree, however, with the thought that (I assume) is behind it.

It seems this is another case where the human tendency to simplify gets in the way. Either RG observations are terrible, useless, trash, or they are all correct. Sigh.

My experience and the results of the recent tests of ID accuracy suggest that the majority of RG observations are correct. However, the rate of correct ID’s varies by species. It’s 95% or more for many (most?) species (both common, easily ID’d species and species so little known that only really experienced people apply the name). For some other species, the rate of correct ID is dismal. Responsible researchers should check the ID’s of observation they want to use, at least enough to know whether the error rate is low (acceptable) or high. Sadly, it never occurs to some of these people that they need to check, that Research Grade doesn’t mean “all ready to suck into your analysis.”

8 Likes

Yes - sorry - I see iNat actually calls it - Community Taxon.

join the rest of us (from that woman who whines!)

Please reconsider your Ancestor Disagreement to convince the CID (and now prompted to correct this whinge to Community Taxon!) algorithm.
Your ID is helping to hold this obs back at …
https://www.inaturalist.org/posts/25514-clarifying-ancestor-disagreements
iNat’s own explanation. With 5 years of comments.

2 Likes

As others have argued, increasing the number of IDs required for a RG observation is probably not a solution. Messing with the DQA is probably not an effective solution either. To find errors, I’m afraid we (identifiers) are stuck with reviewing RG observations. Anyone who’s done this knows it can be an extremely tedious process.

Everyone would be better served if we switched our focus to the leading ID (which usually comes from the observer). Observers that lead with an overspecified species-level ID are the main source of the problem. We all do it—some more than others—partially because the system over-incentivizes RG observations. We need counter-incentives that balance out the system.

Suppose the observer is unsure. Using the algorithm as a tool, the observer leads with a genus-level ID. Now suppose two identifiers (not the observer) eventually follow with two species-level IDs. This pattern of behavior avoids the problem altogether. Yes, it requires the input of two identifiers but only in those situations where two IDs are actually required for a correct RG observation.

So how do we incentivize observers to be more judicious in terms of guessing a leading ID and agreeing with the leading ID of someone else?

4 Likes

I guess two possibilities could be (1) stop the CV suggesting species-level IDs altogether/when first uploaded?, so people would need more specific knowledge to give a precise ID; and (2) pop up a dialogue box when the observer wants to agree with a new/different ID on their observation, something along the lines of ‘you should never blindly agree with any ID - do you actually have any reason for thinking this is correct?’

The first sounds very annoying for those (like me) who upload plenty of things they’re familiar with but use the CV options whenever possible to save typing (and possible typos!). At the same time, I’d love the option of genus-level and up IDs only in animals, which I typically know little or nothing about.

As for the second, it could never stop blind agreement, but maybe something along those lines might discourage it?

[edit: I see that someone made a feature request similar to my second suggestion a few hours ago. Believe it or not, I didn’t see that until after I wrote this!]

3 Likes

The new Next shows the taxonomy levels?
Then we could choose the level that fits our knowledge - without first having to go the taxon tab, to find the levels, to choose the level.
If there are already 2 IDs I can use the CID popup to show the levels.
But newbie identifiers have to know, or care, to find the level that fits their knowledge.

I’m meaning when people are initially uploading, whereas you sound like you’re talking about subsequent identifying? I’d love to be easily able to see/choose taxonomy levels when uploading - is that only the new app, or the website too?

Sorry, not buying that excuse. Communication is a basic skill required for interacting with other people. Sometimes, inevitably, we will do so poorly or our efforts will be misunderstood. But the occasional miscommunication or the additional few seconds required to write a comment instead of clicking a button is no reason not to use our words. iNat is not an impersonal collection of data that needs to be sorted out as quickly and efficiently as possible. We are interacting with other people with feelings and opinions.

Except that this button does not communicate anything. It does not tell users why you checked it – all people will see is that for some reason the observation is not RG in spite of the fact that it has two or more non-conflicting species IDs.

If the reason you checked “ID can be improved” is because you are skeptical about the previous IDs, the box does not fulfill the intended purpose – namely, giving the people who entered the IDs a chance to defend or reconsider their IDs, because they will get no notification whatsoever. They will not even know that the observation is no longer RG. The only way they are likely to learn that something has happened (not why it has happened) is if other people subsequently start adding IDs (which depending on the taxon and how recent the observation is, may take a long time) and they notice at that point that the observation is no longer RG.

It is a “waste of time” if the IDers would not be looking at the observation if it were already RG (some IDers do prioritize “needs ID” over “RG”, particularly for taxa that are usually not problematic). It also means that an observation which has multiple confirming IDs may be prevented from being shared with other databases for years simply because someone has checked a box keeping it from becoming RG.

3 Likes

But if the reason that box was ticked was because the person felt significant uncertainty of the ID’s correctness, that may be a good thing, preventing wrong data from being shared. I agree that there are probably better ways to handle it, but would I prefer wrong data to no data? No.

2 Likes

Sorry, but I’ve wasted too much [of my] time over the years doing the explaining-communicating-teaching stuff --sometimes at great length, even considered writing blog posts lol-- all to no avail (observers gone, observers unresponsive, observers offended, etc.).
Gave up on playing the talkative nice guy doing all the guesswork (what language does the observer speak? will they even notice a question or comment in their app?). Drastically reduced my identifying too (largely due to broken notif & leaderboard systems & too much tricky diplomacy to enact & understandable reluctance from iNat to ‘fix’ things).

Hence my personal choices as an identifier (not as a social network user), in order to retain some peace of mind in the face of iNat’s shortcomings and iNatters’ behaviour. :) Didn’t realize the ‘sociability’ part of iNat was such an important prerequisite to some. Sorry too for that, as an asocial creature I may apparently have broken some site rules and/or cultural guidelines I didn’t know of – have not been flagged for that (yet!). If my use of the DQA in order to trigger more serious review of dubious ‘RG obs after 2 agreeing IDs’ (while sparing me fruitless attempts at interacting with observers) irks anyone - feel free to report it.

If the greater concern is about ‘databaseability’ of iNat stuff … IMHO the more something gets eyeballed the better it gets, and if it takes 5 years until some expert reverts my DQA then so be it; and IMHO again, better have dubious data kept out of databases for too many years, than having poor data entering databases in a snap (as is already the case in my area, largely due to pl@ntnet + iNat’s RG-with-2-IDs!).
Let’s agree to disagree, I imagine.

2 Likes

Gerald must be pretty good then! ;-)

4 Likes

I don’t use the app so am parroting what I read on the Forum threads.
I would like to know what improvements we can hope for on the website.
So far - vanishing placeholders will be dead - so that is an improvement (you can use Notes or a comment)

The “Yes, it can be improved” button is valuable at times, but because of its drastic and long-lasting effects*, I think it should be used rarely.

I use it only if there are several ID’s for species A and I know it’s not species A.

  • The observation marked “Yes, it can be improved” remains “Needs ID” even after enough people have corrected their ID that it would normally go to RG. Since few people understand what’s wrong, the observation often languishes for years.
2 Likes

Well, thank goodness for that. It was “vanishing placeholders” that first brought me to the forum. I still don’t know what they are (or were), but I was guilty of committing one and it’s good to know that I will not inadvertently do so again.

1 Like

They still are on the website.
We have caught 29K
https://www.inaturalist.org/projects/placeholder-backup
if anyone would like to pick a location to help ID.
For those - is this for the beetle or the flower obs - the answer might be / have been (thanks, iNat) in the placeholder - Beetle please?

1 Like