“from the groups we’ve looked at” is what it says.
I’m not seeing that quote. I’ll just copy Scott’s post over so that it’s here for easier reference.
We’ve been doing a lot of analyses of the proportion of incorrectly ID’d Research Grade obs. From the experiments we’ve done, its actually pretty low, like around 2.5% for most groups we’ve looked at.
You could argue that this is too high (ie we’re being too liberal with the ‘Research Grade’ threshold) or too low (we’re being to conservative) and we’ve had different asks to move the threshold one way or another so I imagine changing would be kind of a zero sum game.
One thing we have noticed from our experiments though is that our current Research Grade system (which is quite simplistic) is that we could do a better job of discriminating high risk (ie potentially incorrectly ID’d) from low risk (ie likely correctly ID’d) into Research and Needs ID categories. As you can see from the figures on the left below, there’s some overlap between high risk and Research Grade and low risk and Needs ID. We’ve been exploring more sophisticated systems that do a better job of discriminating these (figures on the right).
We (by which I mean Grant Van Horn who was also heavily involved in our Computer Vision model) actually just presented one approach which is kind of an ‘earned reputation’ approach where we simultaneously estimate the ‘skill’ of identifiers and the risk of observations at this conference a few weeks ago: http://cvpr2018.thecvf.com/
you can read the paper ‘lean multiclass crowdsourcing’ here:
Still more work to be done, but its appealing to us that a more sophisticated approach like this could improve discriminating high risk and low risk obs into Needs ID and Research Grade categories rather than just moving the threshold in the more or less conservative direction without really improving things
Thank you @bouteloua, I mis-typed it, sorry. Here is what it says:
This is becoming tangential to the topic of renaming “Research Grade” as it is more about actual data accuracy but my point was for us to not try and make generalizations out of disparate datasets.
Yeah, I was responding to “just conjecture” since they’ve definitely looked into levels of accuracy.
To loop the discussion back around, Scott mentioned potentially weighting IDs differently and/or that the >2/3 agreement threshold may not be the same standard to reach “research grade” (i.e. remove from “needs ID”) in different risk/accuracy scenarios. So community/majority consensus or even “community” at all may be irrelevant.
If, say 5 years down the road, the computer vision IDs a certain species correctly 99.9% of the time, could an observation IDed by CV be “research grade” without a 2nd confirming human ID? Why put those in the default “Needs ID” pool at all? :)
More like an immense proportion. What prompted me to join here, is that I noticed ~80-90% of RG ids in the taxa I work with on here were wrong.
It might be helpful for you to tell us what this mystery taxon is.
The result of extrapolating the “data quality” from specific taxa to all organisms is conjecture.
Aquatic insects in general, and especially Coleoptera and Hemiptera. Many of these cannot be identified from photos, especially a single photo, and most have many very similar species.
It seems obvious that an organism that can’t be identified by photos won’t be identified via photos. That’s the vast minority of species posted on inat. It’s been discussed before and yes at some point it might make sense to make some mechanism to keep these from becoming research grade at a toon specific taxonomic unit but that isn’t set up yet.
I agree with this. I don’t like that I can agree with somebody whogives id tips when I don’t know what the id is otherwise, or that somebody cna agree with me without knowing what the id is, even if I [think I] know it, and it becomes research grade. The one thing is that’s it’s used for research but the more important thing to me is it not longer comes up in needs id and I’m afraid it won’t be found and rectified if it needs to be. There are things I think I know, and I would like to id them, in case somebody can confirm, but I also expect the id to be duplicated, by a certain user/s, so usually I make a comment, or only do the id if I feel like it’s ok if my ID settles it, which is probably good ettiquette anyway but still a little creepy, how easy it is to get RG from a duplication. (Three people would mean at least somebody besides the poster and the first person to id has to agree, which would be safer feeling.)
And in reading the forums I realize this is true and is probably very true among people who know their taxa and so I worry less about it now.
It is difficult to separate the discussion of the label from the discussion on how the label is allocated and, therefore, what it actually means. I believe that the word “Community” attached to any label suggests a broad consensus. Assuming a change to the label will be made before a change to the way the status is derived, it would still be assigned when only two people agree, one of whom can be the observer, which does not represent anywhere near a community consensus.
For newbies looking for a Like or <3 or Thank You option - clicking agree tips their obs to Research Grade … which is not a good way for iNat to function.
And some downvote the Not Wild, because they WANT an ID, thank you.
Which makes for a layer of confusion, and skews the distribution maps.
I propose “Community Reviewed”.
Similar to peer reviewed. It does not imply that the ID is correct and only that the members of the community think it’s correct. All of the above options suggest a level of quality that is not necessarily present. Community reviewed is not a quality assessment but a descriptive one. It can imply higher quality, but is flexible enough to understand that there are times when it fails to provide the correct ID.
I like this. I’ve seen a couple endorsements in other places as well in the past couple days.
This conversation seems to have been ongoing for a long time and there seems to be a general consensus that it would be of benefit to change it somehow. Are there any feature requests stemming from this or development behind the scenes?
Also @JeremyHussell - the existing poll 2 doesn’t show the currently correct options.
I would also vote for community consensus, if I could, out of those two.
However…if this hasn’t been dynamic across the time period, then it seems somewhat meaningless in any case.
It also doesn’t appear to show some of the ideas within the thread, e.g. @ellen5 's …which is one of the best IMHO.
I think, given the number of IDs per observation is massively dependent on geography and taxa…something unnamed, just stating the metric or fraction of agreed IDs, would make the most sense. Or a connected colour spectrum, e.g. from red to green.
I also wonder, why a broader range of these levels couldn’t go to GBIF, with the connected colour or metric. Aiming to achieve RG is connected to me to the desire to be an external datapoint. I’d be less inclined to see it as an either/or goalpoint if there was a broader spectrum of ID quality being passed through.
Also, with regard to
This comment is internal to iNat. I think the spectrum should be visible outside of iNat.
Finally, I think the terms are currently quite lengthy, which can be offputting, but a spectrum of colour, numbers, or acronyms would be simpler, and forces those external to iNaturalist to investigate what that means in order to understand it. This has acceptable precedence for me - with use of creative commons acronyms for example.
The issue here whether it is implicit or explicit, there will always be some kind of status, regardless of what you call it or how it is attained, because at some point you have to decide that a record no longer needs to be considered in the needs ID pool, when to share the data with external partners etc.
sure - I get that. I just mean recognising the spectrum of the quality of the data beyond the “needs ID” point