Should an observation require 3 IDs to reach "research grade" when the observer is just agreeing with a suggestion?

#1 can be done programmatically with 3 caveats:

  • you can tell if the observer made an identification that agreed with the previous observation taxon, not that they clicked on the Agree button to make that identification.
  • it’s harder to determine exactly when an observation became research grade. you could infer it based on various assumptions, but i don’t think it’s really necessary to incorporate whether the observation actually became research grade or not. i think tor the purposes of this kind of discussion, it’s enough to assume that an agreeing observation by the observer would push the observation closer to RG, if not to RG.
  • this would be done by taking a random sample of the observations

#2 could be harder to do, depending on how you approach it. you could simply take the numbers from iNat’s latest Computer Vision accuracy study and just say that it’s likely that a significant portion of the time, IDs are correct. or you could try to look for how often a disagreeing identification is made after an observer’s agreeing identification (assuming the full identification history is not destroyed by folks deleting their identifications rather than withdrawing them).

i’m always surprised that folks assume that staff should be responsible for this kind of data collection, or that they would even bother with this data collection just because folks are talking about it on the forum. i can’t speak for how staff think about these kinds of things, but the way i thought about this thread is:

the solution being debated is a change to the community ID algorithm. that’s a major change to the system, and that’s a dealbreaker right off the bat, unless someone has made a really strong case for change. has anyone made an actual case for why the benefit for this kind of change is big enough to be worth doing (especially considering all the other things could be done)? no? well, if it’s not enough of a priority for someone to attempt to make the case, then why are we even discussing this?

in my mind, it’s not clear to me why it matters that observations reach research grade with the wrong ID occasionally. i think the assumption with this kind of community ID approach is that these will be discovered and corrected over time. and as others have noted, if you’re really going to use the data for research, it’s the responsibility of the researcher to either review the underly data themselves for accuracy or to otherwise correct for / factor in potential errors in the data.

i think you did your best to help folks get to the right approach, assuming the end goal is to spur action / change, but, sometimes, i think threads like this aren’t really intending to reach any specific action in the end. so if folks want to just talk to talk, then so be it.

7 Likes