As I said, we agree that the kind of analysis you describe is desirable. However, I was reacting to the sentence you’ve repeated here: “Before proposing a fix, you need to quantify the problem.” You’re not really in a position to impose demands on other people, and I consider this particular demand to be obviously unreasonable.
You realize that the average commenter doesn’t have the same resources available to them that the iNaturalist staff does, right?
I myself didn’t say it was, I said it may take a long time (which it at least can, depending on sample size, etc.). I also said that any level of quantification (to the extent possible) would be informative. Replying to your comments about the experiment, there was also a similar forum topic which people commented on in the last 1-2 years. Unsure if it was the same 2017 experiment or a different one. As I remember people saying in that later discussion at least, there are many complexities in interpreting the results in a literal way. In short, I and I think many others expect the true accuracy may be at least slightly lower than those quantified estimations. Plus they assume experts are always correct, which can’t be assumed for a precise estimation. Anyway, I wasn’t against the suggestion of further quantifying.
I think I’ll drop out of this part of the discussion after this, but since both you and I apparently enjoy the literature on cognitive biases, I have a suggestion that has been helpful to me and might be helpful to you as well. Put some effort into watching various cognitive biases play out in your own mind. How often you notice these biases in yourself is a kind of scorecard for how well you understand the phenomena.
How’s that a question? Everyone who did some iding can see that the most of new users jump on each new id an there’s a string of their agreements from e.g. order to species. It also happens with quite a few big/old users, and many new users leave the site, so they never learn what is what, and some don’t care. It is a frequent problem, and OP wanted to have a solution that works only in those cases of observer agreeing with id, not all ids.
you can tell if the observer made an identification that agreed with the previous observation taxon, not that they clicked on the Agree button to make that identification.
it’s harder to determine exactly when an observation became research grade. you could infer it based on various assumptions, but i don’t think it’s really necessary to incorporate whether the observation actually became research grade or not. i think tor the purposes of this kind of discussion, it’s enough to assume that an agreeing observation by the observer would push the observation closer to RG, if not to RG.
#2could be harder to do, depending on how you approach it. you could simply take the numbers from iNat’s latest Computer Vision accuracy study and just say that it’s likely that a significant portion of the time, IDs are correct. or you could try to look for how often a disagreeing identification is made after an observer’s agreeing identification (assuming the full identification history is not destroyed by folks deleting their identifications rather than withdrawing them).
i’m always surprised that folks assume that staff should be responsible for this kind of data collection, or that they would even bother with this data collection just because folks are talking about it on the forum. i can’t speak for how staff think about these kinds of things, but the way i thought about this thread is:
the solution being debated is a change to the community ID algorithm. that’s a major change to the system, and that’s a dealbreaker right off the bat, unless someone has made a really strong case for change. has anyone made an actual case for why the benefit for this kind of change is big enough to be worth doing (especially considering all the other things could be done)? no? well, if it’s not enough of a priority for someone to attempt to make the case, then why are we even discussing this?
in my mind, it’s not clear to me why it matters that observations reach research grade with the wrong ID occasionally. i think the assumption with this kind of community ID approach is that these will be discovered and corrected over time. and as others have noted, if you’re really going to use the data for research, it’s the responsibility of the researcher to either review the underly data themselves for accuracy or to otherwise correct for / factor in potential errors in the data.
i think you did your best to help folks get to the right approach, assuming the end goal is to spur action / change, but, sometimes, i think threads like this aren’t really intending to reach any specific action in the end. so if folks want to just talk to talk, then so be it.
I agree that sometimes people just want to talk and get feedback, which is fine.
I also agree with @murphyslab in the sense that formally making a decision on this issue would require more analysis to proceed. I have no idea what the scale of the possible ‘problem’ is. In my personal experience it seems to be small. But that is only my perception. I have also voiced my opinion on this issue on several posts (see my comment above).
In principle, yes. Without privileged access to the back end, though, I don’t know how you would get all of the identifications associated with a particular observation other than by pulling up the observation on a web browser and manually entering the data you see on the screen. Do you have a way of doing this?
of course. iNaturalist’s API is relatively good at providing a lot of useful information. just make the necessary GET /v1/observations requests using whatever tool or programming language you like. the staff would probably do the exact same thing to get this kind of data.
Thanks, I guess I hadn’t gotten that far the last time I was trying to figure out anything with the iNaturalist API. I’m certainly glad I’m not trying to actually do this analysis through the API, though. With access to a pile of related tables this would be easy…
Indeed! And, if the Withdraw function even existed in the iOS app.
When I exclusively used the iOS app, I had people commenting to me to use the Withdraw. At first, I thought they were asking me to withdraw from using iNat at all, and I was a bit disconcerted. Someone finally cleared it up saying I could only withdraw on the website app. Huh? there’s a website with more features than the phone app?
No horse in this race whatsoever, as someone that spends time identifying I do see instances where it appears a user simply agrees with someone else’s ID that otherwise wouldn’t be able to come to that conclusion independently.
My reason for posting here though is more around the discussion of quantifying the occurrence of this issue. I would argue this would be very difficult if not impossible to do, given you really can’t be certain when it actually happens. For example, I am relatively new into botany and I will redact change my initial ID (say genus or family) and provide a species-level identification that agrees with the previous user’s identification once I am able to independently come to that conclusion. This would be virtually indistinguishable from an instance where a user is just clicking agree without any knowledge of the actual identification. All said in done the inaccuracy of quantifying the issue may be higher than the rate of occurrence of the issue itself
An additional consideration is, how would requiring 3 IDs to reach RG affect situations where an obs. community taxon is stuck at a coarse rank due to a relatively evenly matched number of opposing incorrect IDs and correct IDs? Would it hold things up even more from becoming RG for the correct ID (require more correct IDs), or speed it up, or change nothing? In the event it would hold it up more that would be another drawback. I recently IDed many wasp obs. stuck at coarse rank (order, suborder, infraorder, superfamily), and was surprised at another often described problem, how held up they get despite having multiple correct IDs. That said, tweaking the ID algorithm (depending on the specifics of a given change) can potentially create more problems than existing ones. In other words there will be certain problems with each alternative, so we should consider the best algorithm (whether the present one or modified) to be the one which is overall better by resulting in less problems (but not none).
Maybe I missed it but I don’t remember anyone assuming staff should do that. I thought it was mostly murphyslab making the suggestion and it was for the topic’s author.
In general, I do predict that researchers (iNat users or external researchers) will at some future time continue to test various aspects of iNat accuracy/stats, although not necessarily this specific question. e.g., At least one and I think multiple experiments/tests have already been done, and staff has also calculated some related stats and have a natural interest in others doing them. No one’s obligated to do them but it is very helpful if anyone does. Doing experiments has the potential to improve the research quality of iNat and may be persuasive to some external scientific sources who mistakenly assume iNat data is entirely unreliable.
YES! I completely agree with this. I’ve come across some observations where 4-5 people have blindly followed a clearly mis-identified posts by the OP or by 2nd parties. I think many novice posters are just trying to support their friends, but this presents a real problem for what is definable as “research grade.”
A bit aggro with the tone here, dude. It’s not a cognitive bias, I made no claims about how widespread the issue was, just that it was a thing that I have observed over a long period of time being an identifier and user of iNat data.
Sure, if I was making a feature request doing a formal analysis of the scale of the problem may be an appropriate first step, but for starting a general conversation about what other people think about this issue I don’t think that’s a requisite first step. And chiding me for not doing a quantitative analysis first is ridiculous. You can just say you don’t think it’s a big problem and roll on by.
That is frustrating, but this proposal would do nothing to change 3+ wrong IDs. All we can do is disagree and maybe recruit others users, if there are any familiar with the species in question. In your case, you can add that you are a published authority on Castilleja and here’s why they are wrong about ‘x’.
Kind of arrogant calling for a massive change in the functionality of a process with zero evidence.
The thing is that you raised two issues, not just one. And the 2nd issue was framed as contingent on the first. And you offered zero evidence for the first. If it were just the 1st issue (“Hey I’m seeing a trend here”), I wouldn’t criticize that. If it were the 2nd issue, separately, I would disagree given the evidence available. But you chose to make that flimsy argument, Paul. And I merely pointed it out.
Yes, I almost always do just that (add a clarifying comment). Because at least 95% of my 40,000+ identifications are of Castilleja species and almost all the rest are of closely related genera, I can provide at least a somewhat numerically-informed comment, if not “rigorously quantified.” I do run across this issue on a regular basis, perhaps 1-2 per 100 observations. I’d also observe that these cases occur with much greater frequency (ca. 9 out of 10) with species that have closely-related species with similar morphologies and occur in the same area. A perfect example is Castilleja exserta and C. densiflora, which are similar in color, growth form, and phenology when not examined closely, and they occur in the same areas and even in the same general location (e.g. Edgewood Park, in the south Bay area). These two species are frequently (ca. 1 out of 4) mixed up, with errors in both directions. Fortunately, unless the photos are really poor quality, a zoom-in allows for definitive ID, when the best characters are looked for. Some posters have responded to me that they just accepted the machine-suggested ID, and then others agreed, for whatever reason. I’d also note that there are other drivers of this problem that pop up only occasionally, such as people who were on the same field trip and were told by a leader or a “checklist” of the site that the plant they found was species x, and then they post their observations and agree with each other.
Anyway, as egordan88 observed, a 3-agree rule would help but likely not solve the problem.
I’m not sure I think 3 minimum identifications is a a good strategy, but I did want to point out that this proposed change would reduce identifier time in some areas.
In my experience, finding and correcting incorrect RG observations is much more time consuming on a per observation basis than confirming/correcting ‘Needs IDs’. So every observation that would be saved from incorrectly reaching RG by upping the bar might potentially save someone time in the long run.
I do think it would ultimately increase effort required overall, just wanted to point out a potential plus that wasn’t just related to better data quality.