Rampant guessing of IDs

That explains a lot! I really hope it all will be improved, don’t know why it wasn’t for so long.

3 Likes

60-80% accurate means 20-40% inaccurate, which is significant. I didn’t say “most.” But for the taxa I’m involved with, cicadas and crayfish, it’s much worse. For crayfish, there are about 400 species in North America, a large percentage of which are observed on inaturalist, but only about 10 or 25 (depending on new or old criteria for CV training) have enough research-grade observations to be included in the CV, which means CV is going to be wrong for 97 or 93% of crayfish species. And often people seem to disagree with the top CV suggestions and pick one that they think looks more like their specimen, usually making the guess even less accurate. For cicadas, there isn’t the diversity seen in crayfish, but many specimens are difficult or impossible to identify even by an expert, especially for tenerals and nymphs. The large amount of guessing of these IDs just places more burden on the identifier to clean them up.

8 Likes

I can sympathize, @dan_johnson, having spent a lot of effort following up with Magicicada observations to explain why tenerals, nymphs, and exuviae can’t/shouldn’t be identified to species. Sometimes this takes more than one interaction, and a lot of time. The CV still often goes right to Magicicada septendecim, adding to the problem.

IMO, the initial guesses aren’t the biggest problem - they can often help experts find the observations more quickly to add a correct ID. The thing that really gets me is when two or three top identifiers/experts have already added a correct, research-grade ID, and then a new and unknowledgeable user comes along and adds an incorrect ID, requiring another follow-up. I sometimes wish there was a way to “seal” an observation after it’s been correctly identified by several experts, but that has the potential to be abused and could open a potentially disastrous can of worms.

Adding to the problem is that many app users don’t know how to read comments that are added to observations (or that those comments even exist), so they may never see attempts to communicate. Maybe if those comments appeared on the main screen for each observation, communication would be more effective?

3 Likes

Hi @sbushes,

In software development, when a feature request or a bug report is rejected as a duplicate, that means “we agree this is worth doing, but we’re already tracking it somewhere else, and we don’t want to track this problem in two places.” The word rejection sounds a little confrontational and dismissive, but it isn’t meant that way.

I’ve already got a large list of bugs and features to do for the next release (many of which are android feature parity issues), but I’ll keep your feedback in mind when we come up with a list of todos for upcoming iOS releases.

Thanks,
alex

11 Likes

I have found this to be a problem sometimes. Too many inexperienced people making a stab in the dark, treating iNat like a social media site like Facebook, sometimes downgrading the ID. IDs should be treated with cogitation and are better left to a knowledgeable person and wild guesses to a chat group.
I have tried corresponding with one such person without success and will probably have to delete the observation and upload it as new to get rid of the downgrading to ‘Life’!

1 Like

Well, let’s not blame the users when the on boarding and ID training is left to happenstance and discovery. I do not know about Android, but iOS users have a limited feature set and only a limited orientation video to learn by… unless they just happen to discover the website and the FAQ (there’s no obvious link to the web app from the iOS app).

5 Likes

so ive got some bad news about scientists…

20 Likes

I disagree, I think you can, and I think that you can do it both through the design and features of the website and app, and through the culture and practice of commenting.

You yourself propose a solution in terms of tutorials.

I think there could be more done though. One thing I would really like to see is some sort of metric that measure’s the accuracy of a user’s ID’s. However, I would want it to reflect not the initial guess but whether the person is actually responsive to the contributions of other users.

If a user repeatedly makes incorrect ID’s and then never updates them, this is contributing to lower quality of data on the site. As such, I think it would make sense to track these things and identify the users who do this and then, if it is deemed that they have either abandoned the site (i.e. a user who made a lot of ID’s early on and then never comes back on to even check messages) or that they are continuing to use the site/app but are not updating or engaging with these discussions, then I think it would make sense to disable the weight of their vote the ID’s. Doing so would improve the integrity of the data.

When I look through obviously-misidentified observations, a huge portion of it is from these types of users who don’t respond to the ID’s, and you can see on their profiles that many of them aren’t active any more.

There is no way to force people to change their behavior, and it’s inevitable that some users, especially more casual ones, will abandon an app or site. So it makes sense for us to update or refine our policies to protect the integrity of the data, by taking this into account and using algorithms to measure and identify these sorts of things.

I also think that if we were to do something like this, it would actually provide a strong, intrinsic motivation for people to respond to disagreements about ID’s. If people had an accuracy percent or something displayed on their profile, or even if it were just privately displayed to them, it could help people to strive to become better at ID’s.

And if we designed the calculation the right way, it could be done in such a way that it did not in any way penalize guessing or restraint…i.e. it only would kick in if a person was not actually resolving disagreement, and a person could always rectify the situation by either withdrawing their ID or moving it up to an agreed-upon taxon.

2 Likes

So, how do you disagree and in the end say the same thing? Now we can’t do that, it’s impossible, some day in the future probably, though I doubt any onboarding will be enough for all of users and some people will refuse to change anyway, but we all here support new tutorials and wait for them.

1 Like

this sounds super complex and like it would use a lot of server power to do, hmm.

1 Like

That’s why they call it REsearch, and not just search, right?

3 Likes

I don’t think it would take much in the way of server power. There are some computationally simple ways I can imagine doing this.

When it comes to databases, counting things is usually easy and in the few cases where it isn’t, you can cache the count as a field; adding a few integer fields usually adds negligible burdens to a database.

Even a measure as simple as looking at the portion of times a user has replied to disputed ID’s on their observations in a specific time-frame, would probably provide a pretty good estimate that I think would achieve much of what I would want to here, and would address the issues that the OP raised.

The way I see it is that there are users whose ID’s are disputed, but they are simply not replying to these disputes.

Another simple way to look at it would be to have a flag set when there is a disagreement, and then see if the user ever looks at the observation or the discussion with the dispute. I would imagine that an overwhelming majority of the problematic user accounts are ones where the user does not even look at the discussion, and that by counting these records we would probably be able to identify the bulk of these accounts.

All that said, I would not want to rule out the possibility of much more complex, yet computationally non-intensive solutions. You can do some really sophisticated things with low computational loads if you are clever. The right index on a database table, or maybe a script that is run every once-in-a-while and runs a more complex computation and then stores the value, perhaps something as simple as a boolean value somewhere.

4 Likes

Unfortunately, the iNat leadership itself takes this perspective. They’ve been quite clear that their first priority is always engagement (i.e., getting people to post about interacting with nature) and the integrity of the data (i.e., cleaning up wrong / duplicate / useless IDs or observations) will always be second fiddle to that. It’s a root issue behind countless threads like this. Given this, I’m frankly surprised GBIF accepts their data and NSF helps fund it. At its core, iNat aspires to be a social medial site, not a virtual museum. If I try to curate my favorite taxon like I’m in the latter, I’ll go nuts; it’s only worth my time if I recognize that interacting here is fundamentally just playtime or practice. I think making peace with that hard truth is one reason why many experts I know tire of the Sisyphean endeavor of curating their expertise here.

5 Likes

The only ‘solution’ I see to this is something which the site is not going to implement, which is adding a third category of identification. Thus having (and just to be clear, such a change is not something I support implementing):

  • observer’s identification
  • uneducated amateurs identification
  • validated experts identification

The reality is if there is a desire to have biodiversity documented with assistance of citizen scientists, field naturalists or morons with cameras whatever you choose to call them, there is the simple fact that no one knows how to accurately and conclusively identify everything they encounter.

I have 30k plus records and just yesterday misidentified a fly as a wasp it being something new to me and which I was unfamiliar with. I am sure to many users, including in this thread that’s just further evidence of why I should not be allowed to do id’s or even to participate on the site, but I don’t see that as somewhere the site is going.

Otherwise, you need a way for the folks with the knowledge to find the records to apply that knowledge to. Correcting misidentifications while annoying is certainly many fold more efficient than scanning through roughly a million submissions a week or going on 90 million archived ones to find the niche you are interested in.

5 Likes

while i personally don’t fully love the ‘connecting people with nature’ narrative for iNat, i think it’s important to be fair and clarify. The goal is not to get people to post about connecting with nature the goal is connecting people with nature. I find these two to be very different.

Here’s what has been most helpful for me to navigate: iNat is like a giant, freeform, open source field notebook. I am able to look at only what I have added which, well i am certainly not perfect buti have a decent understanding of my own limitations. I can also choose to look only at a select list of people’s observations. I have two projects for this. Also, i can dive into the firehose of everyone’s oservations, knowing the limitations of the data.

I find all three of these really useful, but in different ways. While I still personally lean more towards the importance of the data than the iNat admins, i’ve actually softened my stance a bit becuase the site has been really successful and I have still found ways to use it for a while lot of things including ‘real’ ecological monitoring (real being i am being paid to do it and have a masters degree and work in the system) and also a bunch more. It isn’t to say it can’t be better, but i think it’s incorrect to say the site is solely beingmanaged as clickbait.

11 Likes

I know this has been milled over and fought over for literally years but the TLDR:|

-if you choose ‘validation’ based on external metrics like who has a PhD in what, you cut out some of the best de facto experts and include a bunch of people with no real connection to or understanding of the community. Anyway we already have a system that uses this, it’s mainstream science. It already exists, though the data isn’t centralized very well if at all.

-if you try to validate based on internal metrics, you create a huuuuuge algorithm that takes tons of processing time that isn’t available (people already rightfully pointing out that the site can be slow ) and a bunch of staff time that isn’t available. And also people will still fight over it.

it’s a field notebook… just keep telling yourself that :)

7 Likes

to me, it seems like trying curating taxa as an individual would eventually become a Sisyphean task anyway just as a result of the growth of observations over time.

i think that overwhelming growth in observations – not that others might come in and override existing community IDs – is really the reason it’s a mistake to try to curate as you might a museum collection.

to me, it seems like the paradigm shift that’s really needed is to prioritize helping others develop the skills to be able to help identify. that way, the burden of watching over a taxon can be distributed among many.

this doesn’t strike me as a huge problem. in my experience, most of such identifications are by the observer, and if an observer wants to stick with an obviously wrong ID, that’s their own right. it would be no different than them opting out of community ID.

that said, i’m not against trying to teach people how to identify better, but i think it would be a mistake to overload folks with too many rules to learn if they’re not ready for it. also, i think a “hey, i appreciate all the IDs you’ve been doing. have you seen these pro tips for identifying?” approach would be better than a “you’ve been messing up a lot of IDs. please learn these rules before continuing to ID” approach.

one last thought… for folks who are really sticklers about “correct” identification, i think the way i would create a technical solution for that is that would create a new kind of “curation” project where an ID added by any project member would cause the observation to be added to the project, if that taxon was set up in the project criteria. for each observation in the project, there would be a project taxon id determined by only the IDs from members of the project, with additional tools to show when the project taxon ID <> community ID. i think this kind of mechanism would help teams that were working on curating taxa in a distributed way.

11 Likes

People who would respond to a measurement of their identification accuracy would, by definition, be people who work on the site repeatedly. I would suggest that most people who make bad ID’s and don’t get better are people who don’t use iNaturalist long enough to benefit from any measurement of their ID accuracy. Many of them are “duress” users, people who use iNaturalist because they have to for classes.

8 Likes

If anything it sounds like a recipe for more blind agreeing to id’s on records already done to improve your score. Seems like it would promote the exact opposite of the intention.

This system won’t work anyway imo, those users that come and go won’t care about their “scores”, and if apps will be same as now they’ll probably won’t even see there’s a score. (and how would we score all of this?)