Overzealous Identification

Just out of curiosity, how do you message them?

I’ve used both comments within observations, as well as direct messages. I usually try direct messaging first. Occasionally that works, and turns into a productive conversation (sources, hints & tips, etc.), but often I don’t ever get a response back, and the person just continues to do the same things.

I don’t do either very much now, as I’d rather just fix what I can. Usually with my correction and tagging an expert or two, it gets back to at least a “Needs ID”, if not the correct RG ID. species.

As for the leaderboards, I don’t usually rely on them - at least note without checking the person’s bio. Also, the longer you use the site, the more you build up your own list of experts for certain taxa and/or locations. However, as much as I can just ignore them, I don’t see the point in having a feature that is at best worthless, and at worst counterproductive.

3 Likes

I agree that smaller changes to the system may sum up to make a difference already. I agree with the proposed “Thank You” button; such a feature was recently activated in the English Wikipedia, with great success.

In addition, what about providing users with a list of “Own IDs needing review”, listing all observations where the current community ID is different from the ID suggested by the user? Such a list, coming across as a type of “to do” list, would encourage the user to correct themselves (or to discuss their reasoning) to keep the number of list entries small. Such a list would make it much easier to revisit observations where necessary, and to learn from mistakes. As a result, the Overzealous Identifier will learn from their mistakes, which is all we want.

Another idea: The “Based on the evidence, can the Community Taxon still be confirmed or improved?” option appears to be underused, and was quite obscure to me for long (I did not understand that this will move an observation to “Research Grade” in certain situations). This could be made more accessible. What about setting observations to “Research Grade” automatically after at least two or three IDs point to same broader taxon, even when not at species level? This may work against the common impression that identifications above species level are worthless.

8 Likes

I have never understood what this feature does or how to interpret it, or what its consequences are. Where is that explained? Thanks!

2 Likes

Not sure where it is explained, but if the Community ID is at Genus or higher, it becomes “Research Grade”, at lower levels, it becomes “Casual”. No notification is sent to the OP, like all of the DQA votes, so a comment about it is courteous.

1 Like

My understanding is it is for when you are sure that the observation cannot be identified any finer than it’s current level based on the information included in the observation.

Eg. 1, there are some flies that can only be ID’d to species visually by their genitalia (and then, sometimes only on the males), and that genitalia cannot be viewed without a microscope. In the absence of those close shots, a genus-level might be the best the ID could get, so the answer to the question “Based on the evidence, can the Community Taxon still be confirmed or improved?” could be no.

Eg 2., I’ve heard there are some animals (cicadas, if memory serves, but it could have been grasshoppers/frogs/birds too) that can only be distinguished to species by their call. If the observation does not include audio, then when it is still at a higher level taxon the “Based on the evidence, can the Community Taxon still be confirmed or improved?” could be no.

Eg 3. I think there are some organisms that I’ve heard can only be positivly ID’d to species by DNA (some fungi, I think). Minus that evidence, it may never get to species, so the answer to the question “Based on the evidence, can the Community Taxon still be confirmed or improved?” could be no.

As Tony Robledo has opined (and I kind of agree), it’s not best used when you think the pic or audio is “too poor” to get an ID, because you never know what an expert may be able to discern from even the blurriest/noisiest photo or most garbled audio recording. I’ve seen iNat Identifiers give some amazing IDs (with explanations!) for features they noticed in some very low quality evidence.

10 Likes

As comments above have already stated, I believe the iNat ”identify” is a great way to learn, I use it for that purpose often (if you take a look at my identifications, you’ll see a ton of IDs of birds from the South Pacific), and I believe that for a lot of ”overzealous” identifiers (myself included) do it for the purpose of learning! However, doing that can still effect the data if you don’t do research. When I’m trying to learn using the ”identify” tool, I spend a couple hrs beforehand studying the birds/butterflies that I’ll be IDing, and while I’m IDing I always have 2 or 3 guides open to make sure my IDs are good. Now if everyone did this that would be great but I have seen many times people just going through and clicking ”agree” without putting much thought into their IDs. I’m not sure how to correct this but I think just reminding them that their IDs contribute to science and that iNat has a very high standard for data and IDs like theirs can mess up the system would go a long way.

6 Likes

Thanks, big help!

2 Likes

How about a way to flag an observation as needing further review? So that when we personally do not have expertise in that area, we can still indicate to others that there may be a problem?

1 Like

You could check yes for "Based on the evidence, can the Community Taxon still be confirmed or improved? " in the DQA which would mark it as “Needs ID” then add appropriate comments to your concerns about the ID. I tested it on one of my observations with 4 species agreements and an Order in agreement and it switched it to “Needs ID”.

Not sure how many would agree that it is an appropriate use of the flag, but I feel it would be applicable.

6 Likes

I think that is exactly what it is for. It is a vote situation too, so others can “overturn” the flag if they feel it has got to a stable enough ID, and vice versa if you have flagged it “as good as it can be = yes” they can vote the other way and say it still needs attention, the majority flags ruling out.

2 Likes

I don’t know the answer but am also frustrated by inexperienced users blindly agreeing with the first ID they get. Especially new users in a group. For that reason, if I’m not totally certain of an ID, I usually give only the genus (or whatever the next taxa higher is) in the ID and put a comment saying, maybe this is G. whateveri or maybe it’s an Asclepias.

I do sometimes run across things that are completely misidentified and I slap an ID on them that’s as low as I can honestly do. iNat asks a question about whether you’re just not sure about the ID or are sure it’s wrong. If you pick a taxa too far away from the current ID, the wording of the choices doesn’t quite fit the situation because it looks too far down, so I pick the one that that means, I know this is wrong.

I like the “best guess” approach. I would find it useful.

I am curious about error rates, and know there has been some work attempting to quantify them. I think it’s probably safe to assume that overzealous identifiers, almost by definition, will have higher error rates than other identifiers, but I see that as one end of a spectrum of identifier quality, and maybe it’s worth considering how to encourage a good balance between identifying more things, and maintaining a reasonably high level of quality/accuracy in the identifications.

I think @charlie has mentioned that for plants he isn’t convinced the error rate on iNaturalist is much higher than that for herbarium collections (forgive me if I’m misremembering, and please correct me if so), and there’s no way of knowing with data that is recorded without vouchers (in my personal experience, I’ve seen enough to suggest that accuracy of plant data is going to be highly dependent on the observer).

Personally, I know that I feel kind of bad about making mistakes in identifications, and sometimes feel like maybe I should be more conservative, to avoid mistakes like that. However, based on this measure at least, my overall error rate is pretty low, and I do use the corrections to help calibrate which things I should feel reasonably confident in (acknowledging that my confidence is based in large part on knowing what species are expected to occur in the area where I focus).

When I do learn about a species I wasn’t aware of previously, I become more conservative unless/until I feel confident I know how to distinguish it from others that occur in the region I look at. As an example, when @markegger identified some observations as C. chrymactis, a species I had not previously been aware of, I began limiting my identifications to genus for observations in the (relatively limited) area where that species occurs, since I didn’t know how to differentiate it from the other species expected in the region.

In contrast to my approach, a friend of mine only wants to identify something based on seeing all the key characters, even if there’s not really anything else expected from the region that is likely to fit what’s shown in the observation.

I would certainly acknowledge that the quality of identifications is likely to be higher in that case, it’s unclear whether that (possibly small) gain in accuracy is worth the cost of many more observations going unidentified (when most of those identifications would have been correct).

I suppose there will always be a tension between putting names on things (or not), and views will differ depending on the person and how they may want to use the observations/identifications.

5 Likes

That’s a compromise I could get behind.

But this discussion does beg the question—what’s a new user? Is it based on the age of the account or is it based on activity in each clade? When I started, I was IDing birds, non-rodent mammals, and a few common reptiles. The past couple years, I’ve been delving into plants and insects. Oftentimes, I can only ID to family for insects and Dicots, Monocots, or non-angiosperm class for plants (a lot of people in my area upload without ID, so I give it something so it shows up on the filtered Identify searches people do). I’ve mostly known what I’m doing with the verts, but have had to do corrective sweeps of my own IDs multiple times in insects and plants when someone who’s an actual expert told me about a similar species that wasn’t showing up on any of the area taxon lists. If there had been a weightless period, I think my contributions would have benefited from having a new one in each major clade until I’d had some practice and a chance to encounter those experts. Though the number of contributions needed for each clade would probably need to be lowered for certain clades if there are some that don’t get many observations overall.

3 Likes

Short answer… any new user who is unfamiliar with how the site and community operates. ie Those that think “Agree” button means to accept the given ID, and anyone who doesn’t grasp the nature of how CID works.

In the context of overzealous indentification, it is not so clear. This is normally involving more active users, and not withstanding the lack of response to questions, they are actually a good thing!

The probation periods and weightless IDs etc are ideas to mitigate the problematic IDing from users that join for school projects or bioblitzes, where they are either duress users or very short term users, and don’t respond to questions or ID challenges outside of the short project/period they joined for. This is not a huge problem, because we can usually tip the CID with weight from tagging in other active users to help with confirming IDs. Typically, these problematic IDs come in pairs, one making the errant ID, and a classmate or other well-meaning iNatter "Agree"ing to it in the Needs ID pages. The advantage of the weightless probationary IDs is that they would still appear, but only two other active IDers would be needed to confirm them to RG, vs the 3 that are needed to overturn a single errant ID (or 5 others if an errant pair). Of course, if either of the errant IDers does respond and/or change their ID, then there is no problem. It is only the absentee IDers that create the problem here.

For me, the key to whether it would be effective as a solution, is in whether the probationary period can be waived for situations where we “recruit” in expert identifiers. Any such system should have as little a deterrent on those new skills as possible.

The other matter that is important to me is how it affects new “novice/amateur” iNatters who perhaps join as a duress user, but then “catch the bug” and go on to become regular iNatters. This is of course what iNat is about, so whatever is implemented must prove to be not too great a hurdle for those new users.

Some have argued that restrictions such as the probationary period would put people off from becoming more active. I think that would be the case if they had the ability and it was taken away, but for a new iNatter starting under the probationary system, it would be a situation of gaining ability, rather than losing it. With the possibility of gaining it very quickly, I might add, should they request and/or be given the release from the probation!

5 Likes

This issue isn’t unique to the iNaturalist, but quite common for crowdsourced projects.
I know quite a bit about it from my experience with OpenStreetMap. (Spoiler: there, nobody really came up with a solid universal simple solution for many reasons.)

There are multiple aspects important to understand and realize to deal with it properly.

  • The goal of the project needs to be defined clearly and that definition should be accepted by the majority of users. There are always those who have a pretty practical result-oriented view on it. They think that it’s important to create a product. A database of observations (iNaturalist), a freely accessible map (OSM), a collection of freely available images (Wikimedia Commons), an open encyclopedia (Wikipedia). With such a goal, it is always possible to start from defining at least some quality standards and define the unwanted contribution as well as how harmful it is. While there are always those who believe that these projects play a completely different role - to motivate, encourage, teach, and so on. Their focus is on a mythical “blank slate newbie” who, according to their belief, can be easily shaped into almost anything, has nearly no own agency (goals, interests) and is super-easy to spook by telling them they did something wrong. This idea is quite unrealistic, generalized and idealistic by nature. Based on this view, no quality standard shall be defined since it supposedly discourages valuable newbies. However, in reality, those who already put a lot of their hard work, knowledge and time into their contributions, often get strongly and reasonably demotivated by low-quality contributions that damage the project’s reputation.
    The latter case is usually supported by the effect of “professional bubble” - a situation, where a project has been started by a relatively small group of professionals sharing similar values, so it doesn’t get any exposure to different kinds of contributors for quite a long time.
  • If there’s an agreement, that contributions can, in fact, be deemed as one of insufficient quality or harmful, regulation mechanisms need to be established. For example, in OpenStreetMap, there are no moderators or contributor levels (while in Wikipedia, they have such things). However, any edit can be undone by any user. Sure, there are conflicts sometimes, but there’s a (small) Data Working Group that solves these issues on an individual basis.
  • The source of contributions of insufficient quality may be studied to benefit a project. But it must not be proclaimed without any actual study because it is quite easy to imagine a pattern and suggest a universal measure that, in reality, will neither be universal nor effective. For example, attributing insufficient quality contributions to “new users” is just as incorrect as thinking that all new users can be turned into valuable productive contributors. For example, a new user can easily be a professional who finally decided to start contributing online. The only thing that can be done effectively based purely on the user’s experience within a project is somehow highlighting their contributions to draw more attention of the experienced users (or moderators, if any). And only then, when a pattern in the user’s activities has been somehow established, it is possible to take corrective actions if that pattern is negative. For example, in the OSM project, kids playing PokemonGO cheat the game by adding fake features on the map (it’s used in the game). Their edits get reversed and, if they persist, bans are issued by Data Working Group.
    So, the general approach there is “everyone is equal at the beginning, new users are watched with more attention, any bad contribution can be undone (deleted), if unwanted actions continue - user can be banned temporarily or permanently”.

Speaking of understanding the reasons for unwanted contributions - it might be important to understand that a lot of people nowadays are raised on videogames and their need for gratification trumps almost everything. So, the discovery of a “top identifiers” section automatically triggers a pursuit of the highest “score”, which is simple once you don’t care about rules/quality. It’s obviously impossible to detect such an intent in advance as well as it is not really necessary - it is perfectly possible to mitigate the effect of such actions if there is a tool that allows to reverse it and a way to prevent this user from repeating it. Sure, an attempt to talk a reason and understanding into them can be added somewhere before a ban to give them a benefit of a doubt.

Even though there is no really simple and universal solution, inventing the wheel from scratch makes no sense once this isn’t a unique issue.

14 Likes

I believe I found the user you reference. Unfortunately, I did not notice this over the summer, as I spent very little time on iNaturalist. I sent this person along message regarding this issue, which is broader than your observations alone, and hope to resolve the issue productively and amicably.

5 Likes

I do a lot of identifications. I’ll pick a genus I know - or have learned to know, because I’m not an ‘expert’ - and go though several pages of observations. My object is not to advance on the leaderboard, but to get two year old observations (&etc) into a database. I’m at the top of the leaderboard for Canadian Noctuids (and some species), but I don’t really care. Currently there are over 300 pages of un-confirmed identifications in this group. They are more beneficial if they are confirmed. When confirming I do two things: I look at 95% of the observations, and usually check them with at least two sources. I also give an explanation if I disagree, something I have noticed that many folks do not do.
I don’t know how to deal with observations that are not properly identified and then confirmed, but don’t lump all ‘confirmers’ in the same boat.

12 Likes

By the way, this is a slow and rather tedious process, but I like doing it.

10 Likes