Are occasional qualified IDs with slight uncertainty justified?

This has been partially discussed in past topics, but I’d like to reframe it. First, no one recommends guessing or making too many or obvious misIDs. To frame this, consider only identifiers who try to maintain an ID accuracy rate between 90-100% (or, starting from anywhere less than 90% if any prefer). Also assume they check all disagreeing-IDs and revise whenever applicable. For those who maintain what they consider good or good enough accuracy, the question arises of if it’s ever justified to make slightly uncertain IDs. Where they indicate their uncertainty to others so they don’t agree unless they can verify, by commenting “I think,” “tentative,” “uncertain,” or “hypothesis.” This has particular relevance when IDing species which are already described but are new or only rarely observed so far on iNat. Which often require checking external photos, ID keys, revisions, checklists, biodiversity databases, specimens, etc. For those species, identifiers often have less than total certainty in knowing the entire location-species checklist (ID options), since even checklists are often incomplete. There’s also by definition less existing iNat-reference material or existing IDs to compare to.

On the one hand, it is possible to correctly ID a majority of such obs., which enable making significant progress IDing all species of a group for a location or globally. On the other, a maximally cautious approach would allow far fewer IDs and slower progress, but also better prevent occasional misIDs. Some specialists or experts imply that the 100% certainty approach is the one rule or an unofficial expectation for serious identifiers, and sometimes disregard the concept of near certainty and equate it to completely guessing. But many other experts actually use the “exploratory approach,” to good effect. Correspondingly, some consider a small percent of misIDs as egregious (including when made by others), while others see it as acceptable or inevitable. Re: certainty, iNat seems to mostly just recommend making IDs when you think they’re correct.

Personally, do you think it’s best (even if multiple options were allowed) for all identifiers’ initial ID certainty to always be 100%? (regardless of if IDs end up proving true) Or is sparingly making slightly uncertain IDs also justified for those who choose to? I currently lean toward the second (used sparingly with qualifications, while mostly IDing with full certainty), although see pros and cons for each approach. Separately as a factual matter, does iNat’s guidance to make IDs you think are correct require 100% certainty or would something less like 95+ or 99+% also be a fair interpretation? In the event it is allowed, it would seem ideal to find ways to more clearly inform the identifier community of this given common beliefs.

7 Likes

I think 100% certainty is much less possible than we’d like, especially just from photos. There have been many times where I was certain an observation was one species, but upon seeing a different photo I realized I was completely wrong. Fixed angles in photos are in cahoots with our imaginations to deceive us. So I guess that means I would also take your second option.

20 Likes

100% certainty doesn’t make sense conceptually, because you can’t be 100% certain the current taxonomy is correct. In fact in some cases ‘correct’ may not even be philosophically well-defined: see, e.g. the recent discussions on the forum about dandelions. Even a full genome DNA sequence wouldn’t give you 100% certainty.

The very fact that its consensus process that requires multiple votes and no reputation system means we accept the fact that no individual IDer is necessarily 100% reliable at reaching 100% confidence.

12 Likes

I agree that 100% is not achievable - it’s an unrealistic goal, and potentially even something of a straw man when discussing ID accuracy. It’s always good to have a healthy respect for the possibility that any ID we make might turn out to be incorrect.

However, I don’t think that means it’s also advisable to make “uncertain” IDs
I think it helps to consider the costs and benefits of different situations.

For species with lots of observations, the impact of a correct or incorrect ID is low - it just doesn’t matter much. There’s no strong reason to take a chance on being wrong.

Where I mainly see people making arguments to ID with less certainty is in cases of rarer species with fewer observations. However, while a correct ID here does have a lot of value, an incorrect ID can have an even greater negative effect. It could lead to the CV being trained incorrectly or other subsequent users making incorrect identifications based on what they think is a correctly-IDed observation. If the incorrectly IDed observation goes to GBIF, it could cause negative consequences if used to make decisions about issues to do with an endangered species for instance (among others). In this situation, I think the costs generally outweigh the benefits. For almost all scientific analyses, one incorrect datapoint will have more of a negative impact than the positive effect of adding another correct datapoint (ie, observation).

If someone were to do tentative guesses for identifications, I think the impact would be least if two conditions were applied:

  1. Being sure to keep up with any observations they make IDs on with less certainty. If the ID is contradicted by other users, they should remove it so they don’t prevent a correct ID (one of the costs of guessing incorrectly). Also, they should be sure to follow up with any agreeing IDs. Inexperienced users or users who want their observations to be RG agree all the time without necessary expertise. I’ve made a few tentative IDs in my day, and I’ve gone cold turkey after seeing people agree to them even when I specified I wasn’t certain in comments.

  2. Tentative IDs are less of an issue with IDs above genus level I think. Since they can’t go to RG, the costs are lower. I also think the potential benefits are higher - making an ID to family for instance can get the observation more visible to a family-level expert who may then be able to ID. That might be worth the chance of an incorrect family ID. So if there’s a case for making tentative IDs, I think it is strongest at those middle levels around family in the taxonomy.

Final point: It’s always totally ok to not ID any observation. When I go through my chosen taxa, I definitely want to add an id to all of them, but sometimes I should just hold off and be humble!

13 Likes

I said 100% simply means when you think you’re certain/sure of an ID when making the ID (regardless of if it really is). 100% is worth mentioning, because anything less than it is often construed as slightly uncertain (e.g. 98 or 99%). I did also carefully caveat that I’m referring to slightly uncertain (vs. “uncertain”) IDs, i.e. IDs that are actually close to being certain without being 100%. Regarding new species, that’s just a common example. I really mean any ID that’s say 90 to 100% certain. Bear in mind those are mostly correct, and have reason for being so. I agree on some of your general cautioning, but again my text at the top carefully frames this question in a different specific context that isn’t regarding beginner identifiers either, and said to use it only sparingly (mostly to not use it), etc.

3 Likes

I love that iNaturalist empowers all of its users to add identifications. I use “I think” as a qualifier fairly often, because I am a complete amateur, who happens to be the top identifier for some taxa. I don’t want my suggestions to be taken as gospel/on reputation alone, even if I’m 90% confident.

I think the latter is more of a concern, yet still minor. Most rare species are not close to the threshold for inclusion in the model. The problem is that rare species can be “lost” at genus level or higher if no one is ever bold enough to suggest the name. It took some dedicated searching, going county by county through all records, to find species like Penstemon metcalfei (sorry, I love that example).

4 Likes

Ideally, identifications should only be made when you are confident that it is x taxa. I would much rather be safely correct at genus than potentially boldly incorrect at species and then negatively impact the data outcomes. The less headache for future identifiers, the better.

6 Likes

Okay, although the questions were 1) do you personally think that should mean 100% certain or would anything less like 98% also quality?; and 2) what’s the actual stance of iNat in defining that (certain vs slightly uncertain)? A recurring thing which I seem to see in some of these comments (which I did address in the prompt) and would again correct is I’m not referring to a process which results in many misIDs, I said few misIDs and a high accuracy rate. As well as to only use sparingly (mostly to use certain IDs).

2 Likes

The alternate problem with this is that if the ‘90%’ guesses are right 80-90% of the time, but left at higher taxon level to be 100%, you end up with a much, much larger pile of ID’s that need review vs the 10-15% corrections required.

I’d rather people get it wrong some of the time leaving corrections necessary, than being correct all the time with a lack of lower order IDs.

7 Likes

Reasonable certainty (above 90%) is fine by me. I often state “I think…” or “Most likely…” in the comments if I am wavering slightly (quite easy to do with some NZ spiders!)

Also given iNat doesn’t have an objective to be a perfect data set, I don’t think people should worry too much (which is not to say they shouldn’t be ‘reasonably sure’ before ID-ing) because if we go too far to try and achieve perfection everyone will be too scared to contribute.

Also bear in mind that no ID is final - the community of motivated, knowledgeable people reviewing and revisiting past IDs is a natural corrective to egregiously inaccurate IDs or serial mis-identifiers, so accuracy should tend to improve over time.

16 Likes

Agreed, every ID is a hypothesis of what the organism is. If later evidence from the community overturns it, that is just the process at work.

I’d much rather add a uncertain ID than have something stuck at genus or family as if it’s stuck at that level, nobody will ever see it.

6 Likes

@russellclarke’s response is very nice.

Like Russell, I often say “tentatively identifying” in my identification comments to let everyone else know that they really shouldn’t agree with me unless they can prove my identification is ‘correct’ by themselves.

If someone agrees with me on a tentative ID for a taxon that doesn’t have many observations, I almost always (when I notice) tell them exactly what I said above; that they shouldn’t agree unless they can actually identify the taxon. Sometimes I will come back to an observation where someone has agreed with me and I will disagree and go back to genus if I don’t feel confident enough, and I’m not sure if they will be super receptive.

My belief is that it is an appropriate practice, in moderation, as others can always disagree with my ID (and I always ask follow-up questions), or ask me how I came to that conclusion or for a resource. The only consistent issue is with species ID like @cthawley said, but then I think about the benefits vs. detriments of making such an ID, and usually come out on top of going forward with the identification.

I really like the way that iSpot had IDs, where with every identification, you had to input how certain you were. That way, people could add tentative identifications, officially.

Final thoughts: It definitely depends on who is making the ID. I don’t make IDs like this unless I have probable cause to believe it is said taxon. I don’t know how others think when they say “tentative.”

There is no official stance, but I expect if iNat staff are to respond on this thread it will be taken as the ‘official stance’.

10 Likes

I agree that every ID is a hypothesis - that’s a good way to put it!

However, I don’t agree that if an observation is at genus or family, no one will ever see it. To my knowledge, most identifiers search at order, family, and genus levels much of the time - those observations definitely do get seen. I also wouldn’t even say that an observation at family or genus is “stuck” there necessarily - an ID to family or genus shouldn’t be seen as stuck or a failure. Sometimes that’s just as far as an observation can be IDed.

5 Likes

I agree with this part. One other clarification from your first comment is that not only genus can become RG but also tribe, subfamily, and (I forget if family) can, if using “Can taxon be improved.”

1 Like

But what if species groupings themselves are uncertain hypotheses? I mean, I can tell a robin from a rabbit and have no problem asserting that “robins” and “rabbits” are separable. But often the boundaries between nominally different groups of organisms or different-looking groups of organisms are unclear, and what a species “is” and how that relates to complex populations a matter of ongoing (and fascinating) uncertainty.

5 Likes

I think Alfred Korzybski’s famous saying, “the map is not the terrain” applies quite nicely to taxonomy, as well as to other such “thinking tools” like maths, physics, etc. They’re all artificial models, and I assume some are less faithful to reality (whatever that turns out to be) than others.

8 Likes

Nope, just subfamily down to genus. Should be noted that this should only be used when it’s certain that the community ID cannot be improved with the evidence provided (as the responses to your other recent topic explain in more detail).

3 Likes

Triage? I ID what I am certain of (but still, follow notifications to keep learning)

I skip what is planty but I cannot take it further.

Quite a few I will add a tentative ID. Or even more tentative leave a comment with a could be this? link (rewarding when 2 taxon experts agree and I can then tag along). My notifications let me withdraw if it is clear that I was wrong. But, I have still moved it from Unknown to a slightly better ID eventually.

4 Likes

Perfectionism is rarely worth it.

Especially on iNaturalist, which has the mission of connecting people to nature and allowing them to LEARN something about it.

INat was never intended to support academic research. If researchers can glean something from this citizen science, well and good. But, really, such researchers have no right to expect perfection or criticize any perceived lack of accuracy here.

I believe it’s okay to make mistakes. My opinion (and I believe certain forum users may disagree wholeheartedly): I think there is far too much discussion and censure about making a wrong ID on iNat.

For my part, I feel disappointment when IDers suggest an ID in the Comments, but they have lost the confidence to make an actual ID. I believe more people here would learn something from making an occasional ID mistakes, if there wasn’t a faction spreading ill-will over mistaken IDs. If there’s a mistaken ID, don’t blow, please. Just fix it, and teach. Provide a little explanation, if you can do it kindly.

(Disclosure, I have not read the whole thread. I am reacting to the OP and my perceptions about the trend ID censure on the forum).

14 Likes

I am one of the people who disappoints you. I sometimes suggest an id in the comments box when I am not sure, the reasoning being that I don’t want an apparently definite id against my name which others will recognise as being unjustified from the photographic evidence. It isn’t loss of confidence; it is knowing my limits and the limits of the evidence. I hope that such comments are helpful, maybe prompting the photographer to take some more pictures of the specimen, or at least find out what photos will allow a definite id next time.

On the other hand, I sometimes stick in an identification when the balance of probabilities say it is correct but I’m not certain, e.g. if the other possibility is far rarer than the species I am suggesting. I tend to stray into this behaviour when I have scanned several pages of observations without seeing any to identify. So I am inconsistent.

10 Likes