As suggested in the ‘Feature Requests’ forum topics category, I thought I would post here in General to see if anyone has ever asked about adding some sort of identification-flag difficulty indicators to group candidates for the process of adding an observation.
The first thing that pops to mind is how restaurants use chili peppers as spiciness indicators on menus. Maybe certain groups of taxa could prompt a similar display to caution submitters to slow down and be extra careful.
(I would tell you how I came up with this idea, but I bet you can all guess.)
Also – if such a system was adopted, what would make a good indicator for cautioning a target group? Text colour? An icon? (What would make a good symbol? Ideas?)
If this has already been discussed here, I’d appreciate anyone sharing that link. I could not find anything (yet).
The question would be how to define the categories so that identification difficulty isn’t totally subjective. Would it be identification diffiuculty by a complete novice? A beginner? Or so diffucult that even the experts struggle to place a name on it? Would it factor in potentially large numbers of undescribed species in certain groups?
It would also have to be geography sensitive. There are plenty of species/groups of species which have lookalikes in certain regions, but only one species in other regions. Displaying a “difficult to ID” marker on a taxon would be incredibly misleading in the areas where it has no lookalikes.
It could take into account number of lookalikes in a region and general difficulty of ID (easy for beginner to difficult for an expert). It would also be useful if there were tips on how to submit useful observations in certain taxa. (For example, taking an identifiable picture of a mushroom)
Would be near-impossible to quantify but I love the idea. Unfortunately, one must consider that different people have different levels of ID skill – to a non-birder, a field sparrow and a swamp sparrow may blend together where to a birder they couldn’t be more different. Similarly, I can barely tell insects of the same family apart but obviously specialized insect ID’ers would have little trouble. Maybe a flag for organisms which can’t be ID’ed from photos could be nice, though?
Thank you for those links! It clearly shows that you are a kind, sharing person.
Wait a second… um… where did that last piece of my chocolate bar go?
How about by track record?
Could the number of ID corrections on record for any given choice be somehow automatically worked into a ‘take extra caution’ flag in the ID system? Maybe only for stuff that has a statistically robust base of number of observations in an area.
Or is that just asking for more trouble?
That’s why I suggested based off ease for a beginner and difficulty for an expert. Like it’s easy to identify a red-winged blackbird but for a beginning identifier sparrows and wrens can be confusing to identify.
Another aspect to maybe consider - what percentage of that taxon is included in the computer vision? That makes a big difference in the possibility of misidentification when novices upload photos and use the iNat suggestions.
I study bees, and when a good chunk of species in a genus don’t even have photos on iNat, let alone be included in the AI, it can lead to many people uploading bees that superficially look identical but could be any number of different things, with an ID of the most common choice. It leaves me wondering if that user is an expert, which means I sometimes go in and read their profiles etc, which loses time.
So a warning in that sense, when uploading/selecting IDs from taxa that are grossly under-represented in the computer vision, would be helpful and maybe give folks some hesitation in wantonly going with the iNat top suggestion and maybe leave it at genus.
I agree! Aside from general difficulties in generalizing ID-difficulty (but I think there could maybe even a common ground be found somehow) the biggest issue is that this information is just not global… For example, in large parts of Europe Pisaura mirabilis is one of the most unmistakingly identifyable spider species for even absolute beginners… it is actually probably one of the very first spider species I myself was able to identify even before I started my arachnological interest. But beware if you observe in the mediterranian or south-eastern areas where indeed a bunch of different species cannot be identified even by expert IDers…
Lately I’ve been spending so much time correcting wrong CV suggestions for bees that I am inclined to propose that it should not be offering suggestions at species level at all for this group, just genus. Or maybe it should stay at Anthophila, since a lot of the wrong CV suggestions are not even the correct genus (e.g. where it has learned the flower preferences of oligolectic bees rather than the appearance of the bee itself).
In practice, though, of course it’s more complicated and I don’t see any easy way to automate such a mechanism.
One indicator might be the percentage of observations in the genus that have multiple IDs and are not at species level – i.e., if this is regionally higher than, say, 30% or 50% it seems likely that there are at least a few species in the genus that can’t be easily distinguished from photos.
A high percentage of observations that are “research grade” at a level above species might work for some taxa, but my experience is that people don’t use the “ID cannot be improved” button very often for bees, for a variety of reasons.
Another criteria that seems likely to be relevant would be the percentage of incorrect/disagreeing IDs – including withdrawn IDs – for observations with a community ID of a particular species/genus. However, from what I understand, withdrawn IDs aren’t currently taken into account either in the “similar taxa” tab on taxon pages or when training the CV, so I suspect this would require some basic changes to the way suggestions are calculated.
Maybe the Pre-Maverick project could be a start here
8K bees in limbo
No, I wasn’t thinking of cases of disagreements – just the many observations where multiple users have ID’d to genus (or subgenus/complex) because a lot of the time a species ID for bees requires seeing characteristics that typically aren’t visible in the average photo. So the CV ends up skewed because it is based on a small subset of species that are readily IDable from photos, rather than the many more observations that have a community ID at genus level (but are not research grade).
To give a concrete example of the sort of issue that happens with some of the more difficult taxa:
The genus Lasioglossum consists mostly of small, inconspicuous, sparsely haired bees, often with a metallic sheen. In Europe there are a handful of species – usually the less typical members of the genus – that can be easily ID’d from photographs, and a few additional groups that can be fairly reliably assigned to a subgenus. The rest … are difficult. They generally require an IDer with local expertise and photos of better-than-average quality. Or photos that aren’t typical field photos (bees caught in-hand to photograph, or pinned specimens).
As a result, I would estimate that only about 1% of observations are “research grade”, and a slightly larger percentage have an ID at species level. This has had palpable consequences for the CV model, which apparently doesn’t recognize the genus at all.
I have been spending an inordinate amount of time correcting CV suggestions for dark, skinny bees. It seems particularly fond of subgenus Micrandrena. For bees on certain flowers, e.g. Campanula or Geranium or Ranunculus, it likes to suggest Chelostoma (pollen specialists). Sometimes it will suggest the closely related sister genus Halictus, which is perhaps the most understandable of these wrong suggestions. But it never seems to suggest Lasioglossum, even though most of the time this is what it is.
The training of the model and the suggestions it provides seems to be based on the assumption that most observations should be IDable to species level, which is simply not the case for some taxa.
There’s definitely a bunch of fungi genuses that I really, really, really wish that CV would prioritize IDing to genus, or even section/subsection - but I don’t think a lot of people even realize mushroom sections exist, most of the time when I want to put something at that level but I can’t remember the spelling I have to go searching for it.
I don’t even think CV really even acknowledges that mushroom sections exist? ( has anyone ever done a feature request for that, would that even be something that would be possible?)
We do include sections in the model, I double-checked. For example, this taxon is in the current model: https://www.inaturalist.org/taxa/1135503-Microsporae I’m pretty sure, however, that we only train on higher taxa if none of its children are in the model, which is the case here.
Gotcha, thank you so much for checking! That’s at least something, though unfortunately the species I see that could really benefit from more section suggestions i generally don’t see pop up (probably because they have children that do show in the CV like you said)
Here’s one example
CV doesn’t suggest the section here, only the species, Coprinellus micaceus, but there are several different species in this section that are macroscopically identical. So generally what happens is folks ends up having to go through and kick them back up to section - which, of course, isn’t the end of the world, but I wouldn’t be surprised if its ending up with a lot of false research grades for the species level simply because your average IDer sees what seems like an obvious-to-ID mushroom without even realizing there are look alikes.
Of course this species is a little bit exacerbated because field guides generally just call them Mica Caps without even mentioning that its really a complex of several different species, so even if you have a field guide you may not realize.
(of course, I have no idea how to make CV suggest sections for problematic mushrooms without making it incredibly tedious on the site curators, so, idk even know what my point is here - just spitballing)
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.