Is the iNatForum-General the appropriate place to note suggestions?
Case example:
Alpine Jelly Cone (Guepiniopsis alpina) is what the algorithm suggests when uploading yellow conical blob-like fungi. And later the displays under the Compare menu are all adjacent species.
In reality, this is a widely mis-identified fungi. (speculative)
Recommended solution: Yellow Hat Jelly (Dacrymyces capitatus) shall also be displayed as an alternative any time the algorithm suggests the Alpine Jelly Cone (Guepiniopsis alpina).
I have made numerous corrections but apparently not enough to have the AI recognize.
I don’t think there’s any way to manually “tell” the CV to change its suggestions. It’s just a learning model that matches unknown images with a dataset of known images. As far as I know, no one can change what the CV spits out on a case-by-case basis like this. The only solution is to make corrections to the photos that have been misidentified thus far, and wait until it learns. It can be a long, tedious process, but it’s doable.
Even if it were possible, I’m not sure I’d agree with this suggestion. The model takes into account not only appearance but also physical location. So while always presenting both species may improve the rate of correct IDs in locations were both species are present, it may also lead to misidentifications where only one species is present.
For example, it looks like alpine cone jelly is present all along the coast of British Columbia and into Alaska, but yellow hat jelly only gets up to Washington state (based on research grade obs). To always suggest yellow hat jelly alongside alpine cone jelly would likely lead to many mis-identified records of yellow hat jelly in British Columbia and Alaska where it does not seem to actually occur.
Good point, I hear you. My thought differs because:
DNA sequencing is being performed as we speak and time will tell what the real truth is. My current knowledge/speculation is that the Dacrymyces is widely and incorrectly misidentified as Guepiniopsis, and therefore the current distribution data is off track. Regarding location, I think the “altitude” will be a critical parameter; low altitude resulting in Dacrymyces and high altitude yielding Guepiniopsis.
New observations are being misidentified, which is likely to delay a wide scale correction.
There is now at least 4 of us making the corrections around WA, so perhaps things will gradually improve.
So I think the real question is, can we, should we, and how do we “re-train” the iNat algorithm when a Homo sapien recognizes that the computer model is on a wrong track. (noting that the humans themselves make errors too, so the machine should have ways to re-re-autocorrect when it was right in the first place)
My first Neural Network project was in 1991. I “trained” a stack of wires and transistors to self-learn & make decisions on a substantial task. The result impacted a lot of human lives, so failure mode was catastrophic. The rig was remarkably accurate, but eventually. Initially I had to tweak many parameters to force it to learn properly, and when it mis-learned, I used my “designer hammer”.
Since those years I used various technologies to create Neural Networks, but that “hammer” is always in my toolbox.
So for the iNat CV, the answer may be that certain members get supervotes on observations. For example, if someone does DNA sequencing on 10 samples and tags various observations accordingly, the algorithm shall take that as the current truth and start/re-start learning.
This already happens, in a way – not with supervotes, but by these folks completely opting out of the voting system (opting out of community ID).
What DOESN’T happen, AFAIK, is the CV analyzing WHY they opted out, and looking at their unusually detailed and specific observation comments about DNA, microscopy, etc., and then using that information to improve its own accuracy.
when there are multiple pictures of an observation, use Bayesian reasoning to include the evidence of, for example, the tree, the leaves, and the cones, which could narrow ids substantially.
Use the “Focal Distance” metadata along with “Focal Length” and Sensor Size, to determine the real size, in mm/pixel, for the in-focus part of the observation. Differentiating the height of length or size of an organism adds critical identifying information…
I don’t understand your reasoning here. Users opt out for lots of reasons; a major one is that they disagree with the community ID. Opted-out observations are no more likely than any other observations to include detailed notes or be of particularly difficult species. The community ID has no inherent connection to the CV suggestions. Some users make liberal use of the CV when IDing their own observations, but users are always free to enter something other than the CV suggestion – as presumably your knowledgeable observer who has carried out a DNA analysis/microscopy etc. would do. It is therefore unlikely that this ID would be overridden by other users (prompting an opt-out by the observer) just because the CV suggests something else – most users do not meddle with other people’s observations if they don’t know anything about the taxon in question.
I agree that it would be useful if the CV training could directly take into account corrections (i.e., people saying “not that taxon”) and frequently confused species rather than simply being retrained on a corrected set of images, but this has nothing to do with opted-out observations.
My experience IDing a difficult arthropod group (bees) suggests that one of the biggest sources of problems with the CV is the fact that it is not trained on higher-level taxa if a lower-level one in that taxon is included. So if there is a genus where the more typical species generally cannot be distinguished from field photos, then only the less characteristics species will be included in the training set – resulting in it not recognizing the typical members of the genus at all. The same is true if certain life stages are not identifiable to species – for example, it may only learn the adult form and not recognize the larva because the larval stage is not included in its training set.
The OP has been tagged “Solved” by someone or a machine. As the initiator of the post, do I have to accept this resolution? (Bear with me as I am still new to iNat and learning)
Honestly, I do not consider it solved, and as a former programmer and life-long end user of various neural networks, I know various plausible solutions exist.
To elaborate my logic, let me make up a boundary example. We all wake up one morning to see that the algorithm began suggesting Slime Mold species for all Fungi and vice versa. What do we do? Do we sit and wait, and hope it auto-corrects, or is there something that can be done within the current architecture?
No, you don’t. Try unchecking it yourself (in the post where it was marked, not in the OP where it is displayed). If that doesn’t work let me know and I’ll do it.