I’m not certain how the AI is differentiating between Elodea canadensis and Elodea nuttallii here, but the vegetative differentiation requires minute detail of tiny reproductive features rarely captured in images from observers. In their Aquatic and Wetlands Plants of Northeastern North America (Second ed.), Crow and Hellquist note, “Differentiating our two most common species, E. canadensis from E. nuttallii, by vegetative specimens alone is very unreliable; accurate identifications require flowering/fruiting material. D. Les (pers. comm., 2020) found that fertile populations were needed to corroborate DNA information with reproductive morphology to confirm species; that DNA studies do confirm that there are two genetic lineages, but that hybrids detected by DNA analysis are not uncommon.”
This is a fairly common problem. The CV suggests “leaf taxa” that are in its training data. In cases where two species are difficult or impossible to distinguish from photos, the CV may suggest a genus level ID at first, until there are enough RG observations of one of the species. At that point, the CV is much more likely to suggest the species within its training data that is the closest visual match to the observation in question. The threshold for inclusion in the CV’s training data is around 100 RG images I believe (not necessarily 100 RG observations, but it’s usually around that mark). So, basically, if you have enough RG observations of either of the species, even if those observations are IDed based on genetic analysis or other characteristics not visible in the photos, the CV will “learn” the visual characteristics of those photos and apply them to other photos. The only way to counter this is to push all the ambiguous observations back to genus, but it will be a perennial problem. Several feature requests have been made in an attempt to deal with this. The approach that seems most reasonable to me is that there should be some mechanism to “prune” the CV so that it does not include particular taxa as leaf taxa within its suggestion pool.
https://forum.inaturalist.org/t/allow-some-non-leaf-taxa-to-be-added-to-the-cv-model/63937
There is this. But a request to prune taxon from the CV was apparently denied in 2019. Still perhaps a new feature request specific to pruning certain taxon should be made. Its been 6 years and this is an important issue. It seems every week or two, a new forum post is being made that relates in some way to the issues if the CV.
I don’t understand how it makes sense to allow the CV to “learn” any living thing from photography. Various species acoss the tree of life are identified by all sorts of things this does not work with. DNA, sound, smell, chemical, measuring, etc. If you get even more in the weeds, some viruses, bacteria, can be identified by symptoms. How do you represent something like a cough with imagery?
The CV is just an image matcher with an estimated range map to help filter what should be nearby. But with very little limit to what species it can learn.
So, as an identifier, I guess there are three options: 1.) Back everything out to the Genus-level; 2.) Leave the ID as is, but put an, “or it could be …” comment on the observation, or 3.) Just scroll on past. My only other suggestion would be to somehow allow identifiers to have an “or” function where we can create small quasi-taxonomic groupings. Another aquatic plant example of this would be Potamogeton richardsonii vs P. perfoliatus. Maybe only 1% of observations include the very small detail necessary to definitively identify the species, but narrowing it down to two choices would seemingly be more useful than backing all the way out to the genus level.
Theres about 9000 observations between the two species, most RG. Even if very rare, you need only about 70-90 RG to train those species. It also depends on location. If one of these is easly idable in a certain location because there is only one species, made up example, Canary Islands. Then its possible it could get trained from just observations in the Canary Islands even if everywhere else in the world you cant really ID it.
Leaving 9000 comments isn’t really feasible.
Taxonomy can’t be made up and needs to be sourced.
Unfortunately theres not much one can likely do in this situation. All it would take is 80ish high quality observations showing the identifying features out of 8000 Elodea canadensis to get it trained. I would reach out to the top Identifier of these species though.
I went back and recommended backing out to the genus level for the past 100 or so observations of either E. nuttallii or E. canadensis, and backed observations which were really obscure to the family level (to account for Hydrilla and Egeria). I left comments on all of those observations, which include most folks currently observing aquatics in Northeastern North America. It seems to have worked - the default now appears to be the genus and not species.
The default? Could you elaborate, do you mean the Computer Vision suggestions?
Yes; for the most part, the CV suggests Elodea sp. now as first choice.
The CV trains on a month to two month time frame. The last update was Aug 15, which was trained on data from June 22.
https://www.inaturalist.org/blog/115962-new-computer-vision-model-with-over-2-500-new-taxa
The CV is slow like that. Basically any changes since June 22 have not had any impact on the CV yet as it needs to be trained. Anything done today will likely not be implemented into the model for 3-4 months, since just recently an update was released.
What you describe though is normal. When the CV suggests leaf taxa, the common ancestor of those taxa is above. The “we’re pretty sure its in X”. Sometimes no ancestor is listed because it seems limited on the website how high of an taxon rank ancestor it can show.
Have you also clicked the “No, it’s as good as it can be” under whether or not the ID can be further improved? If not, and the observation is still in the “Needs ID” state, someone could come along and decide to move the ID to species level even if they have no knowledge of the taxon in question.
If there are many members of the genus, and it is possible to narrow the ID down to a relatively small number of species within it, you could also look into having a “complex” created that includes that limited number of species. Then you can ID the observations as part of the complex instead of the genus. Aside from the added precision, an advantage of this approach is should there be a breakthrough in the future which allows the species to be reliably identified based on the photos, it will be easier to find the observations that should be re-visited.
Well, someone could still come along and decide to move the ID to species even if the DQA has been checked and the observation is RG at species. (Yes, this happens. An observation being RG is not protection from people adding random IDs; such IDs tend to have little relationship with whether the observation is “needs ID” or not.)
Unfortunately the DQA does not count as a vote against more specific IDs. However, if pushing the ID back to genus has involved disagreeing with a species-level ID and this species-level ID has not been withdrawn, the disagree will continue to count against the species-level ID.
Thanks. I have seen the “complex” flag used in other complex genera, such as Carex spp. and Sagittaria spp. I’ll look into that option.
I realize that, but I think moving it to RG will significantly reduce the chances of a random person with little to no expertise coming along and putting a species level ID on the observation (based on the AI suggestion for example) simply to get it out of the Needs_ID cue. Perhaps it doesn’t happen much with a group like aquatic plants.
I don’t know how it works - you may have to enlist the help of a taxon curator.
Would you be able to confirm whether or not the hybrids retain the 1 celled marginal serrations of E. canadensis? My guide material uses this trait to distinguish the two main species apart. It mentions the hybrids being a possibility but they stop short of explaining which traits the hybrids actually express (as do most mentions of hybrids for that matter).
The curliness or length of the leaves do seem like unreliable traits to use, but I was under the impression that the microscopic cellular traits were still fairly reliable.
Complexes are often used for this situation. Ideally they should be sister species and referred to as complexes in literature. If they’re nearly identical and hybridize regularly then that sounds appropriate, but with a quick search I didn’t find any instances of them being called a complex.
In the mean time there are only 8 species in Elodea so pushing back to genus isn’t as bad in terms to reducing the meaningfulness of the ID as it would be in some large genera.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.