The computer vision system has seemingly gotten a lot better since it came out already.
My understanding is that the basic problem is that the computer vision system was designed without two abilities that will likely be necessary to reach practical functionality:
it doesn’t consider location at all
it only trains off of species that have a certain number of research-grade observations from different users; species that don’t yet have enough research-grade observations and higher taxa don’t “exist” as far as the computer vision is concerned
There was word back in the old version of the forum that fixes were underway for these problems. It’d be nice to have a sense when these changes might come out if at all.
Is there any way to fix this other than uploading photos for these ‘non-existing’ taxon?
Even that wont fix it. In order to become eligible to go through the computer vision training, a species has to have a minimum number of observations on iNaturalist (I think it is 20, but dont remember for sure), and those observations must come from a diversity of users. So 1 user adding a bunch of photos will not push a taxa into the queue for the next time the algorithm is run.
The option is to encourage a diverse base of people to upload their records.
Is there any documentation regarding the second item, about training off only species that have a certain number of research grade obsevations? My impression is that that is definitely not the way that the algorithm currently behaves. I frequently get computer vision suggestions on liverworts in particular (but occasionally vascular plants) that have no records, research grade or otherwise, on iNaturalist. Just photos from natureserve or a similar source. And to bring things back to the original topic- yes, they are often described as specific to an entirely different continent.
@erikksen do you have a place defined in your settings? I am on the .nz domain, and have a default “New Zealand” filter picked up from that, so when I see results I am only seeing what is present in New Zealand. To see WW I have to deselect the filter for the place.
There are 13,730 species that have at least 20 research grade observations. We chose this as the data threshold necessary to include a species in our model. Technically, this number is closer to 10,000 species since we took steps to ensure that each species had at least 20 distinct observers to control for observer effects. We are moving a new species across this data threshold every 1.7 hours as new observations and identifications are added to iNaturalist. This means every observation you post or identification you make works to improve the model!
I have no idea if this is the right place to comment, but if it is not just remove this reply…but i am always surprised they (in general) never give a warning that the chosen species is not found in this country, has not been found last 10 years in the state or that it is from another continent.
I know the ObsMap app give a Fat red text if the species is rare and if it is very rare you have to confirm it by pressing an additional YES. The last option reduced the rare birdinds with 90% which was important as sms and email notifications are send on (mainly) rare bird alerts.
I was very happy that the inaturlist app show a very basic “the species is seen nearby” which, in my option, should be extend to warnigns like ‘not from this continent’,
I have no idea how difficult it is but i used to have a database with plant names (from 30years ago) and the present in the different European states. I am only afraid that for several species the data is out dated.
But a bold red text (can colour blind man see red?) will prevent some wrong admissions.
Nope. To add a little more information to this scenario (again, most common with liverworts and mosses in my experience):
In the upload process, the list of suggestions that drops down has maybe four suggestions I’d like to check out, so I right click and open each in a new tab. One or two might turn out to be as I described before- with photos from a natureserv article or something to that effect, but no existing inaturalist records. In line with this topic’s origins, most others often do show inaturalist records but existing observations may be only in asia only or similarly not geographically sensible (I understand that if I did have a location filter on, I would not see those observations, right?).
Really my only interest in pointing this out is to suggest that the computer vision is not trained solely on research-grade iNat observations. It may be that it is trained on a limited number of images from existing authoritative sources as well (otherwise how would the suggestions I’m referring to end up being suggested to me?). That may indicate that condition 2 in edanko’s post that I was responding to has changed at some point.
There is a different problem with the suggestions too: sometimes, e.g., many beans are suggested, from different genera - but the software does not suggest ‘bean’ itself as a option. As bean is consistent with the suggestions, all of them, that is what I want to pick, and I can as long as I know they are all beans. But if among the suggestions I don’t know what the lowest taxa they have in common is, I cannot decide on a lowest level safe or probable id, except, ofc, by clicking on many of the suggestions and checking their taxonomy. I check taxonomy often enough, I would like for there to be a summary id, in the case where the probable taxa are similar at some level which is not too high. Also even if the taxa don’t meet until, say, dicots, I think it’s not a problem providing that because dicots is a very safe id then, being so high level. The trouble will be if the algorithm often picks multiple taxa of the same low level taxonomic group while ignoring candidates from elsewhere. and then the error is no greater than when one of the given ids is chosen. At least it will be obvious how far up the disagreement is.
I am not sure whether I missed something from this very long thread but I wanted to add an extra example.
This problem occurs widely even with VERY common species of mammals. Not only you get suggested a list of species which don’t even occur in the whole continent, you also get a “we are pretty sure this in the family:” suggestion which links you to the wrong family. The family suggestion is obviously linked to the wrongly suggested top species so even the cautious newbie who wants to be on the safe side fails in assigning a family ID.
Here is a very fresh observation in which the problem occurs:
A very common european rodent (Apodemus, family Muridae) is being wrongly assigned to american species (in the genus Peromyscus, which does not occur in Europe) and of course also assigned to the relative wrong family (Arvicolidae) by the algorithm.
I correct several of these easily avoidable mistakes and this leads me to think that geography is absolutely not taken into account.