Species Suggestions for the Wrong Continent

The computer vision system has seemingly gotten a lot better since it came out already.

My understanding is that the basic problem is that the computer vision system was designed without two abilities that will likely be necessary to reach practical functionality:

  1. it doesn’t consider location at all
  2. it only trains off of species that have a certain number of research-grade observations from different users; species that don’t yet have enough research-grade observations and higher taxa don’t “exist” as far as the computer vision is concerned

There was word back in the old version of the forum that fixes were underway for these problems. It’d be nice to have a sense when these changes might come out if at all.

6 Likes

Is there any way to fix this other than uploading photos for these ‘non-existing’ taxon?

Is there any way to fix this other than uploading photos for these ‘non-existing’ taxon?

Even that wont fix it. In order to become eligible to go through the computer vision training, a species has to have a minimum number of observations on iNaturalist (I think it is 20, but dont remember for sure), and those observations must come from a diversity of users. So 1 user adding a bunch of photos will not push a taxa into the queue for the next time the algorithm is run.

The option is to encourage a diverse base of people to upload their records.

2 Likes

and to help with species ID when you can so things become research grade and go into the algorithm

2 Likes

@charlie @cmcheatle Ok, thanks guys!

1 Like

Is there any documentation regarding the second item, about training off only species that have a certain number of research grade obsevations? My impression is that that is definitely not the way that the algorithm currently behaves. I frequently get computer vision suggestions on liverworts in particular (but occasionally vascular plants) that have no records, research grade or otherwise, on iNaturalist. Just photos from natureserve or a similar source. And to bring things back to the original topic- yes, they are often described as specific to an entirely different continent.

1 Like

@erikksen do you have a place defined in your settings? I am on the .nz domain, and have a default “New Zealand” filter picked up from that, so when I see results I am only seeing what is present in New Zealand. To see WW I have to deselect the filter for the place.

1 Like

From the iNat website:

There are 13,730 species that have at least 20 research grade observations. We chose this as the data threshold necessary to include a species in our model. Technically, this number is closer to 10,000 species since we took steps to ensure that each species had at least 20 distinct observers to control for observer effects. We are moving a new species across this data threshold every 1.7 hours as new observations and identifications are added to iNaturalist. This means every observation you post or identification you make works to improve the model!

I dont know if it has changed, but there is this not easy to find page on the site describing the computer vision system : https://www.inaturalist.org/pages/computer_vision_demo

It clearly states that to go into the training for the system, there must be 20 research grade records.

If you are seeing something else, it is either a filter issue or described above, or something has changed versus the rollout documentation.

I have no idea if this is the right place to comment, but if it is not just remove this reply…but i am always surprised they (in general) never give a warning that the chosen species is not found in this country, has not been found last 10 years in the state or that it is from another continent.
I know the ObsMap app give a Fat red text if the species is rare and if it is very rare you have to confirm it by pressing an additional YES. The last option reduced the rare birdinds with 90% which was important as sms and email notifications are send on (mainly) rare bird alerts.
I was very happy that the inaturlist app show a very basic “the species is seen nearby” which, in my option, should be extend to warnigns like ‘not from this continent’,
I have no idea how difficult it is but i used to have a database with plant names (from 30years ago) and the present in the different European states. I am only afraid that for several species the data is out dated.
But a bold red text (can colour blind man see red?) will prevent some wrong admissions.

4 Likes

Nope. To add a little more information to this scenario (again, most common with liverworts and mosses in my experience):

In the upload process, the list of suggestions that drops down has maybe four suggestions I’d like to check out, so I right click and open each in a new tab. One or two might turn out to be as I described before- with photos from a natureserv article or something to that effect, but no existing inaturalist records. In line with this topic’s origins, most others often do show inaturalist records but existing observations may be only in asia only or similarly not geographically sensible (I understand that if I did have a location filter on, I would not see those observations, right?).

Really my only interest in pointing this out is to suggest that the computer vision is not trained solely on research-grade iNat observations. It may be that it is trained on a limited number of images from existing authoritative sources as well (otherwise how would the suggestions I’m referring to end up being suggested to me?). That may indicate that condition 2 in edanko’s post that I was responding to has changed at some point.

I’d be interested in seeing some examples. Post a new topic in General when you come across this again?

Will do, can probably reproduce it with dunmy uploads.

Turns out I can’t reproduce it. Maybe I had location filters coming on selectively for individual tabs? Who knows.

There is a different problem with the suggestions too: sometimes, e.g., many beans are suggested, from different genera - but the software does not suggest ‘bean’ itself as a option. As bean is consistent with the suggestions, all of them, that is what I want to pick, and I can as long as I know they are all beans. But if among the suggestions I don’t know what the lowest taxa they have in common is, I cannot decide on a lowest level safe or probable id, except, ofc, by clicking on many of the suggestions and checking their taxonomy. I check taxonomy often enough, I would like for there to be a summary id, in the case where the probable taxa are similar at some level which is not too high. Also even if the taxa don’t meet until, say, dicots, I think it’s not a problem providing that because dicots is a very safe id then, being so high level. The trouble will be if the algorithm often picks multiple taxa of the same low level taxonomic group while ignoring candidates from elsewhere. and then the error is no greater than when one of the given ids is chosen. At least it will be obvious how far up the disagreement is.

3 Likes

Thanks for connecting me to this thread, @charlie!

1 Like

(Can color blind man see red?)

Not necessarily, but if the text is also bold then the color is not the only way to recognize that type of message.

1 Like

I am not sure whether I missed something from this very long thread but I wanted to add an extra example.
This problem occurs widely even with VERY common species of mammals. Not only you get suggested a list of species which don’t even occur in the whole continent, you also get a “we are pretty sure this in the family:” suggestion which links you to the wrong family. The family suggestion is obviously linked to the wrongly suggested top species so even the cautious newbie who wants to be on the safe side fails in assigning a family ID.
Here is a very fresh observation in which the problem occurs:

https://www.inaturalist.org/observations/24287951

A very common european rodent (Apodemus, family Muridae) is being wrongly assigned to american species (in the genus Peromyscus, which does not occur in Europe) and of course also assigned to the relative wrong family (Arvicolidae) by the algorithm.
I correct several of these easily avoidable mistakes and this leads me to think that geography is absolutely not taken into account.

2 Likes

There are quite a lot of observations of Viola sororia in Europe. I guess 99% are not Viola sororia, but native European species. Can someone check this observations in order to detect true Viola sororia species ?
https://www.inaturalist.org/observations?nelat=65&nelng=55&place_id=any&swlat=34&swlng=-11&taxon_id=82816

Probably most are not V. sororia, and fortunately most are not Research Grade, but it seems that this species has also been introduced in parts of Europe.