Learning more about iNat's computer vision


I would like to read the releasenotes on computervision. What has changed and improved since june 2017 in the CV model ?
I did not know that only the first photo was used for CV in stead of the best photo.

And about the performance it costs. Is it bad for iNaturalist to use the CV by default or is it better to use it on request ? I heard 10 graphic cards in a server was enough to get a good response and i was wondering if such a thing changes in time.

No, see this related feature request:

Draw bounding box on photo for computer vision

I didn’t know that either. I seem to recall that the original iNat CV demo page allowed for multiple photos, so assumed it would evaluate all and combine the results.

(Just checked https://www.inaturalist.org/computer_vision_demo, and I remember wrong - it only allowed one photo)


IT explains the bad results sometimes. The other CV all do multiple fotos as far as i know.(Naturalis and the French one, PlanetNet)




To me I feel like once the computer vision has a species well known, it is really good at knowing what is that species, but it’s really bad at knowing what is not that species.

So for example if you post a photo of Toxomerus marginatus (a small common colourful pollinator in North America), it will get it right away almost every time, but if you post a completely different looking small fly, bee, or wasp, it might also call it Toxomerus marginatus just because it’s also a small pollinating insect. This is problematic when most small insects are too difficult to identify to get enough species identifications for the computer vision to know about (20 RG observations by 20 different users I believe).

1 Like