What Image(s) Are Used for "Training" Computer Vision?

Nice explanation @marykrieger, spot on. Bottom line: the more images of a taxon, the better.

However, it’s important to remember that our computer vision model is not trained to recognize a certain taxon - it is trained to recognize iNaturalist photos of a certain taxon. Most photos on iNaturalist are taken by amateurs using smartphones or other consumer-grade equipment, and the organisms are almost always in situ. If you try to use computer vision on pinned insects, for example, it probably won’t work that well because most of the images it’s been trained on are in situ. It won’t have many reference images of insects on a white background.

So while uploading close-ups of small diagnostic areas can’t hurt, and is great for IDers to evaluate, I’m not sure how much of a help they’ll be for identifying most iNat photos with computer vision.

Each image that is used for training is randomly cropped and also rotated and/or flipped, and the model is trained on those images as well, so we’re able to get a few more images out of each single image. But as I said above, more is better, so the more angles you have, great. Although remember iNat isn’t primarily an image recognition training community, so no need to overload here.

I’m not exactly sure what you mean here. Aren’t those mainly identified by sound?

3 Likes