Psst - New Vision Model Released!

Not training!! But suggestions, analysig… but after training in the observation. Currently i thought it is using the first photo of an observation for the suggestions and skips all other photos.
I could not find it but it seemed i missed several posts on CV, AI training.

Are there more tips on the way CV, AI works (Cropping can definitely improve results.)

===
https://www.inaturalist.org/pages/help#cv-taxa
https://www.inaturalist.org/pages/help#computer-vision
https://www.inaturalist.org/pages/help#cv-select
https://www.inaturalist.org/blog/31806-a-new-vision-model
FWIW, there’s also discussion and some additional charts at
https://forum.inaturalist.org/t/psst-new-vision-model-released/10854/11
https://www.inaturalist.org/pages/identification_quality_experiment
https://www.inaturalist.org/journal/loarie/10016-identification-quality-experiment-update
https://www.inaturalist.org/journal/loarie/9260-identification-quality-experiment-update
about a rare species, but the system might still recommend one based on nearby observation
https://forum.inaturalist.org/t/identification-quality-on-inaturalist/7507
https://github.com/kueda/inaturalist-identification-quality-experiment/blob/master/identification-quality-experiment.ipynb
“nearby” means near in space and time
The model became more efficient in sedges and grasse
the vision model does not itself incorporate non-image data other than taxon IDs
b/c because
https://www.inaturalist.org/blog/25510-vision-model-updates (“taxon and region comparisons” 20190614)
https://distill.pub/2020/circuits/zoom-in/ (“connections between neurons”)
https://www.inaturalist.org/projects/flora-of-russia/journal/31726
https://www.inaturalist.org/posts/31726-
https://forum.inaturalist.org/t/provide-relevant-geographic-data-confidence-level-accuracy-scores-with-ai-suggestions/9226/2
https://forum.inaturalist.org/t/range-covered-by-the-seen-nearby-feature/2849/5


https://www.inaturalist.org/computer_vision_demo
http://www.vision.caltech.edu/publications/publications.html
http://www.vision.caltech.edu/archive.html
https://vision.cornell.edu/se3/
https://vision.cornell.edu/se3/publications/
https://merlin.allaboutbirds.org/
https://merlin.allaboutbirds.org/
https://sites.google.com/visipedia.org/index/publications

https://forum.inaturalist.org/t/use-computer-vision-to-annotate-observations/3331
https://forum.inaturalist.org/t/what-image-s-are-used-for-training-computer-vision/3307/6

4 Likes

Sorry, my previous suggestion probably went to the wrong address (it’s on improving the AI suggestions).
One idea that might be more appropriate here is to allow a smaller numbers of photos for selected taxa in the AI training. The reasoning is that some taxa are much more unique than others and this might allow some of the rarer taxa to get in. Just an idea - I don’t know how much trouble that would be, or if that can even work in your process.

3 Likes

there is a thread on a related topic, using the CV to populate annotations
https://forum.inaturalist.org/t/use-computer-vision-to-annotate-observations/3331

3 Likes

Hi! I had a couple questions related to the new computer vision model and thought I’d float them here (I’m happy to relocate if there’s a better place).

First, is there any particular reasoning behind only running cv (edit: cv prediction) on the first image in a set? I’ve started polling the cv on each photo in the uploader prior to merger, but it’d be nice to be able to access this info after the fact to guide identification of mine and other’s photos.

Second, I’m revisiting my old non-research-grade observations in light of the new cv, and ran into a quandry. If a species I couldn’t previously identify now shows up with a reasonably strong cv suggestion, I’m tempted to add it. If a user previously suggested the species, I think this would promote to research grade. I think this matches intent: The two identifications should be independent, bc sub-RG observations don’t feed to the model except when sub-RG only bc cultivated. Still, it feels a bit weird.

2 Likes

In the first question, CV is trained on all observation photos, but when you are getting a suggestion on an observation it only looks at the first photo. In the uploader, before you merge a group of photos into one card, you can use the CV suggestions for each card independently, so it’s a way to see if any of the photos might offer something different.

4 Likes

Suggestions welcome

Figure out a way to train with the location and date

People keep bringing up modifying the input images, but you can also just provide these as secondary inputs to the model architecture. For example if the architecture is currently:

  • [image] → [convolutional layers] → [output layer]

You could make it:

  • [image]-->[convolutional layers]-->[ concatenated
  • [location]------------------------> ........layer ]-->[ output layer ]

And location could simply be represented by the real value of the latitude and something like the cosine of the longitude after scaling from -pi to pi. (This enables it to wrap around.) You could do similar things to enable dates to wrap, giving a seasonality input value.

edit to add: The wrapping method I gave makes it so that 90W is exactly as far from 0 as 90E, but doesn’t capture that those are opposite from one another. If you use both the sine and the cosine you get both pieces of information. Intuitively, the sine and cosine denote a specific angle that points to a specific location on a circle (where the circle is a latitude line). Likewise, the sine of a date is about 0 at solstice and the cosine is about 0 at equinox, and the pair of them together tells you exactly where in the annual cycle you are.

I don’t know how useful this is / don’t have citations for its use, but it should be easy enough to try and test on a small dataset to see what happens.

4 Likes

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.