What is the current computer vision model, really?

galenseilis · September 8, 2022, 5:29am

As someone that trains machine learning models, I am curious about what model architecture is currently being used for computer vision in iNaturalist.

Because images are a classic example of a feature space that can have translational invariance, it would be genuinely suprising to me if the model did not make use of convolutional layers. The convolution operator is not invariant to translations mind you, but rather it is translation equivariant (see https://arxiv.org/abs/2104.13478). Is it something straightforward like a CNN classifier?

Or is it a single model? I could also imagine a collection of models specialized to different levels or clades of the tree of life. I would also be delighted to learn of a single model structure that nicely handles the hierarchy (i.e. partial order) that is the tree of life.

What is the actual computer vision model beyond a generic name of “the AI”?

anon93074988 · September 8, 2022, 7:09am

More information here.

https://www.inaturalist.org/blog/69193-new-computer-vision-model/

thebeachcomber · September 8, 2022, 7:20am

also previous (but possibly now outdated) info at:

https://forum.inaturalist.org/t/computer-vision-update-july-2021/24728

https://www.inaturalist.org/blog/63931-the-latest-computer-vision-model-updates
https://forum.inaturalist.org/t/new-computer-vision-model-released/31030

https://forum.inaturalist.org/t/new-vision-model-training-started/27378
https://www.inaturalist.org/blog/59122-new-vision-model-training-started

cthawley · September 8, 2022, 12:22pm

This thread/request also has relevant info: https://forum.inaturalist.org/t/computer-vision-should-tell-us-how-sure-it-is-of-its-suggestions/1230

@alex may be able to answer best.

Side note that iNat staff generally prefer “computer vision” or CV over “the AI”.

galenseilis · September 8, 2022, 3:10pm

Thanks to the links shared by others, I eventually made my way to the iNaturalist Github. The iNatVisionTraining repository appears relevant, although I am unsure if it is the current state of the model used in iNaturalist.

The file https://github.com/inaturalist/inatVisionTraining/blob/main/nets/nets.py appears to have the relevant code for instantiating models. The main chunk of the model is Xception which involves something called “depthwise separable convolutions” (I have not read the paper yet). The output of Xception is then put through a global average pooling layer, then a dropout layer, then a dense layer (i.e. like you would find in a perceptron model), and then a softmax layer.

With some further reading of the paper, I think the Github repo will have given me a much clearer picture of what the computer vision model actually is.

system · November 7, 2022, 3:11pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is there information on the CV models previously and/or currently used by iNaturalist? General question , computer-vision	3	174	March 23, 2025
New Computer Vision Model Released - August, 2022 News and Updates web , android-app , ios-classic-app	21	1672	January 10, 2023
New Computer Vision model released! News and Updates	45	5622	October 4, 2024
Can I use images on INaturalist for training a CV algorithm General	3	152	June 21, 2024
How many species are currently integrated to the Computer Vision model? General	2	436	December 6, 2019

What is the current computer vision model, really?

Related topics