Is the CV model performing multi-label classification for shared media?

A given photograph (or other media) can participate in multiple observations when it evidences multiple distinct organisms.

And with the current computer vision model being a type of of convolutional neural network that is clearly performing multi-class classification, I’ve wondered if the predicted vectors are encoded to be single-label or multi-label. From a computer programming perspective, it is feasible to adjust single-label to multi-label with such a model. Note that multi-class classification is not the same thing as multi-label classification in machine learning.

The multi-label aspect would allow for predicting nested taxa (which is perhaps already being done), but also allow for a single photo to be accurately classified as having multiple distinct taxa. It is the latter that I am particular curious about.

Regardless of whether multi-label is already being done, the majority of photos have only a single label. The human effort just to identify each decent-quality observation is alone an enormous effort. Rather I am curious about whether we are making use of that minority of photos with multiple observations on the machine learning end of things using multi-label classification.

There seems to be a lot of folklore on iNat about what “confuses the AI”. I don’t want to debate whether multi-label classification will do this, or even what such a phrase means when you get down to the mathematics, in this thread.

1 Like

I don’t have any comment on the technical side of this, but I think this is an exciting idea. I think a good place to try multi-label classification would be for insects on flowers. There are a large number of observations with both insect taxa and flower taxa labeled.

I’d be excited to see this developed or perhaps it’s out there already.


I think the number of photos that are used by more than one observation currently is vanishingly small. Also, when there have been discussions in the forum about identifying multiple species within a photo or observation, the consistent response has been that the intention of an iNat observation is to describe the interaction with one specific organism at a particular place and time.

I can see benefits from adopting a data structure and CV model that supports multiple organisms per image. I’m almost certain this isn’t happening, and I would guess that iNat staff might put this quite a long way down their priority list.


Each iNaturalist observation is a record of an encounter with an individual organism at a particular time and location. (from the FAQ)

Our vision model attempts to assist the user in this task of documenting an encounter with an individual organism, and is trained on iNaturalist user photos which overwhelmingly provide only a single taxonomic label

Since neither our primary use case nor our dataset are well optimized for it, we do not take the extra effort to train a multi-label classification head for our models.


I agree, there are exciting aspects to this from both an INat perspective and a machine learning perspective. I often observe that a given photo often provides evidence of multiple organisms despite the FAQ definition of an observation. It is difficult to photograph one-and-only-one organism, although controlling depth of field and composition where possible can help emphasize the main subject. Your example of insects on flowers nicely illustrates this difficulty.

I also appreciate the enormous effort that would be required to supply multiple identifications per photograph. It is multiplicatively more difficult than the already enormously difficult task of community sourcing a single ID per observation. And as some have mentioned in this thread, including myself in the original question, most of the data doesn’t have these additional labels at present.

I have a lot on my plate in the near future, but I have added constructing a demo of this to my potential side projects list (which is admitably already too long, but it is good to have a diversity of project ideas on hand). This would be a great hobby project if someone is interested in getting some experience with CNN’s in particular or deep learning more generally. It also may be an opportunity to use semi-supervised learning to combat the low number of multi-labelled photos. And if this has already been demo’ed, I’d love to see links posted to such projects before this thread closes!

@andy71 that is a cool Field! I didn’t know about that…I’m going to start using it!

Is it possible to get the list of all of the Nectar / Pollen delivering plants for a given taxa? The html you provided just shows organisms that have that field, not what the field is (without clicking through to each observation).

Also, would it be possible to do the reverse search? i.e. starting with the plant and search for all of the organisms that list it as a Nectar / Pollen delivering plant?

Probably should start a new topic for the API questions. I think the things you mentioned should be possible. For instance getting all observations with Cirsium vulgare as a nectar plant:

These new species interaction fields on observations are very exciting. I think some ecologists are beginning to explore these as a data set.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.