How are photos selected for CV training?

If you want the model to actually learn this taxon, then your best bet is to be patient.

I see that you’re already responsible for 50% of the observations of that taxon. Based on that, the vision model may learn things about how you take photographs (your camera model, lighting conditions, your preferred focal distance, etc) instead of learning the visual features of the organism. If this kind of situation becomes common enough and it begins causing problems for the models that we can detect during evaluation, then I think we may have to further complicate the export criteria to require a certain number of photographers or identifiers.

I fear this already may be a problem. For example, see https://www.inaturalist.org/taxa/387943-Costelytra-brunnea which has 100 observations, but only 3 observers and 3 identifiers. 98% of the observations were made by one person, and the vast majority of additional identifications were made by a student colleague of the observer. Almost all of the observation were made in a two month period of 2021. This is in no way to disparage the expertise of the observer or the identifier, but this is a very narrow band of observer and identifier expertise to train a computer vision system that offers suggestions to other people.

20 Likes