Casual observations in the CV training set

I understand from reading older forum topics (e.g. this) that captive/cultivated observations at least in the past were included in the training set for the computer vision algorithm. Are they still part of it? And if so, what is the criterion for their inclusion? (E.g. must have a community ID?) Are other casual observations included as well, or just the captive/cultivated subset?

I’m asking because of the seemingly growing habit of making observations casual as a way to remove them from the Needs ID pool without making any attempt to correct wrong IDs on them. It seems a lot of identifiers treat them as out-of-sight = out-of-mind, and I can sense a certain “this should be casual so it doesn’t deserve my time to ID it” attitude. Some will go so far and criticize those of us who do occasionally add IDs to things like potted plants for ‘wasting our time’ on these.

However, if casual/captive observations are still used for CV training that would provide a pretty strong argument for why they shouldn’t be totally ignored in the ID flow and how correcting wrong IDs on these can help improve iNat for everyone.

9 Likes

I think the better solution would be not to use casual or captive observations for CV training. In many cases, the captive or cultivated forms of a taxon look quite different from the wild form.

Also, while there’s been some discussion on the topic of including captive observations in the needs ID pool–and I think the general outcome of that was that this should be an option, i.e. people interested in IDing captive observations should be able to click something to include them–under current iNaturalist policies they should not be in needs ID. Marking observations as captive in order to kick them out of the pool is the correct action. Objecting to someone else IDing captive observations seems pretty unhelpful, though.

1 Like

In the iNat app for Android, there’s an option to include non-nearby taxa which helps a lot for common non-native, captive/cultivated types. I think the issue with completely excluding captive/cultivated is that they’re wild somewhere, so there would need to be some kind of geo-fencing for each taxon.

Granted, there are cultivars that are so far removed from their natural origins that this argument breaks down. For example, if enough observers added GloFish, genetically modified aquarium fish that fluoresce under black lights, it could mess up CV suggestions.

Deviating a little, would GloFishes need to be marked as Homo sapiens?

I understand the frustration; I feel similarly about people who mark unknowns captive without adding any ID. However for observations with an incorrect ID, it’s worth noting some people can recognize a plant as captive without having the foggiest idea what kind of plant it is, and therefore no idea that the ID is bad. (Just assuming you meant plants here and not dogs and goldfish.)

2 Likes

Marking plants as cultivated often does not require knowing what species they are! Clues include the pot their in, the straight edge of the sidewalk garden interface, irrigation lines, and mulch.

2 Likes

I understand that some folks are less familiar with plants although I would think they at least recognize it’s a plant. This is not limited to plants though. I see a lot of ‘unknowns’ with very obvious IDs that were marked captive by someone other than the observer, including cats, dogs etc. Those unknowns do not feed into CV training though. They are probably just frustrating for the person who posted them, but that’s a different issue.

The fact that captive/cultivated observations get a lot less ID attention to correct misidentifications may be more of an issue if they are included in the CV training set. These seem to persist even longer than misidentifications that somehow made it to RG. I found mistakes that were obvious to me (and probably anyone with minimal gardening knowledge) with community IDs that were left uncorrected for 10+ years. Especially for plants, there seems to be a lot of blindly trusting the CV suggestions and many non-native garden/house plants receiving IDs for the closest ‘nearby’ look-alike.

2 Likes

Would you say these usually have a community ID or just a single ID?

1 Like

I feel it depends somewhat on location (e.g. city vs. rural) and prevalence of school projects etc. A lot of captive observations never get confirmed and stay at single ID, but for example if students confirm each others IDs using CV suggestions there may be a bunch of them with community IDs and a high likelihood that someone just clicked them all casual without a second look presumably because of personal bias against such projects.

Complicated by the fact that, if the obs is not marked as Cultivated first - then iNat bravely tries its best to offer a seen nearby Wild if at all ‘possible’

Yes, exactly! And most of these are ‘committed’ by new users, who may or more likely may not know that they can toggle the “seen nearby” setting on and off. I suspect the cycles goes:

  1. newbie observer of plant A checks CV suggestions and picks most similar (plant B seen nearby)
  2. newbie identifier checks CV suggestions and confirms plant B ID for plant A (now with community ID)
  3. someone ‘cleaning up’ RG obs sees it’s cultivated and marks it captive without taking a look at ID (makes it less likely to ever be corrected now)
  4. plant A pictures with plant B community ID get pulled into the next training set, shifting the CV profile for plant B to look closer to plant A
  5. observer posting another plant A now gets more confident CV suggestions for it being plant B
2 Likes

I asked because: https://www.inaturalist.org/pages/help#cv-taxa
However just this week I heard staff say they changed it from one hundred observations to one hundred photos, so the help page is clearly outdated.

1 Like

I think that they could be used for CV training, and in fact may be a great way of doing so. I take casual observation of my garden plants just to compare how they grow and what they look like at different times. I cannot give that level of detail to a wild plant that I just happen to be passing on a hike.

But I think the greater reason for not ignoring captive observations is because the first and foremost point of iNaturalist is to get people who would not ordinarily be interested in the natural world interested. And that means that if your average Joe or Jane downloads the app and heads outside and takes a picture of the first thing they see, it may be very likely to be captive or cultivated, and they may not realize the nuance of such a thing. And you can either mark as casual without helping at all, and they will think it’s a lame app and never return, and then you’ve lost that opportunity. Or you could attempt to identify, or at least explain how iNat works, and perhaps - just perhaps - you might hook another person into making a few more crummy observations, which might eventually grow into better observations, which might actually turn into someone who cares about what’s happening in the little spaces around our daily lives.

5 Likes

and we have no way to see how far pending is from reaching 100 photos

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.