I was surprised when I first learned (in October 2020) what observations are eligible for training, and I still am. People often assume only Research Grade observations are eligible (this surfaced in a recent forum thread), but unless something has recently changed, this is not the case. An observation is eligible to be in the training set if it has an accurate location, an accurate observed date, and a photo. For details, see this thread: How can I search for observations eligible to be in the training set?
The purpose of the current thread is twofold: first, a reminder of the fact; and second, a further plea for a straightforward method to search for observations that are eligible to be in the training set. Unless I’m missing something, the Identify tool should default to this set of observations.
Related question: as taxa accumulate more research grade observations, are observations in those taxa that are not research grade removed from the training set?
That’s a good question. As far as I know, the training set for any given run is selected at random from all eligible observations. So, for better or for worse, the answer to your question is no.
This is incorrect as far as I know. I don’t know what the exact selection criteria are, but staff have said that certain types of observations are prioritized over others when many pictures are available. See some info in this post and throughout the thread: https://forum.inaturalist.org/t/how-are-photos-selected-for-cv-training/42403/21