Why not empower recognised experts?

Thanks for this @tiwane.
I’ll think about possible feature requests leading from all this, as @bouteloua suggested …

I’ve really appreciated hearing everyone’s thoughts…and could probably continue debating aspects of this for a long time yet :)

I’m really just trying to reflect back the stuff I hear from UK community mainly…it frustrates me when I hear iNaturalist being denigrated or ignored, when for me it seems to be a far superior platform than the other ones the recording schemes currently use. I just wish there was more integration of the UK expertise into the community here. But…perhaps I just need to be patient, also.

While I note it… the point you were referring to was perhaps the response by @upupa-epops

This comment from Kueda on the blog post helps clarify this, as follows…:

Training data gets divided into three sets:

Training: these are the labeled (i.e. identified) photos the model trains on, and include photos from observations that

have an observation taxon or a community taxon
are not flagged
pass all quality metrics except wild / naturalized (i.e. we include photos from captive obs; note that “quality metrics” are the things you can vote on in the DQA, not aspects of the quality grade like whether or not there’s a date or whether the obs is of a human))

Validation: these photos are used to evaluate the model while it is being trained. These have the same requirements as the Training set except they represent only about 5% of the total

Test: these photos are used to evaluate the model after training, and only include observations with a Community Taxon, i.e. observations that have a higher chance of being accurate b/c more than one person has added an identification based on media evidence

You’ll note that we’re potentially training on dubiously-identified stuff, but we are testing the results against less-dubious stuff (you can see what these results look like in the “Model Accuracy” section of https://forum.inaturalist.org/t/identification-quality-on-inaturalist/7507). The results are, strangely, not so bad. Ways we might train on less-dubious stuff (say, CID’d obs only, ignore all vision-based IDs, ignore IDs by new users, ignore IDs by users with X maverick IDs) all come with tradeoffs and all, ultimately, limit the amount of training data, which I’m guessing would be a bad thing at this point for the bulk of taxa for which we have limited photos.

I’m not sure if I’ve read this detail before… I certainly seem to constantly forget aspects of it at least! Some more things to stick in an FAQ somewhere perhaps?

6 Likes