Why not empower recognised experts?

sbushes · July 16, 2020, 11:51pm

Thanks for this @tiwane.
I’ll think about possible feature requests leading from all this, as @bouteloua suggested …

I’ve really appreciated hearing everyone’s thoughts…and could probably continue debating aspects of this for a long time yet :)

I’m really just trying to reflect back the stuff I hear from UK community mainly…it frustrates me when I hear iNaturalist being denigrated or ignored, when for me it seems to be a far superior platform than the other ones the recording schemes currently use. I just wish there was more integration of the UK expertise into the community here. But…perhaps I just need to be patient, also.

While I note it… the point you were referring to was perhaps the response by @upupa-epops …

This comment from Kueda on the blog post helps clarify this, as follows…:

Training data gets divided into three sets:

Training: these are the labeled (i.e. identified) photos the model trains on, and include photos from observations that

have an observation taxon or a community taxon
are not flagged
pass all quality metrics except wild / naturalized (i.e. we include photos from captive obs; note that “quality metrics” are the things you can vote on in the DQA, not aspects of the quality grade like whether or not there’s a date or whether the obs is of a human))

Validation: these photos are used to evaluate the model while it is being trained. These have the same requirements as the Training set except they represent only about 5% of the total

Test: these photos are used to evaluate the model after training, and only include observations with a Community Taxon, i.e. observations that have a higher chance of being accurate b/c more than one person has added an identification based on media evidence

You’ll note that we’re potentially training on dubiously-identified stuff, but we are testing the results against less-dubious stuff (you can see what these results look like in the “Model Accuracy” section of https://forum.inaturalist.org/t/identification-quality-on-inaturalist/7507). The results are, strangely, not so bad. Ways we might train on less-dubious stuff (say, CID’d obs only, ignore all vision-based IDs, ignore IDs by new users, ignore IDs by users with X maverick IDs) all come with tradeoffs and all, ultimately, limit the amount of training data, which I’m guessing would be a bad thing at this point for the bulk of taxa for which we have limited photos.

I’m not sure if I’ve read this detail before… I certainly seem to constantly forget aspects of it at least! Some more things to stick in an FAQ somewhere perhaps?

Topic		Replies	Views
Weighted Identification General	8	854	August 17, 2019
Recruiting more identifiers General	287	24448	December 21, 2019
The benefits and drawbacks of adding coarse identifications General	175	1420	August 24, 2024
Limit the Power to Convert "Needs ID" obs to "Research Grade" Feature Requests	29	3379	July 26, 2021
Thank you to the experts General	24	1432	December 4, 2022

Why not empower recognised experts?

Related topics