There are guides of various types in the Guides link (found under More on the header). They are not comprehensive but they are helpful.
The response to posts on iNat can be variable. I actually joined iNaturalist because I decided to make myself a Covid project to learn about bumblebees. I posted a few species and figured at least one of them would get a response. In the meantime I started posting other photographs (old and new) and rediscovered interests that had been dormant for years. No regrets on the bees, which alas remain unidentified. Iām sure somebody will weigh in eventually.
This forum is also a nice bonus. Some wonderful folks and some interesting stuff.
Thanks! Yup, thatās exactly it. There are a zillion wiki implementations for data repositories, and I would love it if iNat implemented its own from which to pull useful taxonomic information in place of the Wikipedia blurb. Thanks for finding those discussion pages for me! I tried a couple of different searches and wasnāt finding the right things.
The odds of being wrong are less than when is not sure and in the next learning iteration will probably learn it, I think is a good balance, otherwise it becomes the problem the OP refers to. In my experience of almost 50 observations where I tried the CV prompt and there was not āweāre pretty sureā suggestion, the other suggestions were simply wrong.
With the published books, thereās some incentive (fame and fortune!) for a person, or persons, to spend years of time putting together those descriptions. And the descriptions in those books are usually only useful regionally. We canāt copy copyrighted content from books and put it on iNat. And the content that you do find on iNat comes from the users of iNatālike yourselfāand others that contribute to Wikipedia articles. Thereās no staff for adding content to iNat itself. Hope that helps! It would be nice to have a giant Swiss army knife, but the more tools added to the knife, the more unwieldy it becomesāeven if itās possible to add the tools and financially feasible to build and maintain it.
For training the model, if you have more than 1000 photos for a species/taxon, are only photos from RG observations used?
And a different question, is there any way for an iNat user to know which species are on the bubble, e.g. close to having 100 photos and thus able to be used to train the model?
If there was a way for me to know which species were close and occur in my area, I could try to take more photos of those species so they could be included for the new model.
Theyāre chosen pretty randomly, RG/not RG is not a factor.
Maybe using the API? Either way, I personally donāt think training the computer vision model should be a priority for users. Have fun exploring and observing what you want to observe. But to each their own. :-)
For me, this is one of the primary incentives to use iNaturalist :
FILLING IN THE BLANKS
Because its like a kind of puzzle with missing piecesā¦it can see some genera but has no data about another ā¦so needs the blanks filling in. If I find a local species it doesnāt recognise to genus or even family, I have been actively aiming to achieve 50 observations of it to try and get it recognised.
FUTURE FOCUSSED
It feels like a long term goal. Unlike helping to identify organisms on other sites which might languish or anyway not be then entered into a dataset, here it feels like weāre all chipping away at something bigger which (if it ever became accurate enough) could potentially have serious impact down the lineā¦ in opening up and supporting societyās awareness and ability to perceive the natural worldā¦and in turn, the larger ecological implications.
HELPING FIX THE BROKEN BITS
Similarly to number 1, incorrect CV issues which propagate feedback loops of misidentification feel like they are holes that need fixing and something we can actively contribute to as users through correct identification.
This response might be slightly misleading, unless Iām misunderstanding the other posts around this. RG is not a factor in training, but it is in testing. ( many might not know how ML worksā¦so understanding of the word ātrainingā might anyway encompass testing for some readers ) ā¦That might sound pedantic! But for me, as mentioned, its a core incentive, so I was happy to learn more about how it was working this week ( and will be happy if someone corrects this with further info).
My current understanding :
HELPING FIX MISSING SPECIES with less than 100 photos
If there are less than 100 photos of a speciesā¦ like @matthias55, we can try to help train it.
We do not need to reach RG, just accumulate the 100. This should be roughly visible through exploring observations though, no need to use API(?)ā¦
e.g. on a blank I think Iāve nearly filledā¦ : https://www.inaturalist.org/observations?place_id=any&subview=table&taxon_id=451684
30 obs with 1-6 photos should be approaching the necessary 100 total.
Currently, CV suggest is some sort of ant for this species, so wrong taxonomic order entirelyā¦and a nice sense of achievement to fix I think!
HELPING FIX INCORRECT SPECIES with over 1000 photos If however, over1000 photos already exist and a user rather wishes to help fix a recurrent error with the CV, adding more wonāt necessarily helpā¦ this is more about ensuring there is a cleanliness to the existing test dataset, in which case, helping with quality control, as an identifier, might contribute to resolve the issue more directly and prevent more incorrect obs being placed in the dataset.
A core issue at present - as visible in the computer vision clean-up wiki - seems to be issues which have been created due to this feedback loop of misidentification >> wrong auto-suggest >> further misidentification.
HELPING FIX INCORRECT SPECIES with 100-1000 photos
If there are between 100 and 1000 photos for the species set, helping with identification quality control and increasing the amount of training data both seem like valid ways to help. Both should contribute to overall accuracy. If I understand correctly.