Training AI bot on all taxonomic descriptions

personally, i think the best implementation of something to answer this particular query would be a generative image AI thing coupled with a computer vision thing. the generative image AI would give you several novel images of “beetles that are red in front and black in back, found in Thailand”, and you could click one to have it generate more like the one you selected. once most of the novel images look like the beetle that you have in mind, you could then run computer vision on those images to see what kind of beetle it might be.

that said, a lot of the existing search engines out there that return image results are already pretty good at returning results for “beetles that are red in front and black in back, found in Thailand”, assuming that kind of information exists on the web.

if you’re going to train an AI strictly on text in iNat, then i think it would be best to focus on identification notes as a means for observation / identification suggestions, as described here: https://forum.inaturalist.org/t/ai-for-identification-notes-and-comments/37989/8.