Thanks for the response, Alex. I guess the difference between the model and the community (at least experts in Psychodidae) is that the model usually doesn’t know what it doesn’t know - in other words, when it comes across a species, it will always try to offer suggestions, instead of sometimes recognising that this is not a species it’s been trained with yet. It would be pretty neat if the AI could learn to say “this seems to be something I haven’t seen before”.
I think this depends a lot on location. In the tropics, so few of the species have been documented, that I suspect there are very many diverse genera and families represented by only one or a few species in the training dataset. Given how many species there on Earth, and that the majority have not even been described to science, this will be an ongoing problem. Sure, the AI is pretty good with “weedy” tropical species that are common and have a wide distribution (and are most often photographed by iNat contributors), but pretty much any rainforest plant from where I live in Brazil, even with good photos of flowers, receives a sequence of improbable suggestions. This is not a huge problem for those who use the AI discerningly, but it does start to become a problem when “observations” of New Zealand or South African endemic plants, for example, start appearing all over the world as a result of less experienced users accepting the suggestions.
Given the nature of biodiversity - that most species are rare and have restricted distributions - we’ll likely never get to the point where the AI can identify everything. So it seems crucial for it to learn to be aware of what it doesn’t know.