This is so true and this “false friends” (as linguists say) leave many new users confused.
In the places with small amount of observations it gets particullary annoying, since suggested species as far as I can tell are based on the number of observations - the greater it is, more frequently it shows up.
I presume with Animalia it can be done more easily - separation by geography (at least by continent). But with plants it can get little tricky, because for instance here in Almaty, Kazakhstan there are many acclimatized species from North America thanks to Botanical Garden - I’ve encountered several Nootka cypresses and Pseudotsuga sp. throughout the city.
Good points about the number of observations Kastani.
I think “rare” species with numerous “good” observations are subsequently vulnerable to a flood of incorrect observations based on the current AI approach: I give an example from my experience below.
For Ulmus thomasii as an example, I have made a serious effort over years to document and upload this rare species to iNaturalist, as I think it is likely threatened/endangered, and historically good data has been lacking to draw quantitative conclusions regarding the population. It appears to me that due to this species having received “attention” and being uploaded to iNat much more than one would expect given a relatively small/declining wild population (i.e., it is “over-represented” as an uploaded tree species for Ontario/Quebec), it is now frequently being suggested by AI for anything that looks somewhat similar (e.g. a photo of what is actually Celtis occidentalis may be be AI-suggested to be Ulmus thomasii). So, ignoring the obviously incorrect tropical Asia uploads, many of the problematic incorrect uploads for this species are occurring in or near its natural range, so the location may be plausible or vaguely realistic for a newly discovered population, but the actual ID based on photo evidence may be completely different from the AI suggestion to the trained eye with expertise/experience. Without constant curation of this and many many other species, I fear that one problem will be generally that “rare” species which have a critical mass of good uploads to iNat subsequently get swamped by a large number of incorrect observations based on AI suggestions (without constant dedicated curation).
Has anyone suggested adding a “frequently misidentified” message for cases like Ulmus thomasii—that is, where the ID has been changed to another species more frequently than not? Or perhaps more frequently than some threshold, such as 20 percent of the time? The message could be displayed with the suggestion, as is “found nearby,” or it could be displayed next to the species name in the observation, as is “introduced.” Or both.
Will iNat deal with this in the future? It’s mostly brand-new users contributing to this- North America gets weekly reports of Gray Herons, Little Egrets, Brown Thornbills, Old World Buntings, Pacific Black Duck x Mallard, and more. It’s all well and good if users can manually fix the issue, but they aren’t the ones not checking ranges so it solves nothing.
It is a very difficult task to do programmatically. Doing it requires having detailed information at both a macro and micro level of geography. It requires tracking and storing that data in some way (currently iNat has a minimum of 5 different sources of data about what species are located where : its own observations, GBIF records, atlases, range maps and checklists and they are not consolidated or all linked together).
To use your example of Grey Heron is on the Canadian checklist (and the US one as well). It should be, the species has been seen in both nations, and a true checklist of the avifauna of both nations should include it. Should the species come up in Newfoundland or the Aleutians for example or still not suggested. Well, you can make it no, but that requires some way to mark that despite it being on the checklist and possibly even having records (I’m pretty sure without checking some of the Newfoundland records are documented as inat records) you still don’t want it suggested. Well, that then needs to be built and populated.
It is also really critical to properly deal with areas outside inat’s core user communities. In particular in Asia, Africa and to a lesser extent South America the distribution data is still poorly complete and a visual based suggestion tool may right now be the best tool.
I was running the identotron for here in the FSM and noticed that only Plumeria obtusa was listed. If the computer vision had checked the national list for Plumeria, only Plumeria obtusa might have been suggested. Yet both Plumeria rubra and Plumeria pudica are extant here. While I was able to update the national checklist to include these two species, I was reminded that internal sources such as the checklists might not always be complete. I have not yet updated checklists for Pohnpei island. In other countries propagating a species down to the many locations in a nation appears daunting at best. Thus eliminating possibilities by checking against the checklist could be problematic as the list may not be complete. As Chris noted above, expanding to include other sources would be difficult programmatically.
You should never assume that a checklist is accurate or complete, but along with atlases (which bring their own set of issues) they remain the only way to document the presence of a species in a location where there are currently no iNat records.
Speaking as someone who added the distribution data for the over 50,000 species in the taxa group I curate into the relevant national checklists (I cant even imagine doing it for subnational levels), they are extremely time consuming to update. It is easier if you have a list that is geographically oriented, but doing for an individual species that needs to be added into multiple checklists - that is very slow work. Somewhere I have an active feature request to try and make it easier, but it got little traction, and I would have to find it.