Pasimachus elongatus range does not include Eastern US

Referring to this taxon: Blue-bordered Pedunculate Ground Beetle Pasimachus elongatus
This species has a number of records from the Eastern/Southeastern US here on iNaturalist. I believe that is a systematic error based on over-specific ID’s originating in computer vision suggestions.
Species ID in this genus is quite difficult. Regional checklists and references are useful to narrow down possibilities. References I have do not show P. elongatus as occurring in the Eastern US. BugGuide shows records from the Midwest and Northern Great Plains up into Canada.
Bosquet gives the distribution of P. elongatus: “This species ranges from the southern part of the Prairie Provinces south to northern Sonora, western and northern Texas, and southeastern Louisiana, east to Indiana and southwestern Michigan.”
Ciegler, for instance, lists several species of Pasimachus for South Carolina, but not P. elongatus
There are several similar species in the East/Southeast, and they are difficult to differentiate without very detailed images and/or examination under a microscope. The iNaturalist computer vision/artificial intelligence system is not able to make distinctions among species in this group, which are all rather similar in habitus and somewhat variable. I think the eastern P. elongatus listed here on iNaturalist are simply over-specific ID’s based on the computer vision suggestions. It happens with these difficult groups of species!
I have corrected/commented on research grade observations from North Carolina. (Edit) Others might want to go and make suggestions on other observations that are clearly out of range here. (End edit.)
References:

4 Likes

3 posts were merged into an existing topic: Computer vision clean-up (archive)

Unfortunately a common problem with iNaturalist’s CV. Similar problems with taxa like Phyllophaga anxia and Heterocerus fenestratus being selected because the CV has trained on a small number of species in a genus (or only one species). But that’s another can of worms I suppose. I’ll try to help clean some of these up if I can.
Edit: There are so many…

4 Likes

Thanks. I should add that this sort of problem is not unique to computer-assisted ID. People are quite capable of repeating the same sort of mistake.
This is one reason I am a big fan of marking observations “as good as it can get” at higher taxon levels when appropriate. It keeps things realistic, and helps guard against runaway over-specific identifications. If additional information emerges, such general ID’s can be made more specific.

1 Like

I did NC and SC. There are a few obs. where several people agreed to the wrong ID, so those are still showing up. At least there are now no research grade observations from either state.
Edit. Also did Georgia.
Edit. New Jersey.
Edit. Not clear what to do about KY, TN, AL, MS. Reported range extends east to Indiana, and parts of Louisiana, I think. It is conceivable that some of the adjacent states could be in-range.
A broader issue is that many species-level ID’s in this genus are likely unjustified. Most images just do not have the detail to run through a key properly.

2 Likes

I would say on a case by case basis. I don’t think it will occur in MS or AL. Western TN and KY might be possible, but certainly not anything in the Appalachian area.
From the range given in Bousquet, its range seems to be generally delimited by the Mississippi and Ohio Rivers and is unlikely established east or south of there. Although certainly possible to occasionally stray over those rivers. The Appalachians would form another natural boundary to the east. However, I do not know yet know the extent of its distribution though Indiana and Ohio.
Lindroth states it is a prairie species, which is repeated by Purrington & Drake, so it likely requires open grassland.

This is very true, and not just in this genus. With arthropods especially, vouchering observed organisms is a valuable option, though I respect that many would not want to do so.

1 Like

Good ideas. I see some observations from the Blue Ridge of TN/KY that should be addressed. East of Mississippi River and south of Ohio River seem like reasonable boundaries for clearly out-of-range.
More generically, perhaps in the future there will be a way for the cv system to “know” which species are unlikely to be identified at species level from images. If there were just some sort of warning that “computer vision suggestions often not reliable in this genus” or some such thing. Then the identifier could click one of two buttons: “ok, leave ID at genus” or “No, I’m pretty sure of the species ID”. Something like that might be worth a feature request. It likely has been discussed before.

Based on another discussion, I think one approach to this issue would be to remove individual Pasimachus species from the computer vision (cv) model. That way the system will not suggest those species. I am unclear on both the mechanics of how this might work and how one could request that change to the model be made. I guess it might be a curation request? There is some more discussion here:

https://forum.inaturalist.org/t/north-american-sinea-id-and-the-sorcerers-apprentice-problem/68337/31

2 Likes

That sounds like a snowball waiting to happen. If we do this with Sinea, then with Pasimachus, then every identifier of difficult taxa will want it for their taxon of choice, too.

that seems like a good thing to me
better to have correct higher-level IDs than incorrect species IDs

3 Likes

Let us please not restart another back-and-forth discussion between the “everything deserves a species ID“ and “not everything is identifiable to species” camps, it’s already been discussed to death in countless threads.

As I understand it, that sort of behavior or anything similar that accomplishes the end goal of “have the CV stop suggesting unjustifiable species-level IDs” would require actual changes to the code of the CV model to implement, which means it’s beyond the realm of curator assistance. That has to go straight to staff as a feature request, presumably in the feature request category on this forum.

3 Likes

Phyllophaga anxia will almost certainly get out of control if not removed from the CV. Because of the lack of other Phyllophaga species in the CV, it will be the top suggestion for many different Phyllophaga. Really any Phyllophaga learned need to be done in groups so no single species are overwhelmingly suggested.

Heterocerus fenestratus is in a better state. There isn’t an explosion of observations after being learned by the CV.

3 Likes

My bigger worry is that if the precedent is set, we’ll have a thread for every taxon that someone wants this to apply to. Hundreds, perhaps thousands, of threads whose subject lines are all variations on “Please exclude ____ from the CV.” As it is, we have more than enough instances of threads by people who do not know how to flag a taxon.

1 Like

Agreed, the CV suggesting Phyllophaga anxia is definitely a problem.

I’ve spent a lot of time curating the observations of Heterocerus from North America to remove many of the erroneus H. fenestratus. That graph would look a lot different otherwise. :)

2 Likes

That was really part of my line of questioning. Should this sort of request be made as a taxon flag? I was not aware of that. Can anyone point me to a taxon flag where this sort of request is made? That would be helpful as a model.

Taxa aren’t/won’t be individually manually removed from the computer vision model: https://forum.inaturalist.org/t/add-a-flag-for-frequent-computer-vision-identification-errors/1905

1 Like

There are links to flags on a lot of taxa which had CV issues here which could theoretically have become requests for this if that were a thing. But note that most of them have been resolved through a mix of ID effort and maybe updates to the geomodel, which I think gives hope that other taxa can eventually be resolved as well.

2 Likes

Oh, that’s a great discussion, and I particularly like the link to Ken-Ichi’s presentation.

I feel like this issue should be revisited. In many groups, especially invertebrates, the interaction of the cv model and less-experienced observers produces many over-specified observations. Manual correction of the observations doesn’t stick–the problem comes back and the cycle repeats. This cycle wears down knowledgeable identifiers and they may give up over time. For these difficult groups the ongoing result is that the model is getting worse, not better. That should concern everyone.

3 Likes