North American Sinea ID and the Sorcerer's Apprentice problem

Many identifications of North American Sinea species here are unjustified at species-level. There are many, I feel, unjustified IDs of Sinea species, especially nymphs, as Sinea spinipes. AFAIK, all the members of that genus are really spiny as nymphs, and pretty spiny as adults. S. spinipes and S. diadema are both widespread in the East/Midwest and very similar–you have to get a close look at some spines/tubercles on the pronotum to tell, see:

https://bugguide.net/node/view/7900#id

Ironically, it is S. diadema that has spines on the anterior pronotal lobe while S. spinipes has blunt tubercles. The difference is subtle and apparent only with a very good macro image, if not a dissecting microscope.

Pretty much all members of this genus are spiny on the crest of their head as well as the front femora. Again, all the nymphs are very spiny and cannot be identified to species, for the most part.

Going back to the first ID’s in this genus, I think people here on iNaturalist saw a Sinea nymph or adult, and identified it as spinipes because it was spiny. Likely nobody realized how similar the species in this genus were, and how much they overlap in range. This ID was repeated, and pretty soon the computer vision (cv) system was calling every Sinea nymph “spinipes” with high confidence. Pretty much all of these ID’s are unjustified.
I call this the “Sorcerer’s Apprentice Problem”. The cv is fed some wrong, or at least over-specific, identifications, starts using them as suggestions, and pretty soon everyone is repeating them. This leads the model to be even more confident. The whole chain of identifications is fundamentally flawed, but it looks good unless examined with some understanding of the diversity of the group and actual ID characters. This happens mostly with groups, especially genera, that do not have big differences in color pattern. The cv is good at picking up differences in pattern, but it often cannot differentiate subtle characters, such as spines versus tubercles in one specific area. (The cv system uses scaled-down images of just 400X400 pixels or so, I understand. This is surely going to lose most such characters, even if present on the original full-resolution image.)

The two widespread species in North America:
https://www.inaturalist.org/taxa/119693-Sinea-spinipes (East/Midwest)
https://www.inaturalist.org/taxa/133977-Sinea-diadema (transcontinental)

The cv system is really a great tool, but this sort of repeated over-specificity is an issue in many groups, especially below the tribe or genus level in invertebrates. I look forward to hearing comments on how to make the system work better for such groups.

20 Likes

Yeah, the CV falls flat on its face when it comes to cryptic species that can’t be distinguished from a typical iNat observation, particularly when one species is more common than the others, or if there are specific areas where one species can be easily IDed due to that area legitimately only having the one species (which will absolutely create a positive feedback loop in other regions without janitorial work from identifiers, because I see it happening in my taxa all the goddamn time). My surface-level understanding of the design decisions behind the CV is that the way it generates suggestions just doesn’t really account for the fact that cryptic species exist.

I can conceive of some ways to modify the algorithm to be less terrible in this regard, some requiring more manual intervention from users than others:

  1. ID feedback - if a suggested ID keeps getting disagreed with back to a higher rank (genus, species complex, subfamily, etc), then the algorithm takes the hint and stops suggesting that species at the species level
  2. Manual flags - put in taxon-by-taxon instructions that say things like “hey, don’t suggest to species level in this region“
  3. If a ton of observations are stuck at a higher taxon in general (especially if RGed at those higher taxa) while there’s a proportionally smaller amount of species IDs in the species within that taxon, have the CV take that into account and start suggesting genus/species complex/subfamily more often.

The problems with the CV get easier to contextualize and visualize if you consider that it probably works best (and is at least in some way optimized for) charismatic vertebrate fauna & charismatic flora that are easy for even an amateur to ID.

Edit: And also, for taxa that are cryptic at some points of their development but not others, it would be wise to implement a seasonality-based variation in how specific the iNat devs let the CV get for those taxa.

12 Likes

The only real solution that identifiers can contribute to is aggressively marking all the over-specific observations back to genus level. That’s understandably not very enjoyable, because people (myself included) always want as specific of an ID as possible.

For some reason the CV seems to treat insects differently than vertebrates. It almost always suggests a genus instead of a species ID as the best guess for birds, but for insects and plants often suggests a random species, even when there are multiple very similar species present that it “knows”

16 Likes

When we get a list, which doesn’t include the genus - what is the quickest and easiest way to pull up the Genus?

If I want Asparagus, then Asparagales is usually on the list and I bash -gales back to -gus.

I wish iNat’s list would at least include the Genus.

4 Likes

Thank you for that name. Now every time I see this happening I’ll get the brain worm music!

3 Likes

I sheepishly checked, and sure enough, I had an incorrect, CV-derived S. spinipes observation. Now for many taxa, e.g. moths, I’m a big fan of using CV as a starting point. It’s like flipping through a field guide until you get to the set of pages that look like they might include what you’re looking for. But this example is exactly the kind of thing for which we need an iNat homegrown wiki. The wikipedia page is nice, but useless for ID. But a brief statement like to the one in bugguide would avoid a lot of incorrect IDs, and would make the CV more useful, and less likely to lead astray.

11 Likes

Have you noticed an increase in species-level Sinea identifications? The new mobile app (in contrast to the website) now often gives species-level top recommendations when it is very confident (see this forum post).

When I use the website CV on my recent observation of a later instar Sinea nymph, it says it’s “pretty sure” of the genus and then gives the two species as suggestions. However, using the iOS app, we get this:

This seems like overconfidence if, as you say, nymphs of these two species are equally spiny and can’t be identified to species.

9 Likes

Using the new app, I’ve definitely noticed that it pushes species ID hard when a genus would be more appropriate—you have to go search and type in the genus name instead, which seems like it could lead to an increase in inappropriately specific initial IDs over time in some groups.

10 Likes

Replace “seems like it could” with “absolutely does“, my man.

7 Likes

Part of my point was that even the adult identifications are not justified. The only difference between the two widespread species, spinipes and diadema, is a some tiny spines vs. tubercles on one lobe of the pronotum. I’ve visualized these a few times on detailed macro images, but there is no way a 400X400 pixel image (used in the cv) scaled down from a much larger one could be seeing that. AFAIK the cv does not work that way. I spot-checked a few of the top-rated images ID to species and they looked correct, but most images (spinipes vs. diadema) should not be assigned to species at all. The whole chain of ID’s at species level is wrong. One human ID to species might be correct, but the cv is taking that overall habitus as a model and then matching with other images where that is not how the ID is done–it it not looking at those spines.

Again, AFAIK, zero images of nymphs should be assigned to species in the geographic regions where more than one species occurs–most of the Continental US–they should just be left at genus. It does not matter whether humans or cv system are doing it.

4 Likes

And one could also say “adults of these two species are equally spiny”. The difference in the two is very subtle, based on near-microscopic characters. Adult “spinipes” is no more spiny than adult “diadema”. It is clear to me that the whole series of ID’s is based on people reasoning “spiny” implies species = spinipes.

1 Like

Problem is, within this genus, as in many invertebrate groups, overall appearance (habitus) is basically the same for multiple of species. Neither humans nor cv can ID based on habitus–it takes macro or microscopic observation. After a campaign of pulling back ID’s to genus level, the whole thing will start again as soon as one person makes a species-level ID, even if correct based on the microscopic characters. The cv will grab that habitus image and start matching other observations. (It’s designed to do that, a good thing where habitus correlates well with species ID, but that is often untrue with invertebrates.) Thus, the Sorcerer’s Apprentice. The problem will not stay fixed.

I feel there should be another level of caution embodied into the cv for such groups. It should not propagate, say, species-level suggestions when these are known to be unreliable in that genus. This is true for many invertebrate groups.

4 Likes

As discussed above, the easiest solution is just to have the CV suggest higher levels like genus or family by default even when it thinks it confidently knows a species. This has some not ideal consequences (e.g. on the website the top suggestions for Blue Jay and Northern Cardinal are only genus, which doesn’t really make sense) but at the scale of every observable species on Earth is the better solution.

Trying to refine the CV algorithm more than that to account for situations like this gets really complicated because of all the edge cases and the fact that it often legitimately doesn’t know about the existence of the other species, and then how do you teach it about them?

Some more discussion here:
Computer Vision should take into account fraction identified to species
Allow some non-leaf taxa to be added to the CV model

And recent discussion about the new app’s CV here: What is the “appropriate” use of ID With AI?

6 Likes

And therein lies the problem. The CV as it’s currently coded just isn’t well-suited for the complexities of invertebrate ID, much to my chagrin (and constant suffering).

I envy the identifiers in taxa that are much better-shaped pegs to fit in the square holes of the CV algorithms (like, I dunno, birds and mammals or something)

4 Likes

This will never be fixed without getting a change to the coding of the CV from the developers. I’m not sure what else to do other than rally those aware of the issue and vocal about it to try and push for changes.

Even species that need DNA. Once 80ish confirmed observations are uploaded. The CV will magically start suggesting that species sometimes with confidence.

I made a long post here about some issues.

https://forum.inaturalist.org/t/cv-suggestions-have-gotten-much-much-worse-at-san-jacinto-mountain/68136/30

Didn’t get a reply from staff.

2 Likes

I did not realize this happens. Does the CV really scale down the entire photo, or does it pick a 400x400-pixel square to focus on?

I wonder if these concerns are only going to become more acute as more species get added to CV. Because inclusion in the CV model is based on a fixed amount of RG observations (and not on proportion of RG observations to all observations), more and more difficult-to-ID species are going to be added as iNat continues to grow. Personally, I hope they revert the species-focused suggestions of the mobile apps and only suggest a lowest level of genus (which, if someone is really just picking the CV suggestion blindly, leads to a lot fewer issues than choosing species).

I always find it interesting (and surprising, actually) that whenever iNat writes on the blog about CV accuracy when a new model is released, it’s Arachnids, Insects and Plants that have the highest CV accuracy, while Amphibians, Birds, Mammals and Reptiles are significantly lower. So I don’t know if the CV is really better or worse at telling gulls or crows apart compared to midges and assassin bugs, but it is the case that vertebrate taxa both are less speciose and have more identifiers relative to the number of observations, so this issue of CV error or over-reliance is less of a thing.

3 Likes

To clarify on this, the observations don’t need to be RG. From the help page:

There must be at least 100 photos of the species and 60 observations of the species, and we don’t choose more than 5 photos from an observation to train the model. Observations do not need to be Research Grade in order to be used in training, but observations with a matching Community ID will be prioritized.

3 Likes

Taking this literally, it seems like 60 observations with only the observers’ own CV-guided ID could get a species included, which doesn’t seem ideal to me. Before 2019 (judging from blog updates) the criteria were that the species had RG obs from at least 20 different observers. Between 2019 and 2024 it was at least 100 observations with at least 50 with a community ID (so there’s an ID from someone other than the observer, but it might not necessarily be RG). They’ve all had issues with feedback loops but it seems more likely if no community ID is necessary.

I think there might be a solution, or at least a strategy, that would not involve recoding per se. It seems that curators could remove some taxa from the computer vision model. In this case, I think there is a strong case that Sinea spinipes, Sinea diadema, and perhaps other North American species, should not be part of the cv model. The differences among the species are far too subtle to generate reliable ID’s based on overall shape/pattern/color (habitus), which is what the cv does. People could still chose a species-level ID manually, but they would not be presented with those species as choices by the cv, just the genus.

1 Like

AFAIK the cv bases ID’s on circa 400X400 downscaled versions of the entire image. I saw a thread on this issue somewhere, but I cannot find it. Please somebody chime in and correct me if I’m wrong on that. (Uploaded photos are down-sampled initially to no larger than 2048X2048 pixels, but I don’t think the cv model uses anything that large for computation. Again, I may be wrong on this.)