All IDs bumped in split despite atlases

I just spent several hours adding IDs and updating atlases for this split:
https://www.inaturalist.org/taxon_changes/139471

Apparently every genus ID was bumped to family despite many observations only being in one atlas. Here is one example observation:
https://www.inaturalist.org/observations/199069174

The only explanation I can think of is that because one of the outputs (Canaridiscus) didn’t have an atlas, all other atlases were ignored. That seems very counterintuitive to me, and is never stated here:
https://www.inaturalist.org/blog/40417-using-a-taxon-split-input-as-an-output

Only identifications belonging to observations within presence places in both atlases (e.g. Northern Territory, NT) or outside of both atlases (e.g. New South Wales) would be updated with the common ancestor.

If what I suspect is true, this statement doesn’t seem entirely correct to me. There were non-overlapping atlases in the above split, and still everything was updated with the common ancestor.

This isn’t a huge issue, but if this isn’t a bug maybe it would be easier to undo this and atlas the one genus. I didn’t atlas it because it doesn’t have observations and was more there for taxonomic correctness. I didn’t understand why the Analyze IDs were saying everything would be bumped to family, but maybe I should have been more careful…

2 Likes

WAIT. Now there is no atlas for Discus, which I’ve already reported:
https://forum.inaturalist.org/t/places-removed-from-atlas-after-split/30202

Maybe that’s related. So an even bigger mess because undoing and redoing would mean I’d have to redo the atlas…

yeah, not a bug - all outputs have to be atlased or it goes to the common ancestor

4 Likes

I think this limitation of atlases should be pointed out in the help information for atlases and taxon splits. In practical terms, it means that atlases can’t be used to assist with splits of taxa that have many children.

For example, last year I created taxon swaps to move 14 North American taxa from Campanula to new genera based on a recent study. These represented about 2,000 observations out of about 190,000 globally for Campanula, a genus that has around 450 species.

Many of the 2,000 observations had one or more genus-level IDs of Campanula and I very nearly created a taxon split for the genus so that these genus IDs would be adjusted to either Subfamily Campanuloideae or the correct new genus where only one occurred for the state or county. Eventually though I realized that the spilt would apply to every Campanula ID globally, and would only work correctly if I added an accurate atlas for each of the 450 species. Without that, all these IDs would get bumped up to subfamily level.

I realized that a better approach would be to leave the Campanula genus IDs untouched and try to encourage identifiers to review any that conflicted with the post-swap taxa. It would have been helpful to be alerted to this in the taxon change guidelines by some wording such as “Atlases will only be considered by the taxon split logic when every child of the taxon being split has an active atlas. If just one child is missing an atlas, the split will ignore the atlases and set all IDs to the common parent.”

Even better would be an automatic check that would catch missing atlases and ask the curator if they want to fix the issue before proceeding with the split.

5 Likes

Atlased splits only require atlases of the specific output taxa (not their children as well), and only affect IDs of the specific output taxa (not children of the outputs)—so in the case you describe (splitting Campanula into multiple genera), you’d just need to create atlases for the genus Campanula and the additional split genera (not every species in any of those genera), and the split would only affect genus-level Campanula IDs (any species-level IDs of anything remaining in Campanula would be unchanged).

4 Likes

That’s useful to know. So, with atlases for the eight new genera, plus an atlas for Campanula I could have committed the split. If I’m following this right, the split should then have handled existing Campanula IDs as follows:

  1. North American IDs with only a single genus atlases for the state/province/county: Change ID to that genus, e.g. Campanula in San Benito County, California, becomes Ravenella; in Newfoundland, it remains Campanula.
  2. North American IDs with multiple atlases for the state/province/county: Change ID to the common parent, Subfamily Campanuloideae.
  3. IDs elsewhere but within the boundaries of the Campanula atlas should remain unchanged as Campanula because that’s the only output genus covered by an atlas for these areas.
  4. IDs elsewhere that are outside the boundaries of the Campanula atlas… I think these would get bumped up to Subfamily Campanuloideae as the algorithm can’t determine which output to choose. I can envisage this happening for cultivated observations and for a few others growing beyond the accepted distribution of Campanula.

Does that seem correct?

3 Likes

Yep, that looks correct :slightly_smiling_face:

1 Like