Yet another reason for more bird curators: https://forum.inaturalist.org/t/problems-understanding-a-community-taxon/7898
There are plenty whoād like to assist, including myself. And why so many deviations? Is that just the convenient way of dealing with the backlog?
Iām not sure.
Cross-posting this from a couple weeks ago:
Hybrids and varieties account for quite a few as well as they are not in Clements.
I believe this proves that not only do we need a new avian curator but more than two. I suggest four curators for each vertebrae class. Example (and I know this is quite a jump from plans for Clements) being there is only 2 fish curators, loarie being one of them. And yet Iāve been waiting for almost 4 months for the addition of a hybrid Iāve been catching all summer while fishing. So to me, thereās too much on every curatorās plate and perhaps more than just 2 curators will help.
Hi folks,
Sorry for the delay on this. Iām aware that Iāve bottlenecked the bird taxonomic updates and I apologies for how that has slowed things down. But Iāve been struggling with the fact that for groups like birds with lots of observations and lots of associated content its just too difficult for curators to do all the steps needed under our current system to properly curate these taxa without introducing a lot of new issues.
Often times just doing one simple operation to the tree - like splitting a genus in two - actually requires many-many taxon changes and moves in order to tie up all the loose ends. Its just too hard to do all these steps manually and too prone to error.
Over the past month, Iāve spent a lot of time using the Clements 2019 update as a test case for trying to figure out a more efficient way for us to properly carry out tricky taxonomic updates like those involved here.
Where Iām at is that Iāve come up with a higher level structure for describing a taxonomic alteration which may involve many individual taxon changes etc. to carry out. Iāve come up with a new kind of figure for visualizing these changes, for example hereās a figure that tries to summarize the steps involved in splitting Alethe castanea off from Alethe diademata and the associated taxon changes (one split and 3 swaps)
And for each of these, Iāve created a group of all the taxon changes necessary to bring our taxonomy in line with Clements 2019. For eample, hereās the changes necessary for figure above:
https://www.inaturalist.org/taxon_changes?change_group=Alethe+diademata+swing-split
Iāve tried to capture all of this in detail in this very long document I just posted:
https://www.inaturalist.org/pages/how+taxon+changes+work
Buried in that doc are links to all 176 alterations like the one in the figure needed to bring our taxonomy in line with Clements 2019. Most are pretty simple things like elevating a subspecies to species status. But a few are super gnarly and involve many many taxon changes like this one involving Oenanthe and a few other genera. And I wanted to make sure the system could handle super complex changes like these:
https://www.inaturalist.org/taxon_changes?change_group=Myrmecocichla+reshuffle
There are two next steps:
-
it would be nice to commit these changes to finally get iNatās bird taxonomy in line with Clements 2019. All of changes should be ready to go (e.g. the split outputs have atlases and ranges in most cases). Please take a look though and raise any concerns you may have here before I commit them. Also, weāve realized that committing taxon changes with many many observations - such as many of these have - is creating performance problems, so Iāll consult with the team but when everyone is ok with these weāll probably have to find a time/way to commit them that doesnāt bog down the site. Once weāve gotten the taxonomy up to date with Clements, I have another set of processes to update the IUCN conservations statuses and range maps which have also gotten quite out of date over the years.
-
Please let me know what you think about the structures these figures are trying to convey. My vision is that we build functionality where instead of manually creating a bunch of individual draft taxon swaps and individually committing them and moving a bunch of associated taxa manually, you have a little tool for drawing a figure like these by specifying the input and output taxa and wiring up all the swaps and splits. Doing so would create the whole structure automatically including all the new inactive taxa and all the taxon changes. Then when you want to commit the structure it would commit all relevant taxon changes and move all the relevant taxa automatically. Thats what Iāve done here to create these 176 figures / groups of taxon changes and what will happen when weāre ready to commit them through the back end. But we donāt yet have a front end tool for doing these.
The ultimate goal, as several of you have written here, is to get back to having more curators involved in doing this work, but we need to make it at least an order of magnitude simpler to make these changes and and order of magnitude less error prone in terms of handling identifications and other associated content like distribution data. Iām hoping this is a step towards that but curious what you all think
Whoa!!!
In the general implementation, do you also envision a more automated process to update Taxon Framework Relationships associated with taxon changes? Those also seem prone to error or neglect in the daily flow of taxon curation.
Will take some time to digest! But one immediate question: there are a good number of 2018 changes still to be committed, which presumably havenāt been through this process. Can these changes be committed now?
@loarie - can you clarify the technical reasons for this āFor example, to change the name of the species Estrilda caerulescens to Estrilda coerulescens, a curator would first create a new inactive taxon with the name Estrilda coerulescens and rank species grafted to genus Estrildaā
What is the reason for adding the new one initially as inactive ? When I do changes like this, and as a taxa curator I do a fair number, I would just create the new one as active. Is that likely to cause any issues ?
Where is that quoted from? Ah I see it now, in that new page.
Adding active output taxa, but not committing the change immediately, can be/has been a huge problem with disagreeing IDs.
In a one to one change like that, I canāt see any downside to adding the new output taxon as active and then immediately committing the taxon change.
Thatās directly from the document Scott added near the top. If you are not implementing the change immediately thatās a different matter, but if you are is doing it initially as active causing any technical problems?
I guess adding it as initially inactive allows the ātree figureā icon to indicate its state change (from circle to square), in order to show it was activated due a taxon change?
yes I agree theyāre error prone and create more work so theyāre still not working right for curation involving a lot of people. But I find them essential for trying to check whether iNat is in line with an external reference and where it is not.
This system Iām piloting here properly handles TaxonFrameworkRelationships, for example if we committed the structure involving these changes https://www.inaturalist.org/taxon_changes?change_group=Alethe+diademata+swing-split it would resolve that single deviation TFR into several āmatchesā. So yes, I think they can ultimately be combined into a higher level easier to use interface.
It gets complicated though when people want to intentionally deviate from the reference though. We still donāt have a good way of separating when weāre just lagging the reference (as in Clements 2019) from when weāre trying to intentionally deviate.
in terms of the structure of the tree, iNat currently exactly matches Clements 2018 (ignoring all the hybrids etc that arenāt included in Clements). Do you mean situations where a new taxon was added when some other taxon should have been split?
If so, then yes, if there are situations when many observations were misspecified as a result we should retroactively split some taxa. Which taxa are you referring to specifically?
Iām again hoping this work here will help reduce that from happening so much in the future, but it remains hard to figure out when a new taxon was added āwholly newā vs split from some other taxon (especially for groups like Reptiles) so I suspect weāll have ongoing retroactive splits like this
One reason is that iNat has stricter rules about active taxa than inactive (e.g. must have active parents, canāt have siblings of the same name etc.) so sometimes it avoids problems to add them as inactive and let the taxon changes activate them.
The other reason is if you want to create the draft taxon changes and then let them sit for a while to get feedback from the community, it would be problematic to have the input taxa active alongside active output taxa
My preference is to:
- create the inactive output taxa
- create the draft taxon changes
and then when Iām ready to commit - move taxa
- commit the swaps
But I guess if you were doing everything at once then it would be fine to create active taxa from the get go.
Because of the performance problems weāre experiencing with committing taxon changes, its probably not long before we have to move to a system of people āsetting up changesā e.g. creating inactive taxa and draft changes and then us committing them later on in a way that doesnāt bog down the site, so this protocol of separating āsetting up the changeā from ācommitting the changeā helps with that. In that spirit, all these 176 Clements 2019 update āstructuresā are all set up and ready to go, but they havenāt been committed yet.
For 2018 splits, in many cases the new taxa were created, and the subspecies swapped in to them, but the original taxa were never split. So the tree structure may match, but there are observations that have not been updated. Examples include Eurasian Magpie (draft taxon change 57605), Red-eyed Vireo (57606) and Iberian Grey Shrike (59577). There are several others - all the outstanding 2018 changes have been drafted and should be ready to go, except for 57748 which has a problem. These splits do need to be committed to update observations and identifications (though many of these have been changed manually)
Can you give more details on this, are the performance issues specific to cases when there are associated observations / identifications etc, or are they broader in all cases ?
Obviously things like checklists etc could be an issue, but I guess only a small percent of taxa that dont have observations are present on checklists.
Can you add a symbol / legend / glossary to that document? I think it would help me understand it all to have a quick reference for each āthingā / symbol / color.