Should New Disruptive Technologies be Used for Classification in Ancient Linnaean Rankings


1 Like

Would an image of the powo dataset, from say 2016… be a simplified dataset? I mean, I don’t think anyone would create all the boxes from scratch. They’d probably start with something.

I mean personally I don’t know plant taxonomy well enough to know what has changed since 2016, but I would expect not that much. The taxonomy of Taraxacum has been the way it is now for decades for example; I think taxonomists have been aware of the issues with them and similar ambiguous reproduction/hybridization issues since long before genome comparisons were possible. Personally I’ve seen phylogenetics cause more shifting and splitting at higher levels than at species level, with birds and insects that I follow a bit more closely.

1 Like

This is such a cool example that really captures the ecological relevance too - if the cryptic species that hosts the butterfly were to disappear it would have ecological implications for the butterfly too. Declines in the host plant species may less noticeable (and actionable) with several plant species lumped together.

1 Like

Sorry, @stockslager – I got very very busy and just did light work here. And also, “What I perceive to be my vertical” does not communicate. What in the world is a vertical in this context?

cf. I think it’s a business/software engineering piece of jargon, in the same way engineers tend to borrow “orthogonal” from linear algebra to mean something like “unrelated” or more precisely “can be adjusted independently of each other”.

1 Like

Sorry… I’m just saying that I have no formal training in botany, ecology, biology, etc. I do have formal training and professional experience in software development and a passionate interest in the natural world. My vertical… the lens through which I see things… is from an IT perspective. Specifically, applying software in a way that achieves some end result. This is where my resistance to splitting at the species level comes from. Modern software seems like it might require one of two things…

  1. the species rank will need to remain more static -or-
  2. the software will need to support bundling at different taxonomic levels and dynamic display based on user role.

In your carrot family example… it would be a shame if the difference between the four new species were so subtle that the CV module flickers between four options with the slightest fluttering of the leaves in a light breeze.

What I like about the example is that the decision to split the species was done because of a concrete use case… to preserve a butterfly. However… I’m not sure the butterfly is best preserved if the flickering of the CV module because of the four new species with subtle differences causes the species to be identified to genus. If it had id’d successfully to the old single species, a wise user might have known to look more closely using a key.

Even in this example (which is a favorable one wrt splitting)… I’m not sure you’ll get the outcome you’re hoping for by splitting the species. Because the subtle differences that cause the fluttering of the Algo Id, will result in an answer with less clarity, not more. It’ll just say “it’s sure of genus”, not species. At least that’s the way I see it.

Silo is an unnecessary pejorative. I’d rather use “vertical”, especially when referring to myself. :)

If I start from the assumption that the best definition for the rank of species is something like “the minimum grouping of organisms that is reliably distinguished by machine learning algorithms from all other such groupings based on visible surface features of the organisms in that group,” then this perspective would make sense to me.

But I don’t, and it doesn’t, and I suspect the same is true for most users of biological taxonomy. I want biological classification to make some kind of biological sense, even if the choice of taxonomic rank is arguably arbitrary for different biological phenomena (reproductive isolation, independent evolutionary lineages, etc.) or amounts of biodiversity.


Thank you for the expression of understanding for my concern. The reason I’m not starting with an assumption is that I also offer a software solution that allows biologists and others to split species as much as they’d like…

2 - the software will need to support bundling at different taxonomic levels and dynamic display based on user role.

What makes this option troublesome is that it implies “data normalization” across a hierarchy of nested analogues. If the four new species of Lomatium were stored in a section or sub-genus or some level between genus and species such that the algo could say… we’re not sure about species but we’re pretty sure it’s in this section (or sub-genus). This would allow the algo to have as much clarity as before the species was split. It would also allow the data to be modeled differently based on user role. I don’t think it would be modeled differently in this particular case… but perhaps in others. But I think this solution might require that section / sub-genus be used in a standard way across the taxonomic hierarchy (and also within iNat). Otherwise, how would the CV module know when to present section/sub-genus as an ID. Also, it’s possible I have a poor understanding of section / sub-genus. But it seems like splitting of species based on increasingly slight variations would require more use of levels between genus and species.