Relational Architecture Above and Below Species Rank

stockslager · April 12, 2024, 9:31am

Someone on another thread shared a link with me about taxonomy. I’m not particularly interested in taxonomy but it was a kind gesture so I read it. The part that interested me was about Pantherine. Basically, the bit about morphologically and behaviorally distinct species breeding only in captivity but producing viable offspring.

The question I have is… when this happens in the wild, and the offspring can themselves reproduce, is there ever pressure to create a new genus above the new species that share traits from the original two species (especially if/when, unlike this scenario the original two species are from different genuses)? Basically, is taxonomy limited to a hierarchical rather than relational architecture? Software supports both hierarchical and relational architectures elegantly, but the traditional taxonomic structure is ancient.

Unsurprisingly, hierarchical db structure came before relational. Hierarchical in the 60s, and relational in the 70s. In practice, many large companies in the early 90s still had mostly hierarchical db structures. It wouldn’t be shocking to me if taxonomists are still limited to hierarchical (because of how long it’s taken for programmers to focus on them). Part of the problem would be that switching might put enormous pressure on existing code (and taxonomic paradigm). But would taxonomists see the switch as worthwhile? Or has it already happened?

Relational architecture above and below species would mean that a “saugeye” (a fish that’s a sauger / walleye hybrid) might be a new micro-species that points to both parent species (not just in terms of nomenclature but also in terms of data presentation). It gets more complicated if the Saugeye is eventually able to reproduce. It’s my understanding that it can’t right now. But if it started to? I imagine it would be looked upon as a new species within the Sander genus. In this case, it’s easy because both walleye and sauger are in that same genus. But what if they weren’t? Would there be pressure to create a new genus to force the new cross out of the two parent genuses? And would this only be because of the underlying taxonomic and technological data structures (or because taxonomists would do it this way anyway)?

It’s quite possible that I am, as I sit here typing, infuriating my tribe. The programmers. This is because they’ve done nothing wrong. The easiest way to apply binary programming logic to an existing paradigm is to take a snapshot of that paradigm and code to that snapshot. Software development is every bit as complicated as taxonomy. To try to debate the underlying taxonomic architecture while also trying to mimic it with binary logic would be impossible. As systems mature, there comes a time to have these conversations… which is why I bring it up.

The reason for relational architecture above and below species is its special distinction to non-scientists. New offspring from two different species wouldn’t have special significance. And a species derived from two different genuses wouldn’t have special significance. The special significance would still mostly be about the ability to reproduce with something.

It’s impossible to know if the underlying data for these displays has the relationships available and the UI just doesn’t use them, or if the relationships aren’t stored in the underlying technological and taxonomic data structures. Are the relationships only revealed via nomenclature (after “x” on the latin name for saugeye)? The Saugeye shows up as a full blown species even though it’s a cross that can’t reproduce. Now, maybe it’s a full blown species because you can kinda tell them apart (even a recreational fisherman can sorta tell them apart). Still, if it can’t reproduce, should it really be directly under Genus…

If it’s missing from the architecture, I’d suggest allowing for every species to be displayed like an X (even if there is no use case for it yet). In other words, pretend the walleye and sauger that created the saugeye each came from a different genus. And the saugeye, upon developing the ability to reproduce, could reference both genuses when elevated to species.

I believe this might provide the most flexibility to scientists (data and taxonomic) and provide the most clarity to iNat users.

But the wise scientist (data and taxanomic) must also ask…

What does all of this work to model data which has been explored, discovered, and described. What does all that work do to enable positive outcomes? If the only outcome is 1,000 exotics breeders becoming interested in Pantherines to create Ligers, then just stick with the ancient architecture that’s less revealing about where crosses are possible.

On the other hand, if it lessens architecturally driven species splitting, it might be worth doing. Two species from different genuses having offspring that would be a sub-species (like the Saugeye) but instead is made a species due to architectural constraints coupled with a newly found ability to reproduce. I have no idea how taxonomists would see a situation like this (and have no idea if it ever even happens). But if it did, I could see how a hierarchical architecture might influence the creation of a new species below a new genus.

stockslager · April 12, 2024, 9:32am

In my former career I had the enormous good fortune of being shown the ancient scrolls. Don’t believe me? What do you think things like taxonomic data in the 1700s was written on?

This wasn’t uncommon when I began my career in the early 90s. To be shown the ancient scrolls. They were still laying around somewhere. We were shown the scrolls so that us young kids (at the time) could fully appreciate the wisdom of the ancients. Not in terms of exploring, discovering, and describing things… but in terms of human factors.

Human factors to the lT professional is "an applied field of study that examines human abilities, limitations, behaviors, and processes in order to inform human-centered designs. It’s inherently more MIS than Computer Science (I was MIS).
In any case, the ancient scrolls are probably why the ancient taxonomy data was stored in a hierarchical structure. Imagine being one of the ancients and recording information in the taxonomy scroll. How big would this scroll have been exactly? To record all the worlds known life at that time? It must have been a really big scroll. Like, really big.

Now imagine you found a new species and had to record that new species under the correct genus. How would you find the correct genus? You’d be paging through a giant scroll. Need to move or split a species… paging through the scrolls. You’d have spent as much time paging as discovering.
This also affects cross references. If there was a cross, how would you reference the two lineages? Would you pick a dominant lineage and refer to the other lineage from the last node in the dominant? Because if you did, whoever saw the reference would need to page through a mountainous scroll to reach the data about the cross.

But there are other things to consider. What may have happened to you in the 1700s if you documented a cross? Scientists, as I understand it, were sometimes confused with sorcerers in the 1700s. Was it worth it to cause themselves the extra paging due to the cross-referencing just to document something that would harm the perception of the scientist among the villagers?

Luckily we live in a different age. We can choose a different architecture. We exist here together. In good faith.

stockslager · April 12, 2024, 9:34am

This stuff is in nature talk. I don’t know if it has any merit. If somehow it does…

we should proceed slowly and cautiously, but deliberately.

we are trying to preserve and protect the work, not destroy and disrupt it.

we are pushing the truths of the past 300 years into the architecture of tomorrow.

we can’t do this without those who understand and remember why the data is the way it is.

the planning and execution of this task would require a community.

swampster · April 12, 2024, 12:31pm

iNat displays these as “genushybrids”:

In my understanding of your suggestions, you’re making things way more complicated than they need be, which would only increase confusion, not mitigate it.

spiphany · April 12, 2024, 1:15pm

What is “this task”?

Generally, before one suggests that an existing system is useless and should be completely redone, it is a good idea to first attempt to understand it and the reasons why it is the way that it is.

charlie · April 12, 2024, 1:26pm

I’m very familiar with taxonomy and think it should be redone. I definitely wouldn’t call it useless though. Just has some issues to work out…

omg kinda like microspecies splitting at the species level, isn’t it

swampster · April 12, 2024, 1:59pm

I’ve never argued in favor of this. The only thing I’ve been in favor of is not reinventing the wheel to solve largely-imagined (or at least way overblown) problems.

charlie · April 12, 2024, 2:26pm

Oh sorry, i wasn’t trying to target you directly with that comment, more just, people seem to always notice when changing the system is complicated, but not when the existing system is too complicated. or something. But it was a badly placed post, sorry.

swampster · April 12, 2024, 2:33pm

Ah, I see. Thanks for clarifying and fair point.

stockslager · April 12, 2024, 6:03pm

I think I have an answer for you. working on it.

stockslager · April 12, 2024, 6:17pm

I never said anything was useless. Taxonomy is useful… it just doesn’t need illustrated at the deepest level on every application and for every role within applications. Nor is it necessary to have an expert’s understanding of taxonomy to be informed by it. If I can positively identify Lonicera maacki I can be more confident removing it. This is exactly how I started my native restoration 17 years ago. By knowing one single species. I remain largely ignorant of Taraxacum because they aren’t a battle I choose. They pop up occasionally in my woods, but there are much bigger battles at hand. Alliaria petiolata being another.

Thank you for providing me with this opportunity to respond to this perception.

spiphany · April 12, 2024, 6:32pm

You completely missed my point. My point was that you are presenting your thoughts as a sort of call to action (it is not clear to me exactly what it is you think needs to happen), even while your posts indicate you do not seem to very clearly understand the thing that you are criticizing (“not interested in taxonomy” – yet you are calling for it to be radically reconfigured).

Generally one is able to provide more constructive suggestions if one has a solid grasp of the existing system, has worked with it extensively, and can diagnose what works and what doesn’t and – most importantly – why.

stockslager · April 12, 2024, 7:03pm

Ah, I see. You and @swampster are concerned about the same thing. There is no problem, and even if there is, it’s not big enough to warrant a solution that is complex.

This is a perfectly legitimate concern. Does it mean I shouldn’t present a solution?

jnstuart · April 12, 2024, 7:13pm

One solution, if you can call it that, to current taxonomy is the PhyloCode. You can google it. Hasn’t really caught on but it supposedly addresses some of the problems in the Linnaean system while still incorporating many aspects of it.

stockslager · April 12, 2024, 7:26pm

Why do they need to exist?

swampster · April 12, 2024, 7:56pm

To be frank, a lot of what you’ve written seems to be coming from way out of left field, to the point that I’ve found myself wondering if you are trolling or not. If you do present a “solution”, I really think you need to take the time to make sure you both fully understand things and are clearly and effectively communicating them (and probably go over it with some friends/family before taking it to a public forum).

wildskyflower · April 12, 2024, 8:53pm

All taxonomic relationships represent a hypothesis. The hypothesis embedded in the tree of life is that are mostly hierarchical.

On some level, must be true that there is some hierarchy; unless life evolved independently multiple times, any two organisms do have a most recent common ancestor, the most recent common ancestor of those organisms has a most recent common ancestor with a third organism, etc. I.e., for any set of n organisms there must have existed some organism that is the most recent common ancestor of all n, even if it goes all the way back to the very first organism to ever exist.

However, it is equally impossible for the tree of life to be strictly hierarchical. If the two of us strictly diverged from a single common ancestor, then the point where our family trees meet would also be our Identical Ancestors Point. This is certainly not true; there are probably trillions of different paths you and I could take through our respective family trees to find a common ancestor. But there have not been trillions of humans so these paths must be some kind of complex web. The fact that relationships cannot be strictly hierarchical in detail results in Pedigree Collapse.

However, just because our relationship is highly non-hierarchical does not automatically mean that all relationships we have with other organism are also highly non-hierarchical. It is really possible that there is basically only one path through our ancestry to get from us to an oak tree. You could say that the fact that the relationship between us is highly non-hierarchical is the defining aspect of the fact that we are the same species, in a mathematical sense. I think following your database analogy, a species is a non-hierarchical subgraph embedded in a much larger graph that is fairly hierarchical.

The issue that hybrids raise is that there isn’t an instant transition where the graph goes from non-hierarchical to hierarchical in a single vertex. There will usually, perhaps always, be some degree of non-hierarchicalness within a genus, just less so than there would be within individual species. Then there should be less and less the higher rank you move to. You could say that the goal of Cladistics and Systematics is to build a hierarchical version of the tree that is the best statistical approximation of the graph of what actually happened. As you go higher up the tree of life, you should find that the best possible approximation gets closer and closer to what really happened (note; not the same as saying the best possible approximation will always get easier to find in practice).

Genetics is a powerful tool for this, but not perfect. Things like viruses and bacteria create possibilities for horizontal gene flow, which could break some of the strict relationship of genes to the divergence of the tree of life. And there is entropy; because your DNA only has so many base pairs, and your tree has much more history than there are base pairs, it must be the case that many past events in your family tree must have no present influence on your DNA whatsoever.

All that is to say: hybrids between species are important, a key mechanism of evolution, should be given more consideration and study than they are now, and should not be treated as an aberrancy or inconvenience. However, that is not the same as saying that hierarchical models are not conceptually useful tools for approximating reality, or that we need to completely throw out those models because they are imperfect.

stockslager · April 12, 2024, 8:59pm

I completely agree. But my broader question is… is there any value to re-evaluating whether or not the underlying architecture should be more relational? If the answer is “no”, I’m fine with it. But who to ask the question to?

stockslager · April 12, 2024, 9:02pm

The other piece of it that I can’t stop wrestling with in my mind is…

If the taxonomic structure could be viewed as a complex hierarchy of nested analogs. Each structure analogous to the others but possessing differences defined by subject matter experts over the centuries. If this is observably true, what happens when complex software is built on top of these analogs? Will this mean the structure is trapped in time. Because no one is willing to revisit the structure along with the software.

The reason I wrestle with this is because if we need to do data-mapping informed by experts to address the analogs… we may as well look at relational architecture at the same time.

wildskyflower · April 12, 2024, 11:12pm

I guess I don’t understand what you are asking. For hybrids we can write the name parent species 1 × parent species 2 of course. Most hybrids on the site do have a name like that even if it isn’t the default, unless there parents aren’t known (often for a true hybrid it is much easier to determine that an individual is hybrid than to determine the parents). If you are asking can we think of it as grandparent species 1 × grandparent species 2 × grandparent species 3 × grandparent species 4, at some point it is not practical to determine that kind of information from a specimen; you can get that kind of thing in horticulture where you have a well-documented pedigree due to manual human cross-pollination (and usually it is crossing varieties and not multiple completely distinct species), but not really possible or at least practical with a wild specimen from nature.

Topic		Replies	Views
Should New Disruptive Technologies be Used for Classification in Ancient Linnaean Rankings Nature Talk	327	3845	August 3, 2024
AI impact on: Taxonomy? Nature Talk	20	791	June 6, 2024
Is potential for hybridization related to taxonomic rank? Nature Talk	20	574	June 29, 2022
The finer points of Research Grade logic General	6	392	May 19, 2020
API taxonomy: list of available ranks? General api	3	950	August 15, 2019

Relational Architecture Above and Below Species Rank

Related topics