What is a "taxon concept"?

Discussion of its definition here:
https://github.com/tdwg/tnc/issues/1

3 Likes

Agreed. I’m not criticizing the lag time at all. It’s especially challenging if the paper that revises the taxonomy doesn’t do a good job of defining the geographic boundaries of their revised taxa. Seems I run into more and more cases of that. There are new taxa in my region that I really don’t know where they should be or if they overlap in range and no one can tell them apart morphologically, certainly not from a photo. .

1 Like

Thanks, I’ll read that paper. It came up in my poking around yesterday, too.

I’m most interested in what a taxon concept is in terms of the actual data structure and data management in, e.g., iNaturalist.

In that sense, I might describe an iNaturalist taxon concept as: a scientific name, a unique identifier, and a reference to external data. The circumscription of a taxon does not appear in an iNaturalist taxon concept. The hope, presumably, is that it appears in the external data. What we expect that data structure to look like and what we might do with the data, though, is not clear.

For instance, suppose External Data includes a distribution map indicating Alpha beta is present in counties A, B, C, D, E. I record an observation of Alpha beta from county F. Does this mean I am using a different concept of Alpha beta than External Data?

Thank you!! That helped a lot. Didn’t realize there was a specific term for that. Best of luck in your studies.

I would say no. Your identification of that obs as Alpha beta is first the result of a judgment on a morphological basis: “Alpha beta is the best fit”. If it is confirmed, and significantly outside the known distribution range in a relevant way, this may end up in an adjustment of the taxon concept.

2 Likes

I’ll have to see if the Franz et al. paper, or others that have come up, provide explicit criteria on questions like this. My suspicion is that taxon concepts are probably not well enough defined for different people working in the field to have consistent intuitions.

In the Alpha beta case, assuming the observation falls within External Data’s morphological circumscription of that taxon, my intuition is to say that either there’s an error in their map or it’s a new county record, but in neither case would I say that the taxon concept had changed, or was different between me and External Data.

iNat has active names and an accumulating pile of linked inactive names. The active names might be linked to a name in an external taxonomic authority. Most of those authorities don’t explicitly provide unique identifiers for any kind of definition of a taxon concept, although they often provide name-based unique identifiers. Most of those authorities will encapsulate concepts in terms of preferred name + synonyms. Usually they don’t explicitly say where that synonymy comes from. Usually they don’t explicitly encode misapplications or narrowing/broadening of taxon concepts over time. Most digital taxonomic authorities really are ‘moving targets’ in that sense.

In iNat the taxon concept applied to an identification is the one in the head of the user. It is condensed down to an enforced ‘active name’ in use at that time. Somebody else may then perform a ‘taxon’ swap and inactivate the use of that name.

Sometimes these user-based concepts might be reliably linked to specific published concept - perhaps a flora the user worked through to arrive at a name. However, I think most seasoned naturalists, professional or not, would say their concepts have developed over time, from reading multiple sources and looking at multiple collections - they are tacit taxon concepts. There’s no way such tacit concepts could be encoded.

Names linked to unique identifiers and grouped by synonymy are still the most pragmatic way of capturing identifications in a system like iNat. The key issue then becomes how these identifications, using such a name-based infrastructure, can be used to convey what concepts are being used, or what alternative concepts might be possible. I don’t think the iNat architecture is currently appropriate for that purpose, but I’m not sure there’s a magic bullet to solve it, at least not one that most non concept-savvy users would understand, or could be employed without significant computational overhead.

2 Likes

That’s consistent with my understanding.

I believe that is the case, more or less (further comments on this point below). Given the “taxon frameworks” that were introduced a few years ago and some fairly explicit guidance in the help documentation, though, the iNaturalist staff apparently desire for this not to be the case. Instead, users should be making identifications based on the taxon concepts in the external resources linked to each taxon id. Presumably, downstream data users should interpret the identifications according to those taxon concepts, as well.

This has the effect of constraining the ability of users to apply their mental taxon concepts. For those who follow the iNaturalist policy and make identifications based on the iNaturalist taxonomy, when that taxonomy and their own mental taxon concepts are incompatible I think the resulting identifications will often be consistent with neither the user’s own mental taxonomy nor the taxon concepts of the linked external source.

Agreed. I was thinking about this topic a bit when I went through Guy Nesom’s revision of Heterotheca sect. Chrysanthe earlier this year—that was probably the most concerted effort I’ve made in quite a while to thoroughly understand and apply the taxon concepts of a particular taxonomic work, but given how difficult it is to understand the variation in these plants I am certain my taxon concepts diverge from Nesom’s in many ways, some of which I am aware of and probably many more I am not.

I think the current iNaturalist system is intended to explicitly link identifications to defined taxon concepts. I don’t think it does so, though, nor that there is likely to be a feasible system that has this outcome.

As discussed in my last thread on the topic, I think there are ways that iNaturalist could be substantially improved with relatively minor real modification, and that the limiting factor is neither user-friendliness nor computational load / coding effort. Rather it is conceptual unfamiliarity. The core of this is: do the grouping by synonymy after the identifier has supplied an identification, and store the name used by the identifier.

I won’t go all the way down that rabbit hole again, but there are a couple of other related things that may be worth mentioning.

First, there is useful information in the particular name used by the identifier. For instance, suppose in 2000 Alpha beta is transferred to the genus Delta. In 2010, two species are segregated from Delta beta: Delta epsilon and Delta zeta. If an identifier uses the name Alpha beta, that implies a circumscription that includes Delta epsilon and Delta zeta, even with no further information.

Second, there is a very direct and simple way for iNaturalist to encode the tacit taxon concepts used by individual identifiers: the set of all observations identified as Alpha beta by User is User’s circumscription of Alpha beta—or, at least, is the extensional aspect of User’s circumscription. To record this circumscription, one merely needs to allow User to enter identifications with the names that correspond to User’s mental taxon concepts. What one might do with that circumscription in any kind of “big data” sense is not immediately obvious. For taxonomists, it would have obvious utility for the kind of direct human interpretation of taxon concepts that taxonomists have been doing forever. For instance, if I want to understand Guy Nesom’s concept of Heterotheca polothrix, looking at the set of herbarium specimens he identified as Heterotheca polothrix is very helpful. Of course, future taxonomists will not have any particular interest in understanding the taxon concepts of most iNaturalists. Some of our current iNaturalists, though, are precisely the people whose concepts I want to review in order to understand the taxonomy of a particular genus.

One other question—I’m not sure what exactly is accomplished by unique identifiers. If iNaturalist has two records for Alpha beta, one with taxon ID 001 and the other with taxon ID 002, what is it that I, as a downstream data user, would want to do with those taxon ID numbers?

My impression is that it’s like having a key field, but no related table with information matched to the key field values.

1 Like

That pretty well sums it up, actually. You have grasped what it means.

4 Likes

That turned out to be a useful lead, thanks! I contacted Rich Pyle. He was kind enough to share the powerpoint and we’ve started some useful discussion in email. His viewpoint is a lot closer to mine than I would have guessed, although in many ways we’re approaching the problem from different directions.

Regards,
Patrick

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.