Improving iNaturalist's nomenclature & taxonomy

Problem is, there would be no way to know whether the taxonomic concept of Yucca torreyi (as the hypothetical example here) being used for the new split is the same, or broader, or narrower, than what the identifier had in mind at the time they added their ID. The names would look the same, so most users wouldn’t even think to check whether the concepts were also the same.

2 Likes

That’s the case for every identification on iNaturalist as it is.

1 Like

This is kind of a whole different rabbit hole, but I don’t think anyone really knows exactly what a taxon concept is or how to tell if two taxon concepts are the same. For instance, we might consider a distribution map to be a component of a taxon concept. Now suppose someone names a new species that is known from Sandoval, Cibola, and Valencia counties, New Mexico. We do some field work to explore similar habitats beyond the known range, and we find this species in northern Socorro County. More field work yields populations in southern Socorro County, then Sierra County, then western Lincoln County. By the end of this, perhaps the known range is three times as large as when the species was published. So, big change in the known distribution of the species—is it the same taxon concept? If not, when did it change?

In any reasonable difficult genus, I think it is also inevitable that the set of plants that I would call a certain name will differ at least a little from the set of plants the authoritative monographer of that genus will call by that name. This is true even if I put a fair amount of effort into understanding and following the monographer’s work. Some of the difference will surely be errors on my part; some, errors on the monographers part. I may also arrive at some insights into where exactly the morphological boundary between two species lies that the monographer did not, and surely I will fail to understand some of the monographer’s insights. This means that within the class of plants where the monographer and I differ, some subset of them will be attributable to misidentification, some subset to correct application of different understandings. Are we using the same taxon concepts? How different do our misunderstandings have to be? How can we distinguish misidentification from a difference in concepts?

In practical terms in the context of a platform like iNaturalist, the problems are much more basic. I think there is no doubt, at least within the parts of the botanical world where I hang out, that most observers and identifiers don’t know what taxon concept iNaturalist is using in the first place. When I observe Astragalus oöcalycis, I type the name into the ID box, and if iNaturalist has the name in its taxonomy I just click on it and move on. I don’t know what taxon concept iNaturalist is using. I don’t even know if there are meaningfully different taxon concepts to choose between in this case. Similarly, when I have occasionally seen that iNaturalist has multiple taxon records for the same taxon name (i.e., the taxon_id numbers differ), I don’t have any idea what taxon concept difference might exist between them and I don’t have any particular reason to try to figure it out. Suppose I did go through the list of taxon concepts and figure out which one corresponds best to my understanding of the taxon. It doesn’t really matter—I can only use the one that’s marked “active” (assuming one of them is), so that’s the one I’ll use even if it isn’t a good fit.

I have some sympathy for the intent behind taxon concepts. A taxon going by the same name can indeed differ substantially in its circumscription between taxonomic works, floras, and so on. It would be useful to distinguish between these different circumscriptions and record which one is being used in a particular identification. But that’s not what taxon concepts achieve. They’re not really coherent enough conceptually for me to tell how to define them or how we’re supposed to work with them. In practice, if taxon concepts worked perfectly they would still be attempting to capture a level of precision and detail that simply does not exist in the identification. If a platform like iNaturalist attaches a particular taxon concept to a particular identification, we have no idea whether or not the identifier is even aware of it—we can’t expect that the attached taxon concept conveys information about the intent of the identifier. At best, it says something about what iNaturalist wishes the intent of the identifier were.

The same then applies to the taxon frameworks and deviations that iNaturalist attaches to taxon concepts. An additional problem here is that when iNaturalist links its taxon concept to an external data source like POWO, the taxon concept on POWO is not a static entity. For instance, Heterotheca villosa 77399 on iNaturalist says it matches the taxon concept for Heterotheca villosa on POWO. The taxon concept of Heterotheca villosa on POWO has changed dramatically over the past couple of years, so that this is certainly not one taxon concept but several. If we want to point to a single taxon concept on POWO, we need to include a particular time point, e.g. “Heterotheca villosa on POWO as of 15 Mar 2021”. I’m not sure POWO maintains the kind of historical data that would let us figure out what the taxon concept was on a particular date, so probably iNaturalist would have to store that data itself. Meanwhile, for users on iNaturalist: suppose you go over to POWO in 2020, you do your best to understand POWO’s taxon concept for Heterotheca villosa, then you apply that understanding in your identifications on iNaturalist. Unless you’re in the habit of going back over to POWO to check, you’re not going to know that in late 2022 the taxon concept you’re applying in your identifications is now radically different from the taxon concept on POWO. And on iNaturalist, Heterotheca villosa 77399 just says it matches the POWO taxon concept the whole time.

Long story short: Apart from additional text commentary in observation notes and comments, which we’re not likely to be able to incorporate in any automated processes, the only reliable information we have, or are likely to have, for understanding the intent of a particular identification is the name used by the identifier. The name alone is often ambiguous. This is simply a constraint we have to deal with.

2 Likes

Just to clarify, in the context of my posts here, I have been referring to taxonomic concepts in the narrow, nomenclatural sense. Does the usage of a name include the type specimens of more than one named entity, and if so which ones? Is someone else using a different name for some of the types that I include under my name? Those questions are tractable using the data already available in (or capable of being in), iNat and (for example) POWO.

I agree, beyond that narrow sense, we get into the realm of pure taxonomic opinion, which at best can only be nailed down by reference to published treatments, annotations on museum specimens (and maybe on iNat observations??), etc.

1 Like

Agreed. Your comment just prompted my mind to go down that path. :-)

1 Like

For what it’s worth, although it’s still more of a pipe dream at this point… supposing one had the taxonomies of a couple dozen floras digitized for the western U.S., it would then be trivial to take the name applied to a particular observation and pull out the set of taxonomies in which that is an accepted name and the nomenclatural circumscriptions from each. Supposing, further, that you had a big set of identifications from a particular identifier, you could probably aggregate the names used across that set of identifications to get a reasonably good idea what taxonomy they’re using.

Using iNaturalist observations as a basis for understanding the taxon concepts of particularly important identifiers on the platform is also on my list of useful things we could do with the data provided we’re pretty sure that the names applied actually reflect the intent of the identifier. For instance, Dave Ferguson knows a lot more about Opuntia than I do, and recognizes a lot more species than I do or can… for him and me to use the same taxonomy isn’t really pragmatically feasible… but his taxonomic concepts are valuable whether I can (currently, at least) apply them or not, and he doesn’t tend to publish much.

2 Likes

Could this be solved by simply adding another field called “inactive_taxon_ID_by_user”? Specific identifications by users are downloadable. If the inactive identifications were also downloadable, that would at least make the data available. And, if these could somehow be made searchable, this would allow a backdoor to display one’s specific taxon concept of preference.

EDIT: well, inactivated taxon IDs due to taxon changes. Inactive taxon IDs generally might pose problems since there can be more than one per observation. If there were the ability to add an identification of an inactivated taxon as an inactivated identification, that seems like it would properly allow the use of an alternative taxon concept while still representing the preferred taxonomy on iNaturalist. Perhaps that would get around the problem of changing taxa in iNaturalist directly?

3 Likes

For what it’s worth, I think this is an important topic. Even if the only way we can figure out how to use it is awkward and difficult to utilize, implementing a change here would be valuable. Otherwise, I feel like we’re throwing away data that could be utilized by future researchers. It may not be particularly obvious to most people why these data are important, but they represent a history of species circumscriptions (like annotations on a preserved specimen). Those circumscriptions can be referenced to avoid future confusion with identification, documented to show where problems in taxonomy still exist, and tested as species hypotheses in species delimitation studies (I’m currently working on refining the last one).

Identifications of people who write taxonomic resources are present on iNaturalist. Taxonomic decisions (at least my own) have been influenced (strongly in my case) by the observations housed on iNaturalist. To me and probably others, iNaturalist ends up functioning as a virtual natural history museum. This means that the identifications function as annotations representing taxonomic beliefs based on the best available evidence at the time of identification. In the specimen digitization community, data of identifications representing outdated taxonomy is digitized alongside current taxonomy. I know many don’t see iNaturalist as a natural history museum or even a scientific tool but it is being utilized in this way successfully. Finding a way to access these alternative taxon concepts on iNaturalist would be a step towards meeting data standards of the broader community of natural history museum curators and aiding in future research.

8 Likes

Yup, that’s where I ended up in my suggestions here. There are other things that would make the data a lot more human-readable and less prone to misinterpretation, but the basic idea is: allow IDs using inactive names; have the UI treat those IDs the same way it already deals with IDs using inactive names; provide some basic but user-friendly way to interact with those IDs (where my definition of “user-friendly” includes “download a related table” but not “try to wade through the API’s xml hierarchy”).

I continue to believe that this is not technically difficult. I think the idea of multiple taxonomies just kind of freaks people out, to the extent that there’s guilt by association. iNaturalist could be multiple-taxonomy-friendly without implementing anything “new” and with little enough change to the UI that I bet 99.9% of users wouldn’t notice.

2 Likes

Yes, this is a big part of my thinking, as well. The best way to learn the taxonomy of any difficult genus is to review specimens annotated by the best prior worker on that genus. Access to physical specimens is a limiting factor. Even if you’re at an institution that has an herbarium and you can do loans, it’s a lot of logistical hassle. If you’re not at that kind of institution, you’ve just got to schlep yourself to wherever the specimens are. Being able to do this digitally is incredibly valuable. It doesn’t replace working with specimens—with very rare exceptions you don’t get the same level of morphological detail—but the ease of access completely changes the landscape of what you can do and where you can do it from.

Many of the taxonomists whose IDs you would want to review are already on iNaturalist, and iNaturalist is really close to being an excellent tool for this purpose. It’s just not quite there because:

It’s hard to work with IDs by user (yes, you can do it, but it’s not great).

Users can only input the “active” name on an ID. Also, IDs are presented in a somewhat confusing fashion; it’s not immediately obvious whether an ID was created by a person vs. an automated process (yes, you can figure it out, but it’s more friction in the process). As a result, it can be hard to know if the ID your focal taxonomist put on a plant on iNaturalist actually matches the name that person would use in taxonomic publications or apply to herbarium specimens.

I think people on iNaturalist can get too focused on getting the right answer. That’s certainly a goal, and in many contexts an entirely reasonable one, but it isn’t the goal. Research is about comparing hypotheses. The set of plants that a particular taxonomist calls by a particular name is a hypothesis.

(I think this could be a useful direction for the whole citizen science / public outreach shindig, as well. Science as a process is all about finding and exploring uncertainty. Science in public perception is more about certainty and authoritative answers. But that’s a whole different rabbit hole.)

6 Likes

13 posts were split to a new topic: Forum Moderation

The discussion of names in different languages connects with this discussion. In some highly biodiverse areas where traditional cultures still survive, there are species which have common names in the language of the local culture, and a body of natural history knowledge, but no scientific names. Suppose someone wanted to add those names in the lexicon of that language? It could be argued that iNat’s practice of only adding taxa that have been scientifically described is excluding indigenous knowledge.

1 Like

Perhaps list those commons names in the Wikipedia About for a higher taxon. Ready to be added when the species are formally described.

Check PlantNet set in GBIF and compare it with iNat, I agree with you on how iNat handles mistakes of uploaders, but I also know that GBIF is far from being as clear as we would like it to be.

GBIG can do it because it’s data aggregator, I think iNat can’t afford it because it will create a mess that would need more curators to keep working on it and linking all the synonyms, etc.
PlantNet is like if all cultivated plants on iNat got to GBIF with no marking and often ridiculously wrong ids, sometimes with no means for anyone on GBIF to even check what was uploaded.

I’m confused. What has this to do with improving iNaturalist’s nomenclature and taxonomy?

You might want to quote the post(s) you are referring to here, since a bunch of the moderation-related posts just got moved to a separate topic at about the same time that you posted. Or are those the posts you were referring to?

Yes. Not all had been moved when I started writing; I actually began it as a reply to a specific user, and was confused when the username disappeared when I posted it.

Ok, no worries and sorry for the confusion.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.