Greetings,
This is formatted as a feature request, because I originally submitted it as one. However, the iNaturalist admins are reluctant to allow new feature requests related to flexibility & interoperability of iNaturalist data in the context of alternative taxonomies. So, I post it here instead.
Knowing that multiple taxonomies within iNaturalist is not a viable option for the foreseeable future, I have attempted to come up with the minimal set of modifications that would make iNaturalist compatible with multiple taxonomies. The most important of these would be allowing observations to be identified as inactive taxa, which could potentially be done without any other modifications to iNaturalist. The remaining components of my proposal are intended to make it easier to correctly understand and work with the iNaturalist identification data, both within the iNaturalist UI and in the context of data exported to some other workspace.
Platform(s): All.
URLs: Any pages where identifications show up.
Description of need:
The flexibility, transparency, and cross-platform interoperability of identifications are limiting factors in some use cases.
Suppose we have two taxonomic viewpoints. Iâll use a Yucca example. Taxonomy 1: Some people on iNaturalist, and letâs assume for the sake of argument Iâm one of them, consider Yucca torreyi and Yucca treculiana to be separate species. Taxonomy 2: Some, and letâs say person 2 is one of them, consider Yucca torreyi to be a synonym of Yucca treculiana. If both I and person 2 are IDing plants to the best of our ability, weâre going to end up with a lot of unproductive non-disagreements. I call a plant Yucca torreyi, they call it Yucca treculiana, but we donât actually disagree about what the plant is. How do we resolve this? One approach is to decide on the âofficialâ taxonomy and try to get everyone to follow it. This has a few problems, three of which seem most significant to me.
If we pick one side and declare it correct, weâre also telling the other side theyâre wrong. And weâre asking them to make IDs that they believe to be incorrect. This tends to be discouraging and alienating. Sure, we can tell people to get over it, that an emotional reaction here is silly. Inconveniently, people have emotional reactions whether theyâre silly or not. Trying to tell them what emotional reactions they should have is, again, discouraging and alienating. Do you want to spend time on a platform that devalues your viewpoint?
Related to the above, but with less focus on emotion: Asking people to use a taxonomy that they think is incorrect is likely to degrade their ability to provide accurate IDs. It imposes an additional cognitive load (you have to do constant âcorrect taxonomyâ to âiNaturalist taxonomyâ translations) and you may be asking people to use a taxonomy that they do not understand as well. Also, in practice, many people just arenât going to do it, often without even realizing it. Suppose iNaturalist accepts taxonomy 1. Iâm happy with that, but person 2 might just identify everything as Yucca treculiana, leading to unproductive non-disagreements as mentioned above. If person 2 is a casual iNatter just entering the name they believe to be correct in the app, thereâs a good chance person 2 wonât even know that the iNaturalist taxonomy differs from their own. Suppose iNaturalist accepts taxonomy 2. Person 2 is happy, but most of the time I canât enter the identification I believe is correct. Do I ID Yucca torreyi as Yucca, because thatâs the most precise, correct ID Iâm allowed to provide? Do I ID Yucca torreyi as Yucca treculiana, going along with the iNaturalist taxonomy whether I think itâs correct or not? Do I just steer clear of providing IDs on any of the things? (As someone who encounters this situation, personally I have done all three in different contexts, but for taxa I particularly care about I usually opt for genus-level ID.) How well will anyone be able to infer my intent from the IDs I provide?
The third big problem I see is coordination between groups in and outside of iNaturalist. For just about any research in which it can be useful to trade less complete per-observation documentation (compared to physical specimens) for the greater number and wider accessibility offered by digital observations, iNaturalist is the best tool out there. OK, I havenât systematically tested every platform out there, but so far as I can tell none of them are even close. I think iNaturalist was intended to be a citizen science tool and has, perhaps somewhat accidentally, ended up being one of the best sciencey science tools out there. But! Suppose youâre working on a big ecological monitoring project. Youâd like to have as good documentation field crewsâ plant IDs as you can, but youâre collecting plant observation data at a scale that physical specimens canât remotely cope with. iNaturalist is the obvious solution. Collecting digital observation data without iNaturalist would require you to reinvent the wheel, and your organization has IT constraints that would make it very difficult to ever achieve the kind of accessibility, feedback, and collaboration outside your organization that iNaturalist offers. (This is the situation Iâm in for my day job.) How well can you take advantage of all the capabilities that iNaturalist offers without committing your organization to use iNaturalistâs taxonomy? When the taxonomies differ you start running into all kinds of problems. Itâs confusing for the field crews to translate between âour nameâ and the âiNaturalist nameâ. The basic goal of documenting what plants were called what name by the field crews gets a lot harder when iNaturalist doesnât allow you to enter the name used by the field crews. And when the name changes afterward due to subsequent IDs, the user-friendly ways of searching observations stop working (unless the crews all opt out of community ID!). Suppose youâve got a situation like the yuccas aboveâif the field crews record both Yucca torreyi and Yucca treculiana on a plot in southern New Mexico, Iâm going to want to check those IDs. If iNaturalist only recognizes Yucca treculiana, itâs hard to keep track of whatâs going on. More generally, getting my organization to commit to the iNaturalist taxonomy increases internal resistance to iNaturalist as a solution. It also makes it more important to me / my organization that the iNaturalist taxonomy be âcorrectâ, and as the specific details of that taxonomy become more important to more people the likelihood of zero-sum conflicts over taxonomy go up.
I know there are potential workarounds for many of these issues. I also know that theyâre convoluted and cumbersome enough that you start to lose the advantages that make iNaturalist great. Within iNaturalist as it is now, the best solution Iâve come up with for addressing these issues in a systematic fashion is to shoehorn a parallel taxonomy into a new observation field and ignore the iNaturalist taxonomy entirely. This is not a good solution for many reasons, but itâs doable and gets a lot of birds with one stone.
When Iâve tried to have this discussion in the past, Iâve mostly gotten two kinds of responses: people telling me that these arenât actually problems, Iâm just doing taxonomy or iNaturalist (or both) wrong; people thinking I am proposing something that would be very confusing to iNaturalist users and very difficult or just not feasible to implement technically. With regard to the first category of response: Please, donât. Just accept that other people have different experiences and use cases than you. With regard to the second category, I think I can identify a pretty small and minor set of changes to iNaturalist that should not adversely affect the user experience for most iNaturalist users or involve anything radical on the technical side.
Previous responses are also why Iâm going into a level of detail here that probably seems wildly excessive. Iâve been led to believe that the baseline level of skepticism is very high when it comes to believing that there is any real concern here that could be taken seriously, and that there is a way to make progress that wouldnât be wildly destabilizing to iNaturalist as a whole.
Feature request details:
There are two parts:
- At present, when there is a change in the iNaturalist taxonomy, this is handled by creating new identification records. Instead of creating new identification records, have a change in the taxonomy be reflected by a change in the content of a new field or fields for existing identification records.
- Allow any name, including âinactiveâ names, to be entered as an identification. Use the new field(s) from â1â to store the corresponding accepted name in the current iNaturalist taxonomy.
For both parts, have user settings where the default is going to be simpler and close to the current iNaturalist experience. For â1â, a default like âjust show me iNaturalistâs current accepted name, and only use that name when searching by taxonâ, with the alternative being âshow me both the original name and iNaturalistâs current accepted name, and give me the option to search by either oneâ. For â2â, a default like âonly let me enter iNaturalistâs current accepted name as an IDâ with the alternative being âallow me to enter any name in iNaturalist as an IDâ. Under the default settings, everything looks the same except that the greyed out original IDs wouldnât be there. We could go from this:
To this:
Or maybe even drop the little taxon swap flag (Iâm not sure how useful it is for most users) and get this:
I think the current greyed out original ID / separate new taxon swap ID system is a compromise that is not great for anyone. Iâm guessing itâs confusing and unnecessary for most users who arenât interested in the taxonomic details without really meeting the needs of those who are. All I can say for certain, though, is that as a user who wants the details, I find it counter-intuitive and unhelpful.
Iâm also thinking that, viewed under default settings, if I entered an inactive name that iNaturalist maps onto Tomostima cuneifolia, my ID would show up just like the second or third images above.
For users selecting the âmore detailâ option for â2â, Iâm imagining the UI would look something like this:
In terms of what the underlying change in the data would look like, letâs take the two identifications in the first image above and make a table with a minimal set of fields needed to convey the situation. I end up with the table below, where âidâ is an ID number for each identification record, âtaxon_idâ is an ID number for each taxon, âis_changeâ is âfalseâ for an identification entered by the user, true for an identification created automatically from a taxon change, and âtaxon_activeâ is âtrueâ when a taxon is active, âfalseâ when a taxon is inactive:
If we reformat the same data as suggested in â1â, we create a new field, âusers_taxon_idâ to store the ID number for the taxon entered by the user, we drop the âis_changeâ field, and we change the name of âtaxon_activeâ to âusers_taxon_activeâ to indicate that it describes the thing in âusers_taxon_idâ, not the thing in âtaxon_idâ:
The current code to implement taxon swaps would basically need to be changed to update taxon_id to the ânewâ taxon_id, rather than creating a new identification record.
Letâs suppose I want to create an identification as Draba cuneifolia after that taxon swap was implemented and Draba cuneifolia became inactive on iNaturalist. Now, when we type text in the taxon box, the system searches for that text in the set of active taxa. Search all taxa instead, put the matching taxon_id in âusers_taxon_idâ, put the taxon_active value in âusers_taxon_activeâ, and if âusers_taxon_activeâ is âfalseâ do a quick lookup to find the current accepted taxon_id, put that âtaxon_idâ. We end up with:
If I want to create an identification under the current active name, Tomostima cuneifolia, users_taxon_active = TRUE should just mean the value from users_taxon_id is copied to taxon_id.
The real data has more fields, of course, but thatâs the basic idea. Restructure the data to give a straightforward and explicit relationship between what the user called the plant and what the iNaturalist taxonomy is calling the plant, and treat a user entering an inactive taxon after itâs inactive exactly the same as we treat a user entering an inactive taxon before itâs inactive.
What if we want to use a name that just isnât in iNaturalist? We already have the âtaxon_activeâ field, all we need is a little checkbox or something in the ânew taxonâ UI so that when weâre creating a new taxon we can say itâs inactive. Then we need one of those little taxon lookup doohickies from the taxon swap page so that we can tell it what the currently accepted taxon is. The parts exist, theyâre just not attached to each other in the UI at the moment. One could, also, simply use the new taxon interface as-is, then create and commit a new taxon swap for it. Functional enough, but inefficient.
The various places where you can search observations by taxon would need a little checkbox or something for âsearch âtaxon_idââ vs. âsearch âusers_taxon_idââ. Same functionality, just a switch to point at one field vs. the other.
How feasible this is to implement on iNaturalist, I donât really know. When it comes to coding, I use R and I have delusions of adequacy. If I were working with tabular data in R and a shiny web app, I know that the changes Iâm describing here would not be a big deal, though they would take longer than I expect because everything does. The scale of iNaturalist would create a lot of difficulties I donât know anything about. Luckily, people at iNaturalist, unlike me, know about those difficulties. So Iâm hoping the ineptness differential might cancel out the scaling difficulty differential. :-)
Does this solve the problem?:
It gets a lot of the way there and would lay groundwork for further improvements.
Letâs return to the Yucca example. If iNaturalist uses taxonomy 1, thereâs still potential for unproductive non-disagreements. If iNaturalist uses taxonomy 2, both I and person 2 can enter the IDs we believe to be correct. For users using the default settings, our non-disagreements get hidden and donât affect the community ID or any of the other functionality theyâre interacting with. We can also both search for âthe plants I called Yucca treculianaâ easily. For users who set their preferences to more taxonomic / identification detail, they can look at our non-disagreements, search for the plants each of us identified as one taxon or the other, and so on. So, big improvement in one context, little change in another context.
With regard to the âyouâre wrongâ issue, personally I would consider this major progress, though not perfect. If iNaturalist adopted taxonomy 2, I might still be irritable about it on occasion. To me, though, data integrity related to accurately capturing my ID and having it be findable in the data is much more important. If I think itâs Yucca torreyi, I can ID it as Yucca torreyi, and I and others who may care can easily and reliably tell that I IDed it as Yucca torreyi, that gets rid of something like 90% of my heartburn. Having the community ID also be Yucca torreyi would be nice, but the community ID is not something over which I feel like I should have control, while my ID is. I donât know how idiosyncratic I am in this regard.
With regard to putting people in a position where theyâre less able to do good ID work: Iâm inclined to think this problem would be basically solved in a structural sense. People would be able to use the names they believe to be correct and leave translating to the iNaturalist taxonomy to iNaturalist. Names that just arenât in iNaturalist would continue to be an issue to some extent.
I think coordination would basically be in the same boat, solved in a structural sense but with some ongoing content limitations arising from names that just arenât in iNaturalist. Crews could just enter the name as used in our organization without worrying about whether iNaturalist uses the same name. Searching for âthe plant the crew called by this nameâ would become easy and reliable whether our taxonomy is the same as iNaturalistâs or notâjust select the âsearch users_taxon_idâ filter. And so on.
There are quite a number of details that would remain to be worked out, but I think these are basically details that arenât handled well now, so immediately solving all of them doesnât seem like a reasonable expectation. Iâm ignoring these details for the moment because, believe it or not, this is me trying to be concise.
Also: yes, Yucca treculiana is the correct spelling. :-)