Add New Export Taxon To Solve All (?) of the Problems With Research Grade

Platform(s): all

URLs: n/a

Description of need:

A number of changes have been introduced in the past year or so in an attempt to resolve complexities and issues surrounding infraspecies / subspecies in relation to the data quality assessment, observation taxon behavior, and when and how things become research grade.

My impression is that these changes have been attempts to alleviate (but not necessarily solve) the underlying issues and complexities by means of small and minimally disruptive (from a software engineering standpoint) software changes. However, because these changes “double down” on a system which conflates two separate concerns, they’ve (in my opinion) failed to satisfy every perspective and use case. This failure is evidenced by yet another (and ongoing) discussion to make another such change – see https://www.inaturalist.org/blog/122781-proposed-change-to-subspecies-labels-try-the-demo-and-vote .

In my opinion, every (or more) perspectives and use cases will continue to fail to be well met until we separate the concerns of whether something needs further identification and whether something should be exported to GBIF – a separation that many have requested in various forms over the years. This feature request proposes and details a separation of those two concerns via a new concept called an “Export Taxon”.

Feature request details:

Observation Taxon: The logic to determine what the observation taxon is set to should be the same across all taxonomic ranks. Any previously implemented behavior within this logic which results in a different output when the existing or potential new observation taxon is at or below the rank of species should be removed.

I may be missing a few details here, but if I understand the current logic correctly, the new observation taxon determination logic should be something like as follows:

  • Initial value of Observation Taxon, prior to any identifications, is Unknown.
  • If active IDs from the observer are present and the observer has opted out of community ID, then the observation taxon is whatever the observer identified the observation as.
  • If active IDs are present and the observer has not opted out of community ID, then the observation taxon is “led along” by an evaluation of community taxon consensus and leading identifications.

Community Taxon: I’m not aware of any behavior within the community taxon setting logic that changes when we’re dealing with a rank of species or below. To the extent my understanding aligns with reality, I don’t believe any changes to how community taxon is set are needed. It should continue to be set in line with the details provided here: https://help.inaturalist.org/en/support/solutions/articles/151000173076-what-are-the-community-taxon-and-the-observation-taxon- .

Export Taxon: Implement a new entity called the Export Taxon. The export taxon should be set to the lowest ranking taxon which both the observation taxon and the community taxon agree upon, but only once/while that agreement exists at a rank of (?) or below family. The export taxon should take the place of the observation taxon in terms of what gets sent to GBIF, and could be made visible on observations somehow.

JAN 20, 2026 EDIT: I originally intended, but neglected, to clarify that the community taxon should not just agree with the observation taxon at some rank, but the community taxon should itself have more than 2/3rds agreement before it is considered to “agree” with the observation taxon at any rank. I spoke to this oversight here – https://forum.inaturalist.org/t/add-new-export-taxon-to-solve-all-of-the-problems-with-research-grade/74888/5?u=regnierda – but I’m making an edit here too because this detail has a very material impact on when and how things get exported to GBIF.

Needs ID

The Needs ID status/indicator should be reworked to convey whether the community taxon can be further refined. The factors that should dictate this are as follows:

  • Community Taxon
    • is one present
    • does the taxon lack any lower ranked taxa
    • is there more than 2/3rds consensus
  • DQA criterion: Based on the evidence, can the Community Taxon be improved?

One could argue to also add “has photos or sounds”, “evidence of organism”, and other DQA criteria to the above, if so desired. One could also argue that the number of “can’t be improved” votes should exceed the number of “can be improved” votes by 2 (or exceed a certain ratio) in order to kick something out of Needs ID. I say “one” could because I won’t here argue those points but do think they’d be worth exploring as a later, “Phase Two”, iteration upon this idea.

The Needs ID status/indicator should be separated from the Casual and Research Grade statuses/indicators. Either an observation Needs ID or it doesn’t. An observation being Casual or Research Grade doesn’t mean that the observation doesn’t Need(s) ID.

Research Grade & Casual

The Research Grade status/indicator should be reworked to separate it from the Needs ID status/indicator. The factors that should dictate whether an observation is Research Grade are as follows:

Research Grade Qualification
Date specified
Location specified
Has Photos or Sounds
(REMOVE) Has ID supported by two or more
Date is accurate
Location is accurate
Organism is wild
Evidence of organism
Recent evidence of an organism
Evidence related to a single subject
Evidence accurately depicts organism or scene
(REMOVE) Community Taxon at species level or lower
(REMOVE) Community Taxon matches Observation Taxon
(REMOVE) Based on the evidence, can the Community Taxon be improved?
(NEW) Export taxon exists (could reword to be less abstract if so desired)

Further Comment

It’d also be worth updating search filters to allow ones to easily surface observations based on observation taxon, community taxon, and export taxon. Minimal configuration could be done upfront so that things under this scheme are at least usable, and then additional configuration could be done later to really polish things up.

I understand that this idea involves reaching deep into the engine and making non-trivial changes to how things are working today, and would require work in several of the iNat repos. While this design proposal does involve a good amount of work and does add some overall conceptual complexity, it is ultimately cleaner in that it separates unrelated concerns and simplifies what certain concepts represent and how they behave.

Some more practical benefits of this design/proposal:

  • Reduces contention surrounding and increases value of Community ID Opt-out observations by allowing them to go to GBIF under more circumstances while still keeping the observer in control of their own observations.
  • Data goes to GBIF sooner instead of sitting limbo for as long as it does under the current system. If this proposal as worded would result in data going to GBIF too soon, I suspect there are ways that could be managed. The research grade qualifications could be tightened up a bit; or additional behavior could be added – for example, logic saying that Needs ID status excludes something from becoming Research Grade up until the observation has reached a certain age or something.
  • Reduces some current incentive to abuse data quality assessment questions, provide guess IDs, etc. for the sake of getting something to research grade.
  • Those that want to ID non-wild organisms can do so more easily (an often and long voiced need), while those that only want to ID wild organisms can still do so just as easily – and the needs of the two no longer conflict as much as they do today.
  • Reduces some of the conflict between the needs and workflows of those who care about / believe in / use subspecies / infraspecies and the needs and workflows of those who don’t.

I like the idea behind your proposal but have a fundamental disagreement with these two among others…
Removing these from the RG qualification will lead to thousands of incorrect GBIF records that create more “work” with updating when they inevitably get corrected eventually.
This is coming from someone who deals with and goes through hard to id taxa such as lichenized fungi and bryophytes and often mass corrects cv misidentifications by amateur/unaware observers.
With iNats current direction there is no way to avoid this and every day there will be hundreds of observations that are errounously RG misidentified because of CV agreement to species level.

Solutions to this would be all things that have previously been requested:

  • Providing better support on Seek for casual users
  • Forcing the CV to back down from recommend certain taxa
  • Make CV recommend higher level taxa more often
  • Show the confidence percentage of the cv
  • Have certain identifiers have more value in their ID’s
  • and more…

Why do you view this as a problem of great importance? The current timeline is already pretty fast, on the order of a few months

What is the distinction between Export Taxon and RG?
Can an observation be at an export taxon but not RG and vice versa?

4 Likes

this doesn’t make sense. your proposal adds a lot of complexity without addressing the original issue(s). if / when they finally (correctly) fix the original issues, then your proposal would not be needed. so then what that tells me is they should just fix the original issues rather than spending time on your workaround.

4 Likes

Wait, what am I missing? How is this different from the community taxon except that it doesn’t exist at higher ranks? Wouldn’t it make more sense to just start exporting the CID instead of the observation taxon?

People already get confused enough about the difference between observation and community taxon, I hate to think what adding a third would do.

2 Likes

I suggested removing these criterion from the Research Grade Qualification because they become redundant with the addition of “Export taxon exists” to the RG Qualification. As proposed, an observation’s export taxon is null unless/until a community taxon exists (which itself requires more than one ID) and it agrees (at least at some rank) with the observation taxon.

One thing I had floating around my head but failed to articulate about the export taxon is that the community taxon should have more than 2/3rds agreement in order to inform the value of the export. This would ensure that observations continue to not go to GBIF until they “Ha[ve] ID supported by two or more”.

I don’t consider the timeliness of data going to GBIF particularly problematic. It just came to mind as a knock-on effect of this proposed design. In most cases, I think things are fine. However, there are lots of times I’ve noticed something sitting at genus or complex for years (with multiple IDs at that same rank) and nobody wants to add (or keep) a species-level ID or mark it as not possible of being ID’d further. Again though, as I mentioned in my original post, if we feel like this would “open the floodgates”, a narrower threshold could be defined. The key point here is the export taxon concept and the things that allows us to do.

With this proposal, the export taxon merely records the taxon that we’d send to GBIF if the observation is otherwise Research Grade (has photos, location, is of an organism, etc.). If that’s a little confusing, the two could be aligned a little further and “otherwise meets Research Grade Qualifications” could be added to the evaluation of whether to populate the export taxon.

I get that you’re not buying into the idea, but what I don’t understand fully is why. It does indeed add some complexity (an additional operation whenever an ID is saved, some additional conceptual overhead for users potentially, potential re-indexing), but I’m not sure how it doesn’t address “the original issue(s)”.

In your mind, what are the original issue(s) that my proposal was intended to address? In my mind the core issue is that the current design conflates two distinct questions – “should this go to GBIF” and “does this need further identification” – and the practical side effects that conflation has – such as conflicts surrounding sub/infra-species identification, incentives to not mark something as cultivated, to engage in certain behaviors merely to get something to achieve research grade, etc.

This isn’t a work-around, this is the fix. If the issue is a conflation of two disctinct things, the fix to that should be a (clearer) separation of those two distinct things, no? The things we’ve seen and continue to see discussed (making things behave differently once they reach species rank, etc.) are what are the work-arounds.

I think the idea of just exporting the CID instead is a fair one. However, a separate export taxon entity respects the Community ID Opt-Out choice and continues to provide the observer with a measure of control of how their data gets contributed to GBIF.

I agree that this has the potential to be very confusing. I think a well-thought out implementation within the UI could go a long way to mitigate the risk of confusion. I purposely refrained from going into detail on that though as I didn’t want people to get hung up on details that, while important, are secondary to the core proposal here.

1 Like

if someone’s opted out ID (the observation ID) doesn’t agree with the community ID, it would not be sent to GBIF. if you’re saying that you want such observations to be sent to GBIF, then that’s a whole new thing.

if you want to get things to research grade as fast as possible, you need to simply modify the rule about community taxon being the same as observation taxon, and say that this needs to apply at the species (or species equivalent) level.

then for GBIF export purposes, send over the community ID, as grampianshiker noted.

there are other issues related to how “can be improved?” works, but those are better discussed in other threads.

Yeah, I guess you’re right. I kind of got excited with the idea and ran with some of the additional possibilities of it. I should have reined that in a little bit, haha.

I agree that the extent to which the observation taxon and community taxon should/must agree is a separate concern. However, the core proposal still works whether we say they must continue to agree exactly (in order to inform the export taxon), or whether we say that (as I originally propose) they must only agree at some rank below family.

Personally, in a world where an export taxon concept loosely like what I propose here exists, I’d prefer the agreement not have to be on the exact taxon that is the observation taxon, but I’d be happy either way – and again concede that it’s a separate issue and could be discussed and dealt with entirely separately.

Another idea is to just allow “research grade” and “needs ID” to have more than two options. Right now, each seem to be just “yes” or “no”, or maybe they are all combined into one field and that is the problem. Each could have three or more options/levels.

Research grade:
Casual
No
Higher Level Research Grade (limited use for certain taxa)
Species Research Grade
Subspecies/variety Research Grade

Needs ID:
Needs ID to Species
Needs ID to Subspecies/variety
No

I think this suggestion, Make captive/cultivated not automatically “no ID needed” - Feature Requests - iNaturalist Community Forum, holds some weight here, and would help some underlying issues.

5 Likes