Eliminate the observation-level "Can the Community Taxon be Improved?" DQA item in favor of an ID-level "Cannot be Improved" flag

pisum · October 5, 2021, 5:37am

Current State

Currently for identifications, there’s the concept of a branch disagreement. This can occur when you identify an observation as a taxon that is an ancestor of the observation’s current taxon and then click a box to indicate that there’s not enough evidence to identify at the lower observation taxon. Your branch disagreement identification would effectively disagree with any descendant taxon to your identification taxon.

For example, suppose you came across an observation with just one identification for species Rudbeckia amplexicaulis, and you made a branch disagreement identification for genus Rudbeckia. Then the observation’s taxon would change from species to genus.

There has been occasional intense discussion about how branch disagreements are / should be initiated (ex. https://forum.inaturalist.org/t/change-wording-used-by-the-system-when-downgrading-an-observation-to-an-higher-level-taxa/3862). Part of the discussion led to talk of changes to clarify branch disagreements and to provide a new type of “leading disagreement”. But more than 2 years since that talk, no changes have been deployed to the masses, and I assume that means that there were issues discovered when alpha testing this functionality that may have halted the progress of that effort.

…

There’s also the “Can the Community Taxon be Improved?” DQA item in the Data Quality Assessment section of an observation:

I think the “Can be Improved?” item is intended to provide a mechanism to prevent hard-to-identify-to-species observations from lingering indefinitely in the Needs ID pool. For example, clicking “No, it’s as good as it can be” on the earlier Rudbeckia observation example would go ahead and make the observation Research Grade at the genus level. Alternatively, it could take an observation identified to family and put it in the Casual pool if it’s unlikely that anyone will be able to identify it to a lower level.

Unfortunately, because the “Can be Improved?” DQA item operates at the observation level and works on a yes / no voting model, it can sometimes lead to unexpected / undesired behavior:

Proposed Change

I think the way to improve the situation for both branch disagreements and “Can be improved?” is to eliminate the observation-level “Can be Improved?” DQA item in favor of an identification-level “Cannot be Improved” flag.

When a user goes to make an identification, there could be an extra little checkbox (or something like that) to the right of the taxon selection that the identifier could check if they want to indicate that they don’t think it’s possible (for anyone) to identify to a lower level:

If the flag is checked, the identification could display an extra little “Cannot be Improved” indicator. (If it triggers a branch disagreement at that point, then the disagreement would also be indicated separately, as it currently is.)

This way, identifiers would be able to record that they don’t think the observation can be identified to a lower level without having to wait for a lower-level taxon to explicitly disagree with. (Checking the flag would provide an implicit disagreement with descendant taxa, regardless of whether the other IDs were recorded before or after.)

Then you could also piggyback on the main identification voting system to determine whether a higher-than-species community ID can be improved (should remain Needs ID or not). If >2/3 of identifiers believe that the identification cannot be improved beyond a particular higher-than-species taxon and rank, then that will trigger genus-level observations to go to Research Grade and higher-level observations to go to casual. Otherwise, the observation remains at Needs ID.

(I think this simplifies things conceptually here since you don’t have to consider a separate yes / no vote on the DQA item, but some might perceive that requiring >1 person to trigger the cannot-be-improved action might be a bad thing though.)

Data Conversion

Besides the obvious required data model changes to support this proposed change, there would be a big question of how to handle data conversion. I think at the very least, you would treat existing identifications that are branch disagreements as having the new ID-level “Cannot be Improved” flag. Then, if it makes sense (I’m not sure it does), you might consider marking the IDs for folks who have indicated “Can be Improved?” = No as having their IDs also get the “Cannot be Improved” flag. Also, if it makes sense (I’m not sure it does), you might also consider making any subsequent identifications at the same level as a branch disagreement identification get the “Cannot be Improved” flag, too.

Since there’s not a perfect way to accomplish an exact data conversion, you would end up with many higher-than-species-level observations going back to the Needs ID queue.

Other Notes

What I’m describing here does not address the previously proposed “leading disagreements” functionality explicitly, but it might lay the groundwork for alternative (more elegant) ways to implement leading disagreements.

cthawley · October 5, 2021, 1:08pm

I think that this is an interesting idea. My two immediate reactions are:

Making this new “Cannot be Improved” flag very visible might be undesirable. The current DQA settings are not hidden per se, but tucked out of the way. In some ways, this isn’t desirable, but for the “cannot be improved” I generally think it’s a good thing: you shouldn’t be using this unless you know what you’re doing. Making it visible for every ID might lead to a lot of undesired behavior (ie, a lot more people choosing it inappropriately).
It’s also a pretty big change to essentially require 2 separate “votes” (selections of “Cannot Be Improved”) for the effects (ie moving to casual, going to RG at genus) to be “approved” by the system. Or at least that seems to be how the proposed system would work in practice. In my experience, there aren’t many users who can make or are comfortable making accurate determinations for “Cannot be Improved” in the current DQA system - you really need to know that it isn’t possible to ID further, which is a much higher bar to clear than making a positive ID. Getting two accurate votes in a lot of cases might be very difficult.

Those caveats aside, I think that conceptually this approach makes a lot of sense, and I would prefer it for my own personal use over the current system. I would just worry that, on the whole, there might be issues with its use that would lead to it overall not being a net positive.

trscavo · October 5, 2021, 1:47pm

With two checkboxes, the “Can the Community Taxon still be confirmed or improved?” item is the most complicated DQA item. The current proposal does not completely address the functionality of this DQA item. It only addresses the “no” checkbox while ignoring the “yes” checkbox.

If an identifier clicks the “yes” checkbox on a Research Grade observation, the observation reverts to Casual. This is desirable if the observation reached RG prematurely, which often happens when the observer “blindly” agrees with a leading ID. In this case, an identifier can click “yes” to signal that additional confirmation is needed.

(As a side note, this is a suboptimal use of this DQA item. Instead there should be a way for an identifier to fire-and-forget a signal that additional confirmation is needed.)

So the basic problem is that the “Can the Community Taxon still be confirmed or improved?” DQA item is too complicated. It needs to be broken down into simpler items that preserve all possible use cases, intended or otherwise.

pisum · October 5, 2021, 1:56pm

i guess i didn’t explicitly say it, but in my proposed change, if a user makes an identification without selecting the “Cannot be Improved” (No) option, then it will be considered an implicit “Can be Improved” (Yes).

yes. I’ve been thinking about it, and maybe for input purposes, you would get just a hamburger menu kind of button that gives you just one selection (for now). having to open the menu and then select the option (2 steps) might fix the problem of someone just accidentally checking a flag (1 step). on the display end of it though, i do think it has to be fairly visible so that it’s clear to people that this identification is a little different than a typical identification. for now, i’m not sure exactly what the best way to display this is… i’ll have to think on it.

i didn’t realize this was possible. this sort of seems undemocratic though, and i would be happy to see this eliminated, i think. are there specific cases where this functionality was actually used for a positive effect?

cthawley · October 5, 2021, 4:34pm

This behavior of the DQA is pretty annoying in my opinion, as plenty of otherwise RG stuff gets stuck there because someone ticked the DQA and then didn’t follow up with the observation and remove it once the observation was IDed correctly.

However, as I understand it, ticking the “yes” on this DQA is not supposed to be used in the way described (to reinforce a disagreement - this isn’t the type of “improvement” referenced). If there’s a disagreeing ID, it should be given by the IDer and made explicit, not done via the DQA. That is sort of “voting twice” (and also confusing to most users). So I would suggest that people not use it this way.

Also, when I’ve seen this occur (ticking the “Yes” box), it doesn’t move an RG observation to “Casual” but to “needs ID” - it basically implies that if it is at genus level, it can at least be refined to species, or if currently at species, can be refined to subspecies.

arboretum_amy · October 5, 2021, 5:30pm

As others have said, that line of the DQA is one of the most complicated. My two biggest uses of the “no” box are for bumping multi-species observations to casual, and for bumping observations to casual because the observer has opted out of community ID and then ignored several–three or four at least–perfectly good IDs for a long period of time–a year or more. (Most times I leave a comment as well, especially if the observer seems active and might respond to a tag.) So while I think it’s fine to require 2/3rds vote on how bad the photo is, or how cryptic the species is, is that really needed for the two use cases above? I suppose in an ideal world we’d have separate DQA lines for such situations.

mamestraconfigurata · October 6, 2021, 12:26am

I use the DQA box for moths in a Complex that cannot be identified without dissection (i.e. Xestia c-nigrum/dolosa complex). Also, in the examples above, if someone does add a correct ID, they can uncheck the box. All this seems unnecessarily complicated.

pisum · October 6, 2021, 12:51am

yes, i think this is a common enough problem that it does deserve its own flag. has anyone made a request for this yet?

i have mixed feelings about the application of “Can be Improved?” in this way. on the one hand, i personally wouldn’t do this, but on the other hand, i can see how it might be annoying to have such observations be stuck in perpetual Needs ID limbo. but on my third hand, i can see how if someone made one of my observations casual this way – even though i obviously had strong conviction that i was right about my id – that could make me even more unhappy. i would be mad at the person who ticked cannot be improved, and i would be mad at the system for allowing people to vandalize my observation even though i clearly had opted out. why did they pick on my observation when there are so many other observations to pick on?.. so i tend to lean slightly towards this being a misapplication of “Can be Improved?”.

i think really the system needs to handle all these cases one way or another. either they just need to linger in limbo, probably with a way to filter for opt-out disagreements (has anyone made such a feature request?) or they all need to become casual after the community taxon gets to what otherwise would have been RG and disagrees with the opt-out taxon.

i think this is one case where my proposed change could lay the groundwork for some future automation. maybe certain taxa could be flagged as not being able to be improved by default (which could be overridden, of course), and this would automatically nudge these kinds of observations to go to higher-than-species RG faster than they currently do.

arboretum_amy · October 6, 2021, 10:57pm

Almost always the observer has put a high ID and then several people have provided the species, but the observer has ignored them. So I’m not so much calling the observer wrong as truly saying “the community ID is as good as it can be.”

treichard · October 7, 2021, 12:09am

@pisum This feature request and motivation are very similar to an FR from four years ago that gained little interest. The timing of that FR is much sooner after the new ID grade system was implemented but long enough after it to have begun to realize the issues with with the too-often-unanswerable “Can the Community Taxon still be confirmed or improved”? question. The issues remain, and it would be useful to have them finally solved.

The earlier FR:
https://groups.google.com/g/inaturalist/c/GLwn9iAyai8/m/m0XWwv0oAAAJ

pisum · October 9, 2021, 7:17pm

there’s not an error in your logic. i just view the opt out as sort of a “leave my observation alone” flag. so i just think it’s not great to go making someone’s observation as casual if they’ve marked it in this way. the exceptions are things like if there’s missing or obviously wrong information (coordinates, dates), if it’s obviously cultivated, or if the identification is abusive / trolling. it’s just my opinion, but i don’t think a maverick opt-out ID rises to the level of needing to push the observation into the casual pool (unless that’s the rule that the system applies to all such observations).

(it may be worth noting that my proposal would effectively remove this way of pushing observations into the casual pool.)

that’s interesting. your proposal is slightly different from mine in that mine will attach either an implicit “Can be Improved” or an explicit “Cannot be Improved”, whereas your proposal allows an implicit/explicit “No Answer” or and explicit “Yes” or “No”. it’s not totally clear to me – what benefit would the explicit “Yes” offer?

treichard · October 9, 2021, 9:05pm

I think the main use of Yes = Can be Improved would be a leading observer entering a species rank ID, knowing it can be taken to a lower rank, such as subspecies, but they are unwilling/unable to offer a lower-rank ID themselves. This would keep the observation in Needs ID if an agreeing ID follows, since improvement is still expected. It would force either further IDs to reach community agreement at the lower rank (lower rank achieved) or to include a species-rank ID with No = Cannot Be Improved to follow (disagrees there is further ID potential).

And of course an identifier who isn’t sure if their ID realized all the ID potential can just leave the question blank (No Answer), neither suggesting their ID is as low as can be nor that it can definitely go lower.

That’s what I think the man differences may be. I don’t know know which way – yours, mine, or another – is best.

pisum · October 18, 2021, 6:09pm

i wouldn’t want observations that would otherwise have reached research grade to ever be held.back simply in the hopes that some identifier might come across it in the needs ID pool. so then i still don’t see a good use case for retaining an explicit Yes option, and i’ll stick with my original proposal.

bouteloua · September 18, 2024, 6:51pm

5 posts were split to a new topic: No longer able to preemptively mark community taxon as unable to be improved

Topic		Replies	Views
Say what the Community Taxon is in the DQA Feature Requests github-issue-made	7	1711	January 25, 2024
No longer able to preemptively mark community taxon as unable to be improved General	27	336	November 22, 2024
Why isn't this observation at RG yet? Bug Reports	6	799	January 28, 2020
How to act on RG observations that lack evidence for the community ID General	13	1398	September 30, 2020
How to Overcome Branch Disagreement General question , identification	10	453	August 20, 2023

Eliminate the observation-level "Can the Community Taxon be Improved?" DQA item in favor of an ID-level "Cannot be Improved" flag

Current State

Proposed Change

Data Conversion

Other Notes

Related topics