Community taxon algorithm tweaks

I already logged a version of this as a feature request but it was declined by moderators unfortunately ( before I started this thread ).

I will try and return to this thread later to comment in full on your points @pisum.

But one thing I would say for now is that I don’t believe this is a niche issue.
I guess if you weight all observations equally perhaps so, as this is doubtless a very limited problem in birds for example where we have ample expertise. But in inverts trapped at higher levels it’s certainly an issue.

For a visible example of impact, look at Opiliones observations trapped at order in Europe. We have two active experts, but not three, so the issue with the algorithm is quite well displayed in this particular instance. Basically any obs which have 6 IDs are affected here - which is 14 out of 30 observations on page 1 :

3 Likes

Just to do a little more number crunching – on page one of https://www.inaturalist.org/observations/identify?order=asc&verifiable=true&place_id=97391&taxon_id=47367&lrank=order, there are 12, not 14 observations with 6 or more IDs. Of those 12, only 3 have a species-level expert ID (i.e. in most cases the expert identifies above the species level). So 3 out of 30 on page 1, and the search returns a total of 348 observations. For comparison, Europe has 31357 NID/RG Opiliones observations. Assuming the 3/30 ratio holds true across all 348 observations stuck at Order, that’s around 35 observations which have a species-level ID by an expert but are stuck at Order, out of 31357 total Opiliones observations (~.1%). Even using 12/30 as your metric, that’s ~140 out of 31357 Opiliones observations that you think have unfairly weighted IDs (~.4%). Even if all 348 observations stuck at Order have unfairly weighted IDs, that’s 1% of all Opiliones observations in Europe.

The expert IDs aren’t usually going to species, but I think the point @sbushes is trying to make is that those experts are providing a more refined ID (e.g., to family or superfamily) and it’s not paying off because it gets hung up at just Opiliones, because of the maverick spider ID. So it’s a bigger fraction than 3/30. I count 16/30 that are affected currently (i.e., have an expert ID below Opiliones but also have a maverick spider ID that is keeping it stuck at Opiliones no matter how many people agree it’s Opiliones).

Is this “niche” compared to the firehose of observations that iNat gets? Of course. But that doesn’t mean the algorithms can’t or shouldn’t be improved.

3 Likes

Here’s some iNat history from 2013: https://github.com/inaturalist/inaturalist/issues/88#issue-16489792

4 Likes

I’m not sure why one would only factor in those affected which have species-level IDs?
But as @tristanmcknight says, this was not my intention. There are 16 trapped at incorrect rank due to this issue…so about 50% at that rank for that taxon, so likely ~170/348 observations impacted here.

The issue with this algorithm doesn’t only affect IDs trapped at order, it’s just a single particular rank and taxon where it’s visible, so I’m not sure why one would take 174 or 348 as a portion of 31357 either?

With an inverse but parallel logic I could assume that the issue with the algorithm has impacted 50% of all 31357 obs… so 15678 obs… just some have been since resolved to RG… and others have had support to a finer rank than the level of order. I do not believe this to be the case in Opiliones. I also do not assume that the distribution of the issue is in any way constant across ranks / taxa / Needs ID / RG. But assuming the problem doesn’t exist at all in other ranks/in those which have since been resolved to RG is clearly not correct either.
e.g. https://www.inaturalist.org/observations/3720578

I am not sure how one can easily define how “relatively” niche or not this issue is due to this lack of constancy. But say it was indeed 1% on average across all obs…that sounds low… but given there are 90,000,000 or so observations on iNat, this would still approach 1,000,000 total impacted. And crucially, we are talking about the time of expertise here - a scarce resource, particularly in complex taxa where this is doubtless more prevalent.


If you were planning this algorithm from scratch, as they do in @muir’s link, would you choose to
retain this aspect or to fix it? Because regardless of how niche this is or is not, fundamentally this aspect of the algorithm makes zero sense. From the original conversation on Github it seems likely this is just an oversight more than anything else.

Say you are outside at a bioblitz with a group of people and you spot something moving.
Bob says, it’s a spider!
Three people come along and say… no this is a harvestman actually.
Then two harvestmen experts come along and say yes, this is not only a harvestman but it is Mitopus morio. But nevertheless, Bob insists!.. it is definitely a spider!

If you want to create an algorithm to represent this group of people’s identifications, do you believe it to be more logical to :
A. take everyone’s opinion into account and weight all 6 IDs equally
B. take just Bob’s spider ID and the 2 x expert IDs and weight them against each other equally, ignoring the other 3 people also stating that Bob’s ID is incorrect
?

I don’t see a justifiable logic in B.
When you say it “depends on priorities”, what priority or situation would justify choosing option B over option A?

just to clarify, the current algorithm is doing both A and B in your example. it’s doing A at Order, and it’s doing B at Species. the logic is effectively taking a poll at each rank. at Order, it’s 5 Harvestmen vs 1 disagreement. at species, it’s 2 M. morio vs. 1 disagreement (3 abstain).

your proposed approach isn’t A vs B. you’re really asking for some sort of funneled or adjusted approach. for example, in a funneled approach, you would say that since the 5 Harvestmen win at Order, you will consider votes only among the 5 Harvestmen at lower ranks.

again, i’m not saying that yours is a conceptually bad approach. it’s just a more complex approach which is both less technically efficient and harder to represent to someone who wants to understand how the algorithm arrived at its result. it gets especially tricky when you have many branches / types of disagreement. (in my example, there are a maximum of 4 branches / types of disagreement, and there would have to be some thought into how best to handle that, i think.)

the question at the end of the day is whether the added complexity provides a net benefit?

1 Like

This whole thread is about the impact at the finer level.
My question was only intended to refer to the finer level, not the coarser one.
At the finer level, the only “poll” being taken is a bizarrely selective one.

Abstain is a strange term to use. This implies there is a decision on the part of the identifier, which there clearly isn’t. The algorithm arbitrarily discounts the opinion of three of the identifiers.

Hell… we could even have 1000 people add IDs at level of order attesting this is a harvestman not a spider, but with the current algorithm, the opinions of all 1000 would be discounted at the finer rank in favour of weighting the single spider ID against the finer species level Harvestmen IDs.

How does that make any sense?
How is this representing any real world situation or sense of democratic decision-making?

If you agree that arbitrarily discounting a portion of the poll makes no logical sense, then it’s a question of how to fairly represent all the IDs in the algorithm at all ranks. I would say that the funneling option you mention (discounting a single outvoted ID ) would make more sense than discounting a random portion of the vote. Though doubtless there are other approaches too which would also make more sense than the existing algorithm.

Externally to iNaturalist, I disagree that this is the more complex option. Weighting all IDs equally as best one can, and adhering to a real world logic as closely as possible is to my mind the simpler, fairer option. But I presume when you talk of complexity you are just referring to implementation within iNaturalist, not in terms of real world logic, is that correct?

In the iNat implementation you illustrated it certainly adds complexity, but I’m not at all convinced this is the only approach possible.

My experience from grappling with these sorts of IDs, is that the vast majority of people don’t look at the “whats this” section or even take the algorithm itself into account. They just intuit what they consider to be sensible (comparing to real world scenarios). So imo the nuts and bolts of the representation within the algorithm is much less important than the intuitiveness of it for users on the page itself.

2 Likes

try to lay out the algorithm in a process flow diagram, and you’ll see what i mean by more complexity. the way it works now is much simpler than a funneled approach or an approach that adjusts the impact of disagreements.

i don’t know how else to explain it. (i’m not going to draw out the diagrams myself because of the complexity, but try it, and you’ll see… and make sure you try to handle the multiple branches / types of disagreements.)

there’s not really a distinction between the “real world logic” and the logic needed to implement an algorithm properly, except that i suppose in the “real world”, you could define an incomplete process upfront and then just arbitrarily fill in missing information or logic on the fly. (if you code an incomplete process though, you’ll just end up with issues when you run into all the unhandled cases.)

i understand what you’re trying to accomplish, and i can see how you’re wanting to handle a very specific circumstance, but i don’t see a clear way of generalizing the process without adding a lot of additional logic.

you liked the funneling approach, and it’s easy to describe if there’s just one maverick vote, but what do you do if it’s not clear which among many votes is an outlier? what if there are 3 votes for Harvestmen, 2 votes for spiders, 1 vote for scorpion, 1 vote for human, and 3 votes for cabbage?

1 Like

I like @pisum’s more complex example. As I’m understanding the discussion above, the current system would say that it’s 7 v 3 in favour of Animalia at Kingdom (so the cabbages are maverick), then it’s only 6 vs 4 in favour of Arthropods so the CID is Animals.

In the system proposed by @sbushes (appropriately called ‘funneled’ by @pisum) it would say
7 v 3 in favour of Animalia at Kingdom so cabbages are maverick and discounted,
then within Animalia it’s 6 v 1 in favour of Phylum Arthropoda, so ‘Human’ is maverick and discounted.
then within Arthropoda there’s unanimity (6 v 0) all the way to Class Arachnida
but at Order it is split 3 v 2 v 1, so no single option has >2/3s support and the CID is Arachnida.

I do think the latter is much better (Just imagine how many expert species ID’s you’d need in Harvestmen to overturn that lot!), I can see that it’s a little more complicated, but only because at each level you need to reference previous levels to know if any IDs need to be ignored in the calculation. I can’t comment on the computational aspects, or effect on the speed of the site etc. but if it’s practicable I’d be in favour.

5 Likes

The funnelling system is basically just disregarding maverick IDs as disagreements entirely, as @alloyant suggests at beginning of thread right?

If so, in terms of what this looks like for users, why not just swap the “disagreement count” column for a “non-maverick disagreement count” column?

Or for simplicity… retain existing column “Disagreement Count”, but change definition.

At present it is described as :

" ‘disagreements’ - the number of IDs that are completely different (i.e. IDs of taxa that do not contain the taxon being scored)"

so change it to

" ‘disagreements’ - the number of IDs that are completely different (i.e. IDs of taxa that do not contain the taxon being scored) but not maverick ( i.e. not outvoted at a 3 to 1 ratio by other IDs ) "

(or however maverick is defined)


Computationally this doesn’t seem complex to me.
I don’t see how this is adding a lot of additional logic.
We already have a maverick ID state triggered… when that happens, it should just automatically be discounted from the disagreements column.
How would this be problematic?

At most, it would seem to me that behind the scenes you would need to retain the original disagreements count to calculate mavericks (in addition to the new non-maverick disagreement count). But in terms of presentation to the user, it can be the same, just redefined.

1 Like

“mavericks” has a very specific meaning in the system, and in a funneling approach, i don’t think it would be true that you’re just throwing out mavericks, even if you were to redefine your set of identifications / disagreements and recalculate “mavericks” (for just the redefined set) as you go down each rank in each branch.

ok. we’re making progress on defining an algorithm a little better. so the rule is that no single option has >2/3s support. great.

now let’s introduce a few variations on this:

  1. suppose we have genus G which has 3 species X, Y, and Z. Z is grafted directly to G, but X and Y are grafted to section S, which in turn is grafted to G. the votes are X=5, Y=2, and Z=3. what should you end up with as the community ID?
  2. suppose we have genus G which has 3 species X, Y, and Z. these species are each tied directly to G. the votes are X=4, Y=2, Z=1 and G=1, where the vote on G is a branch disagreement (which therefore disagrees with X, Y, and Z). what should you end up with as the community ID?
  3. suppose we have genus G which has 2 species X and Y. these species are tied directly to G. the votes are X=3, Y=1, and G=1, where the vote on G is a branch disagreement. what should you end up with as the community ID?

If a funnelling approach would not throw out mavericks, then we have crossed wires perhaps.

Ignoring whatever “funnelling” refers to for you, what is the problem with just discounting mavericks as I have above?

look a the 3 “variations” i’ve described above, and tell me what you would expect the system to do in each case (and why). also, do you agree with matthewvosper’s (initial) assessment of how the funneling should work?

1 Like

My understanding based on definition of Maverick from help section as:

Taxon is not a descendant or ancestor of the community taxon.

is as follows :

=======================================================================

Pisum use case 1: Discounting existing mavericks shifts CID from section to species
Screenshot 2022-02-21 at 11.23.18


Pisum use case 2: No change
Screenshot 2022-02-21 at 11.23.29


Pisum use case 3: No change
Screenshot 2022-02-21 at 11.24.08



For comparison …



My use-case 1: Shifts CID from superfamily to genus

Screenshot 2022-02-21 at 11.39.48


My use-case 2 : Shifts CID from order to species

Screenshot 2022-02-21 at 11.25.02

I remain unsure what you are referring to by funnelling. In Matthew’s description there is reference to the recalculation and creation of new mavericks at each stage. Is this what you mean?

e.g.

A kind of maverick rollover.
If this is what you and Matthew are referring to by funnelling, I am not sure why this is necessary?
Anyhow, in the use-cases mentioned, it does not have any impact on outcome as far as I can see:

Screenshot 2022-02-21 at 11.24.23

Screenshot 2022-02-21 at 11.24.34

I see, so this is simpler than I had imagined because there is no recalculation of mavericks at each level - but still most of the benefit is preserved.

Note that in the examples immediately above you have - I think - treated scorpions as if they were not arachnids, so I think the last lines should be:

  1. CURRENT:
    Arachnida 6 v 4
    {Harvestmen 3 v 7, Spiders 2 v 8, Scorpions 1 v 9}

  2. DISCOUNT EXISTING MAVERICKS
    Arachnida 6 v 1
    {Harvestmen 3 v 4, Spiders 2 v 5, Scorpions 1 v 6}

  3. DISCOUNT EXISTING AND ROLLOVER MAVERICKS
    Arachnida 6 v 0
    {Harvestmen 3 v 3, Spiders 2 v 4, Scorpions 1 v 5}

I suppose if you mavericked everything with <1/3, rather than waiting until one option had >2/3 support, you would have:

  1. ALTERNATIVE ROLLOVER
    Arachnida 6 v 0
    {Harvestmen 3 v 3, Spiders 2 v 4, Scorpions 1 v 5 - Maverick =>
    {Harvestmen 3 v 2, Spiders 2 v 3}

But that’s even more complicated because you need an iterative calculation at each level.

In these scenarios the number of Harvestmen IDs needed to shift the CID to Harvestmen is

  1. +12,
  2. +6
  3. +4
  4. +2

@sbushes suggestion (now I understand it) makes the biggest difference for the smallest change - it’s a question of where the Law of Diminishing Returns kicks in.

(I can’t work through the other cases at the moment!)

1 Like

it looks like your reasoning for what the community ID should be for my 3 variations is just that “that’s the result that my algorithm” produces – in which case, i sort of question why we’re even going through this exercise…

so let me try to get your “real world” reasoning for a couple of these variations:

  • in variation 1, given species-level votes for 3 species, where none of the species would carry a >2/3 vote vs the other species, why would you want any algorithm to be able to make a species-level (and therefore research-grade) determination for community ID?
  • in variation 3, since your goal is to reach research grade as soon as possible by excluding outlying disagreements, why would you not treat the genus-level branch disagreement an outlying disagreement, given 4 votes at species level vs 1 disagreement?

ok. so your preferred “discounting existing mavericks” approach is basically a single-pass, high-level adjustment of disagreements and recalculation of the community ID. this is technically more efficient than an iterative funneling approach, and maybe a little more efficient than other types of adjustment approaches. but if the goal was to get things to research grade as fast as possible, i can envision cases where i would wonder why we went with such a crude adjustment? just for example, if the vote was 3 harvestmen, 1 spider, 1 human, and 1 cabbage, the “discounting existing mavericks” approach could take this only to arachnid. but a funneling approach could take it to harvestmen.

you could argue that, well, it gets us closer to the goal with as little technical complexity as possible. but i see that kind of reasoning as sort of a veiled way to justify that the approach would solve the case du jour to your satisfaction. but if the justification is simply that it’s the least amount of work to solve for your specific case, then why should your particular use case be prioritized above any other cases?

just for example, in my area, female red-winged blackbirds are often mistaken for sparrows. so suppose i come across a research-grade observation where the votes are 2 Savannah sparrow and 1 sparrow. i look at it, and say, wait a second, that’s a blackbird, and i vote 1 red-winged blackbird. with the current algorithm, the system would kick the observation out of research grade, which is what i would want to happen. but with a “discounting existing mavericks approach”, my blackbird vote is completely ignored. so what makes my use case any less of a priority than your use case?

in other words, why are we going to the trouble of replacing the existing algorithm if all we’re doing is trading one arbitrary algorithm for a more complex arbitrary algorithm?

Any algorithm could be described as ‘arbitrary’, but I think the post addresses a genuine problem with the current algorithm, in that it is not only arbitrary (which is fine) but counterintuitive or arguably not self-consistent, in that it labels certain IDs as ‘Maverick’ but continues to give them the same weight as any other ID, resulting sometimes in observations getting stuck at high taxonomic levels because of clearly wrong IDs or at least IDs that have a considerable weight of expert opinion against them at a much more specific level.

In terms of suggesting alternatives, any stopping off point could be considered arbitrary, but all of the above I would regard as ‘improvements’, and there is a trade-off between the level of improvement and the computational practicalities which I will not pretend to grasp.

5 Likes

if i’m the one correcting for red-winged blackbirds misidentified as sparrows, this may not be an improvement for me, right?

my point is that not one algorithm is arbitrary and another isn’t. i agree they are all arbitrary to an extent. my point is that the justification for the proposal here becomes that it simply solves for a particular use case – in which case, how do we weigh the priority some use cases get vs. another use case? (it would seem arbitrary to weigh harvestmen over blackbirds.)