Change how id's work when there are disagreements

try to think of it from the perspective of the person who explicitly chooses not to withdraw their ID. sure, a lot of folks just put in a bad ID and never revisit it. but how do you systematically separate that kind of case from a person who legitimately believes in their ID and purposely doesn’t withdraw because they believe it’s correct? because you can’t separate those cases without asking the identifier what they intended, isn’t it better to acknowledge the disagreement and put the observation taxon at a higher level?

if you’re the person who chooses not to withdraw your ID and all of a sudden your ID seems to be effectively erased, how do you feel?

Doesn’t it feel bad also if people chime in with extra id’s that are against the one you put, and so the observation tips over? That’s what happens anyways in this case: your id vs many opposing id’s, and the only way yours will ‘win’, i.e. ‘appear to count for something’ is if there is a majority, either at species level or higher, for your taxon. It’s not at if, in the original case, it will be put to your ID, or even something commensurate with it, if it is outvoted at a higher level. This is the same as that, in that respect, except it actually ‘moves on’ with the observation ID. If you got enough supporting ID’s for your taxa after that, it would tip back. Does it really feel better to be the one holdout when a mob has to come to put the observation into another category - i.e. there will always be a large number of votes against yours when it advances - than to have only a few votes conflicting yours but the leading taxa ID advances with the consensus? I don’t feel it’s any better to have large numbers of people disagreeing with me. This system, if anything, encourages that.
And since none of the ID’s are erased, it’s possible to put something back, and even easier I would say than if five or more identifiers had been tagged to put it ‘forward’ in the first place, and now even more are needed to counteract them if it must go to a taxon commensurate with what the holdout is holding out for.
At any rate, it just doesn’t seem to make logical sense from the point of view of ‘tip’ agreement.

I ask again: can you not weight things so that a holdout has a chance? It’s fine to hold out, per iNat laws, and it’s also fine to get people to outvote someone, apparently, and also fine to be outvoted.

Perhaps if the leading id is not without conflict, it may take more votes before it can be placed as the leading ID, but that should work at all levels, not just the species one: should work at community ID level too, and at higher taxa levels too.

I’m not specifically for the weighting like this, but it must be consistent at all levels. None of them must be more special, or treated differently. The model right now, and the slant of the discussion, doesn’t seem to be focused on this but I think it’s paramount: to get the behaviour you want it shouldn’t mean resorting to somethign inconsistent and/or ‘illogical’ about the model you are using, aka treating taxon levels differently wrt disagreements / not using the idea that ‘one taxon is inside of another (higher-level) one’.

This is how I’m thinking of it, and that’s all I think I can say on the subject. I don’t see how significantly fewer feelings are hurt - and importantly I don’t see how it’s more mercenary than what I would call ‘iNat’s typical behaviour’, and if you want that to change we should change that, but as it won’t change how taxa nest inside of each other it won’t necessarily change this.

suppose you have these IDs:

  1. New World Sparrow
  2. New World Sparrow
  3. Savannah Sparrow
  4. Red-winged Blackbird

If the resulting ID is New World Sparrow because it’s 3 New World Sparrow (75%) to 1 Red-winged Blackbird (25%), that doesn’t feel bad to me.

If the resulting ID is Savannah Sparrow because the Red-winged Blackbird IDs is effectively erased, i feel bad for the Red-winged Blackbird identifier.

suppose you have these IDs:

  1. Red-winged Blackbird
  2. Savannah Sparrow
  3. New World Sparrow
  4. New world Sparrow

If the resulting ID is New World Sparrow because it’s 3 New World Sparrow (75%) to 1 Red-winged Blackbird (25%), that doesn’t feel bad to me. The fact that identifiers 3 and 4 have chosen to identify to a higher level doesn’t feel like extra / lost effort to me because those identifiers chose to make that effort. they could have perhaps taken an extra effort to identify down to the species level, but they didn’t. they could have ignored the observation altogether, but they didn’t. they could have mentioned a user that they know could identify to the species level, but they didn’t.

If the resulting ID is Savannah Sparrow because the Red-winged Blackbird IDs is effectively erased, i feel bad for the Red-winged Blackbird identifier.

you’re trying to solve the problem of getting identifiers to the right observations by complicating the algorithm so that your goal is achieved some of the time, but you’re discounting the side effects. as others have said, there are other ways to more directly address the actual problem. the existing algorithm is fine.

how about bat,bird,bird,bird,sparrow,sparrow?
is there a consensus there?
without the bat ID this would be RG

if not, how about bat,bird,bird,bird,bird,bird,bird,bird,bird,bird,bird,bird,sparrow, sparrow?
is there a consensus there?

…there has to be a point for you (and @pisum / @paul_dennehy / @dianastuder ) where it becomes illogical for the anomalous ID to bear weight in the equation

Pisum, there are two different ids in question. Only one of them has ever behaved like this.

IMO, yes.

2 Likes

This whole thread is being jammed because people are conflating the community and ‘observation’ ID’s.

Check the top of an observation which is id’d to species but not at research grade, then check the side of if for the community ID.

Notice the two will say different things:
The one at the side reflects consensus: Community ID
The one at the top is the finest possible: ‘observation ID’

This isn’t the best discussion but it mentions the difference: https://forum.inaturalist.org/t/community-versus-observation-taxon/4426/3

1 Like

i don’t understand your last statement. please clarify. what are the two different ids? what is “only one of them”? what is the behavior you’re referencing?

The behaviour desired is for the observation taxon to update as described. However it is implemented - a ‘reasonable’ algorithm (according to me) or not - the CID behaves as people expect.

Best algorithm (for both Id’s) so far: https://forum.inaturalist.org/t/community-taxon-algorithm-tweaks/28583/32?u=tchakamaura

CID will not go to species, in these cases, but the decisions at least make sense at every step.

1 Like

To no-one’s surprise who’s read my comments in the other thread, I agree with @tchakamaura here… simply put, I think once an ID has been outvoted to the point of becoming maverick it should contribute no further to the algorithm either for the CID or the observation ID. The previous thread as I recall only referred to the CID, whereas (as I understand it) the new point simply extends the logic to the observation ID. I think that is beneficial all round. (It even makes it easier for the person whose ID has been mavericked to recover the situation if they can get support for their ID).

3 Likes

i don’t think that’s what tchakamaura is actually suggesting.

that “funneling” algorithm you’ve referenced erases maverick IDs

suppose:

  1. Red-winged Blackbird
  2. Savannah Sparrow
  3. New World Sparrow
  4. New World Sparrow
  5. Post Oak

based on that algorithm, the community ID would end up as New World Sparrow based on 3 out of 3 New World Sparrows because it would erase Post Oak and then Red-winged Blackbird.

that’s different than the existing algorithm, where the community Id would end up as Birds based on 4 Birds out of 5.

i think you’re actually imaging a different algorithm where the community ID is calculated exactly as it is now, but then if there are descendant IDs of the community ID, you want those IDs to potentially refine the observation ID (as long as any disagreements are maverick).

so using the example above, both the observation ID and the community ID would be Birds. note that there would be no funneling here to force observation ID to Savannah Sparrow.

but if:

  1. Red-winged Blackbird
  2. Savannah Sparrow
  3. New World Sparrow
  4. New World Sparrow
  5. Passerculus

… then i assume your algorithm (based on your descriptions) would make a community ID of New World Sparrow based on 4 New World Sparrow out of 5, with a refined observation ID of Savannah Sparrow.

and if:

  1. Savannah Sparrow
  2. Birds (ancestor disagreement)
  3. New World Sparrow
  4. New World Sparrow
  5. Passerculus

… then I assume you would want community ID to be New World Sparrow based on 4 New World Sparrow out of 5, and then have observation ID refined down to Savannah Sparrow.

is that how you’re proposing your algorithm to work?

What iNat calls a Maverick ID - has been swept aside by the CID.
A Maverick ID has no effect - update - no effect on CID consensus.

It is the Pre-Maverick ID which holds back CID.
Pre-Maverick, followed by 2 agree, needs just one more.

The gnarly ones are a tug of war where the votes rack up on both sides.

The community ID and the observation heading ID

The observation ID responds differently than the community ID to the same combination of individual IDs.

1 Like

There is no erasure of ‘maverick’ id’s - they are not treated differently than any other id’s, and every id always remains associated with the observation. If someone wants to add a bunch of id’s later to support what is now a maverick identification it will go towards the total count like always, and together they can tip the scale to some other part of the tree of life. No effort is expended to sequester identifications or otherwise tag them. Every new id on the observation triggers a recalculation from top-level down.

1 Like

Yes, there is a difference w/ CID but it still isn’t going to the finest level. Here it would go to New World Sparrows. Observation ID is Savannah Sparrow, as within New World Sparrows that is the furthest uncontested one.

If there was only one ID of ‘New World Sparrow’ the the CID would be ‘birds’.

CID of New World Sparrow, Observation ID of Savannah Sparrow. If two more people put Passerculus (sp.) that will be the CID.

if your proposed algorithm produces a different community ID than the existing algorithm, then your algorithm effectively erases the the maverick ID, since in your proposed algorithm, it would be removed from the final community ID calculation.

this effective erasure is something that i have a problem with because it’s complicating the algorithm, and it has unintended side effects.

for what it’s worth, the algorithm that i thought you were describing is less objectionable to me because the community ID there would be the same as the existing community ID. the only question there is whether the descendant IDs should refine the observation ID. i find that to be a less objectionable debate.

This is not correct.

In the example
bat, bird,bird,bird,sparrow,sparrow
bat is maverick but still weighed against sparrow

Here is an actual example and it’s algorithm summary :
https://www.inaturalist.org/observations/81586475

as you can see, years on…but the fly ID still remains in play as a disagreement against the genus level sawfly ID by one of the only European sawfly specialists we have…despite 6 people now agreeing this isn’t a fly :roll_eyes:

that’s the point…( of the thread I started before at least )

2 Likes

I feel use of the word erase here is a bit misleading.
In the summary posted above, if the maverick fly ID was discounted at present, it wouldn’t be erased exactly… that sounds absolute and permanent. If sufficient other people added fly IDs it would come back into play and take the ID back to Pterygota (as is the case at present).

1 Like

ok. what’s a better word or phrase? the effect is still the same. any version of a funneling algorithm, even if there’s only one pass, removes the effect of the maverick ID during the community ID calculation right?

read what i thought tchakamaura was describing, and tell me if you have any objections to that approach vs. the funneling approach.