How do you compute Cumulative IDs from API response?

benarmstrong · October 21, 2019, 3:25pm

I must be dense, but I just cannot figure out from looking at the API doc alone how one goes about computing the Cumulative IDs: # of # from the fields returned from the /v1/observations API. I see identifications_count, & num_identification_agreements, but those two numbers alone don’t factor in how the user themselves agreed with the community ID. Is that a simple true/false passed back on the record, and if so, where? Or else, what fields should I be looking at to determine whether to count in their vote into the total # agreeing & the total # of identifications counted against the Cumulative IDs?

benarmstrong · October 21, 2019, 5:28pm

This is my best guess from what I can find in the record:

iterate over all the IDs & find the last id by the user
compare their last taxon.id with the community taxon.id
is the observer’s taxon.id in ident_taxon_ids and equal or lower in the ancestry than the community taxon.id? if so, it counts towards the id, so:
- add 1 to the identifications_count, the right-hand number
- add 1 to the num_identification_agreements, the left-hand number
otherwise it falls into one of two categories:
- it did not conflict with the community taxon.id (i.e. was one of ident_taxon_ids) but doesn’t count towards the ID (because it was higher in rank than the ID):
  - nothing is added or subtracted
- it did conflict with the community taxon_id (i.e. was not in ident_taxon_ids), so must be counted as an ID (maverick):
  - add 1 to identifications_count, the right-hand-number

I think everything that is needed for that is in the record, but it seems like an awful lot of extra work, so I just wanted to check if I had missed something simpler.

pisum · October 21, 2019, 7:55pm

if you haven’t already clicked on the About button in the Community ID section of an Observation detail page, you should do that, and make sure when the pop up appears and shows the Algorithm Summary, that you scroll all the way to the bottom to the see the whole thing.

that summary divides things into identification count, cumulative count, disagreement count, and ancestor disagreement. i think what shows up in the “Cumulative IDs: X of Y” indicator in the Community ID box is X = “cumulative” and Y = “cumulative” + “disagreements” + “ancestor disagreements”. (i think you’re calling X and Y “num_identifications_agreements” and “identifications_count”, respectively.)

so:

instead of this, i think you should find IDs where the current flag (response.results[i].identifications.current) is true.

there is a third category for ancestor disagreements, which get added to the Y figure i noted in my second paragraph above (i think you’re calling it the “identifications_count”).

benarmstrong · October 21, 2019, 7:58pm

The numbers the API gives me are num_identifications_agreements and identifications_count. Those are fields returned from the API. Therefore, the API gets me tantalizingly close to actually being able to display the #/#, but just doesn’t account for the user’s own ID itself. Is that clearer as to what my problem is?

Btw, I now have working code based on my algorithm above, which is only a partial rendering of what I imagine I’ll find at the reference you gave because the API didn’t give me everything I needed to do the whole job. Thanks for that. I’ll definitely cross-check it against my solution to make sure it lines up.

benarmstrong · October 21, 2019, 8:05pm

This is messy and a bit in need of refactoring, but you can see what I did here:

https://github.com/synrg/quaggagriff/blob/240e66b17df84b2dcd01314e5cfd1504246a151f/inatcog/obs.py#L59-L86

pisum · October 21, 2019, 8:31pm

ok… i see what you’re saying. i think what you’re describing is the best that can be done based on API results, as far as i can tell. just remember to account for the couple of things i mentioned above, i think. i don’t really know python(?), but your code looks like what you verbally described before.

benarmstrong · October 21, 2019, 8:34pm

Good. Yeah, I checked it and the disagreements & ancestor disagreements are lumped together into one case by my code. Every test I’ve given it so far is OK. I’ll either have to mock a disagreement by the original user by hand or else find a real one (preferable, as it’s less prone to error, and then I can save/simplify that as my mock) to test that last case.

benarmstrong · October 21, 2019, 8:42pm

p.s. re. “don’t really know python” – I’m still learning, but somehow managed to just write 1500 lines of it in the past few weeks. My prior exposure to it was on a single small personal project in the late '90’s and wow, has it ever evolved since then! I’ll have to say, it’s quite a lovely break from ruby + javascript at work.

benarmstrong · October 22, 2019, 7:25pm

I was wrong about those fields I chose. I have found that identifications_count, num_identifications_agreements & num_identifications_disagreements can be 0, even though the observation is RG with 2/2 supporting the taxon. Therefore, to solve this, I need to implement the full algorithm & count up everything in identifications. I’ll use current for that, following your tip.

system · December 21, 2019, 7:25pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to exclude observations with disagreeing IDs? General web , question	13	1989	October 2, 2020
Num_identification_agreements field Bug Reports	3	583	January 28, 2020
Seeking clarification on the identification algorithm with disagreement General question	6	301	September 20, 2024
Community ID doesn't take into account when taxa were created Bug Reports	7	233	March 17, 2025
Identification algorithm buggy - not accounting for disagreements Bug Reports	7	636	April 21, 2019

How do you compute Cumulative IDs from API response?

Related topics