Observation counts for two users combined less than each individually

One thing the Dronefly bot does is produce displays that include counts of each individual’s observations & species, and a total line at the bottom which, in a single API call, totals up how much they have together. The observations of each individual should add up to the total, and the species count should be a superset. And that worked reliably until I saw it break with one particular user when their observations were added to the display. I can’t for the life of me figure out why. It’s easily reproducible on the web with the same queries the bot uses.

https://www.inaturalist.org/observations?taxon_id=47113&user_id=masonmaron&verifiable=any

4 observations, 2 species above for the 1st user.

https://www.inaturalist.org/observations?taxon_id=47113&user_id=dendrocygna&verifiable=any

16 observations, 4 species above for the 2nd user, the one that’s going missing in the total …

https://www.inaturalist.org/observations?taxon_id=47113&user_id=masonmaron,dendrocygna&verifiable=any

5 observations, 3 species here? What happened to the other 15 observations by dendrocygna? It seems it has only counted one, and the rest have been thrown into the bit bucket.

2 Likes

I tried going and saving each of those taxa, and it seemed to have made O. bilamellata show up. I saved each of the taxa from Aeolidia to Nudibranchia and then the observation count went up to 14 but the species didn’t change.

Then I changed the DQA on each of the “missing” observations and it looks like the rest showed up.

1 Like

Strange. OK, so now I know a procedure to fix it if it happens again, but I’d sure like to know why that works at all. I guess some index that isn’t used when searching for a single user_id, but only used when there are multiple was broken?

1 Like

this is probably similar to whatever is/was going on here: https://forum.inaturalist.org/t/the-map-does-not-show-all-observations/7838.

1 Like

My first thought was that a taxon didn’t get updated properly, and resaving them pushes that through. Changing the DQA does the same for individual observations but I’ve never had to try both before. In other similar issue I’ve had the observation(s) would have showed up or not regardless of how many users there were so that’s a little strange. I also don’t know that it wasn’t just one thing that fixed it and took a while to update.

I don’t really know how the underlying code works but that sounds plausible to me.

1 Like

This has happened again with this pair:

https://www.inaturalist.org/observations?verifiable=any&taxon_id=47178&user_id=cinnamon325,zealouswizard

Together, only 3 observations and 3 species show. But individually, they have 6 observations (6 species), and 2 observations (2 species) respectively:

Oddly enough, if the user ids of both users are given instead of their login IDs, then the expected 8 observations and 8 species total is shown:

https://www.inaturalist.org/observations?verifiable=any&taxon_id=47178&user_id=1299792,3234999

Does this help us to understand what is going on?

Note: the numbers have changed since I left this comment yesterday, as one of the users added more observations, but the problem still exists for the 1st URL and can be worked around with the last one.

since looking at this last, i think this may actually be related to this: https://forum.inaturalist.org/t/majority-of-my-observations-do-not-appear-if-i-use-the-show-only-my-observations-feature/21746/5.

compare:

i’m just guessing here, but i suspect when you do a user_id=[single user login] query, it will lookup the numeric id based on login and then get results from the database using user_id=[single user id]. but when you do a user_id=[multiple user login] query, it will switch to user_login=[multiple user login].

originally, i thought that they must have been storing a user_login along with each observation in the observations table, but i don’t think that’s the case based on https://github.com/inaturalist/inaturalist/blob/main/db/structure.sql#L2831-L2884

so it must be making that tie some other place. i’m not certain how ElasticSearch indexes work, but i suspect the problem is there. if the tie between observations and old logins occurs in one of those, then i guess those indexes just need to be updated whenever a user changes their login.

2 Likes