Getting identification data for multiple user_ids through iNaturalist API

I’m trying to get identifications of multiple user_ids through the iNaturalist API (https://api.inaturalist.org/v1/docs/#!/Identifications/get_identifications_categories). When I enter multiple user_ids into the ‘GET/identifications/categories’ function, I keep getting 0 results, even though when I try for one of the user_ids, there clearly are results. I have tried entering these multiple user_ids both as a string separated by commas, and as multiple values in new lines as instructed in the text box for entering the user_ids. However, I get the same null results each time. Grateful for any advice on this.

what is the Request URL generated on that page when you click the Try It button?

i don’t see any issues:

not sure what you’re trying to do, but this page may or may not be useful: https://forum.inaturalist.org/t/many-ids-have-id-category-null/19061/4

The request URL generated is: https://api.inaturalist.org/v1/identifications/categories?user_id=1756341%2C1617933&current=true&order=desc&order_by=created_at

I entered the following user_id into the user_id text box (whether as same line or separate lines):
1756341,1617933

it looks like /v1/identifications/categories has a problem parsing multiple numeric user_ids for whatever reason. i would just use the string-based (login) user_ids instead of the numeric ids in this case. for example, use &user_id=pisum,tiwane instead of &user_id=779571,28.

i suppose this could be turned into a bug report so that someone could fix whatever the underlying problem is, but honestly, i’m not sure it’s worth the effort.

Thanks, this works. I actually would like the breakdown of results by both user_id and identification category. Do you happen to know if this is possible as well?

you’d have to just make one request per user. then you can either manually aggregate those, or issue another request for the aggregate.

I see. Do you know how I can programme a loop if let’s say I have thousands of user_id.

Tried the following code in Python, but something seems wrong somewhere:

USER_IDS_SAMPLE_2 = [‘pisum’,‘tiwane’]

responses_sample = list()

for user_id in USER_IDS_SAMPLE_2:
response_sample = requests.get(“https://api.inaturalist.org/v1/identifications/categories?user_id=user_id&current=true&order=desc&order_by=created_at”).json()
responses_sample.append(response_sample)

i’m not much of a py user (@jcook is usually the guy i would ask about py), but your request URL seems to be using a hardcoded =user_id string value rather than referencing the user_id variable from your loop. i bet if you fix that, things will work fine.

i would suggest that you incorporate a 1sec delay in the loop. iNat will throttle and potentially block you altogether if you send too many requests too quickly. here are more recommended practices for developers: https://www.inaturalist.org/pages/api+recommended+practices.

just curious, but what are you trying to do here that you would need to look up this information for this many users?

Thanks. This is actually part of a class research project that I’m analysing various parameters of participation (e.g. no. of observations, species) in a particular area, in addition to identification.

Here’s an example:

import requests
from time import sleep

USER_IDS = ['pisum', 'tiwane']


def get_id_categories(user_id):
    """Get a summary of total identifications by category for a given user"""
    response = requests.get(
        'https://api.inaturalist.org/v1/identifications/categories',
        params={'user_id': user_id, 'current': 'true'},
    ).json()
    return {
        result['category']: result['count']
        for result in response.get('results', [])
    }


id_categories = {}
for user_id in USER_IDS:
    id_categories[user_id] = get_id_categories(user_id)
    sleep(1)

Which will give you a result like:

{
    'pisum': {
        'improving': 8546,
        'leading': 4395,
        'supporting': 3262,
        'maverick': 30,
    },
    'tiwane': {
        'supporting': 50213,
        'improving': 49764,
        'leading': 14290,
        'maverick': 67,
    },
}
1 Like

Thanks @jcook. I’ll try it.

Hi @jcook,

When using the above code, it runs successfully for about an hour before it finally gives the following message “KeyError: ‘results’”. I managed to get desired output for the data only up to the point that the code ran until. At other times, when I tried the code on different data samples, it immediately returns the error “KeyError: ‘results’”.

I thought you might have an idea for how to rectify this, given your expertise. Thanks very much in advance.

That just means one or more of the requests return 0 results. You can use dict.get() to handle that. I updated the example above.

Thanks @jcook