API v2 question: How do you get the UUID for an observation?

In the new iNat API, observations are accessed by UUID instead of observation ID. So if you only know the observation ID, how do you get the UUID via the API (in order to retrieve the rest of the observation data)?

Not sure if this fully answers your question, but check this comment by @pisum:

the v2 API is still early in development. is there a particular reason you’re trying to use that right now?

Thanks! That response explains how to retrieve an observation via an observation ID using the website, but I’m interested in how to do it via the API.

I have some scripts written for API v1 and I was investigating whether or not they will be able to migrate to API v2. Since everything in API v2 is based on UUID, it seems pretty critical that there’s a way to get the UUID using the observation ID, but so far I haven’t found a way to do this. Is there a way, currently? If not, I’ll request it on GitHub.

One way is via GET /observations with id_above and id_below, using values one above and one below the required integer ID. This seems a little hacky, but I was able to get a valid response with the requested UUID via the API tester - so it at least shows it can be done. I only had a quick look at the options, though, so there may be a better way.

1 Like

i assume the functionality will be added later, either where id will accept either the legacy numeric IDs if the input is a number, or UUIDs, if the input is a string; or where a legacy_id parameter will be added later. i wouldn’t assume that the id_above and id_below parameters will remain in the final iteration of the v2 API. i also wouldn’t assume that once v2 is active that new observations will be assigned a numeric ID either. there are good reasons that you would want to move folks away from the existing IDs to UUIDs in the long run.

I guess you are talking about:
https://forum.inaturalist.org/t/api-v2-feedback/21215

Part of the point of API v2 is to remove the mapping between id and uuid, so a v2 method of doing that would defeat the purpose. For now, though, you can use API v1 for this. Here’s an example using curl and jq:

curl "https://api.inaturalist.org/v1/observations/2" | jq ".results[0].uuid" 

That’s assuming you’ve already got some local records you need to map. If you’re just working with data fresh from the API, uuid should be available in observation responses, e.g.

curl "https://api.inaturalist.org/v2/observations?user_id=kueda&fields=uuid" | jq ".results[].uuid" 

Part of the point of API v2 is to remove the mapping between id and uuid , so a v2 method of doing that would defeat the purpose.

Can you explain more about this? Is the plan to eventually switch over all the observation URLs to use uuids instead of ids? Currently, the only identifier that a naive user can find out for an observation is the ID, so it seems like that would need to change before API v2 would really be usable for 3rd party scripts. For example, I wrote a script that takes a list of observation IDs as input (or observation URLs) and outputs the observation data formatted for BOLD ingestion. If I wanted to migrate that script to API v2, how would I do that? Or would that just not make sense to do at this point. Is there some kind of roadmap for API v2 that could be shared with 3rd party developers using the iNat API?

Well, the original plan was to switch to UUID-based URLs for obscured records as another roadblock to coordinate interpolation, but it’s becoming increasingly clear that juggling two identifiers is kind of terrible, so now I’m not sure. Need to discuss more w/ the team.

I don’t think you could. You’d need to build another way for people to specify what observations they want, or we’d need to list UUIDs in a more obvious place for people to cut and paste, or we’d need to build some ID-to-UUID mapping service like you describe, but one that omitted obscured records.

Alas no. We’re trying to transition ourselves over to it… slowly. I think once we have the web and mobile apps using it exclusively we’d try to notify people that v1 is deprecated, but we don’t have a due date.

why wouldn’t you switch to UUIDs for all records?

We would break every single iNat observation URL on the Internet. As with every kind of security, we have to find a balance between security and utility.

2 Likes

there are ways to go to UUIDs for all records without breaking everything from the past, i think.

you could start by announcing the intention to switch to UUIDs as the primary identifier at some point, and let folks know that the numeric IDs would be deprecated for all observations and then shut down for new observations altogether sometime after. (observations that had been assigned a numeric ID would retain their numeric IDs and generally could still be referenced by them, except obscured and private observations for which you did not have access to see full details.)

some time after the announcement, you would switch all GUIs, exports, etc. to display UUID as the primary identifier. at the same time, the v1 API would be changed to allow allow input of UUIDs in the id parameter, based on whether the input was as number or a string. the v2 API would be changed to allow similar functionality.

some time after the above change, you would shut down the old old API and also stop the v1 and v2 APIs from being able to accept the numeric IDs as inputs to find obscured and private observations for which you did not have access to see the full details. this would serve as a warning to developers to switch over to UUIDs, if they hadn’t already.

some time after that, stop assigning numeric IDs to new observations.

some time after that, shut down v1.

now, if the issue is that it’s much harder for humans to remember and type UUIDs than the existing IDs, then that’s a reasonable concern. if this were to be the main concern that kept you from switching over to UUIDs, then that’s fine, but i would still switch over to displaying UUIDs as the primary identifier for all observations (but still display numeric IDs as a secondary identifier). once folks had a chance to operate with that kind of paradigm for a while, then ask them again whether they think they couldn’t live without the numeric IDs, and if they still say they couldn’t, then just keep assigning the numeric IDs to new observations.

1 Like

My concern has been along these same lines, that it becomes much messier and more typo-prone to cite iNat observations in research publications.

My alternative suggestion would be numeric IDs similar to the present ones, but assigned randomly instead of sequentially from within a large “block” of numbers that would roll over to a new block when the current block was “used up.”

i’m not against this generally, but i don’t think it’s technically efficient to assign IDs this way.

1 Like

I’m not familiar with how UUIDs are generated and assigned, so am unable to compare the two options.

But with ID numbers, it seems like something as primitive as pre-randomizing a numeric block and storing it in a background table could make it pretty efficient to pull and assign the next random ID number to a newly created observation. Then just automate it so that the next block gets pre-randomized and readied for use when an appropriate number of rows in the current table have been assigned. Technically unconventional and inelegant maybe, but I’d bet it could be done reliably and seamlessly with negligible performance impact.

I think you’re up against the law of diminishing returns when it comes to obscured observations. It’s already much easier to use social engineering to bypass the obscuring mechanisms than it is to use coordinate interpolation (AFAICT). Personally, I would be skeptical of any additional observation security measures that negatively impact utility. It’s just not worth the trade-off, IMO.

Back before the days of iNaturalist, I used to have a hobby of trying to photograph rare flower species. In every case, I was able to locate the species by simply asking people who had posted photos of them on the internet. In most cases, they knew the species were endangered and even knew that they shouldn’t disclose the location, but with a tiny bit of cajoling and reassuring them that I only wanted to take photos, 90% of people would either tell me the location or give me clues about the location. If I had been a poacher, it would have been a disaster. But now I understand why it’s so hard to protect endangered species from poachers! I’m not saying that observation security isn’t important, just that it’s only effective to a point.

1 Like

if you’re going to go with this kind of approach, i suspect it would be better implemented as effectively a link shortening service, where you could create a short link or identifier only for selected observations.

I could live with that for citation purposes for sure. Not sure it would mitigate other use cases where short IDs are helpful (like querying a specific set of observations for example). Mainly I’m just glad that the development staff is fully aware of the downsides of going exclusively with UUIDs, and that they have some alternate ideas to chew on…