So I have been downloading observations as CSV files, using the website query options.
I was reading the API documentation, but it doesn’t seem like I’ve got the full set of filter features available in the API that I have in the website filters… not all the values I can provide (query params nor requested fields) seem available in the “GET /observations” API call.
Is this the case? If so, why doesn’t the API provide all the query features the website does? If it’s not the case, which API call would I use to mimic what I can do with the website Observations filters?
My goal is to be able to pull all the same data programmatically as I can in the website filters, but with fewer steps - I can just provide my script with the params that differ from run to run without having to step through all the web pages.
I recently switched from using the web page download to using the API (via Python). I haven’t done a comparison, but as far as filtering parameters goes, you may not have exactly the same ones available via the API, but your code can always discard any observations that do not meet your filtering criteria. As far as requested fields goes, if you’re using “get observations”, it returns the entire observation. The onus is on you to figure out where the various bits of information are stored in the response. This is a bit complicated, but it’s something you only have to do once. It also exposes a whole raft of fields that are NOT available to you via the web interface - once you figure out where to find them in the data structures. This not only gives you additional information, but it allows you to do more complex filtering of the observations. I found it possible to do more sophisticated filtering using the API than what I was able to do using the filtering parameters on the website (naturally, YMMV).
It took me a bit of time to write a program that would iterate over the different permutations of requests that I used to perform manually, but now I just set up the date parameters within my program and click “run” - it does everything, including putting the data into a “nicer” format than what the web download provides. Recently, I was investigating a problem within iNat and I wanted to get another field for the records to aid in the investigation - the app through which the observations were submitted (web, Android, iPhone, or Seek). This isn’t available via the web download, but it’s in the API response data structure. It took me a bit of time to find it in the data structures, but then I just added that reference into the code and bingo, I now have that additional information.
My only regret is that I didn’t migrate over to using the API sooner.
as far as i can tell, the web pages provide a subset of the filter parameters available in the API, unless you’re using the deprecated API (which provides far fewer filter parameters).
it would be unfortunate if you spent a lot of time developing something against the old API.
Not sure, but I think I’m using V1. Like I said, I didn’t check to see if every filter parameter that I was using on the web interface was duplicated in the API. I found I could do what I needed to do, and it works at present.
v1 is the current API. so you did not develop against the deprecated API. there is also a v2, but it’s still in beta as far as i know.
users of Python can use pyinaturalist to simplify retrieval of data, although it doesen’t necessarily help to lay the data out in a tabular format, which can be extra work for fields that can contain multiple values.
i made some Jupyter notebooks that work with JupyterLite implementations to allow folks to run stuff entirely in their browsers without having to explicitly set up a Python environment on their machines. one of them has the bones to get observation data and put it in a tabular format, get extra information like related standard places, and do some additional filtering that’s not possible when making the get requests.
Yes, I’m using pyinaturalist. I was already managing observation data using Python, so it wasn’t like I was setting up a whole environment from scratch. This was just an extension of what I was already doing - instead of downloading the raw observation data from the website and then processing it using Python code, I am now doing the download via Python as well.
I should add that I’m doing very basic stuff as far as the programming goes. My main priority is getting the data and cleaning it up with as little effort as possible. I’m not spending any time on making the code pretty or sharable.
All of the filters shown in the first section of the export page have a corresponding query parameter listed under GET /observations except for not_in_place. However, that parameter can certainly be used in both website urls and API requests, so I’m not sure why it’s missing from the API docs.
It’s probably possible to append any query created on the export page to https://api.inaturalist.org/v1/observations? and get the same result set. However, there’s no way to specify the requested fields via the API - you have to extract all that information yourself from the returned json data. (The only exception to this is only_id, which reduces the results to a list of record IDs).