I know there are a lot of various code snippets / tools / projects that have been mentioned on the forum at times (as an aside, putting together a wiki that consolidates them all in 1 place would be very helpful), to the point where I can’t remember what is out there.
Is there one out there that allows you to specify a taxa (assume by taxa id #) and then get a download of all taxa that are descendants of it ?
When I wrote a script for retrieving mammal taxonomy, I did pagination the lazy way: I put the call into my browser, noted the number of results, then calculated how many pages I needed and did a for loop exactly that many times in python. But you could just keep track of the total results and the per page amount and go until pages * per_page takes you over the total results.
I can post some python if that would be useful?
note that the API won’t return more than 10,000 records for a given set of parameters, but you can work around that limit by setting id_above / id_below.
UPDATE: i added a little thing to the wrapper page to generate a csv file. like the rest of the page, it’s quick and dirty, but it should work well enough (though i’ve limited it to 10000 records for simplicity).
I cleaned it up a bit, let me know if you have questions or it doesn’t work. The output is a csv.
# see https://api.inaturalist.org/v1/docs/#!/Taxa/get_taxa for more details on parameters
# in particular, if there are more than 10,000 results, you'll need to pare it down via parameters to get everything
taxon = 45933 # specify the taxon number here
rank = 'species' # use '' (empty quotes) if you don't want to specify a rank
# by default calls only for active taxa, doesn't return all the names for each taxon, and 200 results per page
apiurl = 'https://api.inaturalist.org/v1/taxa?is_active=true&all_names=false&per_page=200'
def call_api(sofar=0, page=1):
"""Call the api repeatedly until all pages have been processed."""
response = urllib.request.urlopen(apiurl + '&page=' + str(page) + '&taxon_id=' + str(taxon) + '&rank=' + rank)
except urllib.error.URLError as e:
responsejson = json.loads(response.read().decode())
for species in responsejson['results']:
# lots of possible data to keep, here it's name, taxon id, and observations count
csvwriter.writerow([species['name'], species['id'], species['observations_count']])
if (sofar + 200 < responsejson['total_results']): # keep calling the API until we've gotten all the results
time.sleep(1) # stay under the suggested API calls/min, not strictly necessary
call_api(sofar + 200, page + 1)
with open(str(taxon) +'.csv', encoding='utf-8', mode='w+', newline='') as w: # open a csv named for the taxon
csvwriter = csv.writer(w, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
except Exception as e:
I would be interested to download the iNat taxonomy for all the Odonata around the world. I know that the Odonata taxonomy is based off the World Odonata List which is publicly accessible, but I would like to download the vernacular names off iNat. Is this possible? and if so how?
Typically you would use a programming language to request the data from the API then filter/format/output the data to your needs.
The script above it written in Python, you will need Python installed on your computer to run it. Or use something like Google Colab (Google account required) to run it online.
Yes, taxon id 47792 will give you all Odonata. Empty quotes for rank will include all ranks from order to species, about 7300 results (Including species[‘rank’] may be helpful)
In the script above “…species[‘name’]…” will output the latin name, adding “species[‘preferred_common_name’]” will output the (ie one) vernacular name if available. The language can be set with the ‘locale’ and or the ‘preferred_place_id’ parameters or is determined by the default language on your computer, I think.
Set “all_names=true” to get all the vernacular names, but would then need a way to sort each language into the correct ‘column’ in the cvs file.
[EDIT] This might not make much sense if you’re just starting with programming, but hopefully some pointers when you get the hang of it
Thank you for the advice. I am playing around on Colab. It will be a while till I have it figured out.
How do you get it to output the .csv file? or when I run it where do I find the outputs?
I have set all_names to true. What can I add to get the common names I assume that I will have to add something like - csvwriter.writerow([species[‘name’], species[‘common_name’]]) - and then add several common name columns so that I can recall all the common names?