iNaturalist's API, Python and ChatGPT

I have been saving an enormous amount of time by using ChatGPT to write Python code to access the API. For example, I wanted to look again at my older records to see whether on the basis of my more recent observations and using the updated AI tool I could agree or disagree with IDs suggested by Identifiers. Here is what I asked ChatGPT:

“I want to list in an HTML table all Inaturalist observations of any taxon where the grade is Needs_id, the Community rank is at least genus, where the observer is [me_id], there is an identifier who is not [me_id], and [me_id] has not agreed with the community taxon id. I want the HTML to have a link to each record. Using python. Please include debugging lines.”

It gave me the Python for this instantly. I was then able to ask it to refine the code to add additional columns to the table.

I have also used ChatGPT to write Python code, HTML and CSS for a responsive webpage to display Inaturalist photos of several species listed in a CSV file, with links to Inat records (and displaying copyright attribution).

ChatGPT can be used for text queries that would not be possible using an ordinary search engine. For example:

  • “What species of very small white mushroom on a stalk could be growing on the trunk of a living deciduous tree in moss in London UK in December?”

  • “Display in two columns the visible differences and other differences between X species and Y species.”

1 Like

i’d be a little surprised if GPT gave you code that actually gets what you’re looking for.

below is what i got from CoPilot, which i believe is built on some version of GPT4 (though it might revert to 3.5 when there’s higher demand). it’s not bad as a starting point, but it’s not the right answer.

import requests
import pandas as pd

# Replace 'me_id' with your actual iNaturalist user ID
me_id = 'YOUR_USER_ID'
url = f"https://api.inaturalist.org/v1/observations?user_id={me_id}&quality_grade=needs_id&rank=genus"

# Fetch data from iNaturalist
response = requests.get(url)
data = response.json()
print("Data fetched successfully")  # Debugging line
print(f"Total observations retrieved: {len(data['results'])}")  # Debugging line

observations = []
for obs in data['results']:
    identifier_ids = [identifier['user']['id'] for identifier in obs['identifications']]
    if me_id not in identifier_ids:
        community_taxon_id = obs.get('community_taxon_id')
        if community_taxon_id and obs['community_taxon']['rank'] == 'genus' and obs['user']['id'] != me_id:
            observations.append({
                'ID': obs['id'],
                'Species': obs['taxon']['name'],
                'Link': f"https://www.inaturalist.org/observations/{obs['id']}"
            })

print(f"Filtered observations: {len(observations)}")  # Debugging line

# Create an HTML table
html_content = """
<table border="1">
    <tr>
        <th>ID</th>
        <th>Species</th>
        <th>Link</th>
    </tr>
"""
for obs in observations:
    html_content += f"""
    <tr>
        <td>{obs['ID']}</td>
        <td>{obs['Species']}</td>
        <td><a href="{obs['Link']}">{obs['Link']}</a></td>
    </tr>
    """
html_content += "</table>"

print("HTML table created successfully")  # Debugging line

# Save HTML content to a file
with open('observations.html', 'w') as file:
    file.write(html_content)

print("HTML file saved successfully")  # Debugging line

below is what i got from chatgpt.com, which looks like another version of GPT4. it’s different but still makes the same sort of mistakes as above.

import requests
import json

# Replace with your actual user ID
me_id = "your_user_id"

# Define the base URL for the iNaturalist API
base_url = "https://api.inaturalist.org/v1/observations"

# Define the query parameters for the iNaturalist API
params = {
    "user_id": me_id,  # Filter by observer
    "quality_grade": "needs_id",  # Filter by quality grade
    "rank": "genus",  # Filter by community rank
    "has": "identifications",  # Filter observations that have identifiers
}

# Define a function to check if [me_id] has disagreed with the community taxon
def disagreed_with_community(observation):
    for identification in observation['identifications']:
        if identification['user']['id'] == int(me_id) and identification['taxon']['id'] != observation['taxon']['id']:
            return True
    return False

# Function to fetch observations from iNaturalist API
def fetch_observations():
    print("Fetching observations...")
    response = requests.get(base_url, params=params)
    print(f"Request URL: {response.url}")  # Debugging: Check the request URL
    if response.status_code != 200:
        print(f"Error: Failed to fetch data (Status Code: {response.status_code})")
        return []
    
    data = response.json()
    print(f"Fetched {len(data['results'])} observations.")  # Debugging: Check how many results we fetched
    return data['results']

# Function to generate the HTML table
def generate_html_table(observations):
    html_content = """
    <html>
    <head>
        <title>iNaturalist Observations</title>
    </head>
    <body>
        <h1>Observations Needing ID</h1>
        <table border="1">
            <thead>
                <tr>
                    <th>Observation</th>
                    <th>Taxon</th>
                    <th>Community Taxon</th>
                    <th>Identifier</th>
                    <th>Link</th>
                </tr>
            </thead>
            <tbody>
    """
    
    for obs in observations:
        if disagreed_with_community(obs):  # Only include observations where the user has disagreed
            observation_id = obs['id']
            taxon_name = obs['taxon']['name']
            community_taxon = obs['community_taxon'] if obs['community_taxon'] else 'None'
            identifiers = [identification['user']['login'] for identification in obs['identifications'] if identification['user']['id'] != int(me_id)]
            identifier_names = ', '.join(identifiers) if identifiers else 'None'
            observation_link = f"https://www.inaturalist.org/observations/{observation_id}"
            
            # Debugging: Print each observation's relevant details
            print(f"Observation ID: {observation_id}, Taxon: {taxon_name}, Identifiers: {identifier_names}, Link: {observation_link}")
            
            html_content += f"""
                <tr>
                    <td>{observation_id}</td>
                    <td>{taxon_name}</td>
                    <td>{community_taxon}</td>
                    <td>{identifier_names}</td>
                    <td><a href="{observation_link}" target="_blank">View</a></td>
                </tr>
            """
    
    html_content += """
            </tbody>
        </table>
    </body>
    </html>
    """
    
    return html_content

# Main execution block
def main():
    observations = fetch_observations()
    if observations:
        html_content = generate_html_table(observations)
        
        # Save the HTML to a file
        with open("observations_table.html", "w") as file:
            file.write(html_content)
        print("HTML table generated and saved as 'observations_table.html'")
    else:
        print("No observations fetched or no matching records found.")

# Run the script
if __name__ == "__main__":
    main()

OK, I made a small mistake in the Prompt which I used initially. My Prompt was actually this:

“I want to list in html all observations of any taxon where the grade is needs_id, the rank is genus or higher but not as high as species, where the observer is [me_id], there is an identifier who is not [me_id], and [me_id] has not agreed with the community taxon id. I want the html to have a link to each record. Using python.”

I was sent the python code below. It did not work first time because of the error which I made in my Prompt. I asked it to add some debugging lines, and then I found my error.

The only change which I had to make in order to get the expected results was to change these lines:
“rank”: “genus”, # Minimum rank is genus
“hrank”: “species”, # Maximum rank is species (not species itself)
to this:
“hrank”: “genus”, # Minimum rank is genus
#“hrank”: “species”, # Maximum rank is species (not species itself)

I then asked it to give me the output in an HTML table and to add some additional columns. I have been using the code successfully with no other required bug fixes.

I then asked it to give me the output in an HTML table and to add some additional columns. I have been using the code successfully with (so far as I recall) no other required bug fixes.

I have also several times used ChatGPT to write Python code to do other fiddly things with the API, such as things which required combining results from “observations” with results from “taxons”.

I would be a disbeliever like you if I had not actually been successfully doing this for several months. It is a huge time-saver.

import requests
import json

iNaturalist API endpoint

API_URL = “https://api.inaturalist.org/v1/observations

Parameters for the API query

params = {
“user_id”: “[me_id]”, # Filter observations by the observer
“quality_grade”: “needs_id”, # Observations needing identification
“rank”: “genus”, # Minimum rank is genus
“hrank”: “species”, # Maximum rank is species (not species itself)
“per_page”: 100, # Number of observations per page
}

def fetch_observations():
“”“Fetch observations from the iNaturalist API based on the query parameters.”“”
observations =
page = 1
while True:
params[“page”] = page
response = requests.get(API_URL, params=params)
data = response.json()
observations.extend(data[“results”])
if not data[“results”] or page >= data[“total_pages”]:
break
page += 1
return observations

def filter_observations(observations):
“”“Filter observations based on the additional criteria.”“”
filtered =
for obs in observations:
# Check if [me_id] is the observer
if obs[“user”][“login”] != “[me_id]”:
continue

    # Check if `[me_id]` is among identifiers and has agreed with the community taxon
    has_[me_id]_identified = False
    has_other_identifier = False
    for idf in obs["identifications"]:
        if idf["user"]["login"] == "[me_id]":
            has_[me_id]_identified = True
            if idf["current"] and idf["taxon"]["id"] == obs.get("community_taxon", {}).get("id"):
                break
        else:
            has_other_identifier = True
    else:
        if has_[me_id]_identified and has_other_identifier:
            filtered.append(obs)
return filtered

def generate_html(observations):
“”“Generate an HTML file with links to the observations.”“”
html_content = “

Filtered Observations


with open(“observations.html”, “w”) as f:
f.write(html_content)
print(“HTML file generated: observations.html”)

Main process

observations = fetch_observations()
filtered_observations = filter_observations(observations)
generate_html(filtered_observations)

That came out a bit garbled. How should I put nicely formatted code in a box here?

I’m still a bit leery about using ChatGPT for this since I remember asking it about basic bird ID (e.g. robin vs. bluebird) 2 years ago and it making up hilariously wrong answers. More recently I had similar things with obscure biology questions although it is much better now. I trust Perplexity more since it gets info directly from sources rather than indirectly from its memory of training data (unless I’m misunderstanding how GPT works or there’s a pro version that can scan the internet live). I’m still not sure I’d trust it on fungi since there’s so much information that’s just not on the internet at all about understudied taxa like that.

When you’re making a reply you’ll see a </> button in the banner on the top with the other text formatting options. It makes a text field between a pair of triple-accent-marks like this which you can put the code between:
```
type or paste code here
```

(added “iNaturalist’s” to the title)

Here is the code which I did not display correctly, with the error corrected:


import requests
import json

# iNaturalist API endpoint
API_URL = "https://api.inaturalist.org/v1/observations"

# Parameters for the API query
params = {
    "user_id": "[me_id]",  # Filter observations by the observer
    "quality_grade": "needs_id",  # Observations needing identification
    "hrank": "genus",  
   # "hrank": "species",  # Maximum rank is species (not species itself)
    "per_page": 100,  # Number of observations per page
}

def fetch_observations():
    """Fetch observations from the iNaturalist API based on the query parameters."""
    observations = []
    page = 1
    while True:
        params["page"] = page
        response = requests.get(API_URL, params=params)
        data = response.json()
        observations.extend(data["results"])
        if not data["results"] or page >= data["total_pages"]:
            break
        page += 1
    return observations

def filter_observations(observations):
    """Filter observations based on the additional criteria."""
    filtered = []
    for obs in observations:
        # Check if `[me_id]` is the observer
        if obs["user"]["login"] != "[me_id]":
            continue
        
        # Check if `[me_id]` is among identifiers and has agreed with the community taxon
        has_[me_id]_identified = False
        has_other_identifier = False
        for idf in obs["identifications"]:
            if idf["user"]["login"] == "[me_id]":
                has_[me_id]_identified = True
                if idf["current"] and idf["taxon"]["id"] == obs.get("community_taxon", {}).get("id"):
                    break
            else:
                has_other_identifier = True
        else:
            if has_[me_id]_identified and has_other_identifier:
                filtered.append(obs)
    return filtered

def generate_html(observations):
    """Generate an HTML file with links to the observations."""
    html_content = "<html><body><h1>Filtered Observations</h1><ul>"
    for obs in observations:
        obs_url = f"https://www.inaturalist.org/observations/{obs['id']}"
        html_content += f'<li><a href="{obs_url}" target="_blank">Observation {obs["id"]}</a></li>'
    html_content += "</ul></body></html>"
    with open("observations.html", "w") as f:
        f.write(html_content)
    print("HTML file generated: observations.html")

# Main process
observations = fetch_observations()
filtered_observations = filter_observations(observations)
generate_html(filtered_observations)

ChatGPT certainly makes mistakes on species identification or species description, but it does sometimes point me towards an answer which I can verify as being correct which I failed to find by other means, and it often saves me time.

1 Like

the code you got still looks wrong to me. the filter_observations function doesn’t look to me like it’s doing something a human would actually do.

you wanted to filter for cases:

there’s a bit of ambiguity in your prompt (related to how to define “agree” and whether you actually want the “community taxon id”), but i would guess that if you ran some test cases through to see if you’re actually getting what you want, you would discover that you’re not getting what you think you should be getting.

so based on what i’m seeing, i think this still stands:

Well I am getting what I was aiming to get (and I agree that I did not define what I was aiming to get as precisely as I should have done). I cut and paste the output into Excel and have been using the output for a couple of days now.

I do agree with you that there is some odd-looking code in filter_observations which I would not have written, but with a few small enhancements (as opposed to bug fixes) it does work to work to give me what I wanted, and I have been able to modify it without it breaking.

i’m not saying that the code is written in a non-human style. i’m saying that what the code is doing is not what you’re actually asking for. maybe it’s close to what you’re asking for – close enough that you don’t or won’t notice any difference – and i guess if you don’t see any problems with it even after my prompting, then who am i to say it’s not right for you?

I did this for Lampyridae larvae and Lycidae larvae and the result is mostly useless