What would you like to learn about getting data from the system?

pisum · January 4, 2021, 11:31pm

it’s the concept that important, i think. it’s probably not a bad thing to talk about the capabilities of the old API, but honestly, i wouldn’t show people how to build stuff using it. (deprecated in my mind means that it could disappear with little notice, and i wouldn’t want people to get too attached to a deprecated API.)

as far as i can tell, pagination is not officially documented in that particular endpoint. so i never thought too much about it and figured it was intentional (but i never actually asked, since i never had any use for that data beyond top 500). in muir’s example use case, pages >1 shouldn’t matter, since you’d be hitting /observations/observers for per_page=0 (just to get the total_results value).

yes. i was planning on mentioning that in the tutorial covering intro to the API in general, but it’s not a terrible thing to reiterate points like that often.

pisum · January 5, 2021, 12:07am

credit/cite/attribute and honor licensing as appropriate
- observations are licensed and have attribution info.
- media (URLs) are typically delivered with accompanying licensing and attribution notes, too. (note that media license may differ from observation license)
- map data are licensed and should be attributed too.
- ditto for code
- credit others for inspiration
- you can cite iNaturalist when appropriate, and you may benefit from GBIF’s DOI functionality.
don’t use the data for purposes that their creators / subjects might not appreciate

what are other specific best practices for being an ethical data user? (i can’t think of anything else right now, but i’m sure there’s more to it.)

pisum · January 5, 2021, 12:39am

would you mind sharing a couple of examples of such fields? and maybe sample output row or two that you might use to feed your MPG effort?

wcornwell · January 5, 2021, 1:45am

Hi @pisum

This looks like an amazing plan. Thanks for this–just wanted to reply and support the effort, as in my view there is so much more science that can be done with iNat data.

The only question we’ve come across recently is extracting metadata from the photos, for example zoom information. Analyzing color data from iNat photos, it’s often useful to have the Exif/IPTC data to try to get an external estimate of photo quality, which is highly variable on iNat.

Great stuff and thanks!

pisum · January 5, 2021, 3:06am

are you already aware of the ability to download observations into a CSV file (https://www.inaturalist.org/observations/export), which can be opened by Excel (among other tools)? if you are already aware of this, is the CSV missing some functionality for your needs?

my understanding is that the image files stored in the system have had the metadata stripped from them. so you can’t get that information from the files themselves. there is a page on the website (ex. https://www.inaturalist.org/photos/108914354) that will display the EXIF data captured (during upload?), but i don’t think that the system offers another way to get that information, other than that some API endpoints will provide the original height and width of the photo. if you wanted a way to get EXIF information efficiently, you might consider making a feature request (in the forum) for an endpoint to access that data.

(welcome to the forum.)

jcook · January 5, 2021, 7:55pm

extracting metadata from the photos

This is something I’m interested in too. You can find some related discussion in this forum thread and this GitHub issue. Short version is that there are no current plans to add this to the API, but also sounds like it’s not out of the question.

Currently the only option would be scraping the photo info page.

krancmm · January 7, 2021, 11:38pm

Sorry for the late reply @pisum while I decided how best to respond. Whether you find the following useful is your call but please ask if you want more info (although I can’t imagine why).

Many of the MPG mapping fields are driven by requests from researchers and professionals. The “dots” that end up on the very old static range maps (using a paucity of the fields shown in the sample) are used primarily by amateurs seeking to ID a moth (e.g. http://mothphotographersgroup.msstate.edu/species.php?hodges=367.1). Website images, taxonomic info, references are totally separate parts of the DBMS.

I’m using the sample below to explain mapping. Columns A-I (not included) are administrative although corrections to previous accepted records (which can include iNat updates) are included there. Column I is the foreign key that links to the DBMS, so mapping is the transaction table.

iNat Fields Used (obviously not exclusively by iNat): Columns J-S, V, X, AA. AB is annotations while AC-AF are data scraped from descriptions, observation fields, or personal communications with experts that get fed back into observations/records. AI-AK are the Identification fields.

Mikeknies · January 8, 2021, 10:55pm

I am most interested in getting access to a more detailed map. When you zoom in the detail fades out. I have seen clips of Hillshade maps that have incredible detail to assist in bushwhacking in search of species and natural features.

Please respond also to my email Knies06@att.net

wcornwell · January 12, 2021, 12:16am

Hi @pisum and @jcook,

Looked into this a bit–there is a problem before even getting to the API.

exif data is getting stored on iNAT for some cameras and phones but not others. Here’s an example where it’s been lost: https://www.inaturalist.org/photos/109774620.

All the ones that I found that are losing the data are from iphones.

Is this intentional or a bug? (Maybe iOS changed how they store EXIF data and that broke capture on upload?)

jcook · February 3, 2021, 8:12pm

Well, that’s unfortunate. I have no idea, and don’t have an iOS device to test with.
I did, however, make an ugly but functional python script that will scrape metadata (if available) from the photo info page: https://github.com/niconoe/pyinaturalist/blob/dev/examples/observation_photo_metadata.py

jcook · February 3, 2021, 8:29pm

@pisum @muir
I put together a couple Jupyter notebooks that create visualizations in Altair that are similar to those in muir’s blog post on 2019 Alaska observations. They’re not fully polished, but it’s a start.

Links:

Example output:

The search parameters in those examples are for Alaska, but that could be easily substituted with any other state or region (by changing PLACE_ID).

I’m interested in the stats in that blog post for the percentage of species observed on iNat relative to all known species in Alaska. I think that’s a really cool metric, but I can’t think of a good way to automate that for other regions, since the data sources listed are Alaska-specific checklists.

Do any of you know of a data source that provides region-specific species lists for multiple regions? For example, bird species in US states?

fffffffff · February 3, 2021, 8:32pm

You can check here for Clements and eBird lists https://avibase.bsc-eoc.org/checklist.jsp?region=us&list=howardmoore

muir · February 5, 2021, 8:35pm

That’s neat, thank you @jcook for looking deeper into that journal post’s metrics.

I actually had a similar effort to the AK journal post, that I never finished. It was global, and looked at a limited number of the iconic taxa. When I started, I look around for authoritative species lists by countries and, in the end, I don’t think I knew where to look. The best I could do as an amateur was find Rhett Butler’s attempt to crowdsource biodiversity by country for a 2016 Mongabay article https://rainforests.mongabay.com/03highest_biodiversity.htm It’s a noble attempt, but I did find some errors in that table. Very understandably, Mongabay doesn’t want to responsible for correcting the source data, and states “If you believe data for a country is inaccurate, please update the data at the source.”

If someone does have more authoritative species checklists by country, please share! I would love to use it to finish that journal draft.

system · April 6, 2021, 8:35pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.