would it be possible to download all data (observation + audio/image) from a project we created in order to build a local guide and train specific CNN models?
it’s possible to do this, yes, but you really should make sure whatever you download is licensed in a way that permits your particular use cases. iNaturalist observations have a license, and the their media assets (photos, sounds, etc.) have separate licenses.
i’m not sure how big your particular project or how it’s defined, but if you’re dealing with a very large set of observations, i would probably turn first to the AWS Open Data set to get photos / images: https://registry.opendata.aws/inaturalist-open-data/.
for fewer than 10,000 observations, you could start with the iNaturalist API: https://api.inaturalist.org/.
hopefully that gets you started. if you need more help, you’ll need to provide more information on exactly what you’re trying to do. (which project are you dealing with?)
Thanks Psium! I’m a researcher in ecology and I’m starting projects on less invasive but accurate entomological surveys with the help of motivated 2.0 entomologists, deep learning and macrophotography. I 'm looking for a web platform or a DAM that allow me to compile data (mainly photo) from experts (naturalists, scientists) and and non experts in France and Spain. We are targeting taxa difficult to identify without killing them. I believe that a lot of what we call cryptic species could be identified reliably with CNN with several good pictures. We are now at the step of building the database through a collaborative effort. If we can download easily Inat photo with the appropriate licence that would be ideal as it could serve both our research projects and INat database and models. I’m more than happy to share with Inat what will come out of the research.
if you’re primarily interested in getting various training sets of photos by taxon, it’s likely that the AWS Open Data approach is the way to go. (that’s more or less what that data set was built for.) the iNat folks have decent documentation about how to use the data set here: https://github.com/inaturalist/inaturalist-open-data/tree/documentation, and i wrote something which may or may not be useful: https://forum.inaturalist.org/t/getting-the-inaturalist-aws-open-data-metadata-files-and-working-with-them-in-a-database/22135. the metadata files in the open data set contain (photo) license information, taxon, and photo file locations. so you can use that to grab a bunch of photos for the appropriate taxa and licenses that you’re interested in.
There are also photo training sets here: https://www.kaggle.com/c/inaturalist-2021
thanks! very usefull!