How to obtain the uuid with download observations?

When data is exported/downloaded, it contains a lot of information but not the unique UUID of the observation. Is there a simple solution for obtaining this when exporting? I’d like to integrate the data into a database and the other data has a UUID.
Best regards

as far as i know, this can’t be done with the standard CSV export currently. you’d probably have to use the API to get a set of observations with both numeric ID and UUID, but it’s not very efficient if you have more than 10,000 records in your set.

it looks like the DWCA export files that go to GBIF, ALA, etc. can include UUID, but they either don’t or aren’t processed by the receiving system(s).

the only other thing i can think of is, if iNat development staff don’t get around to it, and you have the skills to do so, you could update the iNat code to get the UUID into the CSV export.

Thank you for your reply,
I’m an entomologist, not a computer scientist, and I don’t know how to use the api, so coding to modify a script, I can’t imagine…
if someone can explain to me how to query the inaturalist via the api in the form of a request for example… i’d be interested.
for example, i’m looking for cerambycidae data for a french region called île-de-france, near 1400 data.

where did this other data come from?

what fields are you looking for from this data set (besides uuid and the numeric observation id)?

what tool / scripting language are you hoping to use to make your request?

French public national, local and private databases.

id
observed_on_string
observed_on
user_name
quality_grade
license
url
captive_cultivated
place_guess
latitude
longitude
positional_accuracy
place_town_name
place_county_name
geoprivacy
coordinates_obscured
species_guess
scientific_name
common_name
iconic_taxon_name
taxon_id

html request ?
R script ?
what is possible ?

Thanks

are you sure that the UUID that exists your sources matches the observation UUIDs used by iNaturalist? (if they are different concepts, you won’t be able to match the data by UUID, and it’ll be pointless to get data from iNat with UUID.)

the only major data source that i’m aware of that includes iNat observation UUIDs is the AWS Open Data Set, and that’s not what you’re describing here.

can you provide a couple of examples of records from each of your data sources so that it can be verified that theses data are actually using iNat’s observation UUIDs?

The structure of a uuid is a standard. https://en.wikipedia.org/wiki/Universally_unique_identifier
Here are a few examples of uuid (supposedly from inaturalist, but passed through Gbif… and returned to France)
3feced0e-3d38-449f-a018-203c3f0b46af
c8a9ca6e-1b1f-41d7-9706-a5a216e7e0ea
e3ac9c8c-ad9d-4b00-9de1-791a64930aaa

Here are a few lines from an example table, for one database (Openobs.mnhn.fr), the others are similar. I hope this will be “readable”.

idSINPOccTax libelleCadreAcquisition idCadreAcquisition descriptionCadreAcquisition objectifCadreAcquisition motsClefsCadreAcquisition referenceBiblioCadreAcquisition maitreOuvrage maitreOeuvre financeur contactPrincipal typeFinancement libelleJeuDonnees idJeuDonnees descriptionJeuDonnees objectifJeuDonnees jsonProtocole libelleProtocole idCampanuleProtocole motsClefsJeuDonnees territoireJeuDonnees fournisseurJeuDonnees producteurJeuDonnees typeDonneesJeuDonnees idOrigine statutSource statutObservation observateur determinateur nomScientifiqueRef nomCite nomVernaculaire cdNom cdRef rangTaxo regne classe ordre famille genre espece groupeTaxoSimple groupeTaxoAvance dateObservation datePrecision decennie annee mois dateDetermination latitude longitude precisionGeometrieMetres systemeCoordonnees precisionLocalisation typeObjetSource toponyme commune codeInseeCommune EPCI codeInseeEPCI departement codeInseeDepartement region codeInseeRegion altitudeMin atitudeMax profondeurMin profondeurMax codeMaille10Km objetGeoWKT denombrementMinMax Abondance Source
302E11B1-FFFB-4321-A741-7852EE087819 Inventaire des coléoptères saproxyliques de France métropolitaine 4A9DDA1F-B5E7-3E13-E053-2614A8C02B7C L’inventaire national des coléoptères saproxyliques (SAPROX) a pour objectif de dresser la répartition actuelle de chaque espèce de Coléoptère lié au bois mort (maille départementale et à terme carroyage 10 km), de recueillir des éléments sur la répartition passée et de favoriser l’intégration de ces données dans d’autres programmes de connaissances ou de conservation. Inventaires et cartographie Atlas Bois Mort Coléoptères Conservation Distribution Insectes Inventaire MINISTERE DE LA TRANSITION ECOLOGIQUE ET SOLIDAIRE MUSEUM NATIONAL D HISTOIRE NATURELLE OFFICE POUR LES INSECTES ET LEUR ENVIRONNEMENT MUSEUM NATIONAL D HISTOIRE NATURELLE Mixte Données entomologiques Bruno Mériguet dans le cadre des études réalisées par l’OPIE et à titre personnel ED22D691-46D8-0A71-E053-0514A8C00AF0 Ensemble des données collectées entre 1998 et 2022 (dont données de spécimens anciens) au cours des prospections pour études entomologiques professionnelles (OPIE) et personnelles. Inventaire pour étude d’espèces ou de communautés Coléoptères Ile-de-France Inventaire Pimul Piègeage Polytrap Saproxylique France métropolitaine Occurrence de taxon 4495 Terrain Présent MERIGUET Bruno (AEV-Opie) MERIGUET Bruno (OPIE) Rutpela maculata Rutpela maculata (Poda 1761) Lepture tachetée, Lepture cycliste 223152 223152 species Animalia Insecta Coleoptera Cerambycidae Rutpela Rutpela maculata Insectes et araignées Insectes 08/07/2002 02:00 DAY 2000 2002 7 09/02/2003 48.81654 2.65571 200 WGS84 XY point Géometrie Forêt régionale de Ferrières Croissy-Beaubourg 77146 CA Paris - Vallée de la Marne 200057958
304EE20B-C3D8-4AC2-8877-3ABDECAC2E79 Inventaire des coléoptères saproxyliques de France métropolitaine 4A9DDA1F-B5E7-3E13-E053-2614A8C02B7C L’inventaire national des coléoptères saproxyliques (SAPROX) a pour objectif de dresser la répartition actuelle de chaque espèce de Coléoptère lié au bois mort (maille départementale et à terme carroyage 10 km), de recueillir des éléments sur la répartition passée et de favoriser l’intégration de ces données dans d’autres programmes de connaissances ou de conservation. Inventaires et cartographie Atlas Bois Mort Coléoptères Conservation Distribution Insectes Inventaire MINISTERE DE LA TRANSITION ECOLOGIQUE ET SOLIDAIRE MUSEUM NATIONAL D HISTOIRE NATURELLE OFFICE POUR LES INSECTES ET LEUR ENVIRONNEMENT MUSEUM NATIONAL D HISTOIRE NATURELLE Mixte Données entomologiques Bruno Mériguet dans le cadre des études réalisées par l’OPIE et à titre personnel ED22D691-46D8-0A71-E053-0514A8C00AF0 Ensemble des données collectées entre 1998 et 2022 (dont données de spécimens anciens) au cours des prospections pour études entomologiques professionnelles (OPIE) et personnelles. Inventaire pour étude d’espèces ou de communautés Coléoptères Ile-de-France Inventaire Pimul Piègeage Polytrap Saproxylique France métropolitaine Occurrence de taxon 37763 Terrain Présent MERIGUET Bruno (MAB-Opie) MERIGUET Bruno (OPIE) Trichoferus pallidus Trichoferus pallidus (Olivier 1790) Clyte pâle 223140 223140 species Animalia Insecta Coleoptera Cerambycidae Trichoferus Trichoferus pallidus Insectes et araignées Insectes 25/08/2002 02:00 DAY_RANGE 2000 2002 8 28/05/2013 48.40929 2.6642 200 WGS84 XY point Géometrie Ft de Fontainebleau Fontainebleau 77186 CA du Pays de Fontainebleau 200072346

the structure is standard, but it’s effectively just a format for an identifier. if that identifier is not shared between systems, then it doesn’t matter if two different systems both use UUIDs. they won’t match up.

in this case, it doesn’t look like the UUIDs that you’re providing match up to any iNat observation UUIDs. for example, this is not a valid iNat observation: https://www.inaturalist.org/observations/c8a9ca6e-1b1f-41d7-9706-a5a216e7e0ea

… whereas this is a valid iNat observation (one of mine): https://www.inaturalist.org/observations/1860fd4f-4a5d-43f4-855a-62dc639b707d

it doesn’t seem like getting iNat observations with their UUIDs is going to help you.

Thanks for the detailed information.
I have integrated uuids into my analysis tool as unique references and a standard field in my database. I’m still convinced that it’s a problem to have different reference typologies (ID field) depending on the sources, especially when there is an identical typology in the different systems.
The uuid is useful not just for cross-referencing information between different sources, but also when integrating new data from the same system:

  • using uuids enables new incoming data to be detected and examined as a priority.
  • It also allows you to check that the identifications have or have not been modified between the 2 versions of the dataset.

Many thanks for these answers.
I’m always on the lookout for simple solutions to recover these uuids. In the meantime I’m going to tweak a unique id like “INaturalist”&{Id}.

I found a partial solution here :
https://api.inaturalist.org/v2/docs/#/
with RISON enclosing and parameter here
https://api.inaturalist.org/v1/docs/#!/Observations/get_observations_updates

the code below send me back information by 200 lines package per page, i only need to change page number
https://api.inaturalist.org/v2/observations?total_results=1500&per_page=1500&page=1&fields=species_guess,observed_on,id&taxon_id=47961&verifiable:!t&place_id=10577&rank=species

Maybe there is a way to retrevie all data at once.

i don’t know why this should matter. if you’re putting data from various sources into the same table, your key on that table really should be something like source_identifier + source_system, even if the source_identifier is a UUID.

iNat has the numeric observation ID in addition to the observation UUID. both of these are unique identifiers. so i’m not sure why you need the UUID to do the comparisons you’re talking about.

as mentioned before, and as you’re discovering, you can get this information from the API. i guess if you’re willing to spend the time to get this information that way, that’s fine.

i would get the data from /v1/observations rather than /v2 though. as far as i know, /v2 is still under development, and it’s probably better to use something that is a little more tested.

This is in fact what was done in certain bases in France but it is no longer done (to my knowledge). From my point of view and by design, the UUID already fulfils this function. The unique identifier of type id, unless recombined with the name of the source, is potentially not unique if several databases are grouped together.

I already use uuids for different databases, so it’s out of a concern for unity that I don’t want to mix different formats in the same field. I think both points of view are valid.

Many thanks for all your advice and the time you took to reply.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.