I just spent a good while searching to see if someone had already asked/addressed this and didn’t find anything, but my apologies if I missed something… My request: To have the option to download through the main Export Observations page all the fields that are available to admins/curators under the “Export with Hidden Coordinates” Project Curator Tool on traditional projects AND have the option to filter by date, taxa, etc. as we can in the Export Observations page.
I manage several traditional projects that our agency uses to assess rare species across our state. Beyond access to true coordinates, one of the most important things we get from the traditional projects is the ability to see IDs from our project curators, which gives us even more quality control beyond the Research Grade designation (we don’t use a RG observation unless it’s been vetted by a curator). We’re at a point where a few of these projects are close to or beyond the 200,000 observation download limit, and the curator tool “Export with Hidden Coordinates” does not give users the ability to filter observations; it’s just a pre-set download of the entire project. Even with small traditional projects, I’m finding that it takes a very long time to respond (typically, it takes hours of clicking through time-outs to get a download).
And while the Export Observations page allows project curators/admins to download private_latitude & private_longitude fields when a relevant project is specified, I have not found a way to include the following fields in the download:
Being able to access those fields in the download and filter by date to be able to pull smaller chunks of data would be extremely helpful, as it’s incredibly time-consuming to constantly be watching your export request spin and time out and then have to click again to start over via the traditional project export tool. Working through the Export Observations portal, users can queue a request once and just let it process. And one of the reasons we want to see exactly which curator has added what ID is so that we can go back over subsequent downloads and scan for any changes/updates to those fields. So just using a URL search for “pcid=true” doesn’t get us what we need.
Just thought I’d put this out there in case this is of interest to anyone else, and if anyone has suggestions for workarounds, I’d love to hear them!
regarding workarounds, while it is possible to get individual identifications associated with an observation via the current v1 Node.js API or to get just curator IDs via the old Rails API, i don’t think the APIs are really designed to handle sets of observations that are as large as you’re talking about here (approaching 200,000 records). that said, i don’t know of any other reasonably efficient way for folks without direct access to the database to get this kind of data.
suppose you were able to get the data in the format you’re looking for. how exactly would you use that data? (for example, if a given observation had an identification by a curator that conflicted with the community ID, i assume you would take the curator’s ID. but what if two curators provided conflicting IDs?) also, why do you need to export all that data? are you saving that off somewhere just for backup, or are you feeding it into some other system or something else?
i don’t usually have a use for getting the true coordinates of obscured observations – so i haven’t really dealt with exporting these much – but i believe that if you just use the standard export page, and include private coordinates in your selected fields for export, then you should be able to filter as much as that page will allow you to filter.
EDIT: i read your original post again with rested eyes, and i think i understand it a little better. clicking “Export with Hidden Coordinates” on a traditional project page produces a CSV file with not only hidden coordinates but also extra curator ID columns. so you want those extra curator ID columns to be selectable/available in the regular export page.
i guess the standard export page would have to be able to check to make sure that you’re a curator for any selected project, although i have seen where the page changes the selectable observation fields based on project selected. so it must be possible to add that kind of logic.
i’ve never used the curator ID functionality before. so i’m still having trouble conceptualizing what kind of workflow you would have with those kinds of results. you mentioned that you want to potentially go back and scan for changes/updates to the curator IDs. but why does that matter? is it just that a particular observation gets any curator to look at it, or is more important that a curator is reidentifying a particular observation?
We use the data from our traditional projects to feed community science data into our Texas Natural Diversity Database, which is used for a variety of uses, from research to conservation and development planning. That database is also our conduit to the NatureServe network. Our standard for observations entering this database is pretty high, which is why we don’t just pull the GBIF dataset and instead pull a subset of RG observations that have been vetted by our curators. So, yes, if we have a conflict in curator & community IDs, we’ll generally give more credence to the curator, and if we have a conflict within the curator IDs, or a subsequent curator ID change that we notice when we compare downloads in Excel, our database team will reach out for clarification.
Somehow it must be able to match up my curator status on projects since the export page can assess when I filter by a traditional project I admin/curate that I am allowed to populate the private_latitude & private_longitude fields with true coordinates for my download within that filter: