Is it possible to filter by observer's original guess?

I am interested in the ability to search and export records based on the taxon proposed by the original observer, regardless of the current community identification. So if someone posted an observation of what they called an Eastern Bluebird but several subsequent IDers called it an Indigo Bunting, I’d like to be able to export this record in a search for Eastern Bluebirds. Is this possible?

1 Like

the only way i know how to 100% satisfy this kind of request is via scripts that get data from the API and then do some additional client-side filtering / processing. there are 3 ways that I can think of to get this:

  1. get identifications where the observer identified as Eastern Bluebird. get observations (only ids are required) where any user identified Indigo Bunting. for any observations where you can join / match the observation ID between the two sets, those are the observations you care about.

  2. get observations where any user identified Indigo Bunting. parse through each observation’s identifications to see if the observer identified Eastern Bluebird, and keep only those observations.

  3. get identifications where the observer identified as Eastern Bluebird. (if you can assume that the observation taxon likely is not Eastern Bluebird, then you can also add this as a filter parameter.) parse through the identifications in the associated observation to see if any reflect Indigo Bunting, and keep only those observations.

3 Likes

Sorry, the Indigo Bunting part wasn’t particularly relevant. In this example, I’d want all records in which the observer called the organism an Eastern Bluebird, whether it turned out to be an Eastern Bluebird, Indigo Bunting, Osprey, Water Oak, whatever. The current ID is not immediately relevant (for my purposes!).

1 Like

if you want to just find cases where the observer identified something as Eastern Bluebird, you can just look for identifications where the observer identified as Eastern Bluebird: https://api.inaturalist.org/v1/identifications?taxon_id=12942&own_observation=true. this can help visualize the results in a human-friendly format: https://jumear.github.io/stirfry/iNatAPIv1_identifications?taxon_id=12942&own_observation=true.

It sounds like you are interested in the first ID entered by the observer – i.e., you want to see what images observers identify as Eastern Bluebird before receiving feedback from the community. Is that correct?

In other words, Eastern Bluebird is not necessarily the current ID of the observer, because you would also want to
a) include observations where the observer originally entered an ID of Eastern Bluebird but later withdrew or changed it, and
b) not include observations where the observer originally entered a different ID and later changed it to Eastern Bluebird.

Yes, that is correct. I’m interested in the first ID entered by observers, and I want to include a) and exclude b). With nearly 90K observations, Eastern Bluebird was maybe not the best choice for discussion; let’s go with Clapper Rail instead!

Poking around on this example, it’s possible this might contain the essential information I need. It looks like when there is a subsequent discrepancy, the “ID vs Obs” value changes from “identical” to “descendant,” though I’m not sure if that precisely defines the intent of “descendant.” I just need to find a definitive example of spiphany’s case b) to confirm that this is doing what I hope it’s doing!

the thing i referenced in my last post won’t get exactly what spiphany described above, but it’s not clear to me that what you want is exactly what spiphany described either.

unless you can describe more clearly exactly what you want and why you want it, rather than continuing to guess at what you want, i think the best thing to do is simply to point you to the API and let you figure things out for yourself: https://api.inaturalist.org.

in that page, this column just describes how the taxon of that particular identification compares to the current observation taxon. if it says “descendant”, then that means the current observation taxon would be an ancestor taxon of the identification taxon (ex. obs = “birds” when identification = “eastern bluebird”).

Okay, hopefully last question (after this explanation). It turns out that at one level the API search you posted does have all the information I need; unfortunately, at 1300 lines of code per observation, it has rather an enormous amount of irrelevant information, which I assume is why it’s limited to 200 observations. What I am interested in is whether the accuracy of initial identifications by local observers changes as a new invasion front moves through an area. So the only info I need from that very comprehensive parameter set would be the observation number, location, and posting date, its current ID, and the “observer’s ID sequence within the current IDs,” as this determines whether they were the ones who first proposed the name.

I more or less understand the code generated by this search, but I can’t tell if there is a way to return only selected parameters for each record. Is this possible? If not, I can “brute force” strip out the five out of 1300 lines of code I need for each species, and do this 31 times (I’m looking at 6200 records), but of course a more efficient search would be preferable.

the /v1 API does not allow users to specify which fields are returned. the /v2 does allow users to specify which fields should be returned, but it is in active development, and its use is not recommended at this time. so then that means the best approach right now is to parse the /v1 results as you need them. (i’m guessing that parsing is what you mean by “brute force strip out”, though the way you describe this suggests that you might be planning to do the parsing in an inefficient way.)

note that this would have to be computed on your end and is not provided in the API results directly.

you work with what you have. if you already have a script to get data in the way that you want, it’s relatively easy to get the data. the time-consuming part is writing the base script you need to get data.

i don’t have anything already written to get exactly what you’re describing here, but you could adapt something that i wrote to get /v1/observations to get /v1/identifications instead. see: https://forum.inaturalist.org/t/whats-the-best-way-to-share-python-code-nowadays/48554.

pyiNaturalist might be another thing that could help you get data more easily.

1 Like

My initial understanding of the own_observation parameter was that it is true if the observer is also the initial identifier, otherwise false. However, this is clearly not the case, and I wondered if this is because I’m embedding this parameter in a search for a particular taxon, because I can’t figure out what this parameter is actually doing.

For example, this record shows up when you set the own_observation parameter to true for Hairy Crabweed records in North America, whereas this virtually identical record shows up when you change the parameter to false.

In general, I can’t find an unambiguous way to distinguish true from false own_observations. Can anyone tell me what I’m missing, or is this likely to be a bug?

I think you might be confusing identifications with observations. The search you linked doesn’t return observations, it returns identifications. The own_observation parameter works on identifications; a true value means that the identifier is the observer and a false value means that the identifier is not the observer. The same observation will be returned multiple times by that search if there are multiple identifications on it that meet the search parameters.

The first observation you linked, 196548106, can be found in both the true and the false results. It’s in the true results for the ID made by the observer and in the false results for the ID not made by the observer.

The same is true for observation 194118184.

1 Like

this is not correct. if true, you will get identifications by the observer, regardless of whether the identification is the first identification in the observation. if false, you will get identifications by users other than the observer.

if you want to find observations where the first identification is from the identifier, you’d have to parse through all the identifications (including withdrawn ones) in each observation to see if the first identification was by the observer.

this thread seems to be a continuation of https://forum.inaturalist.org/t/is-it-possible-to-filter-by-observers-original-guess/50320, and it’s probably best to continue conversations on existing threads rather than starting a new thread and losing the original context.

Moved the above three posts to this topic to keep the conversation together.