Easy way to mark multiple-species observations

It seems that we could figure this out in a progressive manner:

  1. First iNat adopts @jeanphilippeb’s suggestion to add a DQA field for “Images are of the same individual(s)”. That allows us to identify problematic observations.
  2. Ideally at the same time, iNat adds functionality to notify observers when their observation is assessed as not being a single individual/species, along with some guided approach to resolving the issue. That allows observers to understand and fix the problem.
  3. After this has been in place for some time, iNat staff could then do some data analysis to see:
    a. What proportion of “multiple” observations are fixed by the observer? (Probably should focus on stats for observations made after the new functionality is added; for earlier observations there’s a much higher chance the observer left iNat already).
    b. What are the stats for image count, location and date-time for unfixed observations: e.g. how many observations have 2, 3, 4, etc. images; how many have location/date-time metadata in the images; when metadata is present, what’s the spread of location/date-time data among the images?

With that info, we would be better informed to decide what kind of auto-split policy could reliably avoid exposing private data. For example, we might find that among unfixed “multiple” observations where all images have metadata, 91% have timestamps within 30 minutes and locations within 250 m of the data in the iNat record. It might be determined that those criteria are restrictive enough to infer that the location privacy settings from the original observation can be applied while creating new observations.

We might then determine additional rules for auto-splitting observations that don’t reach that threshold. For example, the newly created “child” observations might derive their time and location from the image metadata but be set from the outset to have the location obscured.

I think we can start with tools to help users and the iNat community address the problem and later assess what automated fixes could be safely made.

10 Likes