Fingerprints, human faces and AI

This is an open question, and I’m not sure if it’s been addressed before.
I would like to know what happens with human faces, human fingerspritns and such in photographs/observations and how (if happens) AI used in iNaturalist uses this information.


iNat’s computer vision isn’t directly using them. The CV is just focused on identifying taxa. It may take contextual clues, e.g. early on photos that included human hands sometimes were labelled as lizards because the photos of lizards that were input into the system were often of lizards being held by human hands, hence the model came to associate human hands with lizards. But with a greater variety of photos now, that kind of association becomes less likely.

A greater concern, in my opinion, is the general issue with Creative Commons licenses of human faces and features being collected by 3rd party companies or researchers. Several databases for facial recognition have been built using photos scraped from Flickr and the researchers and companies involved typically do not inform those whose images were used:

To build its Diversity in Faces dataset, IBM says it drew upon a collection of 100 million images published with Creative Commons licenses that Flickr’s owner, Yahoo, released as a batch for researchers to download in 2014. IBM narrowed that dataset down to about 1 million photos of faces that have each been annotated, using automated coding and human estimates, with almost 200 values for details such as measurements of facial features, pose, skin tone and estimated age and gender, according to the dataset obtained by NBC News.

Olivia Solon, Facial recognition’s ‘dirty little secret’: Millions of online photos scraped without consent, NBC News, 2019-03-12.

There’s even been a researcher who’s helped people to find out if their Flickr photos have been used in this fashion. Here’s a story about it:

Thomas Macaulay, Check if your photos were used to develop facial recognition systems with this free tool, TheNextWeb, 2021-02-01.

And here is the tool:

Unfortunately that tool does not cover private databases such as Clearview, which the NYTimes described as "The Secretive Company That Might End Privacy as We Know It
", which also scraped millions of users’ profile pictures across various social media services including YouTube, Twitter, and LinkedIn.

Perhaps the best thing that iNat could do is to try to detect human faces in photos posted by users and automatically blur the faces or limit the license options available for such users to prevent abuse.


I don’t think this is something to worry about with iNat’s Computer Vision model (though it is potentially an issue with facial recognition algorithms more broadly). The CV would need a lot of different faces and/or fingerprints and a reference library to be trained to recognize them individually. iNat’s CV doesn’t have this (training data will be identified as taxa, not people).

For fingerprints specifically, while it is sometimes possible to extract fingerprint info from photos with processing, iNat’s CV isn’t doing this type of processing. I think it’s incredibly unlikely that specific fingerprint info is making it’s way into the CV.

If we’re speculating, I think something like @murphyslab’s examples about early training runs with hands is probably the farthest that the CV model might get with non-targeted recognition of human faces. As a thought experiment, consider a case in which, for one species (XYZ) included in the CV model, there’s only one main observer, and that observer includes their face in a lot of their pictures. The model might learn to associate that observer’s face with Species XYZ. Since Homo sapiens is included in the CV model already, it would likely lead to misidentifying Species XYZ as a human. In the event that the model did somehow distinguish that face as specifically different from Homo sapiens, the observer’s face would be misidentified as Species XYZ which they uploaded so many observations of. Kind of a weird compliment? But not really a security risk.

I suppose that it could be possible to harvest photos depicting humans with CC licenses, but honestly, searching iNat photos wouldn’t be a very efficient way to do this, as most photos are not of humans. Given that most other social networks have a much higher proportion of photos of humans than iNat, and that those images are linked to much more personal data, I think iNat would be much less likely to be targeted. Harvesting could potentially occur via the iNaturalist Licensed Observation Images dataset in the Amazon Open Data Sponsorship Program (ODP) by screening for Homo sapiens images (I think these would be included in the dataset, but haven’t checked).

Fortunately, if a user is concerned about this, they can easily prevent any issues by not uploading photos with their faces/fingerprints or cropping/blurring those details if they are concerned. They could also give any Homo sapiens images an All Rights Reserved license to prevent their inclusion in the Amazon dataset.

NB: I don’t have any special knowledge of iNat’s CV - this is just my speculation based on a limited general knowledge of how these types of models can work.


I’ll note that at the time Homo sapiens wasn’t in the CV model, so it would have never returned Homo sapiens as a suggestion. The model was probably just going “uhhhhh…looks like these things in lizard photos?”

iNat doesn’t care about recognizing individual people (humans are so boring compared to slime molds!), so the model wouldn’t be trained to associate faces or fingerprints to individuals.


This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.