That article does exist, but I think there are some potential problems. First, staff have made it clear in the past that iNat isn’t intended to be used as alternative NHC database, which I would argue the paper sort of proposes.
I think that the duplicates are a fairly minor issue, but they can cause confusion. For instance, identifications for entries in NHC databases are often fairly static - they are rarely changed unless re-examined by an expert. On the other hand, the community taxon of an observation on iNat may change quite easily or quickly. This is also true for other aspects of the observation: fields, date, time, location, all are editable, though for most observations this is a reasonably rare occurrence. If any of these occur and the NHC records are not updated, there are now two records of the same thing, one on iNat, one in the NHC, with different info (which will be quite difficult to identify as duplicates). At the extreme, iNat observations can be deleted by a user at any time. All of these attributes make iNat less suitable for use for NHC collections which generally aim to be permanent.
I also feel like in some cases, this type of approach seems like NHCs can be co-opting work done by others (iNat users) as their own. If an employee at an NHC uploads their own observations or those created by volunteers specifically on an NHC sponsored/coordinated project, I don’t think there’s much of an issue. However, if NHCs are linking to observations or pulling info from the API for import to their own database and then presenting that data as their own, I think this is a potential issue. For one thing, it’s possible to do this without respecting the observation/media licenses that users have made (so that would need to be handled carefully). Also, for NHC specimens, there’s an expected level of rigor when it comes to identification - I think most people have the expectation that a specimen ID from a museum record has been made/checked by museum staff. If they are pulling in community taxon IDs, annotations, or other data entered by iNat users, this seems like benefiting from the work of iNat identifiers without necessarily crediting them.
Relevant sections of the cited paper indicate their proposed workflow may be subject to some of these issues: “Taxonomic identification is facilitated by artificial intelligence features in iNaturalist, the iNaturalist community, and a community‐curated taxonomic nomenclature. iNaturalist improves efficiency and accuracy for botanists relative to field guides or memory alone by providing a list of identification suggestions and a set of pre‐defined taxonomic names from which to choose (e.g., avoid misspellings, taxonomic synonyms). Accuracy is also improved with identification suggestions or verifications from other iNaturalist users. Second, once observations are complete, these data can be easily exported and directly converted into herbarium labels and merged into collections databases.”
This implies to me that this workflow is assuming that the AI suggestions are good, and that iNat IDers will have a higher ID accuracy than botanists (true in some cases, definitely not in others). It also implies a one time conversion of iNat data (which may change) to the database and collection and the potentially uncredited use of data from other users (annotations, ids, etc.) which I think is ethically problematic.