iNat data is already so thoroughly biased as to be useless for many purposes, because density of iNat observations follow human population density, because iNat observations are mostly just the things iNat observers managed to photograph, because iNat observations are often only from the rare times when an observer took a particularly good photograph (a surprising number of people are too embarrassed to upload low-quality photos), because some observers only make one observation per species so they can build a life-list, and probably a lot more reasons I don’t know about. If you want easy-to-analyze data, you need a regular sample/survey method, both spatially and temporally. “Research Grade” is a misleading label for most observations, unless all you’re attempting to research is presence at particular times and places. And for that, duplicates aren’t a problem.
In summary: attempting to use iNat data to work out abundance or even trends will inevitably produce very skewed results, so removing “Research Grade” from duplicate observations won’t noticeably improve results for researchers.
(And researchers really, really need to be warned that analyzing iNat observations to attempt to figure out abundance, density, or population is a waste of time, if they’re aiming at finding and sharing true information in addition to publishing a paper or completing a thesis. These days, thanks to computers, analyzing data is a lot easier than collecting useful data, and iNat can’t be used as a shortcut to skip the hard part. eBird, maybe.)
Phew. End rant. Sorry, I work with population biologists, and my parents were population biologists, so I’ve been inculcated with this point of view. It looks like the developers are going to address exact duplicates at the source, so that’ll help. Maybe they’ll test their system by checking for existing exact duplicates, and test the tool for merging and splitting observations on the ones with the same “Research Grade” ID.