I am curious about error rates, and know there has been some work attempting to quantify them. I think it’s probably safe to assume that overzealous identifiers, almost by definition, will have higher error rates than other identifiers, but I see that as one end of a spectrum of identifier quality, and maybe it’s worth considering how to encourage a good balance between identifying more things, and maintaining a reasonably high level of quality/accuracy in the identifications.
I think @charlie has mentioned that for plants he isn’t convinced the error rate on iNaturalist is much higher than that for herbarium collections (forgive me if I’m misremembering, and please correct me if so), and there’s no way of knowing with data that is recorded without vouchers (in my personal experience, I’ve seen enough to suggest that accuracy of plant data is going to be highly dependent on the observer).
Personally, I know that I feel kind of bad about making mistakes in identifications, and sometimes feel like maybe I should be more conservative, to avoid mistakes like that. However, based on this measure at least, my overall error rate is pretty low, and I do use the corrections to help calibrate which things I should feel reasonably confident in (acknowledging that my confidence is based in large part on knowing what species are expected to occur in the area where I focus).
When I do learn about a species I wasn’t aware of previously, I become more conservative unless/until I feel confident I know how to distinguish it from others that occur in the region I look at. As an example, when @markegger identified some observations as C. chrymactis, a species I had not previously been aware of, I began limiting my identifications to genus for observations in the (relatively limited) area where that species occurs, since I didn’t know how to differentiate it from the other species expected in the region.
In contrast to my approach, a friend of mine only wants to identify something based on seeing all the key characters, even if there’s not really anything else expected from the region that is likely to fit what’s shown in the observation.
I would certainly acknowledge that the quality of identifications is likely to be higher in that case, it’s unclear whether that (possibly small) gain in accuracy is worth the cost of many more observations going unidentified (when most of those identifications would have been correct).
I suppose there will always be a tension between putting names on things (or not), and views will differ depending on the person and how they may want to use the observations/identifications.