Audio recordings of bird song/calls

Given the way that the iNat CV model works, this wouldn’t really be possible. It is certainly possible to train a machine learning model to recognize spectrograms (this is what Merlin does for instance). But that model is a separate one from their photo ID model. Both are trained on their specific class of inputs.

iNat’s model doesn’t know that the picture uploaded is a spectrogram - it will be treated the same as other pictures of the organism. So adding spectrograms will introduce unnecessary error into model training. An additional issue is that spectrograms would need to be produced in a consistent manner to allow for comparisons (again, this is what Merlin et al. do). Using unstandardized spectrograms, even in a spectrogram specific model, would lead to poor results.

If you want a more “official” take on spectrograms:

2 Likes