Spectrogram browser extension for iNaturalist - feedback welcome!

I was doing my usual review of recent frog postings in my area and on one observation that was just audio someone commented saying it would’ve been useful to see a spectrogram to help with the ID. So I thought, “why not?!”

So I made iNatSpectro, a browser extension that adds a spectrogram for any audio files on an observation. It’s free, of course, and I’d love it if anyone has any feedback on how I could make it more useful.

https://chromewebstore.google.com/detail/inatspectro/dkcpffpppiggohlejjcoafbhhdcmnapc?authuser=0&hl=en-GB&pli=1

Once installed in your browser, viewing observations with audio attached will show a spectrogram like this one for an observation of a bat call:

Clicking the little cog reveals a number of settings that can be adjusted also.

Feedback is welcome! If this is something the iNaturalist team would like to incorporate into the site itself, please consider it my gift to help get started!

(I’m still making my way through all the various postings on the forums about spectrograms on iNaturalist, but in the meantime, as a proof-of-concept (and if desired, a basis) for having them built into the website, there’s this.)

47 Likes

This is a fantastic tool! I often wish iNaturalist provided something like this. Thanks for your work.

However, when I try it on my observations, the bottom part looks weird. The only thing I do when editing the sounds recorded by phone in WAV is I normalize the sound and maybe sometimes I do some denoising. I tried adjusting the parameters but it didn’t make much of a difference. My browser is the latest Vivaldi (Chromium based) on Windows.


https://www.inaturalist.org/observations/287230171


https://www.inaturalist.org/observations/264976046

4 Likes

Thanks so much for testing it out!

For the Corn Crake, these settings seem to get a fairly representative result for me?

And maybe these settings for the Red Kite?

I am working on being able to toggle between linear and logarithmic views, which might help too, so hopefully the next update!

3 Likes

Thank you! Works flawlessly from my tests so far. Really appreciate having this feature available.

6 Likes

Thank you, seems I didn’t tweak the parameters enough :-) It might indeed be because of the logarithmic scale. eBird seems to be using a linear one, this is the same corn crake observation (and files) I uploaded there: https://ebird.org/checklist/S247180263, I don’t know whether they apply the same gamma, smoothing, etc. parameters on every sound recording there.

I’ll be happy to test the feature more!

3 Likes

xeno-canto.org also uses a linear scale. Perhaps they have the same settings as eBird? I think it is older than eBird, but perhaps less active now.

3 Likes

Awesome, thank you!
Is this only for Chrome or also Firefox?
Are you planning on releasing the source?

4 Likes

My pleasure! I’ve done some testing with linear vs logarithmic and I definitely think the linear mode for birds makes things a lot clearer. I’ve submitted an update to the Chrome Store, so once it’s been reviewed, it should be available for you!

2 Likes

Thanks! I’ve tested in Chrome and Edge, but not in Firefox as yet, so I’m not sure.

I’m not opposed to releasing the source at some stage, once I’m happy with where it’s at.

4 Likes

This is cool! I’ll have to try it.

1 Like

Ooh, naturalists are so talented and generous and wonderful!!! Thank you @japh :-)

Have tested on Brave (chromium-based) on Linux and it works well. I’m getting the same “test pattern” lines as @caroig . I messed about with the settings a bit but couldn’t improve it that much (any resolution above 350 nearly crashes my machine - this will be my machine and that fact that I have too many tabs open at all times despite trying to keep it tidy ;-)

This does not bother me overmuch coz I find the waveform easier to read than a spectrogram anyway. In fact, maybe it would be a nice option to choose to show one or both?

4 Likes

@japh Thanks for making the extension! For birds, I wonder if one could make the spectrogram more fine-grained. For example, for this American Robin observation, the produced spectrogram (with minimum frequency adjusted from 100Hz to 1000Hz):

It’d be great if it could be more fine-grained, something like this (from Merlin Bird ID):

The more-fined grained spectrogram would make the spectrograms more comparable to some well-known samples, e.g., those from Peterson Field Guide to Bird Sounds.

I tried tweaking the various provided settings. None of them seems to change how fine-grained the spectrogram is.

6 Likes

On linear vs logarithmic scale, I’m under the impression that the extension currently uses logarithmic scale.

For birds, my impression is that the common practice is to use logarithmic scale, e.g., the spectrograms in the well-known samples of Peterson Field Guide to Bird Sounds.
The typical minimum frequency is probably around 1000 Hz instead of 100 Hz (the default in the extension’s bird profile).

1 Like

Have just discovered an issue. When I upload the spectrogram bit is included on the card like so:

And the file name does not show (which it usually does). I need the file name because it shows the date:time & species, which I need to enter manually coz iNat can’t fetch it from the metadata (which seems to be stripped when I convert my videos to audio files in Audacity).

I don’t think it’s necessary to have it on the upload screen. What do others think?

2 Likes

I think both eBird and xeno-canto, both very popular online resources, xeno-canto older, eBird now probably growing much more, use the linear scale.

So this is what my audio cards usually look like:

But, Good News!, I have discovered that one can change the options on the extension from “On inaturalist.org” to “When you click this extension”. So I can turn it off when uploading.

1 Like

I tested this on my bird and bat observations. It worked great on my bird observations. However, I was only able to get it to work on one of my bat observations. My bat recordings are slowed to 1/10th speed (which is a common convention so that they are within the human auditory range). It worked on this one: https://www.inaturalist.org/observations/66054745, but not these: https://www.inaturalist.org/observations/66055763, https://www.inaturalist.org/observations/66030666. For the later two, it only shows an empty purple field. Do I need to change the settings? Increase the volume of the recordings?

Thanks for building such a useful browser extension! I also hope you can release the source for it one day (even if it isn’t completely polished). Maybe it could even be integrated into the iNaturalist software.

2 Likes

Nevermind, it looks like I just needed to dramatically lower the frequency range, due to it being slowed down. It would be cool if it automatically detected the correct frequency range, but I imagine that’s easier said than done. Maybe you could create a “Bat 1/10th speed” profile.

1 Like

Hi, great work! I have been advocating for a iNaturalist solution for spectorgrams ever since I saw the frequently uploaded screenshots of other apps. I do not use chrome so I could not test it. But one thing that is missing is documentation about the generation of the spectrum. For interpretation of Short Time Fourier Transformations, it is crucial to know the parameters, as different parameters create different looking spectrograms, especially when you want to optimize time vs frequency resolution. This can also lead to blocky looking images as @caroig observed. Because of the Rayleigh frequency limit, it could happen that low frequency signals are filtered out because of a short time window length. This could maybe be ignored for birds but could be bad for toads. So ideally you want to have control about these parameters too (hop size, window size/time frame). Maybe you can add this to the profile settings? Maybe there could be a profile setting for birds and for other species groups to make this easier for inexperienced users.

Are the Min and Max frequency from your profile settings a filter before the STFT is done or does it just cut the shown spectrogram?

2 Likes

I stand corrected - I took another look. Peterson Field Guide to Bird Sounds and Merlin Bird ID app both also use linear scale for for frequency / y-axis.

1 Like