Spectrogram browser extension for iNaturalist - feedback welcome!

I notice that the spectrograms here include freqs all the way down to 100 Hz. Those portions of the graph above 1 KHz look fine. So the logarithmic display of sampling at very low freqs seems to be the issue.

1 Like

Notwithstanding the obvious value of having visual displays of sounds, there is still the outstanding issue of whether or not it is appropriate to have spectrograms included among iNat images. I might be corrected, but previously I think staff have discouraged the uploading of such graphics (as jpg images, that is) because those would be rolled into any selection of images from which they train Computer Vision. And the outcomes of such training–images of bird and spectrograms–may throw a monkey wrench into training sets. The standard litany I’ve heard is not to upload a spectrogram as a first image in any event. I don’t know if this would apply to an imbedded link (???).
@tiwane Am I speaking out of turn?

1 Like

No, but I don’t know how much of a difference they make (I’m not involved in the nitty gritty of CV training). So far I don’t think they’ve been an issue as far as we can tell.

I’m so sorry about that! It was a bug that snuck in, and I have since fixed, so I’m glad you got the update which fixed it for you!

1 Like

Thank you! Yes, there is an option to change between frequency modes, so you can select logarithmic, linear, or mel and so you should see an improvement if you switch to whichever one of those is optimal for the sounds you’re viewing.

Right, this is actually one advantage of using something like this browers extension (or if the feature was integrated), because it allows you to visualise the sound without “poluting” (for want of a better term) the image data. The visualisation displayed is rendered on the fly, and therefore not stored as an image. If everyone had access to this functionality, there would not be the need to upload images of spectrograms to observations, as the audio would stand on its own being able to be both heard and seen.
In addition, if this sort of functionality was incorporated into iNaturalist itself, and the audio data parsing to generate spectrograms also allowed for the storing of that audio data in a way more available to Machine Learning, then iNaturalist would begin building a bank of trainable data to identify creatures from audio as well as visually. @tiwane may know if this is already something iNaturalist is working towards.

4 Likes

I have just been doing some bird call IDs on the Identify page, and I see the spectrograms don’t show there. Would be very cool if they did!

1 Like

Great addition! Thanks.

I have added it successfully to Firefox and Chrome with success.

One difference I did notice is in Firefox, it failed to load the spectrogram in a longer recording (1:51). That same file worked fine in Chrome. I guess Firefox has an earlier “time out” failure for extensions?

This is going to be great when trying to ID other people’s recordings.

Chris

1 Like

Love the project! Would you be willing to upload it to Github for contributions?

1 Like

Oh! Do you have a link to an example page like this please? Then I can take a look and see what I can do. If there are a lot on a single page, it might need to be triggered for each one for performance reasons.

Yeah, I noticed that Firefox performance is slower, and I believe there’s also a lower limitation on the size of the spectrogram. I’ve been working on some performance improvements, trying a couple of different approaches to see what works best, so hopefully will be addressed in a future update!

Thank you! The GitHub question has been asked before, and the repo is private currently, but I would consider opening it to public in the future. Part of the reason I haven’t yet is that I’m still exploring the direction, and don’t want to manage contributions at this stage, but once it settles I think I could. Feature requests and suggestions here are more than welcome in the meantime!

2 Likes

Identify example: https://www.inaturalist.org/observations/identify?verifiable=true&project_id=185251

This is the basic Identify page, and the spectrograms must NOT load here:

Then when one clicks on an ob it opens a modal, and the spectrogram should SHOW here:

I imagine the spectrogram should only load per ob that is viewed, not all at the same time. Coz that will crash my machine.

Japh, the 400 default resolution is too high for my computer/browser/RAM/whatever. I lower it, but it is back to 400 on the next ob. Would it be possible to lock the resolution (make it ‘sticky’) per user? Or be able to save settings or something? Once that dang kswapd0 kicks in it basically crashes my computer. It is the bane of my life and I probably need more RAM. Or a new computer. Or to go outside and get off the computer completely. Or maybe just eat some cake. Yup, I think cake is the winner :wink: Here I go in a cloud of dust!

2 Likes

Thank you so much for the additional info!

The resolution change should already be “sticky”, remembering for you on your machine what the last setting was for the active profile, however if you have multiple tabs or even perhaps in this modal situation, I could see that this might not be respected. I have noted it down though as something to address in the next update too. (Which will hopefully also address performance a bit… we’ll see… maybe you can get away with your current RAM, and some cake!)

1 Like

Right, the RAM has been ordered :wink: This being Africa it will take 2 weeks to get here. Hope springs eternal :rofl:

1 Like

RAM arrived. Installed and all is well in ram-world :slight_smile: The sonograms load fine now, and the resolution is sticky. Oh the joy!

1 Like

Hooray! I’m glad to hear it. I’m working on a big upgrade. It’s taking a while, but hopefully will be worth it.

3 Likes