iNaturalist Sound Classifier (browser extension by @k4u3)

Hello!

As I began working in bioacoustics, I often searched for audio samples from sound-based observations on iNaturalist to train or validate custom sound classification models, so I started to think about creating a more accessible approach to sound classification to help iNaturalist users in their sound-only observations.

This project is a first step in that direction: a free, open-source desktop browser extension that runs bioacoustic neural network models directly in the browser. It enables users to analyze sound recordings on iNaturalist observation pages, performing local analysis to identify species and cross-check detections against geographic occurrence data from GBIF and iNaturalist. Classification models can also include geographic bounding boxes that define where they are applicable.

For this initial release, I chose to support models in ONNX format. Two well-known bioacoustic models (BirdNET v2.4 and Perch v2.0) have been adapted to ONNX by Justin Chu and are included as built-in options. BirdNET v2.4 provides near-global coverage for birds, while Perch v2.0 was trained on multiple taxa, including sound data from iNaturalist. Users should be mindful of model resource requirements. For example, Perch v2.0 (size ~400 MB) is significantly larger than BirdNET v2.4 (~60 MB) and may require more RAM during analysis.

There has been significant progress in developing specialized classifiers in this field, and one of my goals is to support the conversion of custom models to ONNX whenever possible and make them available through this extension. I am also exploring support for additional model formats in the future.

Currently available versions:

This is the first release, and feedback is very welcome. The project is open source, available on GitHub, open to contributions and issue reporting.

Cheers,
KauĂŞ

This is fantastic! I tried it out on a half-dozen species of Texas frogs and toads and it was accurate on all but one. It didn’t get the tree crickets correct though. The models will get better with time. I noticed that it showed an older (or different) taxonomy for some species…but linked to the correct species page in iNat.

It took seconds to install and worked perfectly…providing a popup for observations with sounds. For me, running Perch, it only used about 2 GB of RAM. Not bad at all.

Here’s a shortcut to observations with sounds:

https://www.inaturalist.org/observations?sounds&term_id=1

Just a reminder for folks using this:

Please ID based on your own expertise using the model as a tool and don’t just input as an ID a result from this (or any other classification or CV type model like Merlin, etc.).

Or the iNat CV.

Indeed:

Thanks for pointing that out! Models can make mistakes, so it’s best to consider them as tools that support and spark interest in learning about animal sounds. If people are unsure about a suggestion, I suggest comparing it with other sound recordings to confirm the identification, either within iNat or through external sources such as:

The extension was easy and quick to install and easy to use. However, I did not get the expected results with several of my bird sound recordings (all of which had been independently verfied with Birdnet and their sonograms on ebird). Even an easily recognizable species like a Red junglefowl failed with both Birdnet and Perch. Here’s the observation, let me know if you get a positive result

https://www.inaturalist.org/observations/255360428

One thing that you can try is to test different values for the “Advanced parameters”, specially the “Confidence Threshold”. I put 0.5 as default, to avoid showing too many results (specially wrong ones). But sometimes the model can identify the correct species with low confidence. If you reduce the threshold with BirdNET to 0.1, it suggests the expected species in the link you shared. Another thing is that each model needs a specific time window duration (BirdNET is 3s and Perch 5s). If the recording is less than this duration, the extension fills the gap with silence (which can affect the model classification).

Can you say what OS/browser etc. you use? I’m using Chrome on Windows 10, and the extension just keeps crashing. With the Birdnet model an ID briefly flashes, then the tab closes with an “out of memory” message. The Chrome task manager reports about 500 MB just before the crash. The Perch model downloads, and then just sits there forever. There is plenty of system memory. Using the extension or not barely registers on the Windows task manager’s memory display.

Vivaldi browser (Chrome-based) on Linux Mint OS.

This seems like a broader, different question than OPs intent for the thread (which is specifically about their extension). It’s probably best to keep the conversation in this thread focused on that more narrowly. You could make a separate thread with this question and note it was inspired by this tool, ask mods to move this to its own thread, perhaps revive an older thread if there’s one that has a similar question, etc.

Edit: The post this was replying to has been deleted by the poster.

Ah, thanks. I’ll hold off trying to use it then, maybe it’ll become more stable in the future.

I’ve tried this out on a number of my recordings from Tasmania and most don’t return a result or have an incorrect result (the majority I tested are RG). I tested both models for each ID

However for a few of them it was spot on :)

I’ve been using whoBIRD to get an ID for my recordings then validating the ID by comparing the result with audio on xeno-canto and other bird ID websites to my recordings. Your model seems more convenient

I also tested using a machine with Windows 10, and it worked well with all browsers (Chrome, Brave, Edge, Firefox). While running BirdNET, the extension might require around 500MB of RAM (and Perch ~2GB as already mentioned). If you provide further information about your machine, Windows specific version and Chrome version, I’ll try to investigate the issue. But I would suggest to continue our conversation outside this post. You can send an e-mail to info@biodiversica.xyz, or in case you’re a github user, you can also describe the issue here.

Thanks for replying, I wrote some more details about my setup on Github.

These do not work for ultrasound i.e., bats. Just a warning. :wink:

Love seeing Perch 2.0 integration.

also thanks for open sourcing, I will be using it as inspiration to test what other ONNX models I can take a spin on in general.

Indeed, the maximum frequencies covered by the current available models are:

  • Perch v2.0: 16kHz
  • BirdNET v2.4: 15kHz

Do you know BattyBirdNET? It is a BirdNET adaptation for classifying bat sounds (with custom classifiers for specific regions of Europe and US). It basically slows down the audio in a way that the bat sounds fit within the BirdNET frequency range (this leads to changes in the analysis window duration, which will be shorter). But I am not aware of how good it is, let me know if you tested it or heard of any feedback about it.

I am curious to test it as a custom model in the extension, but not possible in the current release, I have to make some changes in the code.