Issues with quality of sound recordings

I have been both recording recently with the iNaturalist iPhone app and doing identifications of sounds. It seems clear to me that something has changed in the sound recording component that has greatly degraded the quality. Many of the recordings are so bad as to be unidentifiable. Things were not the same a year or so ago. Recordings used to be of reasonable quality. Please fix!

–bug report template added by moderator–

Please fill out the following sections to the best of your ability, it will help us investigate bugs if we have this information at the outset. Screenshots are especially helpful, so please provide those if you can.

Platform (Android, iOS, Website):

App version number, if a mobile app issue (shown under Settings or About):

Browser, if a website issue (Firefox, Chrome, etc) :

URLs (aka web addresses) of any relevant observations or pages:

Screenshots of what you are seeing (instructions for taking a screenshot on computers and mobile devices: https://www.take-a-screenshot.org/):

Description of problem (please provide a set of steps we can use to replicate the issue, and make as many as you need.):

Step 1:

Step 2:

Step 3:

2 Likes

Please include links to observations. Without specific information, there’s not much anyone can do. This is why we ask folks to fill out the new bug report template when making a new topic in Bug Reports.

6 Likes

We id and observe sounds. Our sound observations are done mostly in inat’s iphone app.

Here’s a link to our sound observations from the past year (mostly, not all, on inat iphone app). Hadn’t noticed any quality difference or any difference in id rate

https://www.inaturalist.org/observations?d1=2023-06-07&place_id=any&sounds&subview=table&user_id=wildwestnature&verifiable=any

3 Likes

earlier discussion started by dan_johnson (but still without any specific examples): https://forum.inaturalist.org/t/sounds-recorded-with-app-warbled/51838

I’ve switched over to the BirdNET Android app - https://youtu.be/J8-sl3T8BH8?si=OtudtKbJppKBpDZp

example: https://www.inaturalist.org/observations/220567594

I changed not so much because the canned app was bad - this is just better! I didn’t mention it, but you can actually select numerous portions from the same recording.

I’ve had a semi-similar type of issue on the Android app: https://forum.inaturalist.org/t/audio-recorded-in-app-on-lg-g6-is-worse-quality-than-audio-recorded-otherwise/49551
No point of reference for how things were a year ago though, so I don’t know if maybe it had also changed from okay to bad on Android

Here is an example where the warbling recording artifact is as loud as the sound that’s the object of the recording (Swamp Cicadae): https://www.inaturalist.org/observations/220533584. At the end of the recording is a Superb Dog Day Cicada that is reasonably clear.

I did a couple of test recordings last night. Both are as follows, with the first recording in the iNaturalist app and the 2nd in VoiceMemos.

https://www.inaturalist.org/observations/221507528
https://www.inaturalist.org/observations/221507517

In both cases, it wasn’t that bad, but you do hear warbling in the iNaturalist app that you don’t hear in the VoiceMemos app. I have had a number of recordings like the first recording where the warbling was quite loud compared to the organism I was trying to record. I suspect that recording of quieter sounds is where the problem occurs mostly.

My iNaturalist version is 3.3.2, built 716, iOS17.5.1. franpfer’s recording was also with the iNaturalist app he confirmed, with no editing after recording.

2 Likes

i don’t hear any obvious differences in the quality of hte recordings. looking at the files in your observations, they all look to be compressed audio. the iNaturalist iOS app uses a MPEG-4 AAC Low Complexity codec at 44.1kHz sample rate, and it seems to have used those settings for quite a while now. I’m not sure which codec is used when you record directly from the VoiceMemos app, but it seems like it’s one of the MPEG-4 AAC codecs at 48kHz. so the quality should be similar. (the higher 48kHz sample rate shouldn’t matter since the recorded audio doesn’t go much above 16kHz in any of the recordings, and although the VoiceMemos app does seem to offer an option to record lossless audio, it doesn’t look like your files are lossless.)

i do hear some digital compression artifacts in this recording, but there’s a lot going on in this recording, including handling noise.

that said, assuming these recordings were made using the stock microphone on a mobile phone, none of these are what i would describe as particularly poor quality for the hardware.

in other words, i don’t see any obvious problem here.

if you’re hearing differences in playback in the iNat app vs the VoiceMemos app, then maybe it’s just that those apps play audio back slightly differently. can you play back the exact same file in both apps, and see if there’s any difference in the playback between apps?

this seems like a legitimate problem, but i assume the issue in your case is something that is device-specific – like maybe the default recording settings for your default recording app are just really low, and maybe the iNat app just uses the default settings.

i finally figured out how to record stuff on my Android iNat app (i had to grant microphone privileges to my default sound recording app), and i don’t see any problems when recording via the iNat app.

maybe try to set up another app as your default sound recording app, and see if that helps.

3 Likes

I have no experience recording with the iNat apps, but while identifying bird sounds, I’ve noticed that low sound quality has become more common in recent years. (Just a gut feeling, no hard data.) Maybe some cheaper phone models have low-quality mics? Or perhaps casual iNat users, who are becoming more common, pay less attention to sound quality, posting very poor recordings that more experienced users wouldn’t share?

Here are some examples of low-quality and/or very faint sounds, all all except one submitted with the Android app:

https://inaturalist.laji.fi/observations/206590099
https://inaturalist.laji.fi/observations/169346009
https://inaturalist.laji.fi/observations/204266722
https://inaturalist.laji.fi/observations/181886706

(I think the biggest problem with identifying sounds on iNat is that their volume isn’t normalized: one recording might be very faint, requiring full volume to hear anything, and the next one very loud, hurting my ears.)

based on what i hear and also what i see in the frequencies captured in their other recordings, it looks like this user’s audio recording hardware is malfunctioning – like a bad wire somewhere or maybe a malfunctioning microphone / headphone jack.

these are similar to @ngoomie’s reported issue, where, for some reason, these files are recorded only at 8000 Hz sampling rate. both of these phones appear to be Huawei models, which is different from ngoomie’s LG.

this is recorded on an iPhone, and has some processing or compression artifacts, similar to what dan_johnson suggests is an issue.

based on what i’m hearing, i think the “problem” is really just related to hardware limitations and maybe recording technique – probably not related to any specific issues in the iNat app.

1 Like

I’ve started mostly recording my sounds via VoiceMemos on my iPhone because I’ve felt the sound quality is noticeably better than in the iNaturalist app. But after uploading the observations and attaching the sounds, it was clear the sound quality was degraded compared to my original recordings. Here is an example observation: https://www.inaturalist.org/observations/222597396. The original is on google drive here: https://drive.google.com/file/d/1Hz5ZZbwgLrZ3eUtkbtfbLXAFUtTqyNVb/view?usp=sharing. Google doesn’t know how to play it though, so I converted to a wav file. It sounds the same as the original to me: https://drive.google.com/file/d/1bUXEFVG0V3UI1ZDgfODJMXgKpJeE8P7F/view?usp=drive_link. Looking at the numbers, my original recording is 724 KB, but the recording downloaded from iNaturalist is 94 KB. That’s 13% of the original file size. To me, the sound quality is starkly inferior. The warbling presumably from compression is clear.

I know photos are also compressed to a degree, but it’s fair to say that most people couldn’t tell the difference between compressed ones and original ones. Shouldn’t sound be handled in the same way? Shouldn’t the compression be kept at a level where there is no detectable audible difference from the original?

Were they uploaded through the iPhone app or website?

You could add the file somewhere like Google Drive or Box, set it to public view, and share the link.

1 Like

So you recorded the sound in Voice Memos, then uploaded them to an existing observation via the iNat website, I presume?

To do the observation, I did a “No Media” observation with the iPhone iNaturalist app, uploaded it, then went to the observation on the website, then did drag and drop onto the observation.

2 Likes

Good idea. I loaded it on Google Drive and edited my original post to include it.

2 Likes

I thought it was just me noticing iNat recordings were lower quality. I am not sure why they would be, unless the app is intentionally compressing them somehow.

1 Like

ok. prior to this, you seemed to be describing a situation where the iOS iNat app records lossy compressed audio, whereas your iOS VoiceMemo app is theoretically capable of recording lossless compressed audio. it wasn’t obvious in the information that you provioded previously that you were recording lossless compressed audio in VoiceMemo though, but your new information does suggest that you probably are recording lossless here.

but then on top of IOS iNat lossy vs VoiceMemo lossless difference, there’s probably a second thing related to processing on the iNat server when sound files are uploaded.

as of late 2019, it looks like uploaded sound files which are not WAV or MP3 are sent through a thing that converts them to a common (AAC?) format in a M4A container. in mid 2020, there were additional changes to reduce the amount of compreesion that is applied in this process (to the minimum compression).

so this is probably where potential problems could be introduced upon upload. however, the timing of the code changes doesn’t seem to fit the description of a reduction in sound quality starting “a year or so ago”.

i didn’t dig around deep enough to figure out exactly which version of ffmpeg and which codecs are being used for this conversion, but unless one of those changed within the “last year or so ago”, i don’t think there would be any other way something would have changed recently. if there is a problem, it likely would have been a problem for about 5 years now.

i don’t load many non-WAV/-MP3 files, but i did find a case from 2022 where i loaded an M4A file, and i do see that in the observation, the sound file was actually resaved as a higher bitrate file than the original (probably since the original file would have been somewhat heavily compressed, and the conversion during upload would have re-saved it with minimal compression). i reuploaded the original file again to the observation, and the resulting second audio file was similar to the file on the previous observation (though not exact), suggesting that no major changes in the way the system does that conversion had occurred since then.

your original recording is lossless compressed audio (ALAC in a M4A container). however, because it’s not WAV, the file gets converted to lossy compressed audio (AAC in M4A) when you upload it to iNat.

comparing the file in the observation vs your original file, i can see that probably the biggest difference is that you’ve lost most frequencies above about 16kHz, and then there are some compression / artifacts on top of that, i think.

i’m a little surprised that AAC with minimum compression would produce a file that is noticeably different, but maybe additional tweaking of settings or maybe a better codec could make the differences less noticeable. i’m not sure why there is that process to convert stuff into AAC. maybe because MPEG4 files could contain many different types of audio, some folks run into issues playing some MPEG4 files, and saving them all as AAC makes them (more) universally playable?

for non-WAV/-MP3 files, i think it should be possible to determine what the underlying audio codec is. so if a file was recorded as ALAC, and if ALAC isn’t considered acceptable for universal playability, maybe it could be converted to WAV instead of AAC?

3 Likes

This latest comment was originally posted in General, but the admins merged it into this thread.

Many thanks for your detailed analysis of the issues!

1 Like

We happened to have recorded the same bird in Merlin and on the iNat app one minute apart, so we uploaded both audios to the same observation https://www.inaturalist.org/observations/221341419, and then added a third audio, which is the Merlin recording edited in Audacity to remove finger noises at beginning and end and also normalized.

We often observe with audio recorded directly in iNat and also record in Merlin and edit through Audacity. We only rarely record audio on iNat, upload it, download the audio file and edit it in Audacity (to reduce background noise, or amplify, or remove loud noises, etc.). Via these three avenues, we have noticed no change in quality during the past year.

One characteristic of this observation is that the normalized audio is, as @mikkohei13 describes, much, much louder than the other two files.

We do not identify audio using headphones or earbuds because we are sensitive to the volume changes and to loud noises. Our recordings via the iNat app usually have loud noises at beginning and end from our fingers starting and stopping the recording. We have a protective case on our phone (always have) and that may affect the finger noises.