-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
this doesn't seem to work on short files, about 3s or under? #87
Comments
Please provide some details as mentioned in the issue template. |
I'm running this command:
And I find that short files, less than 3 seconds or so, don't get normalized. This may be an artefact of the algorithm needing more samples to work? ProductName: Mac OS X Python 2.7.10 (default, Oct 6 2017, 22:29:07) ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers DEBUG LOG: Ross-MBP:audio_clips rossarnott$ ffmpeg-normalize W5S1-Rest-3-9-Introduction.m4v -c:a aac -nt ebu -t -5 -f --debug -o processed_audio/W5S1-Rest-3-9-Introduction2.m4v DEBUG: Found audio stream at index 0 |
In the log it says that an output file was written. Is this file the same as the input file, or silent, or...? It could be that the EBU-type normalization requires more input, but I'd have to check. |
Thanks for the quick response! The output file is the same amplitude as the input file for the example given. The longer files get normalised, which in this case is mostly making them significantly louder. The short files don't get changed. I'm batch processing dozens of files and I end up with a few (the short ones, it seems) at significantly different volume levels. |
OK, thanks for clarifying this. I'll see if there's a way to tune the parameters to make it work for small files. If not I'll have to at least print a warning. |
I can you this it is an ffmpeg thing. There docs and even in there mail
list it's stated that ffmpeg needs at lest 30 seconds of sound to be able
to do any kind of normailizen. There no way around it outside of adding x
many seconds of silence to the end of the file to pad it to a 1 minute
mark.
…On Thu, Nov 22, 2018, 4:24 PM Werner Robitza ***@***.*** wrote:
OK, thanks for clarifying this. I'll see if there's a way to tune the
parameters to make it work for small files. If not I'll have to at least
print a warning.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#87 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABhMBCjuwTbW7RzMbOpPpBYbWFSddqvXks5uxyQDgaJpZM4Yv63B>
.
|
Can you please point to a reference for your claim that ffmpeg requires at least 30 seconds of audio material to be able to normalize a file? |
Fair enough. Empirically it looks like ffmpeg-normalize does actually work on files of about 4s or longer, but that's not exactly a scientific test and it could be luck or depend on the audio content. |
You can use the option to print the statistics and inspect the loudness before and after. But that's not a proper solution either. I'll see what I can do. |
I've known about this for a while but haven't had time to fix it. I should really just fix the ffmpeg filter. Can you leave this open and assign to me? |
I guess that would be way more efficient than me digging through your code. Thanks! |
Seems I can't assign you, unless you're a collaborator or the OP. I'll leave it open and assign to me in the meantime. |
@kylophone Hoping for some input before putting shovels in the ground: Is file-length the only issue here? Do you expect a scripted solution that pads a file with 5 seconds of blank audio, runs loudnorm, and then strips the blank audio to work? |
The problem has to do with the definition of Integrated Loudness in BS 1770 / EBU R128. IL by definition needs at least 3 seconds. I haven't had a chance to look but padding with silence should work, I think. |
That's as much of a go-ahead as I need. It'll be at least a couple of weeks before I try this, but I'll report back. Thanks. |
For anyone interested, the steps outlined in my issue:
Pads the audio with ten seconds of silence at the beginning. Necessary because of this bug
Gets loudness data from the file.
Feeds the loudness data back into the normalization alg for better results
Removes the 10 seconds of silence Work just fine. This could be added to this library as a work-around for the upstream bug. |
Thanks for sharing this. I have to admit, I'm not in favor of adding functionality to automatically pad and truncate the audio streams. That always bears potential for issues with audio-video sync. I'd rather just provide a warning when the audio stream is < 3 s and link to a FAQ entry. |
The warning pointing to this issue appears even when using RMS normalization ( |
Method presented by @NiloCK works nice, except it can clip file a little bit, e.g.: I am not ffmpeg expert, but after some experiments I have concluded that following step
works only with 2048 samples accuracy. Thus for my 16kHz sound files I use 16 seconds padding (16s=256kS being LCD of 16k and 2048) to avoid clipping. |
That would make this otherwise wonderful tool useless for shorter clips, which as evidenced by the existence of this bug, people need. A use case for shorter audio clips is when normalized single spoken words when learning a language, as seen here. In this use case, audio normalization is important but the ability to sync is not important. Therefore I suggest implementing the warning that audio may not sync properly after normalization, but enabling the pad-then-truncate to happen. |
@dotancohen Thanks for your feedback. I'm not against such a feature per se, it's just that it is a bit of additional work and may lead to files out of sync, so it needs to be well-tested. I'll look into how to implement it, but I can't give you an ETA on it, unfortunately. |
@5tan This way the codec will be preserved instead of copied (the exact difference I was not able to understand so far 👍 ) Yes, (when using the approach with Honestly, I did not fully understand what was going on, but I have a table if someone wants so experiment with it more 😄 |
Another potentially useful distinction is that there are no sync issues on pure audio files (ie, non-videos). From this thread, it looks like most people running into this bug are normalizing single spoken words, which is much more likely to be audio than video. Clipping issues notwithstanding (thanks to everyone who pointed this out), I think a "better" fix for THIS utility might be to keep throwing that error for <3s video files, but do a pad-and-truncate hack on audio files and spit out a warning. Would you consider a PR that adds this behavior? Heck, vine doesn't even exist anymore! (although, honestly, some ex-vine content processing people are exactly the ones who have a fully-baked solution to this problem!) |
I agree that this is the best solution, given the use cases stated above. |
Yes, that seems like a useful solution. It should apply to audio-only files then, which would make the processing easier. |
Can this be reproduced somehow reliably? I could not found any input sample to test. |
Thanks! I guess in particular this one: FFmpeg/FFmpeg@36572a0 I will leave this open until I get time to test that. I will leave the warning in until this fix lands in a specific ffmpeg version. |
Maybe, maybe not, there is also fix for report of 0.0 for LRA for short audio but will look about posting it too. |
Can the current warning for <3s files be ignored somehow? Or maybe it's been solved? |
This fix should be in FFmpeg v6.0 or higher. I will close this issue for now. |
if it's fixed, then why does it show warning redirecting here
|
Because I forgot to remove it. It no longer shows a warning now. |
Hey there, sorry but I just want to clarify... I am trying to get short (< 3s) spoken word audio to normalize to around -14 LUFS, is this supported or not? Cheers thanks. |
This should work better now. Just make sure to use a recent ffmpeg version. |
If you want to report a bug, or have a specific question, please make sure to include this information:
--debug
flagThe text was updated successfully, but these errors were encountered: