-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate low-quality/limited IMA-ADPCM compression from the WAV importer #4264
Comments
IMA ADPCM is super cheap to decode and resample, orders of magnitude more than mp3 or ogg. Not a good idea to remove. You can have hundreds of these decoding at no cost. |
I think a better idea is to just improve/fix/replace the encoder & decoder all together. No (or significantly less) compatibility breaking here. For some reason Godot's encoder (atleast in versions up to and including 3.4.4) in my experience produces results ranging from fine/acceptable, to "this literally sounds like the file was corrupted in export." Some sampling rates tend to sound 8 times worse than they should too (I.E. 32khz sounds much more like 16khz to me). If you export a file with IMA-ADPCM from Audacity, and compare it to a file encoded via Godot, the file from Audacity is going to sound much better. I've included a sample project (using a voice line from Half-Life 2 if that's ok) which shows what I'm talking about (although believe it or not, this is one the less egregious instances I've encountered. If I find a sound that causes this, I'll share it). At the start of the IMA-ADPCM encoded sounds there is some very obvious popping. I've also included some files exported from Audacity to show how the files encoded by Godot should probably sound much more like. I am also wondering if it might be a good idea to basically just use a different ADPCM format altogether (maybe even a custom one lol). |
reduz recently suggested that we look into adding support for FLAC decoding. This would provide the same quality as uncompressed WAV, but with smaller file sizes. This would also benefit games that can load user-provided music (as FLAC has been increasingly replacing MP3 in people's local music libraries). There's a BSD0-licensed library for it: https://github.com/jprjr/miniflac I don't know how much CPU FLAC decoding requires though – probably more than IMA-ADPCM, but maybe less than Ogg Vorbis? |
Note that lossy WAV is an alternative that can be explored if Ogg Vorbis or even MP3 decoding are too expensive in your project. It's compatible with any program that can read WAV files, without requiring dedicated support. |
I just thought it might be because the audio is unfiltered, and that's indeed the case. The encoding is fine, it's a bug. Anyways, sorry for rambling, my final comment is that this leads to a slightly cleaner sound when max rate is set to 24khz than off. |
If FLAC import was added, I'd use that instead of WAV just to save space in the game repository. |
@unfa I posted godotengine/godot#80160 pull request for FLAC import. |
I'd like to mention the existence of ADPCM-XQ, developed by the creator of WavPack. It uses dynamic noise shaping to reduce quantization noise and supports encode lookahead, where samples can go from 0 to 8 which could be exposed as an import option. Even at level 0 it manages to get less audible artifacts than Godot or FFmpeg's implementations. At 5 samples (although going above that occasionally takes forever to encode), the result is usually more faithful to the original when compared to those. And there seems to be no size penalty by using it. The encoding result is readable by any program that supports adpcm_ima_wav. One thing to note, however, is that the repo's README mentions some bugs. But with a bit of adjustments, I think this can be integrated into Godot. |
In addition, I should mention the QOA format. Finished in April 2023, it is a lossy format specifically designed as an alternative to ADPCM:
On the other hand, I don't see any popular software currently adopting the format, likely because it's quite new. |
QOA looks pretty interesting to me, especially if we manage to integrate an encoder in Godot. This way, you can compress your WAVs to QOA on import (and keep your original lossless audio for iterating upon). This means third-party app support doesn't matter here, as only Godot needs to encode and decode the output QOA files. |
My proposal for FLAC support was rejected so I don't know if QOA will be accepted. |
OfftopicIs anyone interested in a custom QOA module (to read QOA files, not convert from WAV)? I kinda tried to write one but couldn't get playback instance and mixing to work. Available here. |
Offtopic (part 2)The QOA module I mentioned above is now somewhat functional if anyone's interested 😃. |
Following the documentation's guide, I was able to compile the engine on it's own, but not with the module. I might try with Visual Studio instead of SCons or maybe even Linux to see if that works. An encoder should definitely be included due to QOA's current lack of support and for the reasons mentioned by @Calinou. |
Every time I have an issue w/ ADPCM I end up here, so I may as well chime in w/ some experience (about a project NOT using Godot): We use ADPCM on most SFX where it doesn't matter, like engine sounds, explosions, bullet pew pew, etc. For music we use Opus. We use adpcm_xq for encoding. We didn't try QOA, but sounds interesting and a better alternative that ticks the same checkboxes.
This is "technically true, BUT..." on mid & low end mobile, OggVorbis/Opus decoding takes around 3-10% of CPU time per stream. This means:
ADPCM is a fixed-size block format. Which means arbitrary seeking is really easy to calculate back and forth. Only special care needs to be taken for the last block because e.g. an audio file with 2000 samples and 1024 samples per block means the last block is gonna have 2000 - 1024 = 976 samples of useful data and 48 samples of garbage. More importantly, if the audio files has 1500 samples, then the first block will have 1024 samples, but the last one will use 512 samples instead of 1024. Here's example code where the wanted sample is at mCurrIndex. /// The returned value is independent of the number of channels.
/// e.g. if the input has 505 samples, it returns 505 whether it's stereo or mono
inline size_t getImaNumSamplesPerBlock( const size_t blockSizeBytes, const size_t numChannels )
{
return ( blockSizeBytes - 4u * numChannels ) * ( numChannels ^ 3 ) + 1u;
}
/// The inverse of getImaNumSamplesPerBlock.
/// (numSamples - 1) must be multiple of 8
inline size_t getImaBlockSize( const size_t numSamples, const size_t numChannels )
{
return ( numSamples - 1u ) / ( numChannels ^ 3 ) + ( numChannels * 4 );
}
const size_t samplesPerBlock =
AudioFormats::getImaNumSamplesPerBlock( mBytesPerBlock, mNumChannels );
const size_t samplesInLastBlock = mNumFrames % samplesPerBlock;
const size_t framesLeft = mNumFrames - mCurrIndex;
const size_t compressedBlockIdx = mCurrIndex / samplesPerBlock;
const size_t uncompressedOffset = mCurrIndex - ( compressedBlockIdx * samplesPerBlock );
size_t blockSizeBytes = mBytesPerBlock;
if( framesLeft <= samplesInLastBlock )
{
const size_t lastBlockNumAdpcmSamples = ( ( samplesInLastBlock + 6u ) & ~7u ) + 1u;
blockSizeBytes = AudioFormats::getImaBlockSize( lastBlockNumAdpcmSamples, mNumChannels );
}
adpcm_decode_block( decodedFrames, mData + compressedBlockIdx * mBytesPerBlock, blockSizeBytes,
mNumChannels );
const int16_t * inputBuffer16 = decodedFrames + uncompressedOffset * mNumChannels; // Decoded data is here |
After a long time I'd like to show my support for a QOA encoder. Since it's a different encoding format from WAV it could be implemented as an alternative importer, similar to how there are different importers for image files (Texture2D, Texture3D, etc.). However, I don't see how encoders can be implemented without duplicating a lot of WAV import code (after all you need to ensure all WAV formats that can be imported can also be converted). If it's done this way, however, the rest is a simple link to QOA's internal encoder (that is, if I understood the code correctly). |
I might have good news.
|
Is it out of the question to reduce the bit depth before encoding? |
Godot itself converts a WAV file from all supported bit depths to 16 bit while importing (except 8-bit), so it's just a matter of spending time to implement it. I was just focused on making the basic converter work. |
This comment was marked as outdated.
This comment was marked as outdated.
How would deprecating IMA-ADPCM work? Throw a warning? Put its code within |
I suggest doing both:
|
Related to #1144.
Describe the project you are working on
The Godot editor 🙂
Describe the problem or limitation you are having in your project
Godot supports several audio formats:
WAV supports lossy compression via IMA-ADPCM, which is a compression technique originally developed in the 1970s:
Unfortunately, this compression is considered very low-quality by today's standards, with Ogg Vorbis or even MP3 performing significantly better in terms of quality. Unlike Ogg Vorbis and MP3, IMA-ADPCM compression also doesn't offer any way to balance between file size and quality with a bitrate control.
In Godot, IMA-ADPCM compression also doesn't support playing audio from a given position.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
Deprecate the IMA-ADPCM encoder and its associated import option in the WAV import dock, but keep the IMA-ADPCM decoder for run-time WAV loading once it's implemented. There are still WAV files out there with IMA-ADPCM compression, and loading those should remain supported.
Any CPU released in the last 12 years can decode several Ogg Vorbis or MP3 streams at the same time (even on mobile), so IMA-ADPCM's comparatively faster decoding speed is not relevant anymore.
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
Since Godot 4.0 was released after this proposal was opened, this proposal now only suggests deprecating rather than removing IMA-ADPCM to avoid breaking compatibility.
If this enhancement will not be used often, can it be worked around with a few lines of script?
No.
Is there a reason why this should be core and not an add-on in the asset library?
This is about removing an option from a core importer.
This proposal is about avoiding "footguns" in terms of audio quality, by steering users towards higher quality audio compression options when needed.
The text was updated successfully, but these errors were encountered: