-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing Rubber Band padding, preventing it from eating the initial transient #11120
Changes from 6 commits
dabf1d4
d4ad20d
e1db59a
5d4f5c6
3fec0ea
1769fea
4ebe88f
78f5252
755a4a4
9b8e82a
b73bdc0
c108ac9
d3aa210
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,5 @@ | ||
#include "engine/bufferscalers/enginebufferscalerubberband.h" | ||
|
||
#include <rubberband/RubberBandStretcher.h> | ||
|
||
#include <QtDebug> | ||
|
||
#include "control/controlobject.h" | ||
|
@@ -19,7 +17,7 @@ namespace { | |
// TODO (XXX): this should be removed. It is only needed to work around | ||
// a Rubberband 1.3 bug. | ||
// This is the default increment from RubberBand 1.8.1. | ||
size_t kRubberBandBlockSize = 256; | ||
constexpr size_t kRubberBandBlockSize = 256; | ||
|
||
#define RUBBERBANDV3 (RUBBERBAND_API_MAJOR_VERSION >= 2 && RUBBERBAND_API_MINOR_VERSION >= 7) | ||
|
||
|
@@ -28,22 +26,16 @@ size_t kRubberBandBlockSize = 256; | |
EngineBufferScaleRubberBand::EngineBufferScaleRubberBand( | ||
ReadAheadManager* pReadAheadManager) | ||
: m_pReadAheadManager(pReadAheadManager), | ||
m_buffer_back(SampleUtil::alloc(MAX_BUFFER_LEN)), | ||
m_buffers{mixxx::SampleBuffer(MAX_BUFFER_LEN), mixxx::SampleBuffer(MAX_BUFFER_LEN)}, | ||
m_bufferPtrs{m_buffers[0].data(), m_buffers[1].data()}, | ||
m_interleavedReadBuffer(MAX_BUFFER_LEN), | ||
m_bBackwards(false), | ||
m_useEngineFiner(false) { | ||
m_retrieve_buffer[0] = SampleUtil::alloc(MAX_BUFFER_LEN); | ||
m_retrieve_buffer[1] = SampleUtil::alloc(MAX_BUFFER_LEN); | ||
// Initialize the internal buffers to prevent re-allocations | ||
// in the real-time thread. | ||
onSampleRateChanged(); | ||
} | ||
|
||
EngineBufferScaleRubberBand::~EngineBufferScaleRubberBand() { | ||
SampleUtil::free(m_buffer_back); | ||
SampleUtil::free(m_retrieve_buffer[0]); | ||
SampleUtil::free(m_retrieve_buffer[1]); | ||
} | ||
|
||
void EngineBufferScaleRubberBand::setScaleParameters(double base_rate, | ||
double* pTempoRatio, | ||
double* pPitchRatio) { | ||
|
@@ -142,7 +134,7 @@ void EngineBufferScaleRubberBand::clear() { | |
VERIFY_OR_DEBUG_ASSERT(m_pRubberBand) { | ||
return; | ||
} | ||
m_pRubberBand->reset(); | ||
reset(); | ||
} | ||
|
||
SINT EngineBufferScaleRubberBand::retrieveAndDeinterleave( | ||
|
@@ -151,22 +143,25 @@ SINT EngineBufferScaleRubberBand::retrieveAndDeinterleave( | |
SINT frames_available = m_pRubberBand->available(); | ||
SINT frames_to_read = math_min(frames_available, frames); | ||
SINT received_frames = static_cast<SINT>(m_pRubberBand->retrieve( | ||
m_retrieve_buffer, frames_to_read)); | ||
m_bufferPtrs.data(), frames_to_read)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to read here frames_to_read + frame_offset; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, did you miss to push a commit? The function expects There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See 755a4a4. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The linked code makes that the function returns less frames than it might could have read. So we should not clamp the value to frames , but to frames + frame_offset in line 132 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No we should not. I can't check what
It can only read a small window of output at a time. Presumably this is 1/4th of the FFT window size. So 1024 samples. Which means that the first two calls to this function (each producing 1024 samples of garbage output) should be thrown away. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added a bit debug code to shows the issue after a seek:
In the second call, we read only 256 frames from the available 512. The available frames are stacked up before they are reduced again. We need to read all available frames in that case. This needs to be fixed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh yeah I see what you mean now! Fixed in d3aa210.
|
||
|
||
DEBUG_ASSERT(received_frames <= static_cast<ssize_t>(m_buffers[0].size())); | ||
|
||
SampleUtil::interleaveBuffer(pBuffer, | ||
m_retrieve_buffer[0], | ||
m_retrieve_buffer[1], | ||
received_frames); | ||
m_buffers[0].data(), | ||
m_buffers[1].data(), | ||
received_frames); | ||
return received_frames; | ||
} | ||
|
||
void EngineBufferScaleRubberBand::deinterleaveAndProcess( | ||
const CSAMPLE* pBuffer, SINT frames, bool flush) { | ||
DEBUG_ASSERT(frames <= static_cast<ssize_t>(m_buffers[0].size())); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When I try to compile this locally on Windows with VS2022, it fails with an error, that identifier ssize_t is undefined. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I didn't realize There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Afaik its part of the C standard library. I guess you could also There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't appear to be. Libusb and the other libraries Mixxx uses that also use |
||
|
||
SampleUtil::deinterleaveBuffer( | ||
m_retrieve_buffer[0], m_retrieve_buffer[1], pBuffer, frames); | ||
m_buffers[0].data(), m_buffers[1].data(), pBuffer, frames); | ||
|
||
m_pRubberBand->process(m_retrieve_buffer, | ||
m_pRubberBand->process(m_bufferPtrs.data(), | ||
frames, | ||
flush); | ||
} | ||
|
@@ -199,10 +194,10 @@ double EngineBufferScaleRubberBand::scaleBuffer( | |
read += getOutputSignal().frames2samples(received_frames); | ||
|
||
if (break_out_after_retrieve_and_reset_rubberband) { | ||
//qDebug() << "break_out_after_retrieve_and_reset_rubberband"; | ||
// qDebug() << "break_out_after_retrieve_and_reset_rubberband"; | ||
// If we break out early then we have flushed RubberBand and need to | ||
// reset it. | ||
m_pRubberBand->reset(); | ||
reset(); | ||
break; | ||
} | ||
|
||
|
@@ -221,26 +216,26 @@ double EngineBufferScaleRubberBand::scaleBuffer( | |
iLenFramesRequired = kRubberBandBlockSize; | ||
} | ||
} | ||
//qDebug() << "iLenFramesRequired" << iLenFramesRequired; | ||
// qDebug() << "iLenFramesRequired" << iLenFramesRequired; | ||
|
||
if (remaining_frames > 0 && iLenFramesRequired > 0) { | ||
SINT iAvailSamples = m_pReadAheadManager->getNextSamples( | ||
// The value doesn't matter here. All that matters is we | ||
// are going forward or backward. | ||
(m_bBackwards ? -1.0 : 1.0) * m_dBaseRate * m_dTempoRatio, | ||
m_buffer_back, | ||
getOutputSignal().frames2samples(iLenFramesRequired)); | ||
// The value doesn't matter here. All that matters is we | ||
// are going forward or backward. | ||
(m_bBackwards ? -1.0 : 1.0) * m_dBaseRate * m_dTempoRatio, | ||
m_interleavedReadBuffer.data(), | ||
getOutputSignal().frames2samples(iLenFramesRequired)); | ||
SINT iAvailFrames = getOutputSignal().samples2frames(iAvailSamples); | ||
|
||
if (iAvailFrames > 0) { | ||
last_read_failed = false; | ||
deinterleaveAndProcess(m_buffer_back, iAvailFrames, false); | ||
deinterleaveAndProcess(m_interleavedReadBuffer.data(), iAvailFrames, false); | ||
} else { | ||
if (last_read_failed) { | ||
// Flush and break out after the next retrieval. If we are | ||
// at EOF this serves to get the last samples out of | ||
// RubberBand. | ||
deinterleaveAndProcess(m_buffer_back, 0, true); | ||
deinterleaveAndProcess(m_interleavedReadBuffer.data(), 0, true); | ||
break_out_after_retrieve_and_reset_rubberband = true; | ||
} | ||
last_read_failed = true; | ||
|
@@ -286,3 +281,40 @@ int EngineBufferScaleRubberBand::runningEngineVersion() { | |
return 2; | ||
#endif | ||
} | ||
|
||
void EngineBufferScaleRubberBand::reset() { | ||
m_pRubberBand->reset(); | ||
|
||
// As mentioned in the docs (https://breakfastquay.com/rubberband/code-doc/) | ||
// and FAQ (https://breakfastquay.com/rubberband/integration.html#faqs), you | ||
// need to run some silent samples through the time stretching engine first | ||
// before using it. Otherwise it will eat add a short fade-in, destroying | ||
// the initial transient. | ||
#if RUBBERBANDV3 | ||
size_t remaining_padding = m_pRubberBand->getPreferredStartPad(); | ||
Swiftb0y marked this conversation as resolved.
Show resolved
Hide resolved
|
||
#else | ||
// This _should_ be equal to the latency in older Rubber Band versions: | ||
// https://github.com/breakfastquay/rubberband/blob/c5f99d5ff2cba2f4f1def6c38c7843bbb9ac7a78/main/main.cpp#L652 | ||
size_t remaining_padding = m_pRubberBand->getLatency(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The input and output padding is different by the playback rate. The interesting commit is this: So we need to do here something similar: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yeah good catch. Looking at the changes to the actual time stretchers it seems like that's how you get the new There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
#endif | ||
std::fill_n(m_buffers[0].span().begin(), kRubberBandBlockSize, 0.0f); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add a comment that this filling zeros is the correct solution here, because we always fade in from zero after seek to avoid click sounds. This means that the first half of the buffer needs to be zero anyway. I have verified it with a recording in Audacity that this looks good. My fist assumption to use here real samples is correct but they have zero gain = 0. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes the idea of the zero padding is that if the STFT process is working correctly, then those samples will not have any impact on the rest of the output. See my comment above. I added a link to the comment in the code. |
||
std::fill_n(m_buffers[1].span().begin(), kRubberBandBlockSize, 0.0f); | ||
while (remaining_padding > 0) { | ||
const size_t pad_samples = std::min<size_t>(remaining_padding, kRubberBandBlockSize); | ||
m_pRubberBand->process(m_bufferPtrs.data(), pad_samples, false); | ||
|
||
remaining_padding -= pad_samples; | ||
} | ||
|
||
#if RUBBERBANDV3 | ||
size_t padding_to_drop = m_pRubberBand->getStartDelay(); | ||
#else | ||
size_t padding_to_drop = m_pRubberBand->getLatency(); | ||
#endif | ||
while (padding_to_drop > 0) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to call while m_pRubberBand->available() before this loop. At least in the rubberband main, this is done not immediately after filling in the padding. I am afraid dropping at this point is not correct, because we likely need to keep the rubber band buffers filled with at least the startDelay() to perform windowing around the current sample. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was also also concerned how big the internal buffer of rubberband is, since I see here memory allocation: The buffer is initial The default start pads is All this makes kind of sense. To retrieve 1 sample, we need half of a window samples before the sample and half a window silence after it. In case of Mixxx, and after seeking within a track, it is not true that these samples are zero. So I think it is required to use the real samples before the start instead of zero. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The confusing point is that we also have the available() function. In theory there must not bee any output available before not receive more than a whole m_aWindowSize. Is this the case? I think the initial getSamplesRequired() return m_aWindowSize. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The padding is half of the window size, so that should definitely fit in Rubber Band's buffers. And thanks to the
That was part of my experiments in solving #11125, but it's a bit more difficult than it seems. Especially because the read ahead manager is shared by multiple things and reading data through it invokes a bunch of side effects. Older samples also may not be cached (I've noticed this causing issues when beat jumping somewhere for the first time that weren't present when not reading past samples). I have not yet had time to dig into this further, so if you have suggestions for how to compensate for the latency here without messing with the rest of the deck's state (so switching back and forth between linear interpolation and Rubber Band time stretching should be seamless and not shift any timings around) then that would be great.
Yes, like I mentioned before, they're using an STFT (Short-time Fourier Transform). It's an overlap-add process where you a window function to a chunk of the input data, take the discrete Fourier transform of that, process the frequency domain signal, take the inverse discrete Fourier transform of that, apply another window function, and then add that to the output buffer, partially overlapping the previous output. The reason why the required padding is half of the input buffer is because most window functions used for these things are raised cosine windows, like the Hann/Hanning function I linked below. Because you apply (= multiple the chunk of audio with) the window twice, once before the DFT, and once after the IDFT, you've effectively applied the square of the window function to the signal if you didn't do anything to the signal in the frequency domain. And if you try adding up squared Hann functions with a quarter period difference between them, you'll notice that after a period the resulting function forms a flat line. That explains why the required padding is half a window, and why you get that smooth fade in if you don't add the padding. Some wikipedia inks if you're curious about how and why this all works: https://en.wikipedia.org/wiki/Short-time_Fourier_transform There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But yeah I guess you are right, we shouldn't be reading and dropping samples immediately. We should be doing the same thing the rubberband cli is doing, and process the audio as normal but drop the first |
||
const size_t drop_samples = std::min<size_t>(padding_to_drop, kRubberBandBlockSize); | ||
m_pRubberBand->retrieve(m_bufferPtrs.data(), drop_samples); | ||
|
||
padding_to_drop -= drop_samples; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you able to update the comment?
Our oldest supported version is 1.8.2 from Ubuntu Focal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, completely remove the block size limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
I have double checked, the bug has been fixed in Rubberband 1.7
breakfastquay/rubberband@c26dc1d
By the way, Rubberband will never request more than getPreferredStartPad() * 2
https://github.com/breakfastquay/rubberband/blob/de56cd114a678003dfef17abbd74ebd9203964eb/src/finer/R3Stretcher.cpp#L565
Maybe we can use this info to improve our buffers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted most of adb034b in 9b8e82a. It Works On My System ™️ with librubberband 3.1.2.