You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From the first three constants, we can compute num_frames = AUDIO_SHAPE / SR * FPS = 67267 / 16000 * 15 = 63.06281249999999 which is about one whole frame less than FRAMES_PER_SAMPLE.
We have encountered this problem when we were trying to test the model on a longer audio sequence, for which the misalignment is magnified.
The text was updated successfully, but these errors were encountered:
According to the file
common/consts.py
, we know thatFrom the first three constants, we can compute
num_frames = AUDIO_SHAPE / SR * FPS = 67267 / 16000 * 15 = 63.06281249999999
which is about one whole frame less thanFRAMES_PER_SAMPLE
.We have encountered this problem when we were trying to test the model on a longer audio sequence, for which the misalignment is magnified.
The text was updated successfully, but these errors were encountered: