You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the dance model for inference, I found that I couldn't use my own wav audio files because I couldn't obtain the script that processes the wav file into a pkl file. I know that the pkl file stores audio features, especially the three features used in the dance model inference: chroma, spectral_flux, and beat_activations. I tried using the following script:
"""
from madmom.audio.signal import Signal
from madmom.audio.spectrogram import Spectrogram
from madmom.features.onsets import spectral_flux
import librosa
import numpy as np
import pickle as pkl
from madmom.features.downbeats import RNNDownBeatProcessor
wav_path='kthstreet_gLO_sFM_cAll_d02_mLO_ch01_arethafranklinrocksteady_002.wav'
pkl_path='kthstreet_gLO_sFM_cAll_d02_mLO_ch01_arethafranklinrocksteady_002_00.audio29_30fps.pkl'
with open(pkl_path, 'rb') as f:
ctrl = pkl.load(f)
signal = Signal(wav_path, sample_rate=48000)
y, sr = librosa.load(wav_path, sr=None)
print(ctrl)
print(chroma.shape, chroma)
print(spec_flux0.shape, spec_flux0)
print(beatactivations.shape, beatactivations)
"""
The wav file and pkl file are from the folder: \ListenDenoiseAction\data\motorica_dance.
I tried to extract the chroma, spectral_flux, and beat_activations features, but I found that they are inconsistent with the features provided in the pkl file because I couldn’t get the exact processing methods and parameters. I'm wondering if you could provide the script to process a wav file into a pkl file.
Thanks!
The text was updated successfully, but these errors were encountered:
Hey!
When using the dance model for inference, I found that I couldn't use my own wav audio files because I couldn't obtain the script that processes the wav file into a pkl file. I know that the pkl file stores audio features, especially the three features used in the dance model inference: chroma, spectral_flux, and beat_activations. I tried using the following script:
"""
from madmom.audio.signal import Signal
from madmom.audio.spectrogram import Spectrogram
from madmom.features.onsets import spectral_flux
import librosa
import numpy as np
import pickle as pkl
from madmom.features.downbeats import RNNDownBeatProcessor
wav_path='kthstreet_gLO_sFM_cAll_d02_mLO_ch01_arethafranklinrocksteady_002.wav'
pkl_path='kthstreet_gLO_sFM_cAll_d02_mLO_ch01_arethafranklinrocksteady_002_00.audio29_30fps.pkl'
with open(pkl_path, 'rb') as f:
ctrl = pkl.load(f)
signal = Signal(wav_path, sample_rate=48000)
y, sr = librosa.load(wav_path, sr=None)
chroma = librosa.feature.chroma_stft(y=y, sr=sr, n_fft=2048, hop_length=1600, n_chroma=5)
spectrogram0 = Spectrogram(signal, frame_size=2048, hop_size=1600, fmin=0.0, fmax=8000.0, num_bins=27, log=True)
spec_flux0 = spectral_flux(spectrogram0)
proc = RNNDownBeatProcessor(fps=30)
beatactivations = proc(wav_path, sr=sr)
print(ctrl)
print(chroma.shape, chroma)
print(spec_flux0.shape, spec_flux0)
print(beatactivations.shape, beatactivations)
"""
The wav file and pkl file are from the folder: \ListenDenoiseAction\data\motorica_dance.
I tried to extract the chroma, spectral_flux, and beat_activations features, but I found that they are inconsistent with the features provided in the pkl file because I couldn’t get the exact processing methods and parameters. I'm wondering if you could provide the script to process a wav file into a pkl file.
Thanks!
The text was updated successfully, but these errors were encountered: