Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to convert a wav file to a pkl file? #18

Open
ideal-ai-mu opened this issue Sep 27, 2024 · 0 comments
Open

How to convert a wav file to a pkl file? #18

ideal-ai-mu opened this issue Sep 27, 2024 · 0 comments

Comments

@ideal-ai-mu
Copy link

ideal-ai-mu commented Sep 27, 2024

Hey!

When using the dance model for inference, I found that I couldn't use my own wav audio files because I couldn't obtain the script that processes the wav file into a pkl file. I know that the pkl file stores audio features, especially the three features used in the dance model inference: chroma, spectral_flux, and beat_activations. I tried using the following script:
"""
from madmom.audio.signal import Signal
from madmom.audio.spectrogram import Spectrogram
from madmom.features.onsets import spectral_flux
import librosa
import numpy as np
import pickle as pkl
from madmom.features.downbeats import RNNDownBeatProcessor

wav_path='kthstreet_gLO_sFM_cAll_d02_mLO_ch01_arethafranklinrocksteady_002.wav'
pkl_path='kthstreet_gLO_sFM_cAll_d02_mLO_ch01_arethafranklinrocksteady_002_00.audio29_30fps.pkl'
with open(pkl_path, 'rb') as f:
ctrl = pkl.load(f)

signal = Signal(wav_path, sample_rate=48000)
y, sr = librosa.load(wav_path, sr=None)

chroma = librosa.feature.chroma_stft(y=y, sr=sr, n_fft=2048, hop_length=1600, n_chroma=5)

spectrogram0 = Spectrogram(signal, frame_size=2048, hop_size=1600, fmin=0.0, fmax=8000.0, num_bins=27, log=True)
spec_flux0 = spectral_flux(spectrogram0)

proc = RNNDownBeatProcessor(fps=30)
beatactivations = proc(wav_path, sr=sr)

print(ctrl)
print(chroma.shape, chroma)
print(spec_flux0.shape, spec_flux0)
print(beatactivations.shape, beatactivations)
"""
The wav file and pkl file are from the folder: \ListenDenoiseAction\data\motorica_dance.
I tried to extract the chroma, spectral_flux, and beat_activations features, but I found that they are inconsistent with the features provided in the pkl file because I couldn’t get the exact processing methods and parameters. I'm wondering if you could provide the script to process a wav file into a pkl file.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant