Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How could we preprocess the raw dataset from ".wav"&".bvh" files to ".pkl"? #17

Open
DarLikeStudy opened this issue Aug 10, 2024 · 6 comments

Comments

@DarLikeStudy
Copy link

Thanks for your work!
It is an outstanding work for dance motion generation.
I read your "LDA" paper over and over again. And I really hope to know your methods to preprocess the raw data by Madmom and Pymo. I am confused about the preprocessing for music features named Spectralflux, Chroma, Beat and Beatactivation and motion features with sklearn.pipeline way from .bvh to MocapPara and so on.
Could you share the preprocessing scripts for the dataset?
Sorry for any possible trouble caused.

@DarLikeStudy DarLikeStudy changed the title How can we preprocess the raw dataset from ".wav"&".bvh" files to ".pkl"? How could we preprocess the raw dataset from ".wav"&".bvh" files to ".pkl"? Aug 10, 2024
@AbelDoc
Copy link

AbelDoc commented Aug 10, 2024

+1
Although I'm not 100% sure, from reading the code digging into the .sav containing the data processing pipeline used to train the dance model, I came up with some information that could be useful to you.
They use sklearn.pipeline to process their data, which you can find an example in one of their other repo that I used:
https://github.com/simonalexanderson/StyleGestures/blob/master/data_processing/prepare_gesture_datasets.py
From reading the .sav file I found that they are using the following pipeline for motion data:
Pipeline([
"jtsel": JointSelector(joints=["Spine", "Spine1", "Neck", "Head", "RightShoulder", "RightArm", "RightForeArm", "RightHand", "LeftShoulder", "LeftArm", "LeftForeArm", "LeftHand", "RightUpLeg", "RightLeg", "RightFoot", "LeftUpLeg", "LeftLeg", "LeftFoot"], include_root=True),
"root": RootTransformer(method="pos_rot_deltas", position_smoothing=..., rotation_smoothing=...),
"feats": MocapParameterizer(param_type="expmap", ref_pose=...),
"cnst": ConstantRemover(),
"cnt": FeatureCounter(),
"npy": Numpyfier(),
])
Field with '...' indicate that I was not able to know the values here, however they are different from the defaults one of the code for sure otherwise the field would not be part of the .sav file.
The output will be a numpy array, so I think you just need to use pickle to save that into a .pkl file and you should good to go. For audio features I cannot help, for my application I will just go for MFCCs features, but anyway again from reading the code you need to save your features into a panda dataframe with columns with the feature names (you can find examples of the columns names in the .txt files in the data folder), and the values being your features, save this dataframe into a .pkl file and it should also be good.
I will try to train the network using those information, I will edit this message if it goes well.

Hope it helps !

@DarLikeStudy
Copy link
Author

+1 Although I'm not 100% sure, from reading the code digging into the .sav containing the data processing pipeline used to train the dance model, I came up with some information that could be useful to you. They use sklearn.pipeline to process their data, which you can find an example in one of their other repo that I used: https://github.com/simonalexanderson/StyleGestures/blob/master/data_processing/prepare_gesture_datasets.py From reading the .sav file I found that they are using the following pipeline for motion data: Pipeline([ "jtsel": JointSelector(joints=["Spine", "Spine1", "Neck", "Head", "RightShoulder", "RightArm", "RightForeArm", "RightHand", "LeftShoulder", "LeftArm", "LeftForeArm", "LeftHand", "RightUpLeg", "RightLeg", "RightFoot", "LeftUpLeg", "LeftLeg", "LeftFoot"], include_root=True), "root": RootTransformer(method="pos_rot_deltas", position_smoothing=..., rotation_smoothing=...), "feats": MocapParameterizer(param_type="expmap", ref_pose=...), "cnst": ConstantRemover(), "cnt": FeatureCounter(), "npy": Numpyfier(), ]) Field with '...' indicate that I was not able to know the values here, however they are different from the defaults one of the code for sure otherwise the field would not be part of the .sav file. The output will be a numpy array, so I think you just need to use pickle to save that into a .pkl file and you should good to go. For audio features I cannot help, for my application I will just go for MFCCs features, but anyway again from reading the code you need to save your features into a panda dataframe with columns with the feature names (you can find examples of the columns names in the .txt files in the data folder), and the values being your features, save this dataframe into a .pkl file and it should also be good. I will try to train the network using those information, I will edit this message if it goes well.

Hope it helps !

Thanks for the reply! I have already tied to use the sklearn.pipeline to preprocess the dataset. For LDA I tied the data_pipe.expmap_30fps.sav on the original motorica_dance dataset, through the [prepare_gesture_datasets.py] way. If I use my own JointSelector, how to name the index? It is confusing for me about the way to name the index for the motion features column of the numpy array about 'Hips_Yposition', 'reference_dXposition', 'reference_dZposition','reference_dYrotation' and so on.

@AbelDoc
Copy link

AbelDoc commented Aug 21, 2024

The JointSelector uses joints names directly so Hips, Spine, Spine1, ...
Those other name are what is stored in the dataframe but the JointSelector will save all column with X_Y with X the joint name you gave.

I was sucessful in launching a training that did seems to converge well however the inference process was completely off and was instable with no proper results. Unfortunately I don't have more time to spend on it so I will have to exclude this model from my research :/

@DarLikeStudy
Copy link
Author

I tied the .sav files to preprocess the data. During the model training, Epoch 0's synthesis period error: synthesizen DataLoader 0: 0%| | 0/235 [00:00<?, ?it/s], AttributeError: 'Numpyfier' object has no attribute 'org_mocap'_.
My .bvh data don't have same joint name as Motorica_dance so I can't use the original .sav in the dataset. It seems failed to decode the data from .pkl file type data to .bvh data.

@AbelDoc
Copy link

AbelDoc commented Aug 26, 2024

You cannot use their .sav file if you do not have the same joint names, you need to create your own pipeline with the Pipeline object I described above
It also means that you need to create your own .pkl file
The .sav file they provide can only work for those precise .pkl file thus only work for the Motorica dance dataset unfortunately

If you want to try use their network on other data you have to:
Create a pipeline, process the data, save the pipeline as .sav, save the data as .pkl and then use their network

But even with that this network seems really hard to train from scratch :/

@DarLikeStudy
Copy link
Author

Yes, I understand. I find the problem and created my own Pipeline to process my own motion data by using my own JointSelector, at which the error still occurred during training the model. I will try to edit the .bvh before dataset preprocess.
I will also consider other dance dataset and projects as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants