Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tune MERT model on the custom dataset #18

Open
kv-42 opened this issue Dec 26, 2024 · 0 comments
Open

Fine-tune MERT model on the custom dataset #18

kv-42 opened this issue Dec 26, 2024 · 0 comments

Comments

@kv-42
Copy link

kv-42 commented Dec 26, 2024

I would like to fine-tune the model on my custom audio collection to learn better reprsesentations of my specific audio dataset. I am looking forward to fine-tune the MERT-v1-330M checkpoint, at least HuggingFace model works quite good for me but I still want to improve it a bit for my task. What is the best way to accomplish that? There are also few technical details I would like to clarify:

  1. Does MERTModel class suit for fine-tuning? Looks like when task.fine_tuning parameter is set to True, it breaks the __init__ of the MERTModel since the code expecting dictionarities properties to be defined while buildling the model, but in case of task.fine_tuning being True, target_dictionary property is defined instead.
    Note: for HuBERT it looks like the custom class is used for the fine-tuning (HubertCtc) https://github.com/facebookresearch/fairseq/blob/main/examples/hubert/config/finetune/base_10h.yaml#L63
  2. What file format is expected when fine-tuning the model? Should it be:
    TASK_LABELS_POSTFIX='["encodec_0","encodec_1","encodec_2","encodec_3","encodec_4","encodec_5","encodec_6","encodec_7"]'
    or something else?
  3. scripts/prepare_manifest.py and scripts/prepare_codecs_from_manifest.py produce train.codec_.txt files while it look like according to existing manifest files the expecting extensions are train.encodec_.txt files (at least for training from scractch). What files are expected?
  4. I still see that in case task.fine_tuning is False, code is trying to load the existing checkpoint for continual training but it with:
    RuntimeError: Cannot resume training due to dataloader mismatch, please report this to the fairseq developers. You can relaunch training with --reset-dataloader and it should work.
    It looks like I still can set reset-dataloader, but I wonder whether it is the correct way for the fine-tuning.

In general, I would love to have an example of configuration for the fine-tuning if possible or some hints to make it working.

Thank you for the great job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant