You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to fine-tune the model on my custom audio collection to learn better reprsesentations of my specific audio dataset. I am looking forward to fine-tune the MERT-v1-330M checkpoint, at least HuggingFace model works quite good for me but I still want to improve it a bit for my task. What is the best way to accomplish that? There are also few technical details I would like to clarify:
Does MERTModel class suit for fine-tuning? Looks like when task.fine_tuning parameter is set to True, it breaks the __init__ of the MERTModel since the code expecting dictionarities properties to be defined while buildling the model, but in case of task.fine_tuning being True, target_dictionary property is defined instead.
Note: for HuBERT it looks like the custom class is used for the fine-tuning (HubertCtc) https://github.com/facebookresearch/fairseq/blob/main/examples/hubert/config/finetune/base_10h.yaml#L63
What file format is expected when fine-tuning the model? Should it be: TASK_LABELS_POSTFIX='["encodec_0","encodec_1","encodec_2","encodec_3","encodec_4","encodec_5","encodec_6","encodec_7"]'
or something else?
scripts/prepare_manifest.py and scripts/prepare_codecs_from_manifest.py produce train.codec_.txt files while it look like according to existing manifest files the expecting extensions are train.encodec_.txt files (at least for training from scractch). What files are expected?
I still see that in case task.fine_tuning is False, code is trying to load the existing checkpoint for continual training but it with: RuntimeError: Cannot resume training due to dataloader mismatch, please report this to the fairseq developers. You can relaunch training with --reset-dataloader and it should work.
It looks like I still can set reset-dataloader, but I wonder whether it is the correct way for the fine-tuning.
In general, I would love to have an example of configuration for the fine-tuning if possible or some hints to make it working.
Thank you for the great job!
The text was updated successfully, but these errors were encountered:
I would like to fine-tune the model on my custom audio collection to learn better reprsesentations of my specific audio dataset. I am looking forward to fine-tune the MERT-v1-330M checkpoint, at least HuggingFace model works quite good for me but I still want to improve it a bit for my task. What is the best way to accomplish that? There are also few technical details I would like to clarify:
MERTModel
class suit for fine-tuning? Looks like whentask.fine_tuning
parameter is set to True, it breaks the__init__
of theMERTModel
since the code expectingdictionarities
properties to be defined while buildling the model, but in case oftask.fine_tuning
being True,target_dictionary
property is defined instead.Note: for HuBERT it looks like the custom class is used for the fine-tuning (HubertCtc) https://github.com/facebookresearch/fairseq/blob/main/examples/hubert/config/finetune/base_10h.yaml#L63
TASK_LABELS_POSTFIX='["encodec_0","encodec_1","encodec_2","encodec_3","encodec_4","encodec_5","encodec_6","encodec_7"]'
or something else?
scripts/prepare_manifest.py
andscripts/prepare_codecs_from_manifest.py
produce train.codec_.txt files while it look like according to existing manifest files the expecting extensions are train.encodec_.txt files (at least for training from scractch). What files are expected?task.fine_tuning
is False, code is trying to load the existing checkpoint for continual training but it with:RuntimeError: Cannot resume training due to dataloader mismatch, please report this to the fairseq developers. You can relaunch training with
--reset-dataloaderand it should work.
It looks like I still can set reset-dataloader, but I wonder whether it is the correct way for the fine-tuning.
In general, I would love to have an example of configuration for the fine-tuning if possible or some hints to make it working.
Thank you for the great job!
The text was updated successfully, but these errors were encountered: