You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m currently working on fine-tuning the 20220506_u2pp_conformer_exp_wenetspeech model in Windows using the .pt file. My dataset consists of WAV files and their corresponding SRT files. I’ve processed my data into four files: spk2utt, text, utt2spk, and wav.scp, intending to use these as input for training.
In the fine-tuning process:
My program calls the init_dataset_and_dataloader function in wenet.utils.train_utils.py to initialize the dataset and dataloader.
Inside init_dataset_and_dataloader, it calls the init_dataset function from wenet/utils/init_dataset.py as follows:
This results in the train_dataset having a Train dataset size of 0 in step (2).
Here are my questions:
Based on the above steps, are there any obvious mistakes in my Fine-tuning data setup, function usage, or even data format?
Are there any available Fine-tuning examples I can refer to? I don’t need the original data, but would appreciate guidance on the configuration and workflow.
Thank you for your help!
The text was updated successfully, but these errors were encountered:
大家好,
我目前在 Windows 系统中使用
20220506_u2pp_conformer_exp_wenetspeech
模型的.pt
文件进行微调。我的数据集由 WAV 文件和对应的 SRT 文件组成。我将数据处理为以下四个文件:spk2utt
、text
、utt2spk
和wav.scp
,并希望以这些文件作为输入数据。在微调的过程中:
wenet.utils.train_utils.py
文件中的init_dataset_and_dataloader
函数,用于初始化数据。init_dataset_and_dataloader
函数内部调用了init_dataset.py
文件中的init_dataset
函数,其调用链如下:init_dataset.py
文件中,init_dataset
函数调用了init_asr_dataset
,通过以下代码调用dataset.py
文件中的Dataset
类:train_dataset
中的 Train dataset size 为 0。我的问题是:
谢谢大家的帮助!
============================================
Hello everyone,
I’m currently working on fine-tuning the
20220506_u2pp_conformer_exp_wenetspeech
model in Windows using the.pt
file. My dataset consists of WAV files and their corresponding SRT files. I’ve processed my data into four files:spk2utt
,text
,utt2spk
, andwav.scp
, intending to use these as input for training.In the fine-tuning process:
init_dataset_and_dataloader
function inwenet.utils.train_utils.py
to initialize the dataset and dataloader.init_dataset_and_dataloader
, it calls theinit_dataset
function fromwenet/utils/init_dataset.py
as follows:init_dataset.py
, theinit_asr_dataset
function is invoked, which subsequently calls theDataset
class indataset.py
as follows:train_dataset
having a Train dataset size of 0 in step (2).Here are my questions:
Thank you for your help!
The text was updated successfully, but these errors were encountered: