[vec]voxceleb convert dataset format to paddlespeech #1630

LeoMax-Xiong · 2022-03-31T02:30:21Z

PR types

PR changes

Describe

Change the voxceleb dataset format from paddleaudio.dataset's csv to paddlespeech.dataset's jsonline
#1630

examples/voxceleb/sv0/local/data.sh

examples/voxceleb/sv0/local/make_csv_dataset_from_json.py

examples/voxceleb/sv0/local/make_rirs_noise_csv_dataset_from_json.py

paddlespeech/vector/exps/ecapa_tdnn/train.py

paddlespeech/vector/io/augment.py

paddlespeech/vector/io/dataset.py

examples/voxceleb/sv0/local/make_rirs_noise_csv_dataset_from_json.py

paddlespeech/vector/exps/ecapa_tdnn/train.py

paddlespeech/vector/io/dataset.py

zh794390558 · 2022-04-02T15:39:11Z

paddlespeech/vector/io/dataset.py

        sample = self.data[idx]

        record = {}
        # To show all fields in a namedtuple: `type(sample)._fields`
-        for field in type(sample)._fields:
-            record[field] = getattr(sample, field)
+        for field in fields(sample):


建议这么使用 dataclass.fields

CLAassistant · 2022-04-06T15:56:38Z

All committers have signed the CLA.

convert jsonfile to csv file

ec24a16

LeoMax-Xiong requested review from qingen and zh794390558 March 31, 2022 02:30

LeoMax-Xiong self-assigned this Mar 31, 2022

mergify bot added the Example label Mar 31, 2022

LeoMax-Xiong added Vector SID/LID/etc. and removed Example labels Mar 31, 2022

mergify bot added the Example label Mar 31, 2022

LeoMax-Xiong added this to the r1.0.0 milestone Mar 31, 2022

LeoMax-Xiong linked an issue Mar 31, 2022 that may be closed by this pull request

[vec]use paddlespeech.dataset to prepare the voxceleb experiment #1618

Closed

zh794390558 reviewed Mar 31, 2022

View reviewed changes

mergify bot added the Audio label Apr 1, 2022

LeoMax-Xiong added 2 commits April 1, 2022 19:03

convert rirs noise to csv file

9944fec

add voxceleb and rirs noise dataset

965f486

mergify bot added the Dataset label Apr 1, 2022

train process add new voxceleb and rirs dataset, test=doc

5b05300

zh794390558 reviewed Apr 2, 2022

View reviewed changes

LeoMax-Xiong added 2 commits April 2, 2022 21:10

add vector csv dataset format, test=doc

30b5b3c

add some annotations, test=doc

57c11dc

zh794390558 reviewed Apr 2, 2022

View reviewed changes

LeoMax-Xiong added 3 commits April 3, 2022 19:50

change the vector csv.spk_id to csv.label, test=doc

acebfad

test.py update the CSVDataset, test=doc

ebfe3e6

refactor voxceleb2 data download, test=doc

38e4e9c

update the note, test=doc

a8244dc

LeoMax-Xiong marked this pull request as ready for review April 8, 2022 07:09

LeoMax-Xiong added 3 commits April 9, 2022 14:23

fix rirs noise download bug, test=doc

2b4b3e1

wrap the embedding mean and std norm, test=doc

567286a

fix vector ips log bug, test=doc

4af007c

qingen approved these changes Apr 11, 2022

View reviewed changes

LeoMax-Xiong merged commit 48e0177 into PaddlePaddle:develop Apr 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vec]voxceleb convert dataset format to paddlespeech #1630

[vec]voxceleb convert dataset format to paddlespeech #1630

LeoMax-Xiong commented Mar 31, 2022 •

edited

Loading

zh794390558 Apr 2, 2022

CLAassistant commented Apr 6, 2022 •

edited

Loading

[vec]voxceleb convert dataset format to paddlespeech #1630

[vec]voxceleb convert dataset format to paddlespeech #1630

Conversation

LeoMax-Xiong commented Mar 31, 2022 • edited Loading

PR types

PR changes

Describe

zh794390558 Apr 2, 2022

Choose a reason for hiding this comment

CLAassistant commented Apr 6, 2022 • edited Loading

LeoMax-Xiong commented Mar 31, 2022 •

edited

Loading

CLAassistant commented Apr 6, 2022 •

edited

Loading