Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vec]voxceleb convert dataset format to paddlespeech #1630

Merged
merged 13 commits into from
Apr 11, 2022

Conversation

LeoMax-Xiong
Copy link
Contributor

@LeoMax-Xiong LeoMax-Xiong commented Mar 31, 2022

PR types

PR changes

Describe

Change the voxceleb dataset format from paddleaudio.dataset's csv to paddlespeech.dataset's jsonline
#1630

@LeoMax-Xiong LeoMax-Xiong self-assigned this Mar 31, 2022
@mergify mergify bot added the Example label Mar 31, 2022
@LeoMax-Xiong LeoMax-Xiong added Vector SID/LID/etc. and removed Example labels Mar 31, 2022
@mergify mergify bot added the Example label Mar 31, 2022
@LeoMax-Xiong LeoMax-Xiong added this to the r1.0.0 milestone Mar 31, 2022
@LeoMax-Xiong LeoMax-Xiong linked an issue Mar 31, 2022 that may be closed by this pull request
examples/voxceleb/sv0/local/data.sh Outdated Show resolved Hide resolved
examples/voxceleb/sv0/local/data.sh Outdated Show resolved Hide resolved
examples/voxceleb/sv0/local/data.sh Outdated Show resolved Hide resolved
examples/voxceleb/sv0/local/make_csv_dataset_from_json.py Outdated Show resolved Hide resolved
@mergify mergify bot added the Audio label Apr 1, 2022
@mergify mergify bot added the Dataset label Apr 1, 2022
paddlespeech/vector/exps/ecapa_tdnn/train.py Outdated Show resolved Hide resolved
paddlespeech/vector/exps/ecapa_tdnn/train.py Outdated Show resolved Hide resolved
paddlespeech/vector/io/augment.py Outdated Show resolved Hide resolved
paddlespeech/vector/io/augment.py Outdated Show resolved Hide resolved
paddlespeech/vector/io/dataset.py Outdated Show resolved Hide resolved
paddlespeech/vector/exps/ecapa_tdnn/train.py Outdated Show resolved Hide resolved
paddlespeech/vector/io/dataset.py Outdated Show resolved Hide resolved
paddlespeech/vector/io/dataset.py Outdated Show resolved Hide resolved
paddlespeech/vector/io/dataset.py Show resolved Hide resolved
sample = self.data[idx]

record = {}
# To show all fields in a namedtuple: `type(sample)._fields`
for field in type(sample)._fields:
record[field] = getattr(sample, field)
for field in fields(sample):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议这么使用 dataclass.fields

@CLAassistant
Copy link

CLAassistant commented Apr 6, 2022

CLA assistant check
All committers have signed the CLA.

@LeoMax-Xiong LeoMax-Xiong marked this pull request as ready for review April 8, 2022 07:09
@LeoMax-Xiong LeoMax-Xiong merged commit 48e0177 into PaddlePaddle:develop Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[vec]use paddlespeech.dataset to prepare the voxceleb experiment
4 participants