Contact for any inquiries Email: [email protected] | Web: http://deeplyinc.com/ | Tel: (+82) 70-7459-0704
The Nonverbal Vocalization Dataset is a human nonverbal vocal sound dataset(a.k.a. vocal characterizer) consisting of 56.7 hours of short clips from 1419 speakers, crowdsourced by the general public in South Korea. Also, the dataset includes metadata such as age, sex, noise level, and quality of utterance. 16 classes of Included human nonverbal sound data contain ‘teeth-chattering’, ‘teeth-grinding’, ‘tongue-clicking’, ‘nose-blowing’, ‘coughing’, ‘yawning’, ‘throat clearing’, ‘sighing’, ‘lip-popping’, ‘lip-smacking’, ‘panting’, ’crying’, ‘laughing’, ‘sneezing’, ‘moaning’, and ‘screaming’.
Device | Android phones |
---|---|
Volume(Sample) | ~ 57(~ 0.6) hours, ~ 70,000(~ 800) utterances, ~ 18(~ 0.1) GB, ~ 1500(~ 500) speakers |
Format | wav/h5(16/44.1kHz, 16-bit, mono) |
Refer to the dataset descriptions in 'docs' for detailed description and statistics of the full set of the dataset.
The sample audio data is a subset(approximately 1%) of a much bigger dataset which were recorded under the same circumstances as these open source samples. Please contact us([email protected]) for the pricing and licensing.
- Coughing Sound [sample sound on soundcloud]
- Crying Sound [sample sound on soundcloud]
- Screaming Sound [sample sound on soundcloud]
- Moaning Sound [sample sound on soundcloud]
- Laughing Sound [sample sound on soundcloud]
- And 11 more Sound!
Click here to download entire sample data
The illustrations below are the statistics about the Deeply Nonverbal Vocalization dataset. The first two are from the sample audio data, And the others are from the full dataset. To attain more insight about the dataset, please refer to the detailed description in 'docs'.
├── dataset
│ ├── Nonverbal_Vocalization_metadata.json
│ ├── coughing
│ │ ├── 0C1S_4_8_0_27_0_1_1.wav
│ │ ├── ...
│ ├── crying
│ │ ├── 1TCO_11_10_0_20_0_0_0.wav
│ │ ├── ...
│ ├── ...
│ ├── ...
│ ├── tongue-clicking
│ │ ├── 06RU_2_7_1_38_0_0_0.wav
│ │ ├── ...
│ └── yawning
│ ├── 0DYI_5_10_1_12_0_1_0.wav
│ ├── ...
└── docs
├── Deeply\ Nonverbal\ Vocalization\ Dataset\ description_Eng.pdf
└── Deeply\ Nonverbal\ Vocalization\ Dataset\ description_Kor.pdf
Nonverbal_Vocalization_metadata.json
{
'LAA7': {'sex': 'Male',
'age': 22,
'class': ['teeth-chattering', 'teeth-grinding', 'lip-smacking']},
...
'WVST': {'sex': 'Female',
'age': 15,
'class': ['nose-blowing','coughing','yawning','throat-clearing','sighing',
'lip-popping','sneezing','screaming']}
}
{speaker_ID}_{class}_{trial}_{sex}_{age}_{location}_{quality}_{noise}.wav
Class: {0: ‘teeth-chattering’, 1: ‘teeth-grinding’, 2: ‘tongue-clicking’, 3: ‘nose-blowing’,
4: ‘coughing’, 5: ‘yawning’, 6: ‘throat-clearing’, 7: ‘sighing’, 8: ‘lip-popping’,
9: ‘lip-smacking’, 10: ‘panting’, 11: ‘crying’, 12: ‘laughing’, 13: ‘sneezing’, 14: ‘moaning’, 15: screaming’}
Sex: {0: ‘Female’, 1: ‘Male’}
Location: {0: ‘indoor’, 1: ‘outdoor’}
Quality: {0: ‘High’, 1: ‘Low’}
Noise: {0: ‘Noiseless’, 1: ‘Noisy’}
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
- Deeply Korean Read Speech Corpus
- Pairs of Korean reading the scripts with 3 text sentiments using 3 vocal sentiments. Recorded in 3 types of places, at 3 distinct distances, with 2 types of smartphone.
- Deeply Parent-Child Vocal Interaction Dataset
- The interaction of pairs of parent and child(reading fairy tales, singing children’s songs, conversing, and others).Recorded in 3 types of places, at 3 distinct distances, with 2 types of smartphone.
Tel: (+82) 70-7459-0704
Web: http://deeplyinc.com/
Email: [email protected]