Skip to content

A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Notifications You must be signed in to change notification settings

guynich/SpecAugmentPyTorch

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SpecAugment License

An implementation of SpecAugment for Pytorch

How to use

Install pytorch (version==1.6.0 is used for testing).

import torch
from spec_augment_pytorch import SpecAugmentTorch
from spec_augment_pytorch import visualization_spectrogram
p = {'W':40, 'F':29, 'mF':2, 'T':50, 'p':1.0, 'mT':2, 'batch':False}
specaug_fn = SpecAugmentTorch(**p)

# [batch, c, frequency, n_frame], c=1 for magnitude or mel-spec, c=2 for complex stft
complex_stft = torch.randn(1, 1, 257, 150) 
complex_stft_aug = specaug_fn(complex_stft) # [b, c, f, t]
visualization_spectrogram(complex_stft_aug[0][0], "blabla")

run command python spec_augment_pytorch.py to generate examples (processed wav and visual spectrogram).

1089-0001: spectrogram 1089-0001-SpecAug: augmented spectrogram 1089-0002: spectrogram 1089-0002-SpecAug: augmented spectrogram

Reference

[1] DemisEom/SpecAugment

[2] zcaceres/spec_augment issue17

[3] SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

About

A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%