Sound classification using Recurrent Neural Networks

This repository is a RNN implementation using Tensorflow, to classify audio clips of different lengths. The input of the neural networks is not the raw sound, but the MFCC features (20 features).

As shown in the the following figure, the audio files are divided in sub-samples of 2 seconds, after it was transformed in MFCC features. The results of the preprocessing is a list of sequences with 20 features, with a fixed length (here, the file produces 3 sequences).

If necessary, the sequences are padded with 0 so the input of the neural network is fixed. But the network is able to retreive the effective time length and get rid of the 0 to be more efficient.

Since one file can be composed of several sequences, the results of sequences corresponding to one file are averaged so one label is given per file.

I used this network to classify sounds for my first kaggle competition, but I still need to dig into the data to improve the result.

Sources

this repository and this notebook helped me to understand the mfcc features extraction.
this post explains how to take into account the variable length of the sequences.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
gallery		gallery
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
RNN.py		RNN.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sound classification using Recurrent Neural Networks

Sources

About

Releases

Packages

Languages

License

fabien-brulport/RNN-Sound-classification

Folders and files

Latest commit

History

Repository files navigation

Sound classification using Recurrent Neural Networks

Sources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages