Based on: SciPy, NumPy, scikit-learn.
Codes in mfcc.py
partly originates from scikits.talkbox.
Developed on python 3.
The program currently uses MFCC(Mel Frequency Cepstral Coefficents), △MFCC and △△MFCC as the features coefficients.
SVM is used as the classifier and is trained under the "one-against-one" approach.
Follow these steps to run it:
- Convert all the training and testing audios to 16bit/32bit/floating-point
.wav
files. pydub may help you convert MP3 to WAV. - Arrange the training audios in this structure:
+ Store the audios played by a same instrument in a same folder.
+ Name the folders the instruments' names.
+ Put all the folders in a same path.
+ Make sure there aren't any audios that are not training audios contained in the path. - Put the testing audios together in one folder. The structure should look like this:
training audios/
|
|-piano/
| |-*.wav
| |-*.wav
| |-...
|
|-guitar/
| |-*.wav
| |-*.wav
| |-...
|
|-violin/
| |-*.wav
| |-*.wav
| |-...
|
|-...
testing audios/
|-*.wav
|-*.wav
|-...
- Run
generateMFCC.py
. Follow the program's instruction and enter the path of the training audios (In the above example, it is the path of the folder called "training audios"). You'll get MFCC, △MFCC and △△MFCC saved ininsrument_name.npy
files. - Run
trainmodel_SVM.py
. You'll get the SVM model namedmodel_svm
and a file namednames
which stores the names of the instruments. - Run
test.py
and enter the path of the testing audios. The detection results will be shown.