Music Source Separation (undergrad. thesis)

This repo contains the pytorch implementations of MDensenet [1] and Sparse-NMF [2] that were used for the experiments of my undergraduate thesis. These models can be used in a supervised manner to learn from clean vocals and accompaniment data in order to separate these sources at test time. Also, there is a pytorch implementation for the NNDSVD method [4] of initializing the NMF matrices.

The structure of the repo and setup instructions are presented next. Finally, some notes about the experiments are given.

Structure

The repo contains the source relevant to the models as well as setup files. In the root directory there is environment-gpu-cuda10.yml that can be used to create a conda environment with the necesesary modules. This configuration file is based on the one used by Open-Unmix.

In mdensenet/ there is the MDensenet model's source, a script to separate a track using a trained model and an exemplar json file with mdensenet's parameters. The directory contains the efficient_densenet_pytorch repo as a submodule, because it is used as a dependency.

In sparse-nmf/ there is the Sparse-NMF implementation along with scripts for training a source dictionary with MUSDB18's [3] tracks and using trained dictionaries for the separation of mixtures.

Setup instructions

In order to setup the necessary dependencies it is recommended to update the densenet repo submodule and use anaconda to create a dedicated environment with the dependencies listed in environment-gpu-cuda10.yml. For these, you can run the following commands from your CLI:

conda env create -f environment-gpu-cuda10.yml
git submodule update --init --recursive

To activate the dedicated environment, use conda activate mss-thesis-pytorch-gpu.

(The conda env setup is based on the Open-Unmix repo, that shows this handy way of installing the necessary dependencies 🙏)

Notes on the experiments

The MUSDB18 dataset [3] was used along with source from the Open-Unmix repo [5] for the experiments, in order to train the models and evaluate them on the separation quality as well as on train and separation times.

References

[1] Takahashi, Naoya, and Yuki Mitsufuji. "Multi-scale multi-band densenets for audio source separation." 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 2017.

[2] Le Roux, Jonathan, Felix J. Weninger, and John R. Hershey. "Sparse NMF–half-baked or well done?." Mitsubishi Electric Research Labs (MERL), Cambridge, MA, USA, Tech. Rep., no. TR2015-023 11 (2015): 13-15.

[3] Rafii, Zafar, et al. "MUSDB18-a corpus for music separation." (2017).

[4] Boutsidis, Christos, and Efstratios Gallopoulos. "SVD based initialization: A head start for nonnegative matrix factorization." Pattern recognition 41.4 (2008): 1350-1362.

[5] Stöter, Fabian-Robert, et al. "Open-unmix-a reference implementation for music source separation." Journal of Open Source Software 4.41 (2019): 1667.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
mdensenet		mdensenet
sparse-nmf		sparse-nmf
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
environment-gpu-cuda10.yml		environment-gpu-cuda10.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music Source Separation (undergrad. thesis)

Structure

Setup instructions

Notes on the experiments

References

About

Releases

Packages

Languages

georgetz15/mss-thesis

Folders and files

Latest commit

History

Repository files navigation

Music Source Separation (undergrad. thesis)

Structure

Setup instructions

Notes on the experiments

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages