Audio prompt tuning for universal sound separation

This is the official repository of paper: AUDIO PROMPT TUNING FOR UNIVERSAL SOUND SEPARATION. This work is a simple yet effective approach to enhance existing universal sound separation systems. Audio prompt tuning (APT) improves the separation performance of specific sources through training a small number of prompt parameters with limited data, while maintaining the generalization of the universal sound separation model by keeping its parameters frozen. The number of tuned parameters are less than 0.1% of the parameters of the backbone model.

Demo Page

Demo page

Results

We evaluate our method on MUSDB18 and ESC-50 dataset. Average SDR scores of APT and average prompt embedding without tuning (Baseline) list in the following table.

Model	MUSDB18_fulldata	ESC-50_fulldata
APT	4.98	8.50
Baseline	4.31	6.44

Few-shot experiments are carried on ESC-50 datasets.

Model	ESC-50_1-shot	ESC-50_5-shot	ESC-50_10-shot
APT	4.57	6.68	7.59
Baseline	4.09	5.59	6.10

Detailed results of 50 categories on ESC-50 dataset are available here.

Cite our work

To be done after publishing

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
README.md		README.md
Results-ESC50.csv		Results-ESC50.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio prompt tuning for universal sound separation

Demo Page

Results

Cite our work

About

Releases

Packages

redrabbit94/APT-USS

Folders and files

Latest commit

History

Repository files navigation

Audio prompt tuning for universal sound separation

Demo Page

Results

Cite our work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages