MIRAGE

A Comprehensive Audio Analysis Pipeline

Overview

MIRAGE is a Python-based pipeline designed for researchers to analyze, modify, and compare audio files, with a particular focus on music. This pipeline uses Demucs for stem separation, Librosa for feature extraction, and SQL-based data storage to enable high-level data organization and retrieval. MIRAGE is versatile and equipped for tasks ranging from creating formulas to reconstructing audio files from formula-based transformations.

Database Structure

MIRAGE stores song information and extracted features in a SQLite database for easy access and management.

Songs Table:

id	artist_name	song_name
1	Artist Name	Song Title

Each song entry is unique by artist name and song title. MIRAGE checks for duplicate entries before adding a new song to the database.

Features Table:

song_id	stem	feature_name	feature_values
1	vocals	mfccs	-123.45, -120.67, ...
1	vocals	chroma	0.12, 0.34, ...
1	drums	mfccs	-130.56, -125.78, ...
...	...	...	...

Each row corresponds to a specific feature (such as mfccs or mel_spectrogram) of a particular stem (e.g., vocals, bass, or drums). Feature values are stored as serialized arrays to enable efficient data access and comparison.

Feature Dimensionality

For each stem, MIRAGE extracts the following features:

MFCCs (Mel-Frequency Cepstral Coefficients): 20 coefficients to capture spectral characteristics.
Chroma: 12 chroma bins representing the 12 semitones in an octave.
Spectral Contrast: 7 coefficients capturing contrast between peaks and valleys in each sub-band.
Tonnetz: A 6-dimensional representation of tonal centroid features.

Additional features such as mel spectrogram are available and stored as needed. The dimensionality of each feature type is determined by standard parameters used in librosa.

Pipeline Options and Functionalities

Upon running the pipeline, the user is prompted to select one of the following options:

Option 1: Add to Database

User Inputs: Artist name, song title, and file path.
Process:
- The audio file is loaded, normalized, and saved in stereo format.
- MIRAGE extracts selected features (MFCCs, chroma, mel spectrogram) for both the full song and individual stems (vocals, drums, bass, other).
- Checks for duplicate songs based on artist name, song title, and feature similarity. If a song is nearly identical to an existing song, it only performs stem separation without adding a duplicate entry.
Output:
- Original audio and stems saved in organized directories.
- OSCR metric (original and reconstructed song similarity) calculated and displayed.
- Song data, stem features, and computed metrics saved to the database.

Option 2: Read from Database

Displays all songs and features currently stored in the database. This option helps researchers quickly access the stored audio files and their associated features.

Option 3: Get Cosine Similarity

User Prompts:
- User selects two songs from the database.
Process:
- For each selected feature (mel spectrogram, MFCCs, chroma, spectral contrast, and tonnetz), MIRAGE calculates the cosine similarity between the corresponding stems.
Output:
- Cosine similarity scores for each feature are displayed, allowing users to quantify similarities between two songs at various levels of the audio structure.

Option 4: Combine Stems

User Prompts:
- Select a song to reconstruct.
- If the stems are unavailable, the user is prompted to split them.
- User selects which stems (e.g., vocals, drums, bass) to combine into a new audio file.
Process:
- MIRAGE combines the selected stems and saves the reconstructed audio in a custom directory.
Output:
- A combined audio file is saved, allowing researchers to analyze specific portions of the song or create new audio samples.

Option 5: Convert Song to Formula

User Prompts:
- Select a song to convert.
Process:
- MIRAGE performs a Short-Time Fourier Transform (STFT) on the audio file to represent each time slice as a formula based on frequency, magnitude, and phase.
- The resulting equations are stored in a text file.
Output:
- A directory containing the song's formula text file. This file serves as an analytical representation of the song's structure and allows researchers to synthesize the song later.

Option 6: Convert Formula to Synthesized Song

User Prompts:
- Select a formula text file to synthesize.
Process:
- MIRAGE uses the saved formula to recreate the song by reconstructing the original waveform.
Output:
- A synthesized version of the song is saved in the designated directory.

Directory Structure

output/original: Contains original versions of all added songs.
output/stems: Stores the separated stems (vocals, drums, bass, other) for each song.
output/reconstructed: Contains audio reconstructed from the saved stems.
output/formulas: Stores formula-based representations of songs as text files.
output/synthesized: Contains synthesized versions of songs recreated from formulas.
output/custom merged stems: Stores custom reconstructions where the user has combined specific stems.

rvc_programmatic

This folder uses the rvc_python package and creates a command line tool to be able to run inference using pre-trained retrieval-based voice conversion models on given input audios. Steps for specifics on how to run rvc_programmatic can be found in {root_directory}/research/rvc_programmatic/README.md

audio_splitter

audio_splitter is a command line tool that segments audio files into roughly even length segments to be used as training data for the RVC model. Segments are only roughly even because audio is preprocessed to ensure that nonsilent chunks of audio are discarded, and some chunks of non-silent audio will not be of the same length. Segments are not combined to be even length because audio between silences could vary greatly and make for poor training data. Steps for how to run this command line tool can be found in {root_directory}/audio_splitter/README.md

Resources and External Links

Main Folder: Google Drive Link
Documentation: MIRAGE Documentation
XAI Project Thought & Task List: Access this on the iPhone Notes app (if shared).

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
audio_splitter		audio_splitter
research		research
.gitignore		.gitignore
README.md		README.md
folder_structures.txt		folder_structures.txt
license.txt		license.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MIRAGE

Overview

Database Structure

Songs Table:

Features Table:

Feature Dimensionality

Pipeline Options and Functionalities

Option 1: Add to Database

Option 2: Read from Database

Option 3: Get Cosine Similarity

Option 4: Combine Stems

Option 5: Convert Song to Formula

Option 6: Convert Formula to Synthesized Song

Directory Structure

rvc_programmatic

audio_splitter

Resources and External Links

About

Releases

Packages

Contributors 3

Languages

License

lennox55555/MIRAGE

Folders and files

Latest commit

History

Repository files navigation

MIRAGE

Overview

Database Structure

Songs Table:

Features Table:

Feature Dimensionality

Pipeline Options and Functionalities

Option 1: Add to Database

Option 2: Read from Database

Option 3: Get Cosine Similarity

Option 4: Combine Stems

Option 5: Convert Song to Formula

Option 6: Convert Formula to Synthesized Song

Directory Structure

rvc_programmatic

audio_splitter

Resources and External Links

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages