NOTE: this project is under heavy development at the moment and this description may not reflect recent (potentially large) changes
This is a demo for capturing spectrograms, mel-scale spectrograms, and mel-scale cepstral coefficients (MFCCs), and identifying formants using GDScript.
The demo includes a scene + script for capturing audio over time and generating images on close.
The demo also includes a scene + script for showing a spectrogram in realtime over a short time window.
Both demos identify the first 4 formants in the analyzed audio. The formants are drawn in green on the spectrogram image. The realtime demo uses a faster/less-accurate dynamic compression method for this purpose.
Both demos give a realtime visualization of the bucket levels using progress bars, and labels for the formant frequencies.
The spectrogram images look like:
The mel-scale spectrogram images look like:
The MFCC images look like:
the non-relatime demo generates the capture on exit_tree. so that means you have to close the app with the X on the window for it to work. Pressing the stop debugging button doesnt trigger the signal. You can bind this to anything, I was just lazy for prototyping. It starts capture as soon as you press play.
you may need to adjust the FFT size or NUM_BUCKETS to suit your needs and/or hardware capabilities.
the .gitignore is set to ignore the .tres files in the captures folder bc they can be too large for github depending on the length of the capture.
This software contains assets from the Librosa repo (sample sounds for validation). See LICENSE.LIBROSA.md for information on permissions.
This software is released under the MIT Licenses, see LICENSE for more information.
Created By: Ryan Powell, 2024.