Skip to content

ExaggeratedRumors/demooder

Repository files navigation

Demooder

Android support Model In progress

logo

Android application using input sound to recognize voice.

Release

in progress

Technologies

  • Gradle 8.2
  • JVM 11
  • Android SDK 34
  • Kotlin 1.9.20
  • Jetpack Compose 1.6.10
  • Compose Multiplatform 1.6.10
  • KotlinDL 0.5.2
  • cudnn 7.6.3

Modules

  • app - mobile application.
  • model - CNN model execution.
  • processing - Common library.

Executing

  1. Clone repository:
https://github.com/ExaggeratedRumors/demooder.git
  1. Download AudioWav data: Download from Kaggle.
  2. Unzip Wav files in data_audio directory (from root it's demooder-model/data_audio directory).
  3. [optional] Run data augmentation task:
./gradlew :model:dataAugmentation
  1. Run create spectrograms task:
./gradlew :model:createSpectrograms
  1. Run model training task:
./gradlew :model:trainModel
  1. Output model is saved in data_models directory.

Sound data

Source: CREMA-D

Audio data augmentation

  1. Audio data augmentation: about audio data augmentation.
  2. Gaussian noise.
  3. Time stretching.

Sound signal processing

  1. Read WAV files according to the header scheme: wav file format.
  2. Convert byte data to complex.
  3. Signal windowing: about windowing.
  4. Use Short-Time Fourier Transform (STFT): about STFT, about FFT.
  5. Filter by A-weighting or C-weighting: about weighting.

Predicting

  1. Read classifier model.
  2. Record voice signal.
  3. Save as WAV file.
  4. Down-sampling signal from 48000Hz to 16000Hz: about resampling.
  5. Convert byte data to complex.
  6. Signal windowing and filter by weighting.
  7. Predict.

Visualizing

  1. Read data.
  2. Use FFT.
  3. Convert FFt to spectral amplitude.
  4. Convert to octave/thirds bands: about octave to third conversion.
  5. Filter by A-weighting or C-weighting.

Build network

  1. Build VGG architecture model: about VGG.

Additional requirements

  1. CUDA for training model on GPU (Nvidia graphics cards):
  2. NNAPI for mobile devices environment acceleration: about inference on Android .

Releases

No releases published

Packages

No packages published

Languages