NN Speech Enhancement

NN Speech Enhancement (NNSE) is a speech enhancement model (SE) based on recurrent neural networks (RNN).

Directory contents

nnse/ # root 
    evb/ # for evb deployment
        build/      # bin files
        includes/   # required inlcudes
        libs/       # required libs
        make/       # make.mk
        pack/
        src/        # c source codes
        Makfile
        autogen.mk
    ns-nnsp/  # c codes to build nnsp library (used only when re-building library)
    python/   # for NN training
    README.md # this readme

Prerequisite

`Software`

To work on Apollo4, you need

Arm GNU Toolchain 11.3
Segger J-Link v7.56+

Speech Enhancement

This speech enhancement model is based on 16 kHz sampling rate. The model size is about 100kB.

`Dataset`

The SE model is trained based on several audio dataset, including human speech and noises. Before you use this repo, please read on their license agreements carefully in here.

Compiling and Running a Pre-Trained Model

From the nnse/evb/ directory:

make clean
make
make deploy Prepare two USB cables. Ensure your board is connected via both the JLINK USB port and the audio USB port. Then turn on the power on EVB.
Plug a mic into the 3.5mm port, and push BTN0 to initiate voice recording
make view will provide SWO output as the device is running.
On your cmd, type
```
$ python ../python/tools/audioview_se.py --tty=/dev/tty.usbmodem1234561 --playback=1
```
You should see a GUI popping out as below. Click the record button to start the record. Anc click stop button to finish. The top panel will show the raw audio that microphone records, and the bottom one will show the enhanced audio.
- You might need to change the option --tty depending on your OS.
- The option playback=1 means you want to play the enhanced speech on the other computer via internet. One simple example is to use MS Teams (see here).
  - Note: we suggest to use earphone on the host side to avoid the echo effect.
Check the two recording files under nnse/evb/audio_result/.
- audio_raw.wav: the raw PCM data from your mic.
- audio_se.wav: the enhanced speech.

Re-Training a New Model

Our approach to training the model can be found in README.md. The trained model is saved in evb/src/def_nn3_se.c and evb/src/def_nn3_se.h.

Library NS-NNSP Library Overview

Library neuralspot NNSP, ns-nnsp.a, is a C library to build a pipeline including feature extraction and neural network to run on Apollo4. The source code is under the folder ns-nnsp/. You can modify or rebuild it via NeuralSPOT Ambiq's AI Enablement Library. In brief, there are two basic building blocks inside ns-nnsp.a, feature extraction and neural network. In ns-nnsp.a, we call them FeatureClass defined in feature_module.h and NeuralNetClass in neural_nets.h, respectively. Furthermore, NNSPClass in nn_speech.h encapsulates them to form a concrete instance. We illustrate this in Fig. 1.

Fig. 1: Illustration of `ns-nnsp`

Also, in our specific s2i NN case, def_nn0_s2i.c has two purposes:

For feature extraction, we use Mel spectrogram with 40 Mel-scale. To apply the standarization to the features in training dataset, it requires statistical mean and standard deviation, which is defined in def_nn0_s2i.c.
For the neural network, it points to the trained weight table defined in def_nn0_s2i.c as well.

Build NS-NNSP library from NeuralSPOT (Optional)

If you want to modify or re-build the ns-nnsp.a library, you can follow the steps here.

Download NeuralSPOT

$ git clone https://github.com/AmbiqAI/neuralSPOT.git ../neuralSPOT

Copy the source code of NS-NNSP to NeuralSPOT. Then go to NeuralSPOT folder.

$ cp -a ns-nnsp ../neuralSPOT/neuralspot; cd ../neuralSPOT

Open neuralSPOT/Makefile and append the ns-nnsp to the library modules as below

# NeuralSPOT Library Modules
modules      := neuralspot/ns-harness 
modules      += neuralspot/ns-peripherals 
modules      += neuralspot/ns-ipc
modules      += neuralspot/ns-audio
modules      += neuralspot/ns-usb
modules      += neuralspot/ns-utils
modules      += neuralspot/ns-rpc
modules      += neuralspot/ns-i2c
modules      += neuralspot/ns-nnsp # <---add this line

# External Component Modules
modules      += extern/AmbiqSuite/$(AS_VERSION)
modules      += extern/tensorflow/$(TF_VERSION)
modules      += extern/SEGGER_RTT/$(SR_VERSION)
modules      += extern/erpc/$(ERPC_VERSION)

Compile

$ make clean; make; make nest

Copy the necessary folders back to nnsp folder

$ cd nest; cp -a pack includes libs ../nnsp/evb

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
docs		docs
evb		evb
ns-nnsp		ns-nnsp
pics		pics
python		python
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NN Speech Enhancement

Directory contents

Prerequisite

`Software`

Speech Enhancement

`Dataset`

Compiling and Running a Pre-Trained Model

Re-Training a New Model

Library NS-NNSP Library Overview

Build NS-NNSP library from NeuralSPOT (Optional)

About

Releases

Packages

Languages

License

AmbiqAI/nnse

Folders and files

Latest commit

History

Repository files navigation

NN Speech Enhancement

Directory contents

Prerequisite

Software

Speech Enhancement

Dataset

Compiling and Running a Pre-Trained Model

Re-Training a New Model

Library NS-NNSP Library Overview

Build NS-NNSP library from NeuralSPOT (Optional)

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Software`

`Dataset`

Packages