Skip to content

Introduction

Phil Schatzmann edited this page Nov 25, 2023 · 127 revisions

Audio Data

Audio data on microcontrollers is usually represented as a stream of signed integers that oscillate around 0. This audio format is usually called PCM or RAW. Stream

To specify the format of an audio stream we need to know:

  • how many bits are used to represent the integer (e.g. 16 bits = int16_t, 24 bits = int24_t or 32 bits = int32_t)
  • how many channels are available (e.g. 2 channels with left and right (=stereo) data or 1 for mono)
  • the sample rate (e.g. 44100 Hz)

The number of bits define the supported value range: 16 bit values are between -32768 and 32767. So a typical stream of audio data on Microcontrollers uses 16 bits and 2 channels and looks as follows :

Stream

In our library the format is represented with the AudioInfo class.

Please note that on regular desktop computers you often also find floats which are scaled between -1.0 and 1.0. Since floating point operations are expensive, this is however usually avoided on Microcontrollers. Also the processing of 8 bit numbers are not supported by most of the classes, but you can easily read and write floats or signed or unsigned 8 bit numbers by using a Codec.

Logging

Logging is important to figure out what the program is actually doing. I recommend that you start with the log level Info.

AudioLogger::instance().begin(Serial, AudioLogger::Info);

Recommendation: If you need to see more details you can switch to the log level Debug and after things work you need to switch to Warning or Error to avoid any negative impact on the audio quality.

Arduino Audio Streams

Both the input and output of audio are handled via Arduino streams, so you can process them exactly the same way like files, Serial etc. However we should avoid the single byte operations and use the methods which use a block of memory. The most important operations are:

  • int availableForWrite ()
  • size_t write (const uint8_t *buffer, size_t size)
  • int available()
  • size_t readBytes(const uint8_t *buffer, size_t size)

It is also important, that the format of the input and output stream are matching. You can determine the default setting of a stream by calling the defaultConfig() method and you start a stream by calling begin() passing the configuration as argument.

Recommendation: Keep the bits per sample at 16 for maximum processing speed!

Generating a Test Tone

Here is a sample sketch that sets up the input from a SineWaveGenerator and writes the output to a CsvOutput.

#include "AudioTools.h"

AudioInfo info(44100, 2, 16);
SineWaveGenerator<int16_t> sineWave(32000);                // subclass of SoundGenerator with max amplitude of 32000
GeneratedSoundStream<int16_t> in(sineWave);                // Stream generated from sine wave
CsvOutput<int16_t> out(Serial); 

// Arduino Setup
void setup(void) {  
  // Open Serial 
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Info);

  // Define CSV Output
  auto config = out.defaultConfig();
  config.copyFrom(info);
  out.begin(config);

  // Setup sine wave
  in.begin(info);
  sineWave.begin(info, N_B4); // frequency of note B4
}
  • Instead of calling sineWave.begin(info, N_B4); you could also call sineWave.setFrequency(N_B4); Please note that N_B4 is the frequency of note B4 which is 493.88f; so instead of N_B4 you could directly give the value 493.88.
  • SineGenerator is just one of many other implementations. Try to replace it with some noise generator: Did you know about the different colors of noise ?
  • Try to replace the CsvStream with another Audio Sink class.

Copy

So far we have set up an audio source and an audio target and all we need to do is to copy the data in the loop.

uint8_t buffer[1024];
void loop() {
   size_t read = in.readBytes(buffer, 1024);
   out.write(buffer, read);
}

There is one slight complication however: For most Stream implementations a write is not blocking and therefore it is not guaranteed that all byes could be processed. Therefore you need to implement some logic which is re-writing the unprocessed bytes. To simplify things we can use the StreamCopy class, which will take care of this:

StreamCopy copier(out, sound);                             
void loop() {
  copier.copy();
}

The generated sound data will be displayed as csv on the serial monitor. The Arduino Serial Plotter is the perfect tool to visualize the result.

I also suggest that you study the available methods of the StreamCopy class: you will discover that instead of using copy in individual small steps you can call copyAll or you can use specify the time in milliseconds with copyMs.

Additional Processing in the Loop

The current proposed loop is quite lean and only contains a copy call. It is critical that the audio data is provided fast enough! Adding bigger delays can cause that the Audio Sink is running out of data. This is not critical in the CsvStream example above, but if you output e.g. to I2S any buffer underflow will be audible!

Therefore consider the following for you logic in the loop:

  • only call methods that are processed fast
  • you can optimize the processing by increasing the copy buffer or and/or I2S buffers or by calling copier.copyN(number) instead.
  • never add any (long) delays() in the loop!

If your sketch does not allow you to follow this advice, then you can just move the copy() to a separate task!

Samples as Array

Some people are confused why the data used by the streams is defined as uint8_t. This has been done, so that we can process any data type! If you know that you have PCM data with bits_per_sample of 16 you can simply cast the data to the proper type if you want to access it as array:

int16_t samples[512];
void loop() {
   size_t bytes_read  = in.readBytes((uint8_t*) samples, 512 * sizeof(int16_t));
   size_t samples_read = bytes_read/sizeof(int16_t);
}

Or

int8_t buffer[1024];
int16_t *samples = (int16_t*) buffer;

void loop() {
   size_t bytes_read = in.readBytes(buffer, 1024);
   size_t samples_read = bytes_read/sizeof(int16_t);
}

You can e.g. access the first sample with samples[0], so

for(int i=0; i<samples_read; i++){
   Serial.println(samples[i]);
}

is printing all received samples. Accessing data this way however is usually not necessary because there are better ways to process the data, as we will see in the next chapter.

Changing the Signal

You can easily change the signal by chaining transformation stream classes:

#include "AudioTools.h"

AudioInfo info(44100,2,16)
SineWaveGenerator<int16_t> sineWave(32000);                // subclass of SoundGenerator with max amplitude of 32000
GeneratedSoundStream<int16_t> in(sineWave);                // Stream generated from sine wave
CsvStream<int16_t> csv(Serial); 
VolumeStream vol(csv);
StreamCopy copier(vol, in);                             

// Arduino Setup
void setup(void) {  
  // Open Serial 
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Info);

  // Define CSV Output
  auto config = out.defaultConfig();
  config.copyFrom(info);
  out.begin(config);

  // setup volume
  auto config_vol = vol.defaultConfig();
  config_vol.copyFrom(info);
  config_vol.volume = 0.5;  // half the volume
  volume.begin(config_vol);

  // Setup sine wave
  sineWave.begin(info, N_B4);
}

void loop() {
  copier.copy();
}

In the example above we copy the audio data to the volume control, which forwards the output to the CsvStream after adjusting the volume (=amplitude) of the signal.

Building your own Sketch

You can build your own sketch by replacing the input (source) and/or the output (sink) with an alternative implementation. But before you start with any complicated combination I suggest that you test first

  • a new input together with the CsvStream
  • a new output together with the GeneratedSoundStream

As “Audio Sources” we will have e.g.:

As “Audio Sinks” we will have e.g:

Happy Coding...

External Resources