Repository for EEC 201 Winter 2019 Final Project B. The goal is to design and implement an LPC vocoder and synthesizer in Matlab with a GUI.
- Abhinav Kamath - agkamath(at)ucdavis.edu
- Mason del Rosario - mdelrosa(at)ucdavis.edu
For a given windowed portion of a speech sample, the sample can be modeled as an LTI system excited by a periodic impulse train.
Thus, the main goal in analyzing speech is to determine the transfer function of the LTI system for each windowed sample. The discrete-time transfer function can be modeled as an all-pole IIR filter with the general form seen below.
The autocorrelation method fixes the coefficients of the all-pole transfer function's denominator by implementing the the following equation:
The relevant code which performs this analysis can be found in lpc_analysis.m
.
Given the all-pole transfer function above, the main goal in synthesizing an approximation of the original speech sample is to determine the instantaneous pitch of the excitation pulse train. In this work, we take two approaches to estimating pitch: the cepstrum-based method and the autocorrelation method.
Taking the cepstrum of a sequence allows for determining the period of fundamentals in the sequence. The cepstrum is defined below:
The relevant code which performs the cepstrum-based pitch detection can be found in pitch_detect_candidates.m
.
The autocorrelation is computed frame-by-frame. For the ACF of a frame, the number of samples between the center peak and the next highest peak is its pitch-period estimate. In order to find the second peak, the central peak is suppressed, as are all negative autocorrelation values. The fundamental-frequency estimate or the pitch estimate is simply the sampling frequency of the audio signal divided by the pitch-period estimate.