Given an audio file, the beat tracker uses a dynamic programming algorithm to compute an estimated sequence of beat times[1].
The system is made up of three distinct phases.
- First, perceptually weighted spectral flux analysis extracts an onset strength envelope (OSE) from an audio file.
- This OSE is then used to estimate a tempo period for the audio via a perceptually weighted auto-correlation.
- Finally, a dynamic programming algorithm is used to identify the most likely sequence of beats based on the OSE and estimated tempo period.
Due to Dynamic Programming, this algorithm is particularly efficient and can process and evaluate performance on the full Ballroom dance dataset in ~2-3 minutes.
[1] D. P. W. Ellis, ‘Beat Tracking by Dynamic Programming’, Journal of New Music Research, vol. 36, no. 1, pp. 51–60, Mar. 2007.