Speech and Audio Processing: Theory & Applications

2015 Globex Julmester

Colleage of Engineering, Peking University

July 6-25, 2015

Tan Lee

DSP and Speech Technology Laboratory

Department of Electronic Engineering

The Chinese University of Hong Kong

tanlee@ee.cuhk.edu.hk

Materials Schedule Assessment References

Speech is inarguably the most preferred and natural way of communication for humans. Speech is transmitted from a speaker to a listener in the form of an acoustic signal. The signal carries abundant information, including the linguistic content, the speaker’s voice characteristic, health and emotional conditions, and the ambient environment. Speech signals have many distinctive features that are not found in other signals from the natural world. In the first part of this course, students will study the fundamental theory of digital processing of speech signals. Important time-domain and frequency-domain properties of speech signals will be investigated. Other types of audio signals, namely music and noise, will also be covered in our discussion. The second part of this course will be focused on a few selected applications of speech and audio processing, which include automatic speech recognition, music classification, hearing and speaking aids. The basic principles of system design will be introduced and the major technological challenges will be discussed. Students who take this course are expected to have fundamental knowledge in signals and systems and experience in using MATLAB.

List of topics

Digital signal processing: discrete Fourier transform, short-time Fourier transform, digital filters;
Speech communication: human speech production, human auditory perception, types of speech sounds;
Speech analysis: short-time stationarity, time-domain features, frequency-domain features, pitch and tone;
Music analysis: pitch and harmonics, notes, tempo, rhythm, melody and timbre;
Selected applications: hearing aids, automatic speech recognition, music transcription, audio search.

Pre-requisite

Time-varying signals: waveform, amplitude, phase, periodicity, time-shift transformation
Linear systems: linearity, impulse response, convolution, system block diagram
Fourier transform: continuous-time Fourier series, continuous-time Fourier transform, frequency spectrum, fundamental frequency and harmonics
Sampling & digitization: bandlimiting signal, sampling frequency, sampling theorem, quantization, linear PCM
Use of MATLAB: representation of signals, m-functions, graph plotting