Electronic Engineering Department, The Chinese University of Hong Kong - ELEG5421 - Audio Signal Processing

Homepage

Objective

This course is an in-depth exploration of audio processing using neural networks. Starting with an introduction to audio problems, the course covers a range of topics including audio features and human labels, filtering and digital signal processing for audio processing, audio and music tagging with convolutional neural networks, audio and music transcription with recurrent neural networks, audio compression, bridging audio and language with sequence-to-sequence models, symbolic music generation, audio and music generation with pipelines, vocoder, and autoregressive models, audio and music generation with VAEs and diffusion models, controllable audio and music generation from texts and multiple modalities, and open problems and future directions in the field.

這門課程探討音頻和音樂處理。这门课從音頻和音樂問題的介紹開始,涵蓋音頻特徵和人類標籤、數字信號處理、使用卷積神經網絡進行音頻和音樂標記、使用循環神經網絡進行音頻和音樂轉錄、音頻壓縮、使用序列到序列模型连接转录音頻。這門課程音頻探討和音樂生成:包括符號音樂生成、聲碼器、自回歸模型、使用VAE和擴散模型進行音頻和音樂生成、從文本和多種模式控制音頻和音樂生成。

Syllabus

  1. Introduction: Audio and music problems.
  2. Audio features and human labels.
  3. Filtering, digital signal processing for audio processing (Assignment 1)
  4. Audio and music tagging with convolutional neural networks.
  5. Audio and music transcription with recurrent neural networks.
  6. Audio compression.
  7. Bridging audio and language: audio and music caption with sequence-to-sequence models (Assignment 2).
  8. Symbolic music generation.
  9. Audio and music generation: pipelines, vocoder, and autoregressive models.
  10. Audio and music generation with probabilistic models, such as diffusion models.
  11. Controllable audio and music generation from texts and multiple modalities (Assignment 3).
  12. Open problems and future.

.

Learning Outcome

Through a combination of lectures, assignments, and projects, students will gain hands-on experience working with state-of-the-art tools and techniques for audio and music processing. By the end of the course, students will have a solid foundation in the latest techniques for audio and music processing using neural networks, and will be able to apply these techniques to real-world problems in the field.

Back to the List

back-to-top