Electronic Engineering Department, The Chinese University of Hong Kong - ELEG5760 - Machine Learning for Multimedia Applications


This course aims to provide students with a general understanding of various computational techniques that enable machines to understand different types of multimedia data, including text, speech, image and video. The course content covers the methods that are used to analyze, classify and detect the underlying information, properties and modalities inherent in complex data. Students will learn the theories, models, algorithms and operation of machine learning tools, which have been successfully developed and deployed for speech/audio, image/video, and other multimedia applications. Specifically, the basics and recent progress of machine learning techniques will be introduced.


  1. Introduction to the course
  2. Mathematical & programming basics
  3. Supervised Learning
    • Logistic regression
    • Linear Classifier
    • EM Algorithm
    • Support Vector Machine
  4. Unsupervised Learning
    • Principal Component Analysis;
    • ZCA Whitening;
    • Clustering;
  5. Basics on Neural Network & Multi-layer Perceptron
  6. Convolutional Neural Network
  7. Recurrent Neural Network
  8. Generative Adversarial Networks
  9. Machine learning for different data modalities
    • Feature representations for different modalities
  10. Practice on neural networks

Learning Outcome
Upon completion of this course, students will be able to:

  • Describe the properties of different types of multimedia data
  • Explain the fundamental concepts, theories and algorithms of machine learning techniques
  • Describe the advantages, limitations and trends of machine learning and techniques
  • Apply machine learning algorithms to solve given problems of various multimedia data
  • Use machine learning tools to implement a system of multimedia data processing

Back to the List