Sep 2015 - now
CUHK Ph.D. student [CV]
Deep Learning, Computer Vision, Machine Learning. My supervisor is Prof. Xiaogang Wang.
Summer 2014
Adobe Research Internship
Salient Object Detection for Images using Deep Learning.
Summer 2012
Mitacs Research Internship at University of Victoria
Devise an algorithm to alleviate the server bandwidth consumption in a P2P VoD system.
Bachelor Degree at Dalian University of Technology
Major in Electronic and Information Engineering.
How to pronounce my name Hongyang? It's Home + Young :)

I am currently a 3rd year (2017.8 - 2018.7) PhD student at The Chinese University of Hong Kong under supervision of Prof. Xiaogang Wang. My research covers a wide span of Deep Learning and its applications in Computer Vision. In particular, I am interested in the fully end-to-end learning with different types of models (CNN, RNN, etc) in object detection. Recently I am more interested in using novel Deep RL algorithms to solve various vision tasks.


Neat versoin. For a full list, check the Google Scholar page.

Rethinking Feature Discrimination and Polymerization for Large-scale Recognition
Feature matters. How to train a deep network to acquire discriminative features across categories and polymerized features within classes has always been at the core of many computer vision tasks. In this paper, we address this problem based on the simple intuition that the cosine distance of features in high-dimensional space should be close enough within one class and far away across categories. To this end, we proposed the congenerous cosine (COCO) algorithm to simultaneously optimize the cosine similarity among data. It inherits the softmax property to make inter-class features discriminative as well as shares the idea of class centroid in metric learning.
Yu Liu*, Hongyang Li*, Xiaogang Wang (* equal contribution)
NIPS 2017 workshop
Recurrent Scale Approximation for Object Detection in CNN
Since CNN lacks an inherent mechanism to handle large scale variations, we always need to compute feature maps multiple times for multiscale object detection. To address this, we devise a recurrent scale approximation to compute feature map once only, and only through this map can we approximate the rest maps on other levels. To further increase efficiency and accuracy, we (a): design a scale-forecast network to globally predict potential scales in the image since there is no need to compute maps on all levels. (b): propose a landmark retracing network to retrace back locations of the regressed landmarks and generate a confidence score for each landmark.
Yu Liu, Hongyang Li, Junjie Yan, F. Wei, Xiaogang Wang, X. Tang
ICCV 2017
Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection
We propose a zoom-out-and-in network for generating object proposals. A key observation is that it is difficult to classify anchors of different sizes with the same set of features. A map attention decision (MAD) unit is further proposed to aggressively search for neuron activations among two streams and attend the most contributive ones on the feature learning of the final loss. The unit serves as a decision maker to adaptively activate maps along certain channels with the solely purpose of optimizing the overall training loss. One advantage of MAD is that the learned weights enforced on each feature channel is predicted on-the-fly based on the input context, which is more suitable than the fixed enforcement of a convolutional kernel.
Hongyang Li, Yu Liu, Wanli Ouyang, Xiaogang Wang
arXiv preprint 2017
Multi-Bias Non-linear Activation in Deep Neural Networks
In this paper, we propose a multi-bias non-linear activation (MBA) layer to explore the information hidden in the magnitudes of responses. It is placed after the convolution layer to decouple the responses to a convolution kernel into multiple maps by multi-thresholding magnitudes, thus generating more patterns in the feature space at a low computational cost. It provides great flexibility of selecting responses to different visual patterns in different magnitude ranges to form rich representations in higher layers.
Hongyang Li, Wanli Ouyang, Xiaogang Wang
ICML 2016
Learning Deep Representation with Large-scale Attributes
This paper contributes a large-scale object attribute database that contains rich attribute annotations (rotation, viewpoint, occlusion, etc.) for around 180k samples and 494 object classes. We use this dataset to train deep representations and extensively evaluate how these attributes are useful on the general object detection task.
Wanli Ouyang, Hongyang Li, Xingyu Zeng, Xiaogang Wang
ICCV 2015
Dual Deep Network for Visual Tracking
In this paper, we propose a dual network to better utilize features among layers for visual tracking. It is observed that features in higher layers encode semantic context while its counterparts in lower layers are sensitive to discriminative appearance. Thus we exploit the hierarchical features in different layers of a deep model and design a dual structure to obtain better feature representation from various streams, which is rarely investigated in previous work. To leverage the robustness of our dual network, we train it with random patches measuring the similarities between the network activation and target appearance. Quantitative and qualitative evaluations on two large-scale benchmark data sets show that the proposed algorithm performs favourably against the state-of-the-arts.
Zhizhen Chi, Hongyang Li, Huchuan Lu, Ming-Hsuan Yang
IEEE Trans. on Image Processing (TIP) 2017
Inner and Inter Label Propagation: Salient Object Detection in the Wild
We propose a novel label propagation based method for saliency detection. A key observation is that saliency in an image can be estimated by propagating the labels extracted from the most certain background and object regions. A co-transduction algorithm is devised to fuse both boundary and objectness labels based on an inter propagation scheme to effectively improve the saliency accuracy.
Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price
IEEE Trans. on Image Processing (TIP) 2015


As a Graduate Student
Foundations of Optimization. Big Data Analytics.
Pattern Recognition. Computer Vision. Advanced Machine Learning.
As a Teaching Assistant
Digital Circuits and Systems. Introduction to Probability. Introduction to Deep Learning.


My Academic Blog. A good (probably best) way of sharing ideas than bustling around like most people do in conferences.
Friends and stars in academia. I 'mark and fork' friends and genii to learn the best out of them. Update: CV family tree.
Thanks to my Ph.D job, I travel a lot and really enjoy every moment of it. There is a gallery to take a peek.
Conferences and Journals in computer vision. Readable for both professional and layman.
Still more unsorted stuff
- Meta talk: How to give a talk so good that there will be pizza left for you
- Computer vision industries summarized by David Lowe. A rough list of companies related to computer vision and robotics.