Home Biography Education Research
Wanli Ouyang, Ph.D, IEEE Senior Member.
Research assistant professor working with Prof. XiaoGang Wang
I'm with MMlab and IVP lab
IVP Lab, Dept. Electronic Engineering,
The Chinese University of Hong Kong,
Hong Kong, China
Phone: +852 2609 8251
Fax: +852 2603 5558
CUHKee.cuhk.edu.hk
Photo

Biography

Wanli Ouyang obtained Ph.D from the Dept. of Electronic Engineering , the Chinese University of Hong Kong. He is now a research assistant professor in CUHK. His research interests include deep learning and its application to computer vision and pattern recognition, image and video processing.

CV PDF Download Wanli Ouyang's Full CV      View Wani Ouyang's LinkedIn ProfileView Wanli Ouyang's LinkedIn Profile      View Wani Ouyang's Google Scholar Citations View Wanli Ouyang's Google Scholar Citations

Back To Top

Information for potential Postdoctoral Fellow, Master and Ph.D. students

I'm moving to the School of Electrical and Information Engineering, University of Sydney as senior lecturer on June. If you are interested in my research topic and this university, please feel free to contact me.

News

Our team rank as #1 for object detection with provided data and external data and #1 for video object detection/tracking in the ImageNet Large Scale Visual Recognition Challenge 2016. Project page with source code

Good resources on Paper Writing

How to do good research on computer vision (Chinese)
Making Data meaningful
Slides on "How to get your paper rejected." By Prof. Ming-Hsuan Yang from UC Merced
Chinese blog on how to publish a top journal

Journal Papers

Wanli Wanli Ouyang, Hui Zhou, Hongsheng Li, Quanquan Li, Junjie Yan, Xiaogang Wang, "Jointly learning deep features, deformable parts, occlusion and classification for pedestrian detection," IEEE Trans. Pattern Anal. Mach. Intell. (PAMI), accepted, 2017. [Abstract] [BibTeX] [Full Text]

Abstract: Feature extraction, deformation handling, occlusion handling, and classification are four important components in pedestrian detection. Existing methods learn or design these components either individually or sequentially. The interaction among these components is not yet well explored. This paper proposes that they should be jointly learned in order to maximize their strengths through cooperation. We formulate these four components into a joint deep learning framework and propose a new deep network architecture ( Code available on www.ee.cuhk.edu.hk/∼wlouyang/projects/ouyangWiccv13Joint/index.html). By establishing automatic, mutual interaction among components, the deep model has average miss rate 8.57%/11.71% on the Caltech benchmark dataset with new/original annotations.

BibTeX entry:
@article{ouyang2016Jointly, title={Jointly learning deep features, deformable parts, occlusion and classification for pedestrian detection}, author={Wanli Ouyang and Hui Zhou and Hongsheng Li and Quanquan Li and Junjie Yan and Xiaogang Wang}, journal={IEEE Trans. Pattern Anal. Mach. Intell}, pages={1--14}, year={2017}, publisher={IEEE} }
}
Wanli Xingyu Zeng (equal contribution), Wanli Ouyang (equal contribution), Junjie Yan, Hongsheng Li, Tong Xiao, Kun Wang, Yu Liu, Yucong Zhou, Bin Yang, Zhe Wang, Hui Zhou, Xiaogang Wang, "Crafting GBD-Net for Object Detection," IEEE Trans. Pattern Anal. Mach. Intell. (PAMI), accepted, 2017. [Abstract] [BibTeX] [Full Text] [ Code ] [Project page & code ]

Abstract: The visual cues from multiple support regions of different sizes and resolutions are complementary in classifying a candidate box in object detection. Effective integration of local and contextual visual cues from these regions has become a fundamental problem in object detection. In this paper, we propose a gated bi-directional CNN (GBD-Net) to pass messages among features from different support regions during both feature learning and feature extraction. Such message passing can be implemented through convolution between neighboring support regions in two directions and can be conducted in various layers. Therefore, local and contextual visual patterns can validate the existence of each other by learning their nonlinear relationships and their close interactions are modeled in a more complex way. It is also shown that message passing is not always helpful but dependent on individual samples. Gated functions are therefore needed to control message transmission, whose on-oroffs are controlled by extra visual evidence from the input sample. The effectiveness of GBD-Net is shown through experiments on three object detection datasets, ImageNet, Pascal VOC2007 and Microsoft COCO. Besides the GBD-Net, this paper also shows the details of our approach in winning the ImageNet object detection challenge of 2016, with source code provided on https://github.com/craftGBD/craftGBD. In this winning system, the modified GBD-Net, new pretraining scheme and better region proposal designs are provided. We also show the effectiveness of different network structures and existing techniques for object detection, such as multi-scale testing, left-right flip, bounding box voting, NMS, and context.

BibTeX entry:
@article{ouyang2016Jointly, title={Crafting GBD-Net for Object Detection}, author={Zeng, Xingyu and Ouyang, Wanli and Yan, Junjie and Li, Hongsheng and Xiao, Tong and Wang, Kun and Liu, Yu and Zhou, Yucong and Yang, Bin and Wang, Zhe and Hui Zhou and Xiaogang Wang}, journal={IEEE Trans. Pattern Anal. Mach. Intell}, pages={1--14}, year={2017}, publisher={IEEE} }
}
Wanli Wanli Ouyang, Xingyu Zeng, Xiaogang Wang, et al, "DeepID-Net: Object Detection with Deformable Part Based Convolutional Neural Networks," IEEE Trans. Pattern Anal. Mach. Intell. (PAMI), accepted, 2016. [Abstract] [BibTeX] [Full Text] [Project]

Abstract: In this paper, we propose deformable deep convolutional neural networks for generic object detection. This new deep learning object detection framework has innovations in multiple aspects. In the proposed new deep architecture, a new deformation constrained pooling (def-pooling) layer models the deformation of object parts with geometric constraint and penalty. A new pre-training strategy is proposed to learn feature representations more suitable for the object detection task and with good generalization capability. By changing the net structures, training strategies, adding and removing some key components in the detection pipeline, a set of models with large diversity are obtained, which significantly improves the effectiveness of model averaging. The proposed approach improves the mean averaged precision obtained by RCNN [16], which was the state-of-the-art, from 31% to 50.3% on the ILSVRC2014 detection test set. It also outperforms the winner of ILSVRC2014, GoogLeNet, by 6.1%. Detailed component-wise analysis is also provided through extensive experimental evaluation, which provides a global view for people to understand the deep learning object detection pipeline. Wanli Ouyang and Xingyu Zeng and Xiaogang Wang and Shi Qiu and Ping Luo and Yonglong Tian and Hongsheng Li and Shuo Yang and Zhe Wang and Hongyang Li and Kun Wang and Junjie Yan and Chen-Change Loy and Xiaoou Tang

BibTeX entry:
@article{ouyang2016DeepID-Net, title={DeepID-Net: Object Detection with Deformable Part Based Convolutional Neural Networks}, author={Wanli Ouyang and Xingyu Zeng and Xiaogang Wang and Shi Qiu and Ping Luo and Yonglong Tian and Hongsheng Li and Shuo Yang and Zhe Wang and Hongyang Li and Kun Wang and Junjie Yan and Chen-Change Loy and Xiaoou Tang}, journal={IEEE Trans. Pattern Anal. Mach. Intell}, pages={1--14}, year={2016}, publisher={IEEE} }
}
Wanli Wanli Ouyang, Tianle Zhao, Wai-Kuen Cham, Liying Wei "Fast Full-Search Equivalent Pattern Matching Using Asymmetric Haar Wavelet Packets," IEEE Trans. Circuits and System for Video Technology (CSVT), accepted, 2016. [Abstract] [BibTeX] [Full Text] [Project & source code] [Slides]

Abstract: Pattern matching is widely used in signal processing, computer vision, image and video processing. One efficient approach is to perform pattern matching in a transform domain that has good energy packing ability and so allows early rejection of most mismatched candidates. Calculating the transforms of pixels in sliding windows requires much computation, and so fast algorithms are employed. Existing methods require O(u) additions per pixel for projecting input pixels onto u 2D basis vectors. In this paper, we propose a new 2D transform, called asymmetric 2D Haar transform (A2DHT), and extend it to wavelet packets that contain exponentially large number of bases. A basis selection algorithm is then proposed to search for the optimal basis in the wavelet packets. A fast algorithm is also developed which can compute u projection coefficients with only O(log u) additions per pixel. Results of experiments show that the proposed fast algorithm and the proposed transform can significantly accelerate the full-search equivalent pattern matching process and outperform state-of-the-art methods.

BibTeX entry:
@article{ouyang2016fast, title={Fast Full-Search Equivalent Pattern Matching Using Asymmetric Haar Wavelet Packets}, author={Ouyang, Wanli and Zhao, Tianle and Cham, Wai-kuen and Wei, Liying}, journal={IEEE Transactions on Circuits and Systems for Video Technology}, year={2016}, publisher={IEEE} }
}
Wanli Wanli Ouyang, Xingyu Zeng and Xiaogang Wang, "Learning Mutual Visibility Relationship for Pedestrian Detection with a Deep Model," International Journal of Computer Vision (IJCV), accepted, 2016. [Abstract] [BibTeX] [Full Text] [ Project and code]

Abstract: Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. The difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Then the evidence of co-existing pedestrians is used for improving the single pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the Caltech-Train dataset and the ETH dataset. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. The mutual visibility deep model leads to 6% - 15% improvements on multiple benchmark datasets.

BibTeX entry:
@article{ouyang2016learning, title={Learning Mutual Visibility Relationship for Pedestrian Detection with a Deep Model}, author={Ouyang, Wanli and Zeng, Xingyu and Wang, Xiaogang}, journal={IEEE Trans. Pattern Anal. Mach. Intell.}, pages={1--14}, year={2016}, publisher={IEEE} }
}
Wanli Rui Zhao, Wanli Ouyang and Xiaogang Wang, "Person Re-identification by saliency Learning," IEEE Trans. Pattern Analysis and machine Intelligence(TPAMI), accepted, 2016. [Abstract] [BibTeX] [ Full Text] [ Project and dataset(N/A)]

Abstract: Human eyes can recognize person identities based on small salient regions, i.e. person saliency is distinctive and reliable in pedestrian matching across disjoint camera views. However, such valuable information is often hidden when computing similarities of pedestrian images with existing approaches. Inspired by our user study result of human perception on person saliency, we propose a novel perspective for person re-identification based on learning person saliency and matching saliency distribution. The proposed saliency learning and matching framework consists of four steps: (1) To handle misalignment caused by drastic viewpoint change and pose variations, we apply adjacency constrained patch matching to build dense correspondence between image pairs. (2) We propose two alternative methods, i.e. K-Nearest Neighbors and One-class SVM, to estimate a saliency score for each image patch, through which distinctive features stand out without using identity labels in the training procedure. (3) saliency matching is proposed based on patch matching. Matching patches with inconsistent saliency brings penalty, and images of the same identity are recognized by minimizing the saliency matching cost. (4) Furthermore, saliency matching is tightly integrated with patch matching in a unified structural RankSVM learning framework. The effectiveness of our approach is validated on the four public datasets. Our approach outperforms the state-of-the-art person re-identification methods on all these datasets.

BibTeX entry:
@article{zhao2014person, title={Person Re-identification by Saliency Learning}, author={Zhao, Rui and Ouyang, Wanli and Wang, Xiaogang}, journal={IEEE Trans. Pattern Anal. Mach. Intell.}, year={2016} }
}
Wanli Wanli Ouyang, Xingyu Zeng and Xiaogang Wang, "Partial Occlusion Handling in Pedestrian Detection with a Deep Model," IEEE Trans. Circuits and System for Video Technology (TCSVT), accepted, 2015. [Abstract] [BibTeX] [ Full Text] [ Project]

Abstract: Part-based models have demonstrated their merit in object detection. However, there is a key issue to be solved on how to integrate the inaccurate scores of part detectors when there are occlusions, abnormal deformations, appearances or illuminations. To handle the imperfection of part detectors, this paper presents a probabilistic pedestrian detection framework. In this framework, a deformable part-based model is used to obtain the scores of part detectors and the visibilities of parts are modeled as hidden variables. Once the occluded parts are identified, their effects are properly removed from the final detection score. Unlike previous occlusion handling approaches that assumed independence among the visibility probabilities of parts or manually defined rules for the visibility relationship, a deep model is proposed in this paper for learning the visibility relationship among overlapping parts at multiple layers. The proposed approach can be viewed as a general post-processing of part-detection results and can take detection scores of existing part-based models as input. Experimental results on three public datasets (Caltech, ETH and Daimler) and a new CUHK occlusion dataset, which is specially designed for the evaluation of occlusion handling approaches, show the effectiveness of the proposed approach.

BibTeX entry:
@article{ouyangPartial2015,
title={Partial Occlusion Handling in Pedestrian Detection with a Deep Model},
author={Ouyang, Wanli and Zeng, Xingyu and Wang, Xiaogang},
publisher={IEEE Trans. Circuits and System for Video Technology}
}
Wanli Wanli Ouyang, Xingyu Zeng and Xiaogang Wang, "Single-Pedestrian Detection Aided by Two-Pedestrian Detection," IEEE Trans. Pattern Analysis and machine Intelligence(TPAMI), accepted, 2015. [Abstract] [BibTeX] [ Full Text] [ Project, source code and dataset]

Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups. A new approach is proposed for single-pedestrian detection aided by two-pedestrian detection. A mixture model of two-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and two-pedestrian detectors, and to refine the single-pedestrian detection result using two-pedestrian detection. The two-pedestrian detector can integrate with any single-pedestrian detector. Twenty-five state-of-the-art single-pedestrian detection approaches are combined with the two-pedestrian detector on three widely used public datasets: Caltech, TUD-Brussels, and ETH. Experimental results show that our framework improves all these approaches. The average improvement is 9 percent on the Caltech-Test dataset, 11 percent on the TUD-Brussels dataset and 17 percent on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 37 to 32 percent on the Caltech-Test dataset, from 55 to 50 percent on the TUD-Brussels dataset and from 43 to 38 percent on the ETH dataset.

BibTeX entry:
@article{ouyangsingle,
title={Single-Pedestrian Detection Aided by 2-Pedestrian Detection},
author={Ouyang, Wanli and Zeng, Xingyu and Wang, Xiaogang},
publisher={IEEE Trans. Pattern Anal. Mach. Intell.}
}
Wanli Wanli Ouyang, Renqi Zhang and Wai-Kuen Cham, "Segmented Gray-Code Kernels for Fast Pattern Matching," IEEE Trans. Image Processing(TIP), 22(4):1512-1525, Apr. 2013. [Abstract] [BibTeX] [ Full Text] [ Project, source code and dataset]

Abstract: The gray-code kernels (GCK) family, which has Walsh Hadamard transform on sliding windows as a member, is a family of kernels that can perform image analysis efficiently using a fast algorithm, such as the GCK algorithm. The GCK has been successfully used for pattern matching. In this paper, we propose that the G4-GCK algorithm is more efficient than the previous algorithm in computing GCK. The G4-GCK algorithm requires four additions per pixel for three basis vectors independent of transform size and dimension. Based on the G4-GCK algorithm, we then propose the segmented GCK. By segmenting input data into Ls parts, the SegGCK requires only four additions per pixel for 3Ls basis vectors. Experimental results show that the proposed algorithm can significantly accelerate the full-search equivalent pattern matching process and outperforms state-ofthe- art methods.

BibTeX entry:
@ARTICLE{Ouyang2013SegGCK,
author = {Wanli Ouyang and Federico Tombari and Stefano Mattoccia and Luigi
Di Stefano and Wai-Kuen Cham},
title = {Segmented Gray-Code Kernels for Fast Pattern Matching},
journal = {IEEE Trans. Image Processing},
year = {2013},
volume = {22},
pages = {1512-1525},
number = {4},
month = {Apr.},
}
Wanli Ouyang, Federico Tombari, Stefano Mattoccia, Luigi Di Stefano and Wai-Kuen Cham, "Performance Evaluation of Full Search Equivalent Pattern Matching Algorithms," IEEE Trans. Pattern Analysis and machine Intelligence(TPAMI), 34(1):127 - 143, Jan. 2012. [Abstract] [BibTeX] [ Full Text] [ Project, source code, and dataset]

Abstract: Pattern matching is widely used in signal processing, computer vision, and image and video processing. Full search equivalent algorithms accelerate the pattern matching process and, in the meantime, yield exactly the same result as the full search. This paper proposes an analysis and comparison of state-of-the-art algorithms for full search equivalent pattern matching. Our intention is that the data sets and tests used in our evaluation will be a benchmark for testing future pattern matching algorithms, and that the analysis concerning state-of-the-art algorithms could inspire new fast algorithms. We also propose extensions of the evaluated algorithms and show that they outperform the original formulations.

BibTeX entry:
@ARTICLE{Ouyang:PME,
author = {Wanli Ouyang and Federico Tombari and Stefano Mattoccia and Luigi
Di Stefano and Wai-Kuen Cham},
title = {Performance Evaluation of Full Search Equivalent Pattern Matching
Algorithms},
journal = {IEEE Trans. Pattern Anal. Mach. Intell.},
year = {2012},
volume = {34},
pages = {127 - 143},
number = {1},
month = {Jan.},
}
F. Tombari, Wanli Ouyang, L. Di Stefano, W.K. Cham, “Adaptive Low Resolution Pruning for Fast Full-Search Equivalent Pattern Matching,” Pattern Recognition Letters (JPRL), 32(15), pp 2119-2127, November 2011 [Abstract] [BibTeX] [ Full Text]

Abstract: Several recent proposals have shown the feasibility of significantly speeding-up pattern matching by means of Full Search-equivalent techniques, i.e. without approximating the outcome of the search with respect to a brute force investigation. These techniques are generally heavily based on efficient incremental calculation schemes aimed at avoiding unnecessary computations. In a very recent and extensive experimental evaluation, Low Resolution Pruning turned out to be in most cases the best performing approach. In this paper we propose a computational analysis of several incremental techniques specifi- cally designed to enhance the efficiency of LRP. In addition, we propose a novel LRP algorithm aimed at minimizing the theoretical number of operations by adaptively exploiting different incremental approaches. We demonstrate the effectiveness of our proposal by means of experimental evaluation on a large dataset

BibTeX entry:
@ARTICLE{Ouyang:PME,
author = {Federico Tombari and Wanli Ouyang and Luigi Di Stefano and Wai-Kuen Cham},
title = {“Adaptive Low Resolution Pruning for Fast Full-Search Equivalent Pattern Matching},
journal = {Pattern Recognition Letters (JPRL)},
year = {2011},
volume = {32},
pages = {2119-2127},
number = {15},
month = {Nov.},
}
Photo Wanli Ouyang and Wai-Kuen Cham, "Fast algorithm for Walsh Hadamard transform on sliding windows", IEEE Trans. Pattern Analysis and machine Intelligence (TPAMI), 32(1):165-171, Jan. 2010. [Abstract] [BibTeX] [Matlab Code for the proposed fast algorithm] [ Project & source code] [Full Text]

Abstract: This paper proposes a fast algorithm for Walsh Hadamard Transform on sliding windows which can be used to implement pattern matching most efficiently. The computational requirement of the proposed algorithm is about 1.5 additions per projection vector per sample, which is the lowest among existing fast algorithms for Walsh Hadamard Transform on sliding windows.

BibTeX entry:
@ARTICLE{Wanli:GCK,
author = {W. Ouyang and W.K. Cham},
title = {Fast Algorithm for {W}alsh {H}adamard transform on Sliding Windows},
journal = {IEEE Trans. Pattern Anal. Mach. Intell.},
year = {2010},
volume = {32},
pages = {165-171},
number = {1},
month = {Jan.},
}
Ori_Edgeimage
Edgeimage1 Edgeimage2 Edgeimage3
Renqi Zhang, Wanli Ouyang and Wai-Kuen Cham, "Image Edge Detection Using Hidden Markov Chain Model Based on the Non-decimated Wavelet," International Journal of Signal Processing, Image Processing and Pattern, Vol.1, No.2, March 2009, pp.109-117. [Abstract] [BibTeX]

Abstract: Edge detection plays an important role in digital image processing. Based on the non-decimated wavelet which is shift invariant, in this paper, we develop a new edge detecting technique using hidden Markov chain (HMC) model. With this proposed model (NWHMC), each wavelet coefficient contains a hidden state, herein, we adopt Laplacian model and Gaussian model to represent the information of the state "big" and the state "small". The model can be trained by EM algorithm, and then we employ Viterbi algorithm to reveal the hidden state of each coefficient according to MAP estimation. The detecting results of several images are provided to evaluate the algorithm. In addition, the algorithm can be applied to noisy images efficiently.

BibTeX entry:
@ARTICLE{2010b_dong_tcsvt,
author = {J. Dong and K. N. Ngan},
title = {Parametric interpolation filter for {HD} video coding},
journal = {IEEE Transactions on Circuits and Systems for Video Technology},
year = {2010},
volume = {20},
pages = {1892-1897},
number = {12},
month = {Dec.}
}
NXP_DSPImageDCT Wanli Ouyang, C. Xiao, and G. liu. "A new IDCT and motion compensation algorithm based on VLIW (in Chinese)," ACTA ELECTRONICA SINICA (One of the best Electronic Engineering journals in China), 33(11):2074-2079, Nov. 2005. [Abstract] [BibTeX]

Abstract: In this paper, the matrix multiplication theory is utilized to obtain Loeffler's DCT algorithm and Feig's DCT algorithm. In addition, the Feig's algorithm is extended to other three forms. Utilizing matrix decomposition representation, the links and differences between the two algorithms are revealed. This decomposition representation helps understanding and further improving the algorithms using matrix theory.

BibTeX entry:

@ARTICLE{Ouyang:IDCT,
author = {Wanli Ouyang and Chuangbai Xiao and Guang liu},
title = {A New IDCT and Motion Compensation Algorithm Based on VLIW},
journal = {ACTA ELECTRONICA SINICA},
year = {2005},
volume = {33},
pages = {2074-2079},
number = {11},
month = {Nov.},
owner = {wlouyang},
timestamp = {2010.03.13}
}
Back To Top

Conference Papers

OHT_Complexity
Kai Kang, Hongsheng Li, W. Ouyang , Junjie Yan, Xihui Liu, Tong Xiao, Xiaogang Wang. ”Object Detection in Vidoes with Tubelet Proposal Networks”, Proc. CVPR , 2017. [Full Text]

Feng Zhu, Hongsheng Li, W. Ouyang , Nenghai Yu, Xiaogang Wang. ”Learning Spatial Regularization with Image-level Supervisions for Multi-label Image Classification”, Proc. CVPR , 2017. [Full Text]

Yu Liu, Junjie Yan, W. Ouyang . ”Quality Aware Network for Set to Set Recognition”, Proc. CVPR , 2017. [Full Text]

Yikang LI , W. Ouyang , Xiaogang Wang. ”ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection”, Proc. CVPR , 2017. [Full Text]

Xiao Chu, Wei Yang, W. Ouyang , Xiaogang Wang, Alan Yuille. ”Multi-Context Attention for Human Pose Estimation”, Proc. CVPR , 2017. [Full Text] [Code]

Dan Xu, Elisa Ricci, W. Ouyang , Xiaogang Wang, Nicu Sebe. Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation”, Proc. CVPR , 2017. [Full Text]

Dan Xu, W. Ouyang , Elisa Ricci, Xiaogang Wang, Nicu Sebe. Learning Cross-Modal Deep Representations for Robust Pedestrian Detection”, Proc. CVPR , 2017. [Full Text]
OHT_Complexity
X. Chu, W. Ouyang , H. Li, X. Wang. ”CRF-CNN: Modeling Structured Information in Human Pose Estimation”, Advances In Neural Information Processing Systems (NIPS), 2016. [Full Text] [Demo Results]
OHT_Complexity
Xingyu Zeng, Wanli Ouyang, Bin Yang, Junjie Yan, Xiaogang "Gated Bidirectional CNN for Object Detection", In Proc. ECCV 2016. [Full Text]

Z. Wang, H. Li, W. Ouyang , X. Wang Wanli "Learnable Histogram: Statistical Context Features for Deep Neural Networks", In Proc. ECCV 2016. [Full Text]
OHT_Complexity
Wanli Ouyang, X. Wang, C. Zhang, and X. Yang. "Factors in finetuning deep model for object detection", In Proc. CVPR 2016. [Full Text]

Wei Yang, Wanli Ouyang, Hongsheng Li and Xiaogang Wang "End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation", In Proc. CVPR 2016 (Oral). [Full Text] [Project]

Lijun Wang, Wanli Ouyang, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. "STCT: Sequentially Training Convolutional Networks for Visual Tracking", In Proc. CVPR 2016. [Full Text]

K. Kang, Wanli Ouyang, H. Li, and X. Wang. "Object detection from video tubelets with convolutional neural networks", In Proc. CVPR 2016. [Full Text]

X. Chu, Wanli Ouyang , H. Li, and X. Wang. "Structured feature learning for pose estimation", In Proc. CVPR 2016. [Full Text] [Project and dataset ] [Spotlight talk] [Source code ] [Supplementary ]

Tong Xiao, Hongsheng Li, Wanli Ouyang, Xiaogang Wang, "Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification", In Proc. CVPR 2016. [Full Text]

Hongyang Li, Wanli Ouyang, Xiaogang Wang "Multiple Bias on Non-linearity Activation in Deep Neural Networks", In Proc. ICML 2016. [Full Text] [Slides ] [Code on Github ]
OHT_Complexity
Wanli Ouyang, Hongyang Li, Xingyu Zeng, and Xiaogang Wang, "Learning Deep Representation with Large-scale Attributes", In Proc. ICCV 2015. [Abstract] [BibTeX] [Full Text] [Project and dataset ]

Abstract: Learning strong feature representations from large scale supervision has achieved remarkable success in computer vision as the emergence of deep learning techniques. It is driven by big visual data with rich annotations. This paper contributes a large-scale object attribute database that contains rich attribute annotations (over 300 attributes) for ∼180k samples and 494 object classes. Based on the ImageNet object detection dataset, it annotates the rotation, viewpoint, object part location, part occlusion, part existence, common attributes, and class-specific attributes. Then we use this dataset to train deep representations and extensively evaluate how these attributes are useful on the general object detection task. In order to make better use of the attribute annotations, a deep learning scheme is proposed by modeling the relationship of attributes and hierarchically clustering them into semantically meaningful mixture types. Experimental results show that the attributes are helpful in learning better features and improving the object detection accuracy by 2.6% in mAP on the ILSVRC 2014 object detection dataset and 2.4% in mAP on PASCAL VOC 2007 object detection dataset. Such improvement is well generalized across datasets.

BibTeX entry:
@CONFERENCE{ouyang2015learning,
title={Learning Deep Representation With Large-Scale Attributes},
author={Ouyang, Wanli and Li, Hongyang and Zeng, Xingyu and Wang, Xiaogang},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
pages={1895--1903},
year={2015}
}
OHT_Complexity
Xiao Chu, Wanli Ouyang, Wei Yang, and Xiaogang Wang, "Multi-task Recurrent Neural Network for Immediacy Prediction", In Proc. ICCV 2015. (Oral) [Abstract] [BibTeX] [Full Text] [Project and dataset ] [Oral presentation on videolectures] [Poster]

Abstract: In this paper, we propose to predict immediacy for interacting persons from still images. A complete immediacy set includes interactions, relative distance, body leaning direction and standing orientation. These measures are found to be related to the attitude, social relationship, social interaction, action, nationality, and religion of the communicators. A large-scale dataset with 10, 000 images is constructed, in which all the immediacy cues and the human poses are annotated. We propose a rich set of immediacy representations that help to predict immediacy from imperfect 1-person and 2-person pose estimation results. A multi-task deep recurrent neural network is constructed to take the proposed rich immediacy representations as the input and learn the complex relationship among immediacy predictions through multiple steps of refinement. The effectiveness of the proposed approach is proved through extensive experiments on the large-scale dataset.

BibTeX entry:
@inproceedings{chu2015multi,
title={Multi-Task Recurrent Neural Network for Immediacy Prediction},
author={Chu, Xiao and Ouyang, Wanli and Yang, Wei and Wang, Xiaogang},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
pages={3352--3360},
year={2015}
}
OHT_Complexity
Lijun Wang, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu, "Visual Tracking with Fully Convolutional Networks", In Proc. ICCV 2015. [Abstract] [BibTeX] [Full Text] [Project and source code ]

Abstract: We propose a new approach for general object tracking with fully convolutional neural network. Instead of treating convolutional neural network (CNN) as a black-box feature extractor, we conduct in-depth study on the properties of CNN features offline pre-trained on massive image data and classification task on ImageNet. The discoveries motivate the design of our tracking system. It is found that convolutional layers in different levels characterize the target from different perspectives. A top layer encodes more semantic features and serves as a category detector, while a lower layer carries more discriminative information and can better separate the target from distracters with similar appearance. Both layers are jointly used with a switch mechanism during tracking. It is also found that for a tracking target, only a subset of neurons are relevant. A feature map selection method is developed to remove noisy and irrelevant feature maps, which can reduce computation redundancy and improve tracking accuracy. Extensive evaluation on the widely used tracking benchmark [36] shows that the proposed tacker outperforms the state-of-the-art significantly.

BibTeX entry:
@inproceedings{wang2015visual,
title={Visual Tracking With Fully Convolutional Networks},
author={Wang, Lijun and Ouyang, Wanli and Wang, Xiaogang and Lu, Huchuan},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
pages={3119--3127},
year={2015}
}
OHT_Complexity Details that show how our team achieve #2 in the ImageNet Large Scale Visual Recognition Challenge 2014:
Wanli Ouyang, Xiaogang Wang, Xingyu Zeng, Shi Qiu, Ping Luo, Yonglong Tian, Hongsheng Li, Shuo Yang, Zhe Wang, Chen-Change Loy and Xiaoou Tang, "DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection", In Proc. CVPR 2015. [Abstract] [BibTeX] [Full Text] [Project]

Abstract: In this paper, we propose deformable deep convolutional neural networks for generic object detection. This new deep learning object detection diagram has innovations in multiple aspects. In the proposed new deep architecture, a new deformation constrained pooling (def-pooling) layer models the deformation of object parts with geometric constraint and penalty. A new pre-training strategy is proposed to learn feature representations more suitable for the object detection task and with good generalization capability. By changing the net structures, training strategies, adding and removing some key components in the detection pipeline, a set of models with large diversity are obtained, which significantly improves the effectiveness of model averaging. The proposed approach improves the mean averaged precision obtained by RCNN [13], which is the state-of-the-art, from 31% to 50.3% on the ILSVRC2014 detection dataset. Detailed component-wise analysis is also provided through extensive experimental evaluation, which provide a global view for people to understand the deep learning object detection pipeline.

BibTeX entry:
@CONFERENCE{ouyang2015deepid,
author = {Ouyang, Wanli and Wang, Xiaogang and Zeng, Xingyu and Qiu, Shi and Luo, Ping and Tian, Yonglong and Li, Hongsheng and Yang, Shuo and Wang, Zhe and Loy, Chen-Change, and Tang, Xiaoou},
title = {DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection},
booktitle = {CVPR},
year = {2015},
}
OHT_Complexity
Rui Zhao, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang, "Saliency Detection by Multi-context Deep Learning", In Proc. CVPR 2015. [Abstract] [BibTeX] [Full Text] [Code] [Supplementary Material]

Abstract: Low-level saliency cues or priors do not produce good enough saliency detection results especially when the salient object presents in a low-contrast background with confusing visual appearance. This issue raises a serious problem for conventional approaches. In this paper, we tackle this problem by proposing a multi-context deep learning framework for salient object detection. We employ deep Convolutional Neural Networks to model saliency of objects in images. Global context and local context are both taken into account, and are jointly modeled in a unified multicontext deep learning framework. To provide a better initialization for training the deep neural networks, we investigate different pre-training strategies, and a task-specific pre-training scheme is designed to make the multi-context modeling suited for saliency detection. Furthermore, recently proposed contemporary deep models in the ImageNet Image Classification Challenge are tested, and their effectiveness in saliency detection are investigated. Our approach is extensively evaluated on five public datasets, and experimental results show significant and consistent improvements over the state-ofthe-art methods

BibTeX entry:
@CONFERENCE{ouyang2015deepid,
title={Saliency Detection by Multi-Context Deep Learning},
author={Zhao, Rui and Ouyang, Wanli and Li, Hongsheng and Wang, Xiaogang},
booktitle = {CVPR},
year={2015}
}
OHT_Complexity Xinyu Zeng, Wanli Ouyang, Xiaogang Wang, "Deep Learning of Scene-Specific Classifier for Pedestrian Detection", In Proc. ECCV 2014. [Abstract] [BibTeX] [Full Text]

Abstract: The performance of a detector depends much on its training dataset and drops significantly when the detector is applied to a new scene due to the large variations between the source training dataset and the target scene. In order to bridge this appearance gap, we propose a deep model to automatically learn scene-specific features and visual patterns in static video surveillance without any manual labels from the target scene. It jointly learns a scene-specific classifier and the distribution of the target samples. Both tasks share multi-scale feature representations with both discriminative and representative power. We also propose a cluster layer in the deep model that utilizes the scene-specific visual patterns for pedestrian detection. Our specifically designed objective function not only incorporates the condence scores of target training samples but also automatically weights the importance of source training samples by fitting the marginal distributions of target samples. It significantly improves the detection rates at 1 FPPI by 10% compared with the state-of-the-art domain adaptation methods on MIT TrafficDataset and CUHK Square Dataset.

BibTeX entry:
@CONFERENCE{Zeng2014Deep,
author = {X. Zeng and W. Ouyang and X. Wang},
title = {Deep Learning of Scene-Specific Classifier for Pedestrian Detection},
booktitle = {ECCV},
year = {2014},
}
OHT_Complexity Wanli Ouyang, Xiao Chu, Xiaogang Wang, "Multi-source Deep Learning for Human Pose Estimation", In Proc. IEEE CVPR 2014. [Abstract] [BibTeX] [Full Text]

Abstract: Visual appearance score, appearance mixture type and deformation are three important information sources for human pose estimation. This paper proposes to build a multi-source deep model in order to extract non-linear representation from these different aspects of information sources. With the deep model, the global, high-order human body articulation patterns in these information sources are extracted for pose estimation. The task for estimating body locations and the task for human detection are jointly learned using a unified deep model. The proposed approach can be viewed as a post-processing of pose estimation results and can flexibly integrate with existing methods by taking their information sources as input. By extracting the non-linear representation from multiple information sources, the deep model outperforms state-of-the-art by up to 8.6 percent on three public benchmark datasets.

BibTeX entry:
@CONFERENCE{Ouyang2014Multi,
author = {W. Ouyang and X. Chu and X. Wang},
title = {Multi-source Deep Learning for Human Pose Estimation},
booktitle = {CVPR},
year = {2014},
}
OHT_Complexity Rui Zhao, Wanli Ouyang, and Xiaogang Wang, "Learning Mid-level Filters for Person Re-Identfiation", In Proc. IEEE CVPR 2014. [Abstract] [BibTeX] [Project] [Code] [Full Text]

Abstract: In this paper, we propose a novel approach of learning mid-level filters from automatically discovered patch clusters for person re-identification. It is well motivated by our study on what are good filters for person re-identification. Our mid-level filters are discriminatively learned for identifying specific visual patterns and distinguishing persons, and have good cross-view invariance. First, local patches are qualitatively measured and classified with their discriminative power. Discriminative and representative patches are collected for filter learning. Second, patch clusters with coherent appearance are obtained by pruning hierarchical clustering trees, and a simple but effective cross-view training strategy is proposed to learn filters that are viewinvariant and discriminative. Third, filter responses are integrated with patch matching scores in RankSVM training. The effectiveness of our approach is validated on the VIPeR dataset and the CUHK01 dataset. The learned mid-level features are complementary to existing handcrafted lowlevel features, and improve the best Rank-1 matching rate on the VIPeR dataset by 14%

BibTeX entry:

@CONFERENCE{Ouyang2014Multi,
title = {Learning Mid-level Filters for Person Re-identfiation},
author={Zhao, Rui and Ouyang, Wanli and Wang, Xiaogang},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2014},
month = {June},
address = {Columbus, USA}
}
OHT_Complexity Wanli Ouyang, Xiaogang Wang, "Joint Deep Learning for Pedestrian Detection ", In Proc. IEEE ICCV 2013. [Abstract] [BibTeX] [Project & Source code] [Full Text]

Abstract: Feature extraction, deformation handling, occlusion handling, and classification are four important components in pedestrian detection. Existing methods learn or design these components either individually or sequentially. The interaction among these components is not yet well explored. This paper proposes that they should be jointly learned in order to maximize their strengths through cooperation. We formulate these four components into a joint deep learning framework and propose a new deep network architecture. By establishing automatic, mutual interaction among components, the deep model achieves a 9% reduction in the average miss rate compared with the current best-performing pedestrian detection approaches on the largest Caltech benchmark dataset.

BibTeX entry:
@CONFERENCE{Ouyang2013Joint,
author = {W. Ouyang and X. Wang},
title = {Joint Deep Learning for Pedestrian Detection},
booktitle = {ICCV},
year = {2013},
}
OHT_Complexity Xingyu Zeng, Wanli Ouyang, Xiaogang Wang, "Multi-Stage Contextual Deep Learning for Pedestrian Detection ", In Proc. IEEE ICCV 2013. [Abstract] [BibTeX] [Project ] [Full Text]

Abstract: Multi-Stage classifiers have been widely used in pedestrian detection and achieved great success. However, these classifiers are usually trained sequentially without joint optimization. In this paper, we propose a new deep architecture that can jointly train multiple classifiers through several stages of back-propagation. Unsupervised pre-training and specifically designed stage-wise supervised training are used to regularize the optimization problem. Through a specific design of the training scheme, this deep architecture is able to simulate the cascaded classifiers in using hard samples to train the network stage-by-stage. Both theoretical analysis and experimental results show that the training scheme helps to avoid overfitting. Experimental results on three datasets(Caltech, ETH and TudBrussels) show that our approach performs better than the state-of-the-art approaches.

BibTeX entry:
@CONFERENCE{Zeng2013Multi,
author = {X. Zeng and W. Ouyang and X. Wang},
title = {Multi-Stage Contextual Deep Learning for Pedestrian Detection},
booktitle = {ICCV},
year = {2013},
}
OHT_Complexity Rui Zhao, Wanli Ouyang, Xiaogang Wang, "Person Re-identification by Salience Matching ", In Proc. IEEE ICCV 2013. [Abstract] [BibTeX] [Project] [Full Text]

Abstract: Human salience is distinctive and reliable information in matching pedestrians across disjoint camera views. In this paper, we exploit the pairwise salience distribution relationship between pedestrian images, and solve the person re-identification problem by proposing a salience matching strategy. To handle the misalignment problem in pedestrian images, patch matching is adopted and patch salience is estimated. Matching patches with inconsistent salience brings penalty. Images of the same person are recognized by minimizing the salience matching cost. Furthermore, our salience matching is tightly integrated with patch matching in a unified structural RankSVM learning framework. The effectiveness of our approach is validated on the VIPeR dataset and the CUHK Campus dataset. It outperforms the state-of-the-art methods on both datasets.

BibTeX entry:
@CONFERENCE{Zhao2013Person,
author = {R. Zhao and W. Ouyang and X. Wang},
title = {Person Re-identification by Salience Matching},
booktitle = {ICCV},
year = {2013},
}
OHT_Complexity Wanli Ouyang, Xiaogang Wang, "Single-Pedestrian Detection aided by Multi-pedestrian Detection ", In Proc. IEEE CVPR 2013. [Abstract] [BibTeX] [Project & Source code] [Full Text]

Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels and ETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.

BibTeX entry:
@CONFERENCE{Ouyang2013MultiPed,
author = {W. Ouyang and X. Wang},
title = {Single-Pedestrian Detection aided by Multi-pedestrian Detection},
booktitle = {CVPR},
year = {2013},
}
OHT_Complexity Wanli Ouyang, Xingyu Zeng and Xiaogang Wang, "Modeling Mutual Visibility Relationship in Pedestrian Detection ", In Proc. IEEE CVPR 2013. [Abstract] [BibTeX] [Project] [Full Text]

Abstract: Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. The difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/ visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the Caltech- Train dataset, the Caltech-Test dataset and the ETH dataset. Including mutual visibility leads to 4%−8% improvements on multiple benchmark datasets.

BibTeX entry:
@CONFERENCE{Ouyang2013MutualDBN,
author = {Wanli Ouyang and Xingyu Zeng and Xiaogang Wang},
title = {Modeling Mutual Visibility Relationship in Pedestrian Detection},
booktitle = {CVPR},
year = {2013},
}
OHT_Complexity Rui Zhao Wanli Ouyang, and Xiaogang Wang, "Unsupervised Salience Learning for Person Re-identification ", In Proc. IEEE CVPR 2013. [Project & source code] [PDF]
HumanDBNDetRes Wanli Ouyang and Xiaogang Wang, "A Discriminative Deep Model for Pedestrian Detection with Occlusion Handling," In Proc. IEEE CVPR 2012. [Abstract] [BibTeX] [Project] [ Full Text] [CUHK Occlusion Dataset]

Abstract: Part-based models have demonstrated their merit in object detection. However, there is a key issue to be solved on how to integrate the inaccurate scores of part detectors when there are occlusions or large deformations. To handle the imperfectness of part detectors, this paper presents a probabilistic pedestrian detection framework. In this framework, a deformable part-based model is used to obtain the scores of part detectors and the visibilities of parts are modeled as hidden variables. Unlike previous occlusion handling approaches that assume independence among visibility probabilities of parts or manually define rules for the visibility relationship, a discriminative deep model is used in this paper for learning the visibility relationship among overlapping parts at multiple layers. Experimental results on three public datasets (Caltech, ETH and Daimler) and a new dataset (the new dataset will be released to the public) specially designed for the evaluation of occlusion handling approaches show the effectiveness of the proposed approach.

BibTeX entry:
@CONFERENCE{Ouyang:DBNHuman,
author = {Wanli Ouyang and Xiaogang Wang},
title = {A Discriminative Deep Model for Pedestrian Detection with Occlusion
Handling},
booktitle = {CVPR},
year = {2012}, }
OHT_Complexity Wanli Ouyang, Renqi Zhang and Wai-Kuen Cham, "Fast pattern matching using orthogonal Haar transform ", In Proc. IEEE CVPR 2010. [Abstract] [BibTeX] [Project & source code] [Slides] [Full Text]

Abstract: This paper introduces strip sum on the image. The sum of pixels in a rectangle can be computed by one addition using the strip sum. Then we propose to use the orthogonal Haar transform (OHT) for pattern matching. Applied for pattern matching, the algorithm using strip sum requires O(log u) additions per pixel to project input data of size N by N onto u 2-D OHT basis while existing fast algorithms require O(u) additions per pixel to project the same data onto u 2-D WHT or GCK basis.

BibTeX entry:
@CONFERENCE{Ouyang:OHT,
author = {W. Ouyang and R. Zhang and W.K. Cham},
title = {Fast pattern matching using orthogonal {H}aar transform},
booktitle = {CVPR},
year = {2010},
}
Img_codedImg_filtered Renqi Zhang, Wanli Ouyang and Wai-Kuen Cham, "Image Deblocking using Dual Adaptive FIR Wiener Filter in the DCT Transform Domain," In Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, ICASSP 2009, Taiwan, April 19-24, 2009, pp.1181-1184. [Abstract] [BibTeX]

Abstract: Blocking artifacts exist in images and video sequences compressed to low bit rates using block discrete cosine transform (DCT) compression standards. In order to reduce blocking artifacts, a novel DCT domain technique is presented in this paper. Firstly, a new FIR Wiener filter which exploits the dependence of neighboring DCT coefficients based on the linear minimum mean-square-error (LMMSE) criterion is proposed. Then we apply the new FIR Wiener filter twice in a dual adaptive filtering structure to restore each quantized DCT coefficient. In addition, an efficient parameter estimation method is proposed for the designed filter. Experimental results show that the performance of the proposed method is comparable to the state-of-the-art methods but has low computational complexity.

BibTeX entry:
@CONFERENCE{Zhangrq:WienerDeblkICAASP,
author = {Renqi Zhang and Wanli Ouyang and W.K. Cham},
title = {Image Deblocking using Dual Adaptive FIR Wiener Filter in the {DCT} Transform Domain},
booktitle = {Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP)},
year = {2009},
pages = {1181-1184},
address = {Taiwan},
month = {April 19-24},
}
Img_oriImg_canny Img_our3DEdge Renqi Zhang, Wanli Ouyang and Wai-Kuen Cham, "Image Multi-scale Edge Detection using 3-D Hidden Markov Model based on the Non-decimated Wavelet," In Proc. 2009 IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, November 7-10, 2009, pp.2173-2176. [Abstract] [BibTeX]

Abstract: Edge detection plays an important role in digital image processing. Based on the non-decimated wavelet which is shift-invariant, in this paper, we develop a new edge detecting technique using 3-D Hidden Markov Model. Our proposed model can not only capture the relationship of the wavelet coefficients inter-scale, but also consider the intra-scale dependence. A computationally efficient maximum likelihood (ML) estimation algorithm is employed to compute parameters and the hidden state of each coefficient is revealed by maximum a posteriori (MAP) estimation. Experimental results of natural images are provided to evaluate the algorithm. In addition, the proposed model has the potential to be an efficient multi-scale statistical modeling tool for other image or video processing tasks.

BibTeX entry:
@CONFERENCE{Renqi:Edge,
author = {Renqi Zhang and Wanli Ouyang and Wai-Kuen Cham},
title = {Image Multi-scale Edge Detection using 3-D Hidden Markov Model based
on the Non-decimated Wavelet},
booktitle = {Proc. 2009 IEEE International Conference on Image Processing (ICIP)},
year = {2009},
}
Wanli Ouyang, D. Song, C. Xiao, and W. Ju. The matrix decomposition representation of DCT algorithms. In IEEE midwest sym. Circuits and Syst. (MWCAS), 2005.
Wanli Ouyang, C. Xiao, W. Ju, and D. Song. The dynamic range acquisition of DCT and IDCT algorithms. In IEEE midwest sym. Circuits and Syst. (MWCAS), 2005.
Wanli Ouyang, C. Xiao, W. Ju, and D. Song. Practical fast asymmetric DCT algorithm based on SIMD and VLIW. In IEEE Int. Sym. Intelligent Signal Processing , 2005.
Back To Top