Prof. Yizhou Yu
The University of Hong Kong, Hong Kong (China)
Visual Intelligence Based on Deep Learning

Deep learning is a powerful machine learning paradigm that involves deep neural network architectures, and is capable of extracting high-level representations from multi-dimensional sensory data. Such high-level representations are essential for many intelligence related tasks, including visual recognition, speech perception, and language understanding. In this talk, I first give an overview of deep learning and its applications in computer vision and visual perception. Then I present one of the deep learning projects for visual intelligence carried out in my research group. This project addresses scene labeling, which is also known as semantic scene segmentation. It is one of the most fundamental problems in computer vision, and refers to associating every pixel in an image with a semantic object category label, such as `building’, `car’, and `table’. High-quality scene labeling can be beneficial to many intelligent tasks, including robot task planning, pose estimation, context-based image retrieval, and automatic photo adjustment. Our project focuses on semantic labeling of RGB-D scenes, and generates pixel-wise and fine-grained label maps from simultaneously sensed photometric (RGB) and depth channels. Specifically, we tackle this problem by i) developing a novel Long Short-Term Memorized Context Fusion (LSTM-CF) model that captures image contexts from a global perspective and deeply fuses contextual information from multiple sources (i.e. photometric and depth channels), and ii) incorporating this model into deep convolutional neural networks (CNNs) for end-to-end training. It has been demonstrated on the large-scale SUNRGBD benchmark and the canonical NYUDv2 benchmark that our method outperforms existing state-of-the-art methods. In addition, it has been found that our scene labeling results can be leveraged to improve the ground-truth annotations of newly captured RGB-D images in the SUNRGBD dataset.


Professor Yu received his PhD degree in computer science from University of California, Berkeley in 2000. He also holds a MS degree in applied mathematics and a BE degree in computer science and engineering from Zhejiang University. He was first a tenure-track then a tenured professor at University of Illinois, Urbana-Champaign for more than ten years, and a visiting researcher at Microsoft Research Asia during 2001 and 2008.

Professor Yu has made significant contributions to visual computing, including computer vision, computer graphics, and pattern recognition. He is a recipient of 2011 and 2005 ACM SIGGRAPH/EG SCA Best Paper Awards, 2007 NNSF China Overseas Distinguished Investigator Award, 2002 US National Science Foundation CAREER Award, and 1998 Microsoft Graduate Fellowship. His current research interests include deep learning methods for visual recognition, digital geometry processing, video surveillance, and biomedical data analysis.

Professor Yu is an associate editor of IEEE Transactions on Visualization and Computer Graphics as well as International Journal of Software and Informatics. He was a program chair of Computer Animation and Social Agents 2012, Pacific Graphics 2009, and a conference chair of ACM SIGGRAPH/EG Symposium on Computer Animation 2013. He has served on the program committee of many leading international conferences, including SIGGRAPH, SIGGRAPH Asia, and International Conference on Computer Vision.

Home | About Engii | Contact Us
Copyright © 2007 - 2017 Engineering Information Institute. All rights reserved.