CVPR 2010 Tutorial on Learning multi-subspaces in computer vision
Description
Subspace methods are important for solving numerous problems in machine learning, pattern recognition, and computer vision. While subspace methods have been studied intensively (from PCA and beyond), new results are emerging from the areas of sparse coding, matrix factorization, and matrix completion. These results, in contrast with more classical subspace learning approaches, deal with multiple subspaces, of varying dimensions, automatically learned from the data. This makes the model extremely non-linear and rich. Applications of such methods range from computer vision (motion estimation, scene analysis, etc) to data mining (collaborative filtering, gene expression, etc). The goal of this tutorial is to present the audience with a unifying perspective of this problem, introducing the basic concepts and connecting multi-subspace methods with sparse modeling, matrix factorization, and dimensionality reduction. The presentation of the theoretical foundations will be complemented with applications in computer vision including motion segmentation, face recognition, clustering, and registration.
Organizers
Yi Ma
Guillermo Sapiro
René Vidal
Keynote Speaker
Emmanuel Candés is a Professor of Mathematics, a Professor of Statistics, and a member of the Institute of Computational and Mathematical Engineering at Stanford University. He is also the Ronald and Maxine Linde Professor of Applied and Computational Mathematics at the California Institute of Technology (on leave). He received the Ph.D. degree in statistics from Stanford University in 1998. His research interests are in computational harmonic analysis, multiscale analysis, mathematical optimization, statistical estimation and detection with applications to the imaging sciences, signal processing, scientific computing, and inverse problems. Professor Candés received numerous awards, most notably the 2006 Alan T. Waterman Medal, which is the highest honor bestowed by the National Science Foundation and which recognizes the achievements of scientists who are no older than 35, or not more than seven years beyond their doctorate. Other awards include the 2008 Information Theory Society Paper Award, the 2005 James H. Wilkinson Prize in Numerical Analysis and Scientific Computing awarded by the Society of Industrial and Applied Mathematics (SIAM) and the 2010 George Polya Prize awarded by SIAM. He has given over 40 plenary lectures at major international conferences.
Syllabus
  1. Introduction (René Vidal - 5 minutes)

  2. Robust Principal Component Analysis: Theory and Algorithms (Emanuel Candés - 45 minutes) This part of this tutorial is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the L1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithmic approach for solving this optimization problem emphasizing the suitability of this approach for large scale problems.

  3. Robust Principal Component Analysis: Applications (Yi Ma - 45 minutes) This part of the tutorial will present several applications of Robust PCA to various problems in image/video processing, computer vision, and webdata analysis. In the area of video surveillance for example, our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, it offers a principled way of removing shadows and specularities in images of faces. Moreover, we will show how a special extension of the RPCA can readily handle simultaneously transformation in domain and corruptions in intensity, for various purposes such image alignment and rectification.

  4. Multi-Subspace Learning and Clustering via Sparse Representation (René Vidal - 45 minutes)
    1. Recovery of Block-Sparse Signals via Sparse Representation
    2. Clustering of Multiple Subspaces via Sparse Representation
    3. Applications to Motion and Video Segmentation

  5. Collaborative Dictionary and Multi-Subspace Learning (Guillermo Sapiro - 45 minutes) This part of the tutorial will show the need to work with learned multiple subspaces and connect it with dictionary learning. After presenting the underlying structure of the problem, we will primarily concentrate on two examples illustrating both the need and the power of multiple subspaces. The first one shows how to learn multiple PCAs (or GMM) to obtain state-of-the-art image enhancement results at a fraction of the computation cost of previous approaches. The second one presents a hierarchical model where the multiple sub spaces are exploited in a collaborative and hierarchical fashion, with applications ranging from classification and clustering to source separation.
    1. Supervised and Unsupervised Clustering via Subspaces and Learned Dictionaries
    2. Collaborative Hierarchical Sparse Models for Clasification and Reconstruction
    3. Applications to Object Recognition

  6. Discussion (15 minutes)
Prerequisites
The course is at the intermediate level which requires the audience to have some basic knowledge in images, linear algebra, and optimization. Some knowledge of introductory-level computer vision, image processing, and machine learning may increase the appreciation of some of the theory and applications, but is not crucial. The course will be sufficiently self-contained, making even beginning graduate students appreciate the value of the framework.
Intended Audience
The intended audience are academicians, graduate students, and industrial researchers who are interested in the state-of-the-art data modeling and machine learning techniques for the modeling and segmenting data that are considered to be mixed, multi-modal, inhomogeneous, heterogeneous, or hybrid. Audience with mathematical and theoretical inclination will enjoy the course as much as the audience with practical tendency.
Tutorial History
This is the first time such a tutorial is delivered, since most of the topics of this tutorial are being developed in the last year or so. Related tutorials include GPCA at CVPR '07 and '08 (Ma and Vidal), sparse modeling at CVPR '09 (Ma) and sparse modeling at ICCV '09 (Sapiro). Those tutorials, that had audience at the level of 50-100, include some overlapping background material, but quite minimal. Participants from those tutorials will receive over 80% of new material in this new tutorial. The speakers have created semester long courses in the topics of this tutorial at their respective institutions.
Funding
Limited funding is available from Microsoft Research Asia for students to partially cover their expenses for attending the course. Interested students should have their advisors send an e-mail to rvidal@jhu.edu by May 15th in order to apply for financial support.
Sponsors
The organizers acknowledge the financial support of Microsoft Research.