Non-Convex Methods for Discovering High-Dimensional Structures in Big and Corrupted Data

Project Summary

Discovering structure in high-dimensional data, such as images, videos and 3D point clouds, has become an essential part of scientific discovery in many disciplines, including machine learning, computer vision, pattern recognition, and signal processing. This has motivated extraordinary advances in the past decade, including various sparse and low-rank modeling methods based on convex optimization with provable theoretical guarantees of correct recovery. However, existing theory and algorithms rely on the assumption that high-dimensional data can be well approximated by low-dimensional structures. While this assumption is adequate for some datasets, e.g., images of faces under varying illumination, it may be violated in many emerging datasets, e.g., 3D point clouds. The goal of this project is to develop a mathematical modeling framework and associated non-convex optimization tools for discovering high-dimensional structures in big and corrupted data.
This project will develop provably correct and scalable optimization algorithms for learning a union of high-dimensional subspaces from big and corrupted data. The proposed algorithms will be based on a novel framework called Dual Principal Component Pursuit that, instead of learning a basis for each subspace, seeks to learn a basis for their orthogonal complements. In sharp contrast with existing sparse and low-rank methods, which require both the dimensions of the subspaces and the percentage of outliers to be sufficiently small, the proposed framework will lead to results where even subspaces of the highest possible dimension (i.e., hyperplanes) can be correctly recovered from highly corrupted data. This will be achieved by solving a family of non-convex sparse representation problems whose analysis will require the development of novel theoretical results to guarantee the correct recovery of the subspaces from corrupted data. The project will also develop scalable algorithms for solving these non-convex optimization problems and study conditions for their convergence to the global optimum. These algorithms will be evaluated in two major applications in computer vision: segmentation of point clouds and clustering of image categorization datasets.

Detecting Outliers in a High-dimensional Subspace

In [1-4], we studied and developed an L1-based non-convex optimization technique, called Dual Principal Component Pursuit (DPCP), which can be applied to single as well as multiple subspace learning. The method is motivated by the fact that state-of-the-art subspace methods for outlier rejection, such as Outlier Pursuit (Xu et al. 2010), Self-Expressiveness approaches (Soltanokotabi et al. 2011), or RANSAC (Fischler and Bolles 1981), as well as state-of-the-art subspace clustering methods, such as Sparse Subspace Clustering (Elhamifar and Vidal, 2009,2013), and Low-Rank Subspace Clustering (Liu et al. 2010), require the dimension of the subspace to be sufficiently small, and thus fail for high-dimensional subspaces, such as hyperplanes (subspaces of codimension 1). DPCP fills that gap in the literature by solving a non-convex L1 problem on the sphere, whose global solution is a vector orthogonal to the subspace (e.g., the normal vector to the subspace, if the subspace is a hyperplane). With a random sphereical model, we [3,4] show that DPCP can tolerate as many outliers as the square of the number of inliers, thus improving upon other provably correct robust PCA methods. In [1,2], we proved that the non-convex problem can be solved through a sequence of linear programs, which minimize the same L1 objective locally on a sequence of tangent spaces to the sphere. We showed that the solutions of these linear programs converge to the global minimum of the non-convex problem in a finite number of steps. In [3,4], we proposed a scalable Projected Sub-Gradient Method (DPCP-PSGM) for solving the DPCP problem and show that it achieves linear convergence even though the underlying optimization problem is non-convex and non-smooth. By repeating this process as many times as the codimension of the subspace, a basis for the orthogonal complement of the subspace is found. As experiments with real and synthetic data demonstrated, DPCP was able to advance the state-of-the-art in the context of outlier rejection in principal component analysis.

Clustering in a Union of Hyperplanes

In [5,6] we extended the theoretical
analysis and algorithms of DPCP to the case where the data are drawn from of a union of hyperplanes. We proposed a sequential hyperplane learning algorithm which
first learns the dominant hyperplane H1 by applying DPCP to the data, then computes the normval vector to the second dominant hyperplane H2 and so on. The extensive theoretical analysis in [5,6] reveals that as long as there are enough points in the dataset from the dominant hyperplane H1 (as compared to points from other hyperplanes), then the normal vector to H1 is the unique up to sign solution to the L1 non-convex problem. Moreover, this solution can be provably computed by the efficient algorithms developed for DPCP. To compute the normal vector to H2, instead of removing points that are close to H1, a procedure that is known to be sensitive to the choice of the thresholding parameter, we weight all points according to their distance to the estimated normal to H1, and then solve the non-convex problem on the weighted data. Experiments on synthetic data show that our algorithm dramatically improves the performance of similar sequential algorithms based on traditional approaches such as RANSAC or modern methods such as REAPER.

Acknowledgement

Work supported by NSF grant 1704458.

Publications

[1]

Dual Principal Component Pursuit

ICCV Workshop on Robust Subspace Learning and Computer Vision 2015.

[2]

Dual Principal Component Pursuit

Journal of Machine Learning Research (JMLR) 2018.

[3]

Dual Principal Component Pursuit: Improved Analysis and Efficient Algorithms

Neural Information Processing Systems (NeurIPS) 2018.

[4]

Noisy Dual Principal Component Pursuit

International Conference on Machine Learning (ICML) 2019.

[5]

Hyperplane Clustering via Dual Principal Component Pursuit

International Conference on Machine Learning (ICML) 2017.

[6]

Hyperplane Clustering via Dual Principal Component Pursuit

arXiv:1706.01604v2[cs.CV].

[7]

Basis Pursuit and Orthogonal Matching Pursuit for Subspace-preserving Recovery: Theoretical Analysis

Journal of Machine Learning, 2020 (submitted)

[8]

Self-Representation Based Unsupervised Exemplar Selection in a Union of Subspaces

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020 (submitted)

[9]

What is the Largest Sparsity Pattern that Can Be Recovered by 1-Norm Minimization?

IEEE Transactions on Information Theory, 2019 (submitted)

[10]

Finding the Sparsest Vectors in a Subspace: Theory, Algorithms, and Applications.

IEEE Signal Processing Magazine, 2020 (submitted)

[11]

Regional complexity analysis of algorithms for nonconvex smooth optimization

Mathematical Programming, 2020.

[12]

Exploiting negative curvature in deterministic and stochastic optimization.

Mathematical Programming 176(1): 69-94, 2019.

[13]

Nonconvex robust low-rank matrix recovery

SIAM Journal on Optimization, 30(1):660-686, 2020.

[14]

A nonconvex formulation for low rank subspace clustering: algorithms and convergence analysis.

Computational Optimization and Applications 70(2): 95-418, 2018

[15]

A critique of self-expressive deep subspace clustering

Neural Information Processing Systems, 2020 (submitted).

[16]

Dual Principal Component Pursuit for Learning a Union of Hyperplanes: Theory and Algorithms

Neural Information Processing Systems, 2020 (submitted)

[17]

Generalized Nullspace Properties for Subspace-Preserving Recovery.

Neural Information Processing Systems, 2020 (submitted)

[18]

Robust Homography Estimation via Dual Principal Component Pursuit.

CVPR 2020.

[19]

Analysis of the Optimization Landscapes for Overcomplete Representation Learning.

International Conference on Learning Representations, 2020

[20]

A Linearly Convergent Method for Non-Smooth Non-Convex Optimization on the Grassmannian with Applications to Robust Subspace and Dictionary Learning.

Advances in Neural Information Processing Systems, 2019

[21]

A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution

Neural Information Processing Systems, 2019.

[22]

Exact and Efficient Multi-Channel Sparse Blind Deconvolution - A Nonconvex Approach.

IEEE Asilomar Conference on Signals, Systems & Computers, 2019.

[23]

Is an Affine Constraint Needed for Affine Subspace Clustering?

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9915-9924

[24]

Classifying and Comparing Approaches to Subspace Clustering with Missing Data.

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019

[25]

Adaptive Online k-Subspaces with Cooperative Re-Initialization.

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019

[26]

Noisy Dual Principal Component Pursuit.

ICML 97:1617-1625, 2019.

[27]

Dual Principal Component Pursuit: Improved Analysis and Efficient Algorithms

NeurIPS 2018