JHU Computer Vision Machine Learning

Project Summary

Dynamical systems are widely used for the analysis, verification, and control of physical, mechanical, thermal, chemical and biological processes. However, there are many emerging applications in which one also needs to "compare the dynamics of two processes". In computer vision, for example, one can use dynamical models to describe kinematic and video data of human motion. While different people move differently, the dynamical models of two people performing the same task (e.g., walking) should be "closer" to each other than the models of two people performing different tasks (e.g., walking vs running). This project willl develop a mathematical framework for efficiently comparing dynamical systems identified from high-dimensional time-series data as well as algorithms for clustering, classification, and statistical analysis of such data. The proposed framework will be developed for spaces of linear dynamical systems whose quotient structure is defined by the action of a group on a smooth manifold. A family of efficiently computable distances in the ambient space will be used to define a family of "group-action-induced distances" in the quotient space. Such distances will be used to develop methods for performing classification, clustering and statistical analysis on spaces of dynamical systems. The proposed methods will be evaluated on kinematic and video data of human activities.

Initial-State Invariant Binet-Cauchy Kernels for the Comparison of Linear Dynamical Systems

Linear Dynamical Systems (LDSs) have been extensively used for modeling and recognition of dynamic visual phenomena such as human activities, dynamic textures, facial deformations and lip articulations. In these applications, a huge number of LDSs identified from high-dimensional time-series need to be compared. Over the past decade, three computationally efficient distances have emerged: the Martin distance, distances obtained from the subspace angles between observability subspaces, and distances obtained from the family of Binet-Cauchy kernels. The main contribution of our work [1] is to show that the first two distances are particular cases of the latter family obtained by making the Binet-Cauchy kernels invariant to the initial states of the LDSs. We also extend Binet-Cauchy kernels to take into account the mean of the dynamical process. We evaluate the performance of our metrics on several human activity recognition datasets and show similar or better results.

The Alignment Distance on Spaces of Linear Dynamical Systems

In [2], we introduce a family of group action induced distances on spaces of Linear Dynamical Systems (LDSs) of fixed size and order. The distance between two LDSs is computed by finding the change of basis that best aligns the state-space realizations of the two LDSs, hence the name alignment distance. This distance can be computed efficiently, hence it is particularly suitable for applications in modern dynamic data analysis (e.g., video sequence classification and clustering), where a large number of high-dimensional LDSs may need to be compared. Based on the alignment distance, we also define a notion of average between LDSs of the same size and order with the property that the order and in some cases stability are naturally preserved. Various extensions to the basic notion of alignment distance are also proposed.

Distances on Spaces of High-Dimensional Linear Stochastic Processes

In [3] we study the geometrization of certain spaces of stochastic processes. In the first part of the paper, we provide a rather extensive review of some existing approaches to defining distances on spaces of stochastic processes. The majority of these distances are, in one way or another, based on comparing power spectral densities of the processes. In the second part, we focus on the space of processes generated by (stochastic) linear dynamical systems (LDSs) of fixed size and order, on which we recently introduced a class of group action induced distances called the alignment distances. This space is a natural choice in some pattern recognition applications and is also of great interest in control theory. We review and elaborate on the notion of alignment distance. Often it is convenient to represent LDSs in state-space form, in which case the space (more precisely manifold) of LDSs can be considered as the base space of a principal fiber bundle comprised of state-space realizations. This is due to a Lie group action symmetry present in the state-space representation of LDSs. The basic idea behind the alignment distance is to compare two LDSs by first aligning a pair of their realizations along the respective fibers. Upon a standardization (or bundle reduction) step this alignment process can be expressed as a minimization problem over orthogonal matrices, which can be solved efficiently. The alignment distance differs from most existing distances in that it is a structural or generative distance, since in some sense it compares how two processes are generated. We also briefly discuss averaging LDSs using the alignment distance via minimizing a sum of the squares of distances (namely, the so-called Frechet mean).

Bundle Reduction and the Alignment Distance on Spaces of State-Space LTI Systems

In [4] we introduce a large class of differential- geometric distances between finite-dimensional linear dynamical systems, collectively called the alignment distance. Contrary to the existing distances, the alignment distance is based on the state- space description of dynamical systems, and is defined on mani- folds of systems of fixed order and fixed input-output dimension under a matrix rank constraint (e.g., minimality, controllability, or observability). While the quotient topology and principal fiber bundle structure associated with such manifolds have been known since the early days of modern control theory, distances natural to this structure have not been studied. The starting point for defining such a distance is to identify a linear system of order n with its equivalence class of state-space realizations, all related by the so-called similarity action, i.e., state-space change of basis under GL(n), the Lie group of nonsingular n x n matrices. The main idea of the alignment distance is to first find the best state-space change of basis that brings a realization of a system "as close as possible" to a realization of another system (the alignment step), and then compare the aligned realizations. A direct implementation of this idea, due to noncompactness of GL(n), is complicated. However, using the notion of "reduction of the structure group" of a principal bundle, we show that the change of basis can be restricted to an orthogonal change of basis, provided one uses realizations in a reduced subbundle. This key observation brings about significant computational benefits. As a technical contribution (possibly of independent interest), we show that several forms of realization balancing available in the control literature have differential- geometric significance, and are, indeed, examples of reducing the structure group from GL(n) to its subgroup of orthogonal matrices O(n). The alignment distance can be defined for stable and unstable systems, discrete or continuous-time, and stochastic systems.

Model Order Reduction using the Alignment Distance

In [5,6] we formulate model order reduction for discrete-time LTI (MIMO) systems in terms of the alignment distance. The intuition behind our formulation is to consider systems of orders lower than n as boundary points of the manifold minimal systems of order n and fixed size, in an appropriate ambient space. The goal is to find a system of order at most r (on the boundary) closest to a given system of order n, where closeness is measured in the alignment distance. We introduce an algorithm for this minimization problem and give some a-priori error bounds in terms of the Hankel singular values of the system. Interesting relations and resemblances emerge with the popular balanced truncation reduction, which is a method not based on any optimality criterion. We show that in certain cases (but not always) balanced truncation provides a good approximation to reduction based on the alignment distance. In fact, our approach can be considered as a principled attempt to put balanced truncation in an optimization framework, and in doing so we allude to a shortcoming of balanced truncation that highlights an advantage of our approach. The proposed approach is general and can be extended to other classes of systems.

Linear Dynamical Systems with Sparse Inputs for Modeling Complex Dynamics

In [7] we introduce a new class of linear time-invariant systems for which, at each time instant, the input is sparse with respect to an overcomplete dictionary of inputs. Such systems may be appropriate for modeling a system which exhibits multiple discrete behaviors orchestrated by the sparse input. Although the input is assumed to be unknown, we show that the additional structure imposed on the input allows us to recover both the initial state and the sparse, but unknown, input from output measurements alone. For this purpose, we derive sufficient observability and sparse recovery conditions that integrate classical observability conditions for linear systems with incoherence conditions for sparse recovery. We also propose a convex optimization algorithm for jointly estimating the initial condition and recovering the sparse input.
In [8], based on this idea, we propose a surgical gesture segmentation and classification method based on shared, discriminative, sparse dictionary learning, which can be used to effectively analyze complex surgical gestures recorded by the da Vinci robotic surgical system. Rather than learning a separate dictionary for each gesture in an unsupervised manner, we propose an algorithm for jointly learning a common overcomplete dictionary for all gestures together with a multi-class linear support vector machine for classifying each gesture. Experiments on the JHU-ISI Gesture and Skill Assessment Dataset (JIGSAWS), which contains the motions from three surgical tasks, reveal that the proposed method performs on par or better than state-of-the-art methods based only on kinematic data.

ALignment DIstance Package (ALDIP)

The first version of the ALignment DIstance Package (ALDIP) will be availabel online. The package includes codes for calculating the alignment distace, averaging linear dynamical systems, model order reduction, etc.

Acknowledgement

This project is supported by grant NSF 1335035.

Publications

[1]

R. Chaudhry and R. Vidal.

Initial-state invariant Binet-Cauchy kernels for the comparison of Linear Dynamical Systems.

IEEE Conference on Decision and Control, pp. 1162-1167, 2013.
Download: [pdf] [HTML]

[2]

B. Afsari and R.Vidal.

The alignment distance on Spaces of Linear Dynamical Systems.

IEEE Conference on Decision and Control, 2013.
Download: [pdf]

[3]

B. Afsari and R.Vidal.

Distances on Spaces of High-Dimensional Linear Stochastic Processes: A Survey.

Geometric Theory of Information, Signals and Communication Technology, 2014.
Download: [pdf] [HTML]

[4]

B. Afsari and R.Vidal.

Bundle Reduction and the Alignment Distance on Spaces of State-Space LTI Systems.

IEEE Transactions of Automatic Control, 2017.
Download: [pdf]

[5]

B. Afsari and R.Vidal.

Model order reduction for discrete-time LTI systems using the alignment distance.

American Control Conference (ACC), pp. 2868-2875, 2015.

[6]

B. Afsari and R.Vidal.

Model Order Reduction in the Alignment Distance and Metrization of the Kalman Decomposition.

IEEE Transactions of Automatic Control, 2016. (Under Review)

[7]

S. Sefati, N. J. Cowan and R. Vidal

Linear Systems with Sparse Inputs: Observability and Input Recovery.

American Control Conference (ACC), 2015.
Download: [pdf]

[8]

S. Sefati, N. J. Cowan and R. Vidal

Learning Shared, Discriminative Dictionaries for Surgical Gesture Segmentation and Classification.

Modeling and Monitoring of Computer Assisted Interventions (MMCAI), 2015.
Download: [pdf]