Learning
Theory II: Modeling and Segmentation of Multivariate Mixed Data
(BME 580.692, CS 600.462)
Instructor: René Vidal
Phone:
410-516-7306, E-mail: rvidal @ cis.jhu.edu
Time/Place: T-Th
4.30pm-6pm, Hodson 301
Office Hours:
Mondays 5-6, 308B Clark Hall
The aim of this two-semester course is to study the
foundations of computational methods for the statistical and dynamical
modeling of multivariate data.
The emphasis of Learning
Theory I is to use probability theory to build models of data in the
framework of regression, classification, and data reduction. The emphasis of Learning Theory
II is to use methods from algebraic geometry, probability theory and
dynamical systems theory to build models of data in the framework of linear
and polynomial algebra and dynamical systems theory. Topics will include
nonlinear dimensionality reduction (PCA, LLE, Isomap), unsupervised learning
(central clustering, subspace clustering, Generalized PCA), and estimation
and identification of dynamical systems (Kalman filtering, subspace identification, hybrid system
identification). We will apply these tools to model data from computer
vision, biomedical imaging, neuroscience, and computational biology.
Course Syllabus
Introduction (Chapter 1)
Linear and Nonlinear Dimensionality Reduction (Chapter 2)
- 09/12
Principal Component Analysis (PCA)
- 09/14
Model Selection and Robust PCA
- 09/19
Nonlinear and Kernel PCA
- 09/21
Locally Linear Embedding (LLE)
Iterative Methods for Unsupervised Learning (Chapter 3)
- 09/26-28
Central Clustering: K-means, Expectation Maximization (EM)
- 10/03-05
Subspace Clustering: K-subspaces, EM for Mixtures of PCAs
Algebraic Methods for Unsupervised Learning (Chapter 4)
- 10/10-12
Line, plane, and hyperplane clustering
- 10/17-19
Subspace Clustering: Generalized Principal Component Analysis (GPCA)
Applications in Computer Vision
- 10/24-26
3-D Motion Segmentation (Chapter 8)
- 10/31-02
Spatial and Temporal Video Segmentation (Chapter 9)
- 11/7
Midterm
Estimation and Segmentation of Hybrid Dynamical Models
(Chapters 10-11)
- 11/9
Linear systems: input/output (ARX) and state space (ARMA) representation
- 11/14-16
State estimation: observability, observer design, Kalman filter
- 11/21-28
Identification: linear parameter identification, subspace
identification, recursive identification
- 11/30
Identification of hybrid systems
- 12/5-7:
Presentation of projects
References
- R.
Vidal, Y. Ma, and S. Sastry. Generalized Principal Component Analysis.
Springer Verlag, 2007. (In preparation)
Grading Policy
- Homework
(30%): Homework problems will include both analytical exercises as well
as programming assignments in MATLAB.
- Midterm
(30%): There will be one midterm on November 7th.
- Project (40%): There will
be a final project where each student will either apply techniques from
the course to solve a real problem or solve an open research problem.
Each student will submit a 1-page project description by October 19th
(5%), a 3-page progress report by November 16th (5%), a
6-page final report by December 7th (15%), and give a 20
minute presentation on December 5th or 7th (15%).
Honor system
Homeworks, midterms and projects will be individual. The
strength of the university depends on academic and personal integrity. In
this course, you must be honest and truthful. Ethical violations include
cheating on exams, plagiarism, reuse of assignments, improper use of the
Internet and electronic devices, unauthorized collaboration, alteration of
graded assignments, forgery and falsification, lying, facilitating academic
dishonesty, and unfair competition. All these will be severely penalized.
Announcements
- First
class will meet on Thursday 09/07. We can discuss change of time in the
first class.
Handout
Homeworks: Please
submit the code of your homework at Submit HW
- Homework 2: Principal Component Analysis (PCA). Due Thursday September 28th, 2006, beginning of class.
Dataset (images and MATLAB functions).
Errata corrige
- The prototype of the function vector2image is [img]=vector2image(vectorimg,sz) instead of [vectorimg,sz]=vector2image(img).
- Improved the clarity for the description of the function reconstruct.
- In the second part (Experiments), question (f), "Set B" must be substituted with "Set A, Validation Set".
Thank you for reporting these errors.
- Homework 3: Nonlinear Dimensionaliry Reduction (KPCA, LLE and Isomap). Due Thursday October 5th, 2006, beginning of class.
Dataset (images and MATLAB functions).
Clarifications
- The code given for Matlab function handles was tested in
Matlab 7.1. If your version is 6.5 or older, check help as you may have
to use"feval" y=feval(kernel,x,var) .
- When discussing classification rates, please give the
percentage of correct classification in addition to any plots you may
have.
- Homework 4: Central and Subspace Clustering (K-means, EM and K-subspaces). Due Thursday October 12th, 2006, beginning of class.
Images for Intensity Segmentation
Images for texture Segmentation
Face Dataset
Note : Resize the images to 30 x 40 to run K-Subspaces, Please Run
K-Subpaces on the Dataset (after resizing it) without dimensionality
reduction and with reducing the dimension using PCA.
Errata
1.Question 2 modified
2.Mean added to the return value of K-subspaces algorithm.
- Homework 5: EM, MPPCA, Polysegment Due Thursday October 19th, 2006, beginning of class.
- Homework 6: Generalized PCA Due Thursday October 26th, 2006, beginning of class.
Test dataset for problem 5
- Homework 7: Applications of GPCA Due Thursday November 2nd, 2006, beginning of class.
Kanatani1 Kanatani2 Kanatani3 three-cars canbook FaceData
Errata
1. mat files updated to include true classification. Please download the new version
- Homework 8: Linear and Hybrid Systems Due Thursday November 30th, 2006, beginning of class.
dytex.m
synth.m
ocean
Code to read avi
ocean-steam
Midterms:
|
|