Mathematics of Deep Learning
Time: F 1:00-3:00 pm (10-04-19 to 12-06-19)

Place: Shaffer 300

Instructor: René Vidal (OH: F 3:00-4:00 pm, Clark 302B)

TA: Connor Lane (OH: Tu 4:00-5:00 pm, Clark 311A or B)

Course Description
The past few years have seen a dramatic increase in the performance of recognition systems thanks to the introduction of deep networks for representation learning. However, the mathematical reasons for this success remain elusive. For example, a key issue is that the training problem is nonconvex, hence optimization algorithms are not guaranteed to return a global minima. Another key issue is that while the size of deep networks is very large relative to the number of training examples, deep networks appear to generalize very well to unseen examples and new tasks. This course will overview recent work on the theory of deep learning that aims to understand the interplay between architecture design, regularization, generalization, and optimality properties of deep networks.
Class Schedule
• 10/04: Introduction.
• 10/11: Optimization landscape for shallow networks.
• 10/18: No class (Fall break).
• 10/22: (Optional) MINDS/CIS Seminar: Dr. Jason D Lee (12:00 PM, Hodson 310)
• 10/25: Optimization landscape for deep networks.
• 11/01: Analysis of SGD and entropy SGD (Guest lecture by Dr. Chaudhari).
• 11/08: Analysis of inductive bias of dropout.
• Reading: (Cavazza et al., 2018), (Mianjy et al., 2018).
• Homework 2 released, due Thursday 11/21 11:59 PM ET.
• For problem 1, let $$\mathcal{W} = \{(\mathbf{U}^*, \mathbf{V}^*) \:|\: f(\mathbf{U}^*, \mathbf{V}^*) = \min f(\mathbf{U}, \mathbf{V})\}$$. Then we say $$\mathcal{W}$$ is bounded if there exists some $$M > 0$$ such that for all $$(\mathbf{U}^*, \mathbf{V}^*) \in \mathcal{W}$$, $$(\|\mathbf{U}^*\|_F^2 + \|\mathbf{V}^*\|_F^2)^{\frac{1}{2}} \leq M$$.
• Homework 2 solutions here.
• 11/15: Generalization theory for deep and shallow networks.
• 11/20: (Optional) MINDS Symposium on the Foundations of Data Science (all day in Shriver Hall).
• 11/22: Approximation theory: sparsity (Guest lecture by Dr. Jeremias Sulam).
• 11/29: No class (Thanksgiving).
• 12/06 Approximation theory for deep and shallow networks.
• Homework 3 released, due Tuesday 12/17 11:59 PM ET.
• See here for an example of what to expect for problem 2(e).
• See here for a few hints.