Skip to content

RTG Summer School: Mathematical Foundations of Deep Learning


📁 Curriculum Topics

  • 01 Optimization Theory


    Geometry & saddles, convergence theory, stochastic algorithms, and adaptive methods.

    Explore Topic

  • 02 Approximation Theory


    Universal approximation, depth efficiency, harmonic perspectives, and KAN theory.

    Explore Topic

  • 03 Statistical Learning


    Concentration of measure, VC theory, PAC-Bayes, and modern generalization.

    Explore Topic

  • 04 RMT & NTK


    High-dim geometry, spectral laws, free probability, and Neural Tangent Kernels.

    Explore Topic

  • 05 Information Theory


    Entropy foundations, Information Bottleneck, VAEs, and Information Geometry.

    Explore Topic

  • 06 Geometry & Topology


    Group theory, equivariance, GNN theory, TDA, and Optimal Transport.

    Explore Topic

  • 07 Differential Equations


    Neural ODEs, Diffusion SDEs, PINNs, and Symplectic Integrators.

    Explore Topic

  • 08 Bayesian ML


    Probabilistic foundations, BNNs, MCMC/HMC, and Variational Inference.

    Explore Topic

  • 09 Kernel Methods


    RKHS foundations, Representer theorem, Mercer theory, and Mean Embeddings.

    Explore Topic

  • 10 Transformers


    Attention mathematics, meta-optimization, scaling laws, and interpretability.

    Explore Topic


🎓 Prerequisite Tiers

Tier 1 — Foundational

Appropriate after 1st/2nd year of a math/CS degree.

  • Multivariable calculus, Linear algebra (eigendecomposition, SVD)
  • Basic probability (expectation, variance, CLT)
  • Python + NumPy/PyTorch basics

Tier 2 — Intermediate

Typically 2nd/3rd year. Includes Tier 1 plus:

  • Real analysis (convergence, continuity, \(\varepsilon\)\(\delta\))
  • Intro probability theory (σ-algebras, conditional expectation)
  • Convex analysis / convex optimization, Intro statistics

Tier 3 — Advanced

Typically late 3rd/4th year. Includes Tier 2 plus:

  • Measure-theoretic probability, Functional analysis (Hilbert spaces)
  • Stochastic processes / SDEs, Graduate-level ML/Optimization

📚 Core Textbooks

Book Authors Use
High-Dimensional Probability Vershynin (2018) Concentration, RMT, JL
Understanding Machine Learning Shalev-Shwartz & Ben-David Statistical learning theory
Convex Optimization Boyd & Vandenberghe (2004) Optimization
Foundations of Machine Learning Mohri et al. (2018) Generalization bounds
Mathematics for Machine Learning Deisenroth et al. (2020) Prerequisite reference