Depen Morwani defends his PhD
Congratulations to Depen Morwani, one of the first PhD students from the group, on successfully defending his thesis. Onward.
The Harvard Machine Learning Foundations Group studies the theoretical and empirical foundations of machine learning. The group is led by Boaz Barak, Sham Kakade, and David Alvarez-Melis, alongside affiliated faculty across computer science, applied mathematics, and statistics.
Group members have produced foundational results on generalization in deep learning. These include double descent, the deep bootstrap, and hidden progress in SGD. They have also developed widely used optimization methods for large-scale training, such as Shampoo, SOAP, and schedule-free optimizers. Current work extends into the theory of reinforcement learning and post-training, sampling and diffusion-based generative models, and the algorithmic study of large language models.
We are affiliated with the Kempner Institute at Harvard.
Principal investigators
Affiliated faculty
Lucas Janson · Kianté Brantley · Yilun Du · Cengiz Pehlevan · Sitan Chen
Each links to recent publications on that topic.
Why over-parameterized models generalize, the role of inductive biases, sample complexity, and the limits of statistical learning.
Second-order methods, learning-rate schedules, weight decay, and the geometry of training in large models.
Treating large models as empirical objects — scaling laws, capability emergence, and the dynamics of pretraining.
Diffusion, flow matching, masked diffusion, and the algorithmic foundations of generation.
What transformers can compute, length generalization, the structure of attention, and the dynamics of in-context learning.
Distribution shift, conformal inference, interpretability, alignment, and the social context of machine learning.
Congratulations to Depen Morwani, one of the first PhD students from the group, on successfully defending his thesis. Onward.
A Simplified Analysis of SGD for Linear Regression with Weight Averaging
Alexandru Meterez, Depen Morwani, Costin-Andrei Oncescu, Jingfeng Wu, Cengiz Pehlevan, Sham Kakade · arXiv 2025
Any-Order Flexible Length Masked Diffusion
Jaeyeon Kim, C. Lee, Carles Domingo-Enrich, Yilun Du, Sham Kakade, Timothy Ngotiaoco, Sitan Chen, M. Albergo · arXiv 2025
Cognitive models can reveal interpretable value trade-offs in language models
Sonia K. Murthy, Rosie Zhao, Jennifer Hu, Sham Kakade, Markus Wulfmeier, Peng Qian, Tomer D. Ullman · arXiv 2025
Random Scaling of Emergent Capabilities
Rosie Zhao, Tian Qin, David Alvarez-Melis, Sham Kakade, Naomi Saphra · arXiv 2025
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas, Depen Morwani, Rosie Zhao, Itai Shapira, David Brandfonbrener, Lucas Janson, Sham Kakade · arXiv 2024
Universal Length Generalization with Turing Programs
Kaiying Hou, David Brandfonbrener, Sham Kakade, Samy Jelassi, Eran Malach · arXiv 2024
We are recruiting at every level — postdocs, graduate students, and undergraduate researchers. Several fellowships are open in parallel each year; applying to multiple is encouraged. See join for the full list, or follow @boazbaraktcs for openings.