Research
Our research questions don't fit neatly into a single subfield — statistics, optimization, theory of computation, and large-scale empirical study all show up. Below are some of the themes our work clusters around. Each links to recent publications on that topic.
Theory of learning and generalization
Why over-parameterized models generalize, the role of inductive biases, sample complexity, and the limits of statistical learning.
Optimization for deep learning
Second-order methods, learning-rate schedules, weight decay, and the geometry of training in large models.
Scaling, emergence, and the empirical study of deep learning
Treating large models as empirical objects — scaling laws, capability emergence, and the dynamics of pretraining.
Sampling and generative modeling
Diffusion, flow matching, masked diffusion, and the algorithmic foundations of generation.
Architectures, attention, and in-context learning
What transformers can compute, length generalization, the structure of attention, and the dynamics of in-context learning.
Robust and trustworthy machine learning
Distribution shift, conformal inference, interpretability, alignment, and the social context of machine learning.