Research

Our research questions don't fit neatly into a single subfield. Statistics, optimization, theory of computation, and large-scale empirical study all show up. Below are some of the themes our work clusters around. Each links to recent publications on that topic.

Theory of learning and generalization

Why over-parameterized models generalize, the role of inductive biases, sample complexity, and the limits of statistical learning.

Optimization for deep learning

Second-order methods, learning-rate schedules, weight decay, and the geometry of training in large models.

Scaling, emergence, and the empirical study of deep learning

Treating large models as empirical objects. Scaling laws, capability emergence, and the dynamics of pretraining.

Sampling and generative modeling

Diffusion, flow matching, masked diffusion, and the algorithmic foundations of generation.

Architectures, attention, and in-context learning

What transformers can compute, length generalization, the structure of attention, and the dynamics of in-context learning.

Robust and trustworthy machine learning

Distribution shift, conformal inference, interpretability, alignment, and the social context of machine learning.