How could machines learn as efficiently as humans and animals? How could machines learn how the world works and acquire common sense? …

Large language models are capable of an incredible array of tasks. Language models are pre-trained on large amounts of text data from …

Scale appears to be the winning recipe in today’s leaderboards. And yet, extreme-scale neural models are (un)surprisingly brittle …

In this talk, I’ll describe some recent work outlining how distribution shifts are fundamental to working with human-centric …

Naturalistic experimental paradigms in cognitive neuroscience arose from a pressure to test, in real-world contexts, the validity of …

The new wave of AI systems, ChatGPT and its more powerful successors, exhibit extraordinary capabilities across a broad swath of …

Researchers have proposed many methods to make neural networks more reliable under distribution shift, yet there is still large room …

When machine learning models are deployed into the world, they inevitably encounter scenarios that differ from their training data, …

How can we trace surprising behaviors of machine learning models back to their training data? Influence functions aim to predict how …

Deep learning algorithms are responsible for a technological revolution in a variety of tasks, yet understanding why they work remains …

The brain is a very noisy place: when a spike arrives at a pre-synaptic terminal, about half the time neurotransmitter fails to …

The emergence of machines that seem to offer the same or better capabilities than humans raised interests in many sectors who are eager …

Network data is ubiquitous, and many examples can be found in domains ranging from biology to social sciences. Learning from graph data …

This talk gives an overview of distribution-free predictive inference, using conformal prediction. Conformal prediction essentially …

Language models can be dramatically improved by reward models, which predict the quality of a sample. Two approaches for combining …

Universality is a fascinating high-dimensional phenomenon. It points to the existence of universal laws that govern the macroscopic …

Geometric Deep Learning is an attempt for geometric unification of a broad class of ML problems from the perspectives of symmetry and …

Once described as alchemy, a quantitative science of machine learning is emerging. This talk will seek to unify the scientific …

Visual intelligence is a cornerstone of intelligence. From passive perception to embodied interaction with the world, vision plays a …

Video games have become an attractive testbed for evaluating AI systems, by capturing some aspects of real-world complexity (rich …

Multiagent reinforcement learning has received a growing interest with various problem settings and applications. We will first present …

How and why are we succeeding in training huge non-convex deepnetworks? How can deep neural networks with billions of parameters …

Inverse problems in image processing and computer vision are often solved using prior probability densities, such as spectral or …

What is the relationship between task geometry, network architecture, and emergent feature learning dynamics in nonlinear deep …

I will describe a personal perspective on a few key problems in learning theory at the moment. Several different architectures that …

I will present 3 learning algorithms fusing scientific computing and AI for the prediction and control of complex physical systems. The …

Neural networks have been shown to significantly outperform kernel methods (including neural tangent kernels) in problems such as image …

A central goal in neuroscience is to understand how orchestrated computations in the brain arise from the properties of single neurons …

The successes of deep learning critically rely on the ability of neural networks to output meaningful predictions on unseen data …

Deep learning has significantly changed the fields of speech recognition, computer vision and natural language processing, to name a …

Large neural networks perform extremely well in practice, providing the backbone of modern machine learning. The goal of this talk is …

This talk will introduce two new tools for summarizing a probability distribution more effectively than independent sampling or …

I’ll discuss empirical work on neural scaling laws, emphasizing their apparent precision, universality, and ubiquity. Along the …

We study the geometry of deep learning through the lens of approximation theory via spline functions and operators. Our key result is …

Graph Neural Networks (GNNs) have become a popular tool for learning representations of graph-structured inputs, with applications in …

An exciting area of intellectual activity in this century may well revolve around a synthesis of machine learning, theoretical physics, …

Standard machine learning produces models that are accurate on average but degrade dramatically on when the test distribution of …

As datasets continue to grow in size, in many settings the focus of data collection has shifted away from testing pre-specified …

Deep learning seeks to discover universal models that work across all modalities and tasks. While self-attention has enhanced the …

Many supervised learning methods are naturally cast as optimization problems. For prediction models which are linear in their …

Gradient descent algorithms and their noisy variants, such as the Langevin dynamics or multi-pass SGD, are at the center of attention …

From music recommendations to high-stakes medical treatment selection, complex decision-making tasks are increasingly automated as …

Quantifying uncertainty in deep learning is a challenging and yet unsolved problem. Predictive uncertainty estimates are important to …

A major challenge in the theory of deep learning is to understand the computational complexity of learning basic families of neural …

GPT3 has shown that large generative models are unexpectedly powerful and capable. In this talk, I will review some of these …

Why do large learning rates often produce better results? Why do “infinitely wide” networks trained using kernel methods …

“What is learnable?” is a fundamental question in learning theory. The talk will address this question for deep learning, …

Convolution is one of the most essential components of architectures used in computer vision. As machine learning moves towards …

One desired capability for machines is the ability to transfer their understanding of one domain to another domain where data is …

The fundamental breakthroughs in machine learning, and the rapid advancements of the underlying deep neural network models have enabled …

Understanding deep learning calls for addressing the questions of: (i) optimization — the effectiveness of simple gradient-based …

Modern deep generative models like GANs, VAEs and invertible flows are showing amazing results on modeling high-dimensional …

As neural networks become wider their accuracy improves, and their behavior becomes easier to analyze theoretically. I will give an …

Autonomous systems require efficient learning mechanisms that are fully integrated with the control loop. We need robust learning …

A common view of deep learning is that deep networks provide a hierarchical means of processing input data, where early layers extract …

When predictions support decisions they may influence the outcome they aim to predict. We call such predictions performative; the …

Abstract: Machine Learning is invaluable for extracting insights from large volumes of data. A key assumption enabling many methods, …

How should we go about creating a science of deep learning? One might be tempted to focus on replicability, reproducibility, and …

The existence of adversarial examples in which tiny changes in the input can fool well trained neural networks has many applications …

We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable …

This talk will survey the role played by margins in optimization, generalization, and representation of neural networks. A specific …

Deep Learning has had phenomenal empirical successes in many domains including computer vision, natural language processing, and speech …

Classical theory that guides the design of nonparametric prediction methods like deep neural networks involves a tradeoff between the …

Much recent theoretical work has concentrated on “solving deep learning”. Yet, deep learning is not a thing in itself and …

Inductive biases from specific training algorithms like stochastic gradient descent play a crucial role in learning overparameterized …

Machine learning has made tremendous progress over the last decade. It’s thus tempting to believe that ML techniques are a …

Algorithms in deep learning have a regularization effect: different optimizers with different hyper-parameters, on the same training …