Surbhi Goel - Computational Complexity of Learning Neural Networks over Gaussian Marginals


A major challenge in the theory of deep learning is to understand the computational complexity of learning basic families of neural networks (NNs). The challenge here arises from the non-convexity of the associated optimization problem. It is well known that the learning problem is computationally intractable in the worst case. Prior works have circumvented this hardness by making assumptions on the distribution as well as the label noise. In this talk, we focus on the problem of learning shallow NNs under the benign gaussian input distribution. We first show a super-polynomial Statistical Query (SQ) lower bound in the noiseless setting. We further show how to use this result to obtain a super-polynomial SQ lower bound for learning a single neuron in the agnostic noise model. Lastly, on the positive side, we describe a gradient-based algorithm for approximately learning ReLUs which runs in polynomial time. This talk is based on multiple works with Ilias Diakonikolas, Aravind Gollakota, Zhihan Jin, Sushrut Karmalkar, Adam Klivans and Mahdi Soltanolkotabi.