Equilibrium approaches to deep learning: One (implicit) layer is all you need
Does deep learning actually need to be deep? In this talk, I will present some of our recent and ongoing work on Deep Equilibrium (DEQ) Models, an approach that demonstrates we can achieve most of the benefits of modern deep learning systems using very shallow models, but ones which are defined implicitly via finding a fixed point of a nonlinear dynamical system. I will show that these methods can achieve results on par with the state of the art in domains spanning large-scale language modeling, image classification, and semantic segmentation, while requiring less memory and simplifying architectures substantially. I will also highlight some recent work analyzing the theoretical properties of these systems, where we show that certain classes of DEQ models are guaranteed to have a unique fixed point, easily-controlled Lipschitz constants, and efficient algorithms for finding the equilibria. I will conclude by discussing ongoing work and future directions for these classes of models.
Zico Kolter is an Associate Professor in the Computer Science Department at Carnegie Mellon University, and also serves as chief scientist of AI research for the Bosch Center for Artificial Intelligence. His work spans the intersection of machine learning and optimization, with a large focus on developing more robust and rigorous methods in deep learning. In addition, he has worked in a number of application areas, highlighted by work on sustainability and smart energy systems. He is a recipient of the DARPA Young Faculty Award, a Sloan Fellowship, and best paper awards at NeurIPS, ICML (honorable mention), IJCAI, KDD, and PESGM.