Neural Execution Engines
Speaker:
Milad Hashemi, Google
Date and Time:
Wednesday, November 13, 2019 - 2:30pm to 3:00pm
Location:
Fields Institute, Stewart Library
Abstract:
Computer software and hardware systems provide a rich source of problems and data for machine learning research. In this talk, I give an overview of the potential of this area, and focus on how we can use fundamental computer science algorithms (such as sorting and graph processing) to study the problem of strong neural network generalization. We find that given appropriate supervision and structure, fairly standard transformers are capable of implementing these algorithms with near-perfect accuracy - even if they are difficult to learn in an end-to-end manner. Moreover, the networks retain this level of accuracy while generalizing to unseen data and long sequences outside of the training distribution.