Momentum Stiefel Optimizer, with Applications to Orthogonal Attention, and Optimal Transport

Speaker:

Molei Tao, Georgia Institute of Technology

Date and Time:

Thursday, September 29, 2022 - 11:15am to 11:45am

Location:

Online

Abstract:

This talk will report a construction of momentum-accelerated gradient descent algorithms on Riemannian manifolds, focusing on a particular case known as Stiefel manifold. The treatment will be based on, firstly, the design of continuous-time optimization dynamics on the manifold, and then a thoughtful time-discretization that preserves all geometric structures. Since Stiefel manifold corresponds to matrices that satisfy orthogonality constraint, two practical applications will also be described: (1) we markedly improved the performance of trained-from-scratch Vision Transformer by appropriately placing orthogonality into its self-attention mechanism, and (2) our optimizer also makes the useful notion of Projection Robust Wasserstein Distance for high-dim. optimal transport even more effective.

The Fields Institute for
Research in Mathematical Sciences

Momentum Stiefel Optimizer, with Applications to Orthogonal Attention, and Optimal Transport

Scheduled as part of

People and Contacts

Calendar and Events