On Stochastic Optimization with Conditional Expectation
We study a class of compositional stochastic optimization involving conditional expectations, which finds a wide spectrum of applications particularly in Markov decision processes, robust and invariant learning. Classical approaches such as Stochastic Approximation and Sample Average Approximation would simply fail to work due to the limitation of sampling from conditional distributions. We propose a novel saddle point reformulation and stochastic-approximation-based algorithm that allows to take only one sample at a time from the conditional distribution and significantly reduces the sample complexity. The method benefits from a fresh employ of Fenchel duality, interchangeability principle, as well as kernel embedding techniques, and achieves the state-of-the-art empirical performances in several applications.