Reinforcement Learning with Risk Preferences
The first part of this talk explores various methods for incorporating risk measures into Markov decision processes, with a focus on a framework that utilizes nested compositions of conditional risk mappings. We propose a distributional viewpoint to this framework to include weakly continuous dynamics, latent costs, and randomized actions. Additionally, we introduce a novel distributional reinforcement learning method that approximates optimal strategies in discrete environments.
The second part of this talk is based on an ongoing project. We focus on the problem of learning conditional distributions on continuous spaces, with a view towards developing a risk-averse reinforcement learning method for continuous environments. Our method involves clustering data close to varying query points within the input space to create empirical distributions in the output space. I will discuss some convergence results related to this approach and demonstrate its implementation using neural networks.