Quicklists
Javascript must be enabled

Mengdi Wang : Primal-Dual Pi Learning Using State and Action Features (Sep 5, 2018 11:55 AM)

We survey recent advances on the complexity and methods for solving Markov decision problems (MDP) and Reinforcement Learning (RL) with finitely many states and actions - a basic mathematical model for reinforcement learning.

For model reduction of large scale MDP in reinforcement learning, we propose a bilinear primal-dual pi learning method that utilizes given state and action features. The method is motivated from a saddle point formulation of the Bellman equation. The sample complexity of bilinear pi learning depends only on the number of parameters and is variant with respect to the dimension of the problem.

In the second part we study the statistical state compression of general Markov processes. We propose a spectral state compression method for learning the state features from data. The state compression method is able to “ sketch” a black-box Markov process from its empirical data and output state features, for which we provide both minimax statistical guarantees and scalable computational tools.

Please select playlist name from following

Report Video

Please select the category that most closely reflects your concern about the video, so that we can review it and determine whether it violates our Community Guidelines or isn’t appropriate for all viewers. Abusing this feature is also a violation of the Community Guidelines, so don’t do it.

0 Comments

Comments Disabled For This Video