Code for Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms. Paper accepted at NeurIPS 2023!
Authors: Shenao Zhang, Boyi Liu, Zhaoran Wang* , Tuo Zhao* (* indicates equal advising)
The code can be set up by:
git clone https://github.com/agentification/RP_PGM.git
cd RP_PGM
python setup.py develop
After setup, the following example can be run to train RP-DP-SN in the ant environment.
python train.py env=mbpo_ant device=cuda:0 seed=0
To train in other environments, change the env argument to the ones in ./config/env. Our code is adapted from the repository of the SVG-SAC algorithm.
