This is the implementation of Flow Matching Policy Gradients [https://arxiv.org/abs/2507.21053] based on RSL-RL style implementation. You are free to use it the same way as using RSL-RL PPO.
This implementation is originally for Toddlerbot project [https://toddlerbot.github.io/].
The implementation is based on rsl-rl v2.3.3.
Author: Yao He