This repository contains minimal implementations of the SSPG algorithm from the NeurIPS paper Policy Gradient With Serial Markov Chain Reasoning. We provide two different code bases for the proprioceptive and pixel-based experiments located in the sspg_mujoco and sspg_dmc subfolders, respectively. We refer to the README.md files located in these subfolders for instructions regarding installation and replicating the results.
For any extension/query/question, feel free to raise a pull request, an issue and/or get in contact with Edoardo Cetin at [email protected].
To cite our work, you can use:
@inproceedings{cetin2022serialMCR,
author = {Cetin, Edoardo and Celiktutan, Oya},
booktitle = {Advances in Neural Information Processing Systems},
editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},
pages = {8824--8839},
publisher = {Curran Associates, Inc.},
title = {Policy Gradient With Serial Markov Chain Reasoning},
url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/39fac857b4467e3ef4f358186bb07d81-Paper-Conference.pdf},
volume = {35},
year = {2022}
}