SAC

Implementation of Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Added another branch for Soft Actor-Critic Algorithms and Applications -> SAC_V1.

Soft Q-Learning uses the following objective function instead of the conventional expected cumulative return:

$J( \pi ) = \sum_{t=0}^T E_{( s_{t}, a_{t}) \sim{ \rho _{t} }} [r(s_{t}, a_{t}) + \alpha H(.|s_{t})]$

The entropy term is also maximized which have two major benefits:

The exploration will be intelligently tuned and maximized as much as need, so the exploration/exploitation trade off is well satisfied.
It prevent the learning procedure to get stuck in a local optima which results to a suboptimal policy.

Demos

Humanoid-v2	Walker2d-v2	Hopper-v2

Results

Humanoid-v2	Walker2d-v2	Hopper-v2

Dependencies

gym == 0.17.2
mujoco-py == 2.0.2.13
numpy == 1.19.1
psutil == 5.4.2
torch == 1.4.0

Installation

pip3 install -r requirements.txt

Usage

python3 main.py

You may use Train flag to specify whether to train your agent when it is True or test it when the flag is False.
There are some pre-trained weights in pre-trained models dir, you can test the agent by using them; put them on the root folder of the project and turn Train flag to False.

Environment tested

Humanoid-v2
Hopper-v2
Walker2d-v2
HalfCheetah-v2

Reference

Acknowledgement

All credits goes to @pranz24 for his brilliant Pytorch implementation of SAC.
Special thanks to @p-christ for SAC.py

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
Result		Result
pre-trained models		pre-trained models
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
main.py		main.py
model.py		model.py
play.py		play.py
replay_memory.py		replay_memory.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAC

Demos

Results

Dependencies

Installation

Usage

Environment tested

Reference

Acknowledgement

About

Releases

Packages

Languages

alirezakazemipour/SAC

Folders and files

Latest commit

History

Repository files navigation

SAC

Demos

Results

Dependencies

Installation

Usage

Environment tested

Reference

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages