- DQN
- Dueling Double DQN
- Categorical DQN (C51)
- Categotical Dueling Double DQN
- Proximal Policy Optimization (PPO)
- discrete (episodic, n-step)
- Group Relative Policy Optimization (GRPO)
- Random Network Distillation (RND)
The result of passing the environment-defined "solving" criteria.
- Dueling Double DQN
- Only one hyperparameter "UP_COEF" was adjusted.
- Proximal Policy Optimization (PPO)
- continuous