-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Can't solve Gymnasium Frozenlake-v1 8x8 with A2C #1670
Comments
Hello,
Have you tried other algorithms? |
I tried with a DQN without any luck. I tried modifying the size of the net (policy and value) and entropy and value coefficient for the A2C algorithm. Someone in this post mentioned that a tabular Q-Learning method would be more efficient than a DQN and a A2C. I'll check the hyperparameter tuning anyway but if anyone can point me to the right direction would be great. Thanks in advance. |
By the way, what do you mean exactly by solving? a reward always equal to 1? |
Solving the environment equals to reaching the finish state. |
yes, but always or at least in some cases? |
simpler doesn't mean worse, tabular q-learning is tailored for that env. |
I'm using the non deterministic version of the env (is_slippery=True), and it can solve it around 60 times out 100 aprox. With the regular Q-Learning, none. Same with A2C. |
With those commands, I managed to get ~60% success.
FrozenLake-v1:
n_timesteps: !!float 1e6
policy: 'MlpPolicy'
n_envs: 8
|
Thank you for your reply! I'll try it to see if I can replicate these results. Anyway I think this should be added to the RL zoo repo |
❓ Question
Hello, I'm trying to solve the Frozenlake-v1 environment with is_slippery = True (non-deterministic) with the stable baselines 3 A2C algorithm. I can solve the 4x4 version but I can't achieve any results with the 8x8 version. I also checked the RL-Zoo to see if there is any hyperparameter tunning about that environment but there is nothing. Which adjustments can I do to make it work properly?
Checklist
The text was updated successfully, but these errors were encountered: