Peer review Lab10 #3

Niiikkkk · 2023-12-29T13:16:01Z

Hi, the code seems to be right. I just don’t understand how the reward is computed. Also there’s no exploration in the move of the RL, maybe you can add a 20% of making a random move and not always the optimal one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peer review Lab10 #3

Peer review Lab10 #3

Niiikkkk commented Dec 29, 2023

Peer review Lab10 #3

Peer review Lab10 #3

Comments

Niiikkkk commented Dec 29, 2023