You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks a lot for the heads up.
I have added the fix in PR #281 mostly as suggested.
not sure if it will have much impact on the results, as the noise application process is ... well, noisy.
Problem Description
There are two bugs in
cleanrl/cleanrl/td3_continuous_action.py
Line 209 in e466f6e
(1) The same noise is used for all the batch actions.
(2) Action scale is not taken into account for the noise.
Checklist
poetry install
(see CleanRL's installation guideline.Current Behavior
(1) Takes noise size from actions[0]
(2) No scaling is performed on the noise, but the policy could have a different scale (see
cleanrl/cleanrl/td3_continuous_action.py
Line 114 in e466f6e
)
Expected Behavior
(1) Should take shape of data.actions
(2) Scale the noise according to the policy scale
Possible Solution
(1) replace torch.Tensor(actions[0]) with torch.Tensor(data.actions)
(2) Multiply the noise with target_actor.action_scale
Steps to Reproduce
The text was updated successfully, but these errors were encountered: