-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern #5626
Conversation
@pcmoritz can we also cherry pick this into the 0.7.4. branch? |
Test FAILed. |
Sounds good, yeah! Shall we also merge it into master? |
Yep
…On Tue, Sep 3, 2019, 3:55 PM Philipp Moritz ***@***.***> wrote:
Sounds good, yeah! Shall we also merge it into master?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5626?email_source=notifications&email_token=AAADUSVQMRPW2J2LMUAZL3DQH3TPBA5CNFSM4ITK7DTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5Z2AKA#issuecomment-527671336>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAADUSWISPVBJEHOY5JVQ6LQH3TPBANCNFSM4ITK7DTA>
.
|
Test FAILed. |
There is another Jenkins error (it might involve copying a lot of code back, since it is shared by the SAC agent). |
Test FAILed. |
Jenkins retest this please |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Jenkins retest this please |
Test FAILed. |
Test FAILed. |
This reverts #5242 to fix #5604
Two issues: (1) blowup of Q-values during training of MountainCar (and also Pendulum if you use higher tau values), (2) eager mode acts differently from graph mode somehow.
These are likely due to a subtle bug in the port to the build_tf_policy pattern. We should re-visit the port, but this PR reverts it quickly for 0.7.4.