Debugging PPO training #57

jcallaham · 2022-09-27T11:53:22Z

Explanation

Previously the cylinder environment would train with the PPO implementation in RLlib but the performance would not improve (fixes #49). I think there were a few issues, including

Environments (and their flows/timesteppers) not properly resetting
Control inputs not being properly applied within rollouts

These at least seem to be fixed, although now there's a new issue where RLlib crashes after 3 epochs with a hard-to-understand error message (see #55).

Current status

All tests passing
RLlib will run without error for some amount of time and then crash

Plan

The way the package is currently structured makes the memory issue a bit hard to debug, especially with recent version changes in OpenAI Gym. Even though our package is in a sense "broken" at this point with respect to training RL agents, it's no more broken than it was before this PR. Since fixing this issue may require resolving #54, #55, and #56, I plan to merge as-is and then try to work through all these with a cleaner package (post-#56).

jcallaham · 2022-09-30T07:39:34Z

Also fixes #55

jcallaham added 10 commits September 3, 2022 07:06

Added basic PPO for debugging

64bd1dc

Added cylinder example with homebrew PPO

1e7284a

Added cylinder example with homebrew PPO

d6c1dd6

Debugging actuation updates

086446a

Refactoring control handling (cyl only) and testing PPO

575e2d2

Testing RLlib PPO on cylinder and updating other environments

1e35a63

Moving common functions to FlowConfigBase (tests passing except env_grad

ee42c93

Re-installed pre-commit

89d547d

Fixed IPCS_diff (env_grad passing)

2fc6c07

Debugging memory leak

5a75bd2

jcallaham self-assigned this Sep 27, 2022

jcallaham added 2 commits September 30, 2022 00:31

Fixing intermittent crash in PPO training

99678c6

Notebook formatting

a5cde38

jcallaham added 3 commits September 30, 2022 01:43

Merging main

969eddc

Merging main

042e96d

Tests passing (changed TAU back to 0.05)

02dbd38

jcallaham marked this pull request as ready for review September 30, 2022 09:01

jcallaham merged commit 208fcf4 into main Sep 30, 2022

jcallaham deleted the jc/ppo-train-debug branch September 30, 2022 09:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Debugging PPO training #57

Debugging PPO training #57

jcallaham commented Sep 27, 2022 •

edited

Loading

jcallaham commented Sep 30, 2022

Debugging PPO training #57

Debugging PPO training #57

Conversation

jcallaham commented Sep 27, 2022 • edited Loading

Explanation

Current status

Plan

jcallaham commented Sep 30, 2022

jcallaham commented Sep 27, 2022 •

edited

Loading