-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Add metrics to buffers. #49822
Changes from 4 commits
f4e56ee
22d406f
5eea74e
78bcb4f
d440c93
12a251b
9f5ab29
a4df011
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -51,6 +51,7 @@ | |
NUM_ENV_STEPS_SAMPLED_LIFETIME, | ||
NUM_TARGET_UPDATES, | ||
REPLAY_BUFFER_ADD_DATA_TIMER, | ||
REPLAY_BUFFER_RESULTS, | ||
REPLAY_BUFFER_SAMPLE_TIMER, | ||
REPLAY_BUFFER_UPDATE_PRIOS_TIMER, | ||
SAMPLE_TIMER, | ||
|
@@ -660,6 +661,11 @@ def _training_step_new_api_stack(self): | |
sample_episodes=True, | ||
) | ||
|
||
replay_buffer_results = self.local_replay_buffer.get_metrics() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nice. Unified API names |
||
self.metrics.merge_and_log_n_dicts( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I wonder why
Maybe b/c in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I need to check it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, basically, the lifetime metrics are somehow wrongly accumulated and grow exponentially. They probably need to be reduced before given to the |
||
[replay_buffer_results], key=REPLAY_BUFFER_RESULTS | ||
) | ||
|
||
# Perform an update on the buffer-sampled train batch. | ||
with self.metrics.log_time((TIMERS, LEARNER_UPDATE_TIMER)): | ||
learner_results = self.learner_group.update_from_episodes( | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,11 +18,15 @@ | |
lr=0.0005 * (args.num_learners or 1) ** 0.5, | ||
train_batch_size_per_learner=32, | ||
replay_buffer_config={ | ||
"type": "PrioritizedEpisodeReplayBuffer", | ||
"type": "EpisodeReplayBuffer", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this just for testing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I wanted to check with you, if we proceed like this and then all buffers get the metrics. Then I can test with any of them. |
||
"capacity": 50000, | ||
"alpha": 0.6, | ||
"beta": 0.4, | ||
}, | ||
# replay_buffer_config={ | ||
# "type": "PrioritizedEpisodeReplayBuffer", | ||
# "capacity": 50000, | ||
# "alpha": 0.6, | ||
# "beta": 0.4, | ||
# }, | ||
n_step=(2, 5), | ||
double_q=True, | ||
dueling=True, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!