[RLlib] Reinstate trajectory view API tests. #18809

sven1977 · 2021-09-22T13:17:19Z

This PR reinstates the commented-out trajectory view API tests in rllib/evaluation/tests/test_trajectory_view_api.py.

Fixes some minor bugs along the way: tf-eager does not return correct value from is_recurrent() when used with attention nets;

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

gjoliver

a lot of nice cleanups, awesome change.
I just have a couple of question mostly for my education.
thanks.

gjoliver · 2021-09-22T17:18:04Z

rllib/evaluation/tests/test_trajectory_view_api.py

            results = trainer.train()
-            assert results["train_batch_size"] == config["train_batch_size"]
+            assert results["timesteps_total"] == config["train_batch_size"]


I have a random question that I have been curious about for a while: how much do we honor the train_batch_size param here?
for example, in complete_episode mode, or if there is sample replay, will we ever give a training batch that is of very different size?
thanks.

We try to do our best to honor it, but it's not guaranteed to be exact always.
The reason is that we do parallel rollouts with a fixed (or full episode length) step limit per vectorized(!) environment. Depending on the number of vectorized sub-envs per worker and the number of workers, the final train batch may be slightly off. For PPO for example, we auto-correct the rollout_fragment_length (since a few releases ago) based on these factors to better match the train_batch_size, but of course if you have lots of odd numbers in these setting, you will not get the train batch exactly right.

thanks for the explanation, that's my impression as well.
I actually have a feeling sometimes we may be off a lot.
I can probably do some testing when I get a chance.
thanks.

gjoliver · 2021-09-22T17:18:56Z

rllib/evaluation/tests/test_trajectory_view_api.py

+                SampleBatch.ACTIONS,
+                shift=1,
+                space=action_space,
+                used_for_compute_actions=False)


maybe we should have validation for this field somewhere? seems easy to miss, and not straight-forward for regular users.

You mean to check, whether it's even possible to have this in the action computation event, even though the shift is >0 from a "collected" field, like actions? Great idea!

wip.

6f222a5

sven1977 requested a review from gjoliver September 22, 2021 13:17

sven1977 assigned gjoliver Sep 22, 2021

docstring update

ac330a8

gjoliver approved these changes Sep 22, 2021

View reviewed changes

wip.

9dfda3e

sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Sep 22, 2021

sven1977 merged commit a96dbd8 into ray-project:master Sep 23, 2021

sven1977 deleted the reinstate_traj_view_api_tests branch June 2, 2023 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Reinstate trajectory view API tests. #18809

[RLlib] Reinstate trajectory view API tests. #18809

sven1977 commented Sep 22, 2021 •

edited

Loading

gjoliver left a comment

gjoliver Sep 22, 2021

sven1977 Sep 23, 2021

gjoliver Sep 23, 2021

gjoliver Sep 22, 2021

sven1977 Sep 23, 2021

[RLlib] Reinstate trajectory view API tests. #18809

[RLlib] Reinstate trajectory view API tests. #18809

Conversation

sven1977 commented Sep 22, 2021 • edited Loading

Why are these changes needed?

Related issue number

Checks

gjoliver left a comment

Choose a reason for hiding this comment

gjoliver Sep 22, 2021

Choose a reason for hiding this comment

sven1977 Sep 23, 2021

Choose a reason for hiding this comment

gjoliver Sep 23, 2021

Choose a reason for hiding this comment

gjoliver Sep 22, 2021

Choose a reason for hiding this comment

sven1977 Sep 23, 2021

Choose a reason for hiding this comment

sven1977 commented Sep 22, 2021 •

edited

Loading