Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Reinstate trajectory view API tests. #18809

Merged
merged 3 commits into from
Sep 23, 2021

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Sep 22, 2021

This PR reinstates the commented-out trajectory view API tests in rllib/evaluation/tests/test_trajectory_view_api.py.

  • Fixes some minor bugs along the way: tf-eager does not return correct value from is_recurrent() when used with attention nets;

Why are these changes needed?

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Member

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a lot of nice cleanups, awesome change.
I just have a couple of question mostly for my education.
thanks.

results = trainer.train()
assert results["train_batch_size"] == config["train_batch_size"]
assert results["timesteps_total"] == config["train_batch_size"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a random question that I have been curious about for a while: how much do we honor the train_batch_size param here?
for example, in complete_episode mode, or if there is sample replay, will we ever give a training batch that is of very different size?
thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to do our best to honor it, but it's not guaranteed to be exact always.
The reason is that we do parallel rollouts with a fixed (or full episode length) step limit per vectorized(!) environment. Depending on the number of vectorized sub-envs per worker and the number of workers, the final train batch may be slightly off. For PPO for example, we auto-correct the rollout_fragment_length (since a few releases ago) based on these factors to better match the train_batch_size, but of course if you have lots of odd numbers in these setting, you will not get the train batch exactly right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation, that's my impression as well.
I actually have a feeling sometimes we may be off a lot.
I can probably do some testing when I get a chance.
thanks.

SampleBatch.ACTIONS,
shift=1,
space=action_space,
used_for_compute_actions=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should have validation for this field somewhere? seems easy to miss, and not straight-forward for regular users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to check, whether it's even possible to have this in the action computation event, even though the shift is >0 from a "collected" field, like actions? Great idea!

@sven1977 sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Sep 22, 2021
@sven1977 sven1977 merged commit a96dbd8 into ray-project:master Sep 23, 2021
@sven1977 sven1977 deleted the reinstate_traj_view_api_tests branch June 2, 2023 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants