[RLlib] "APPO-accelerate" vol 01: Make AggregatorActors work with IMPALA/APPO. #49284

sven1977 · 2024-12-16T14:45:02Z

"APPO-accelerate" vol 01: Make AggregatorActors work with IMPALA/APPO.

This PR adds aggregation actor support for APPO and IMPALA on the new API stack.
Aggregation actors now own the learner connector offloading the task of translating episodes to batches from the Learner to these workers. They also pre-load the ready batch directly to the correct GPU and pass the ray reference directly to the Learner, thus avoiding any GPU-CPU transfers and allowing the Learner to directly utilize the train batch om the correct GPU.
Co-locates them with n Learner actors and makes them configurable as num_aggregation_actors_per_learner(!).
Deprecates old AggregationWorkers on old API stack.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

…_accelerate

…top (local CPU learner). Signed-off-by: sven1977 <[email protected]>

Signed-off-by: sven1977 <[email protected]>

…_accelerate

Signed-off-by: sven1977 <[email protected]>

…r) on 1 local(!) GPU, 29 EnvRunners: python cartpole_impala.py --num-env-runners=29 --num-envs-per-env-runner=20 --stop-iters=10 --num-learners=0 --num-gpus-per-learner=0.98 Signed-off-by: sven1977 <[email protected]>

…ny slowdowns wrt runs w/o evaluation active; and those might be due to the fact that we have 2 env runners less Signed-off-by: sven1977 <[email protected]>

…_accelerate

…an be reverted to their "normal" master versions (add states and to-numpy). Signed-off-by: sven1977 <[email protected]>

- co-locate each agg-actor with exactly one learner and keep the exact mapping on the algo (to later match, which gpu-batch-ref should go to which learner) - formalize config option to NOT build learner connector on learner (b/c its already built on agg actors) - remove garbage code - deprecate agg workers on old api stack entirely. Signed-off-by: sven1977 <[email protected]>

Signed-off-by: sven1977 <[email protected]>

…_accelerate

Signed-off-by: sven1977 <[email protected]>

…_accelerate Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/algorithms/impala/impala_learner.py # rllib/connectors/learner/learner_connector_pipeline.py

Signed-off-by: sven1977 <[email protected]>

…_accelerate

Signed-off-by: sven1977 <[email protected]>

…_accelerate Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/algorithms/algorithm_config.py

…ot_finalize_episodes_sent_to_buffer Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/utils/replay_buffers/episode_replay_buffer.py

Signed-off-by: sven1977 <[email protected]>

simonsays1980

LGTM. Already reviewed for OSS Ray. One small question tothe value of num_aggregate_workers_per_learner which could be None as it looks like in the code.

simonsays1980 · 2025-01-13T16:52:43Z

rllib/algorithms/algorithm.py

+                        0,
+                        (
+                            cf.num_gpus_per_learner
+                            - 0.01 * cf.num_aggregator_actors_per_learner


How does this not fail, if cf.num_aggregator_actors_per_learner=None?

B/c it's always some int. By default 0.

…ALA/APPO. (#49284)

…ALA/APPO. (ray-project#49284) Signed-off-by: Puyuan Yao <[email protected]>

sven1977 added 30 commits December 12, 2024 11:51

wip

9664f97

Signed-off-by: sven1977 <[email protected]>

1194

ad6cc99

Signed-off-by: sven1977 <[email protected]>

wip

acea140

Signed-off-by: sven1977 <[email protected]>

wip

2b52303

Signed-off-by: sven1977 <[email protected]>

wip

27fb72c

Signed-off-by: sven1977 <[email protected]>

wip

7380f46

Signed-off-by: sven1977 <[email protected]>

wip

0050c8a

Signed-off-by: sven1977 <[email protected]>

wip

4465ad2

Signed-off-by: sven1977 <[email protected]>

wip

0e8b9a2

Signed-off-by: sven1977 <[email protected]>

fix

b8397e9

Signed-off-by: sven1977 <[email protected]>

wip

409c95b

Signed-off-by: sven1977 <[email protected]>

1194

123c8b5

Signed-off-by: sven1977 <[email protected]>

wip

b8ac459

Signed-off-by: sven1977 <[email protected]>

wip

8213928

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into appo…

07c6b5c

…_accelerate

up and running (and learning cartpole) w/ 1 aggregation worker on lap…

8a937b3

…top (local CPU learner). Signed-off-by: sven1977 <[email protected]>

fix num_gpus=0.01 for aggregator actor

f4510b5

Signed-off-by: sven1977 <[email protected]>

fix num_gpus=0.01 for aggregator actor

eda58ab

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into appo…

843be59

…_accelerate

fixes

e38fcbc

Signed-off-by: sven1977 <[email protected]>

fix num_gpus=0.01 for aggregator actor

10f2a23

Signed-off-by: sven1977 <[email protected]>

fix

df18279

Signed-off-by: sven1977 <[email protected]>

fix

94e02e2

Signed-off-by: sven1977 <[email protected]>

- even with evaluation active (2 env runners), we almost don't have a…

d8edc3c

…ny slowdowns wrt runs w/o evaluation active; and those might be due to the fact that we have 2 env runners less Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into appo…

086fab1

…_accelerate

fix adding RLModule to aggregation workers, so all connector pieces c…

e831f89

…an be reverted to their "normal" master versions (add states and to-numpy). Signed-off-by: sven1977 <[email protected]>

fixes

59d300c

Signed-off-by: sven1977 <[email protected]>

fix

3acc5ba

Signed-off-by: sven1977 <[email protected]>

sven1977 added 7 commits January 6, 2025 20:39

fixes

c76f5b1

Signed-off-by: sven1977 <[email protected]>

fixes

72e30e1

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into appo…

9420e8f

…_accelerate

wip

ca0d05b

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into appo…

4307c6e

…_accelerate Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/algorithms/impala/impala_learner.py # rllib/connectors/learner/learner_connector_pipeline.py

wip

b296e8e

Signed-off-by: sven1977 <[email protected]>

wip

a888cc4

Signed-off-by: sven1977 <[email protected]>

sven1977 enabled auto-merge (squash) January 8, 2025 12:13

github-actions bot disabled auto-merge January 8, 2025 12:13

github-actions bot added the go add ONLY when ready to merge, run all tests label Jan 8, 2025

sven1977 added 2 commits January 8, 2025 13:57

wip

d968845

Signed-off-by: sven1977 <[email protected]>

fix

deec8b3

Signed-off-by: sven1977 <[email protected]>

sven1977 enabled auto-merge (squash) January 8, 2025 16:38

wip

765476b

Signed-off-by: sven1977 <[email protected]>

github-actions bot disabled auto-merge January 9, 2025 11:29

sven1977 added 9 commits January 9, 2025 13:22

merge

b71fb78

Signed-off-by: sven1977 <[email protected]>

merge

313f893

Signed-off-by: sven1977 <[email protected]>

merge

edbcad3

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into appo…

96c65c6

…_accelerate

Merge branch 'master' of https://github.com/ray-project/ray into appo…

36904c6

…_accelerate

wip

82cf4da

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into appo…

69a3b5b

…_accelerate Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/algorithms/algorithm_config.py

Merge branch 'master' of https://github.com/ray-project/ray into do_n…

7a86c43

…ot_finalize_episodes_sent_to_buffer Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/utils/replay_buffers/episode_replay_buffer.py

wip

4c66864

Signed-off-by: sven1977 <[email protected]>

sven1977 enabled auto-merge (squash) January 13, 2025 16:50

simonsays1980 approved these changes Jan 13, 2025

View reviewed changes

sven1977 merged commit 093ed4c into ray-project:master Jan 13, 2025
6 checks passed

sven1977 deleted the appo_accelerate branch January 13, 2025 18:15

srinathk10 pushed a commit that referenced this pull request Feb 2, 2025

[RLlib] "APPO-accelerate" vol 01: Make AggregatorActors work with IMP…

d99d6c7

…ALA/APPO. (#49284)

anyadontfly pushed a commit to anyadontfly/ray that referenced this pull request Feb 13, 2025

[RLlib] "APPO-accelerate" vol 01: Make AggregatorActors work with IMP…

afb9c40

…ALA/APPO. (ray-project#49284) Signed-off-by: Puyuan Yao <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] "APPO-accelerate" vol 01: Make AggregatorActors work with IMPALA/APPO. #49284

[RLlib] "APPO-accelerate" vol 01: Make AggregatorActors work with IMPALA/APPO. #49284

sven1977 commented Dec 16, 2024

simonsays1980 left a comment

simonsays1980 Jan 13, 2025

sven1977 Jan 13, 2025

[RLlib] "APPO-accelerate" vol 01: Make AggregatorActors work with IMPALA/APPO. #49284

[RLlib] "APPO-accelerate" vol 01: Make AggregatorActors work with IMPALA/APPO. #49284

Conversation

sven1977 commented Dec 16, 2024

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Jan 13, 2025

Choose a reason for hiding this comment

sven1977 Jan 13, 2025

Choose a reason for hiding this comment