-
In skrl examples, the timestep are defined by the above equation, but why? I thought is
Thank you in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Hi @berttggg As indicated in the skrl documentation, the training configuration is mapped as far as possible from the rl_games configuration. In rl_games, the training duration is basically defined by the while True:
epoch_num += 1
for n in range(self.horizon_length):
... # collect data (rollouts)
... # train the agent
if epoch_num >= self.max_epochs:
break Reference: |
Beta Was this translation helpful? Give feedback.
-
Thank you so much so the explanation. But I am confused. In rl_games documentation: So, the data size is governed by In skrl documentation: In skrl, the num_actors (num_envs) seem to be left out? |
Beta Was this translation helpful? Give feedback.
Hi @berttggg
rl_games counts (for statistics) the timesteps of each parallel environment (
num_actors
) independently, while skrl does it for the whole task (where callingenv.step(...)
executes a timestep for all parallel environments in one go).