Enabling periodic evaluation #202

elle-miller · 2024-09-19T09:38:22Z

elle-miller
Sep 19, 2024

Hi there,

I am proposing SKRL should have a way to periodically evaluate the agent during training. For example use SequentialTrainer in an alternating train()->eval()->train() fashion. The provided examples only show evaluation post-training: https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html

Here is an example of an agent in the Isaac Lab Cartpole environment. You can see that the evaluation returns are communicating the true learning state of the agent, without the stochasticity of the sampled actions. I was always confused by how the performance would degrade/oscillate in Rewards/Total reward (mean) throughout training, but this would fix that.

The code needed for this change & to reproduce plots below is here: https://github.com/elle-miller/skrl_testing

In this example, I train num_envs for 1000 timesteps each, and evaluate 10 times throughout the process. This means training 100 timesteps, evaluate, and repeat x10.

max_timesteps = 100
num_eval = 10
train_timesteps = 50
for step in range(num_eval):
        # global_step includes only training timesteps
        global_step = step * train_timesteps

        # compute evaluation returns
        returns = trainer.eval()
        agent.writer.add_scalar("Eval / Returns", returns['returns'].mean().cpu(), global_step=global_step)
 
        # train
        trainer.train(train_timesteps)

Code modifications

The train() function to reset the memory and rollout counter:

self.agents.memory.reset()
self.agents._rollout = 0

The act() function in PPO to only return the mean action under evaluation instead of sampling.

actions, log_prob, outputs = self.policy.act({"states": self._state_preprocessor(states)}, role="policy")
if eval:
    actions = outputs['mean_actions']
self._current_log_prob = log_prob
return actions, log_prob, outputs

The eval() method
Fixing reset method

Let me know what you think - I can make a PR request if you want to integrate this.

elle-miller · 2024-09-30T09:57:09Z

elle-miller
Sep 30, 2024
Author

@Toni-SM just fyi I resolved my initial issues from when I opened this discussion 2 weeks ago, and have edited the post to reflect current working state.

1 reply

Toni-SM Sep 30, 2024
Maintainer

Hi @elle-miller

Yeah, this sounds good, and please open a PR for this.
By the way, it might be necessary to take into account the problem mentioned in #154

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enabling periodic evaluation #202

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Enabling periodic evaluation #202

elle-miller Sep 19, 2024

Replies: 1 comment · 1 reply

elle-miller Sep 30, 2024 Author

Toni-SM Sep 30, 2024 Maintainer

elle-miller
Sep 19, 2024

Replies: 1 comment 1 reply

elle-miller
Sep 30, 2024
Author

Toni-SM Sep 30, 2024
Maintainer