RedispReward returns reward greater than the reward_max #187

mlanden · 2021-04-13T18:22:20Z

Environment

Grid2op version: 1.4.0
System: Red Hat Enterprise Linux Server 7.9 (Maipo)"
Additional system information

Bug description

The redispReward is return rewards that are greater than the maximum reward calculated.

How to reproduce

Create environment with default reward and compare the rewards returned to the max reward computed in the initiative function.


### Code snippet
<!--Expose the python code you want us to test-->
```python
import grid2op

env = grid2op.make()
_, reward, _, _ = env.step(env.action_space({}))

The output of the code snippet above

The reward returned is greater than the reward_max computed in initialize.

The text was updated successfully, but these errors were encountered:

BDonnot · 2021-04-13T18:33:54Z

Hello,
Thanks for noticing, l will try to update it asap.

It should not be a problem, maybe in the mean time you can have a look at other environments, that should behave more normally i hope.

The other env. close to the default one is

import grid2op
env_name="l2rpn_case14_realistic"
env = grid2op.make(en_name)

mlanden · 2021-04-13T18:42:57Z

Unfortunately, that does not seem to be the case. with the ieee 14 bus, I still see

Max: 706.4000244140625, Reward: 1085.8973388671875 redispach 1085.8973388671875, overflow 0.0

where the overflow reward is what I'm adding, but it is not relevant here because it is 0. I am applying interpolation to this reward so the max should be accurate to get correct results from np.interp.

BDonnot · 2021-04-14T08:32:26Z

For me, for later resuse:

import grid2op
env_name="l2rpn_case14_sandbox"
env = grid2op.make(env_name)
obs = env.reset()
obs, reward, done, info = env.step(env.action_space())
print(f"obtained reward: {reward:.2f}")
print(f"max reward: {env.reward_range[1]:.2f}")

So indeed there is problem with the way the reward max is computed for this specific case.

Proposing a fix for issue #187 Adding the doc, for issue #179 Adding other doc for issue #184 the documentation of the opponent See changelog for more information

Bd dev

mlanden added the bug Something isn't working label Apr 13, 2021

BDonnot referenced this issue in BDonnot/Grid2Op Apr 14, 2021

addressing issue rte-france#187 and the redisp reward

754994a

BDonnot mentioned this issue Apr 14, 2021

Bd dev #188

Merged

BDonnot added a commit that referenced this issue Apr 15, 2021

Merge pull request #188 from BDonnot/bd_dev

27d7078

Proposing a fix for issue #187 Adding the doc, for issue #179 Adding other doc for issue #184 the documentation of the opponent See changelog for more information

BDonnot mentioned this issue Apr 15, 2021

Ready for 1.5.1 #189

Merged

BDonnot linked a pull request Apr 15, 2021 that will close this issue

Dev 1.5.1 #190

Merged

BDonnot closed this as completed in #190 Apr 15, 2021

BDonnot added a commit that referenced this issue Mar 1, 2024

Merge pull request #187 from BDonnot/bd_dev

b8d13e0

Bd dev

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RedispReward returns reward greater than the reward_max #187

RedispReward returns reward greater than the reward_max #187

mlanden commented Apr 13, 2021

BDonnot commented Apr 13, 2021

mlanden commented Apr 13, 2021

BDonnot commented Apr 14, 2021 •

edited

Loading

RedispReward returns reward greater than the reward_max #187

RedispReward returns reward greater than the reward_max #187

Comments

mlanden commented Apr 13, 2021

Environment

Bug description

How to reproduce

BDonnot commented Apr 13, 2021

mlanden commented Apr 13, 2021

BDonnot commented Apr 14, 2021 • edited Loading

BDonnot commented Apr 14, 2021 •

edited

Loading