-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RedispReward returns reward greater than the reward_max #187
Comments
Hello, It should not be a problem, maybe in the mean time you can have a look at other environments, that should behave more normally i hope. The other env. close to the default one is import grid2op
env_name="l2rpn_case14_realistic"
env = grid2op.make(en_name) |
Unfortunately, that does not seem to be the case. with the ieee 14 bus, I still see
where the overflow reward is what I'm adding, but it is not relevant here because it is 0. I am applying interpolation to this reward so the max should be accurate to get correct results from np.interp. |
For me, for later resuse: import grid2op
env_name="l2rpn_case14_sandbox"
env = grid2op.make(env_name)
obs = env.reset()
obs, reward, done, info = env.step(env.action_space())
print(f"obtained reward: {reward:.2f}")
print(f"max reward: {env.reward_range[1]:.2f}") So indeed there is problem with the way the reward max is computed for this specific case. |
Environment
1.4.0
Red Hat Enterprise Linux Server 7.9 (Maipo)"
Bug description
The redispReward is return rewards that are greater than the maximum reward calculated.
How to reproduce
Create environment with default reward and compare the rewards returned to the max reward computed in the initiative function.
The output of the code snippet above
The text was updated successfully, but these errors were encountered: