Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reward not called on diverging flow error #146

Closed
bwitherspoon opened this issue Sep 11, 2020 · 2 comments
Closed

Reward not called on diverging flow error #146

bwitherspoon opened this issue Sep 11, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@bwitherspoon
Copy link
Contributor

Environment

  • Grid2op version: 1.2.2
  • System: Fedora 32

Bug description

The reward is not called on diverging flow exceptions and possibly other errors. The minimum reward is returned instead

Code snippet

import grid2op
from grid2op.Reward import BaseReward
from grid2op.dtypes import dt_float

class Reward(BaseReward):
    def __init__(self):
        super().__init__()
        self.reward_min = dt_float(100.0) # Note difference from below
        self.reward_max = dt_float(0.0)

    def __call__(self, action, env, has_error, is_done, is_illegal, is_ambiguous):
        if has_error:
            return dt_float(-10.0)
        else:
            return dt_float(1.0)

env = grid2op.make("l2rpn_case14_sandbox", reward_class=Reward)

obs = env.reset()
while True:
    act = env.action_space.sample()
    obs, reward, done, info = env.step(act)

    if len(info["exception"]) > 0:
        for e in info["exception"]:
            if type(e) is grid2op.Exceptions.DivergingPowerFlow:
                print("reward -", reward)
    if done:
        break

Current output

reward - 100.0

Expected output

reward - -10.0
@bwitherspoon bwitherspoon added the bug Something isn't working label Sep 11, 2020
@BDonnot BDonnot added enhancement New feature or request and removed bug Something isn't working labels Sep 12, 2020
@BDonnot
Copy link
Collaborator

BDonnot commented Sep 12, 2020

Hello,
Thanks for reporting this bug :-)
I'll try to have a fix ready for the next version (hopefully released next week, but no promises)

As it is required that "min_reward" is acutally the minimum reward, i moved this issue from "bug" to "enhancement": if the requirement is met, then the behaviour is normal.
Though I totally agree this is quite surprising (and not really explicit) behaviour.

Thanks for the issue :-)

Benjamin

BDonnot added a commit to BDonnot/Grid2Op that referenced this issue Sep 15, 2020
@BDonnot
Copy link
Collaborator

BDonnot commented Sep 25, 2020

Fixed in version 1.2.3 :-) [release in progress at this very moment]

@BDonnot BDonnot closed this as completed Sep 25, 2020
BDonnot added a commit that referenced this issue Jan 19, 2022
Include example for noisy observation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants