Reward not called on diverging flow error #146

bwitherspoon · 2020-09-11T18:21:22Z

Environment

Grid2op version: 1.2.2
System: Fedora 32

Bug description

The reward is not called on diverging flow exceptions and possibly other errors. The minimum reward is returned instead

Code snippet

import grid2op
from grid2op.Reward import BaseReward
from grid2op.dtypes import dt_float

class Reward(BaseReward):
    def __init__(self):
        super().__init__()
        self.reward_min = dt_float(100.0) # Note difference from below
        self.reward_max = dt_float(0.0)

    def __call__(self, action, env, has_error, is_done, is_illegal, is_ambiguous):
        if has_error:
            return dt_float(-10.0)
        else:
            return dt_float(1.0)

env = grid2op.make("l2rpn_case14_sandbox", reward_class=Reward)

obs = env.reset()
while True:
    act = env.action_space.sample()
    obs, reward, done, info = env.step(act)

    if len(info["exception"]) > 0:
        for e in info["exception"]:
            if type(e) is grid2op.Exceptions.DivergingPowerFlow:
                print("reward -", reward)
    if done:
        break

Current output

reward - 100.0

Expected output

reward - -10.0

The text was updated successfully, but these errors were encountered:

BDonnot · 2020-09-12T14:11:33Z

Hello,
Thanks for reporting this bug :-)
I'll try to have a fix ready for the next version (hopefully released next week, but no promises)

As it is required that "min_reward" is acutally the minimum reward, i moved this issue from "bug" to "enhancement": if the requirement is met, then the behaviour is normal.
Though I totally agree this is quite surprising (and not really explicit) behaviour.

Thanks for the issue :-)

Benjamin

BDonnot · 2020-09-25T14:49:05Z

Fixed in version 1.2.3 :-) [release in progress at this very moment]

Include example for noisy observation

bwitherspoon added the bug Something isn't working label Sep 11, 2020

BDonnot added enhancement New feature or request and removed bug Something isn't working labels Sep 12, 2020

BDonnot added a commit to BDonnot/Grid2Op that referenced this issue Sep 15, 2020

adressing and making a test for enhancement Grid2op#146

6cefb57

This was referenced Sep 25, 2020

Update to version 1.2.3 BDonnot/Grid2Op#112

Merged

update to version 1.2.3 #150

Merged

BDonnot closed this as completed Sep 25, 2020

bwitherspoon mentioned this issue Dec 12, 2020

Incorrect reward on end of episode without error #164

Closed

BDonnot added a commit that referenced this issue Jan 19, 2022

Merge pull request #146 from BDonnot/bd_dev

35cf69e

Include example for noisy observation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reward not called on diverging flow error #146

Reward not called on diverging flow error #146

bwitherspoon commented Sep 11, 2020

BDonnot commented Sep 12, 2020

BDonnot commented Sep 25, 2020

Reward not called on diverging flow error #146

Reward not called on diverging flow error #146

Comments

bwitherspoon commented Sep 11, 2020

Environment

Bug description

Code snippet

Current output

Expected output

BDonnot commented Sep 12, 2020

BDonnot commented Sep 25, 2020