Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line that will be in maintenance next time step are not taken into account in the "simulate" function #148

Closed
DesmondZhong opened this issue Sep 15, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@DesmondZhong
Copy link

Environment

  • Grid2op version: 1.2.2
  • System: Archlinux

Bug description

Line maintenance seems to be treated as an attack, which is reflected in the attack duration.

How to reproduce

Code snippet

from lightsim2grid import LightSimBackend
from grid2op import make
backend = LightSimBackend()
env = make("l2rpn_neurips_2020_track1_small", backend=backend, difficulty="0")
env.seed(3) # for reproducibility
obs = env.reset()

def print_obs(obs):
    print(f"line status: {obs.line_status}")
    print(f"attack_duration: {obs.time_before_cooldown_line}")
    print(f"time next maintenance {obs.time_next_maintenance}")
    print(f"maintenance duration {obs.duration_next_maintenance}")

print("\n-------------initial observation----------------\n")
print_obs(obs) 
# from the observation, we know line 18 is scheduled for maintenance 
# in 684 time steps, we then do nothing for 683 time steps

from grid2op.Agent import DoNothingAgent
do_nothing_agent = DoNothingAgent(env.action_space)

# do nothing for 683 time steps
for i in range(683):
    obs, reward, done, info = env.step(do_nothing_agent.act(observation=None, reward=None))

print("\n-------------observation after 683 steps-------------\n")
print_obs(obs)
# notice now line 18 is still connected in the power grid

# first simulate one time step and actually step one time step
sim_obs, sim_reward, sim_done, sim_info = obs.simulate(do_nothing_agent.act(observation=None, reward=None))
obs, reward, done, info = env.step(do_nothing_agent.act(observation=None, reward=None))

print("\n------------simulation-------------\n")
print_obs(sim_obs)
# notice in the simulation, line 18 is still connected
print("\n---------true------------\n")
print_obs(obs)
# notice in the actual step, line 18 is down, 
# and the attack_duration of line 18 change to 96, 
# which indicates line 18 is undergoing an attack

Current output

------------simulation-------------

line status: [ True  True  True  True  True  True  True  True  True  True  True  True
  True False  True  True  True  True  True  True  True  True  True False
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True False  True  True]
attack_duration: [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0 31  0  0]
time next maintenance [3745   -1   -1   -1   -1   -1   -1   -1   -1 1729   -1   -1   -1   -1
   -1   -1   -1   -1    1   -1   -1   -1   -1   -1   -1   -1   -1   -1
   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1
   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1
   -1   -1   -1]
maintenance duration [96  0  0  0  0  0  0  0  0 96  0  0  0  0  0  0  0  0 96  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0]

---------true------------

line status: [ True  True  True  True  True  True  True  True  True  True  True  True
  True False  True  True  True  True False  True  True  True  True False
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True False  True  True]
attack_duration: [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 96  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0 31  0  0]
time next maintenance [3744   -1   -1   -1   -1   -1   -1   -1   -1 1728   -1   -1   -1   -1
   -1   -1   -1   -1    0   -1   -1   -1   -1   -1   -1   -1   -1   -1
   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1
   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1   -1
   -1   -1   -1]
maintenance duration [96  0  0  0  0  0  0  0  0 96  0  0  0  0  0  0  0  0 96  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0]

Expected output

Please look at my comments in the code snippet. I found this when I try to test the behavior of obs.simulate, what I find is that the maintenance of duration 96 is treated as an attack of duration 96, which is indicated by the the attack_duration of power line 18. As I understand it, the true attacks are currently hard coded to have duration of 48. I think the above results indicates that line maintenance are treated as attacks in the environment and obs.simulate will not be able to predict a line maintenance as they are essentially attacks.

@DesmondZhong DesmondZhong added the bug Something isn't working label Sep 15, 2020
@BDonnot
Copy link
Collaborator

BDonnot commented Sep 16, 2020

Hello,

I think there is some misunderstanding here on the "obs.time_before_cooldown_line" that is not exactly the attack. Actually cooldown on line can come of 3 different manners:

  • the agent changed the status of the powerline
  • there is a maintenance
  • there is an attack

Cooldown only means "you cannot act on the status of this powerline for XXX steps"

So yes, you have the impression that Line maintenance seems to be treated as an attack, which is reflected in the attack duration. because you just looked at the cooldown, which covers also maintenance and actions.

However, you are right it appears maintenance, on the first time step, are not correctly taken into account into simulate (there is a difference of 1 time steps).
You can manually disconnect the powerlines when you "simulate" for example.

@BDonnot BDonnot changed the title line maintenance are treated as attack, unexpected behavior in obs.simulate Line that will be in maintenance next time step are not taken into account in the "simulate" function Sep 16, 2020
BDonnot added a commit to BDonnot/Grid2Op that referenced this issue Sep 16, 2020
@DesmondZhong
Copy link
Author

Thanks for your explanation! It's good to know that the behavior of simulate in 1.2.2 does not take maintenance into account and I guess it will remain as it is in the current competition environment.

Actually, I kind of like the wrong behavior of the "simulate" function since it could possibly make the my RL agent easier to code. I know from the definition of "simulate", you probably want to fix it to take maintenance into account. Maybe it is a good idea to make the fix as the default and retain the option of not considering maintenance as well. I don't know if other people want this feature or not.

Anyway, thanks for addressing this issue! I'll close it since it has been fixed.

BDonnot added a commit that referenced this issue Feb 7, 2022
Some improvments, mainly for gym_compat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants