-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to run l2rpn_neurips_2020_track1_small using Rllib integration code #196
Comments
Hello, I will have a look at the first error. It seems that the upper / lower bound of the observation space are too tight. I will try to reproduce it and fix it asap. With the provided error, it does not seem to be an issue though (i mean: the observation given appears to be in the observation space, so this might - i say might- be a ray / rllib or a gym error and not a grid2op error) For the second one, I am aware that multiprocessing does not work well on macos / windows due to internal working of this package.
There is for now little i can do about it unfortunately. It works fine on linux based machine and on macos with python 3.6 or 3.7 i believe. They "broke" it starting from python 3.8 on macOs if i recall correctly. There might be a good chance that this code works if you run it outside a jupyter notebook though (running multiprocessing in jupyter notebook on MacOs and Windows never works pretty much, see eg https://stackoverflow.com/questions/23641475/multiprocessing-working-in-python-but-not-in-ipython/23641560#23641560 for a more detailed explanation) |
Hi, The issue with the "obs not in obs space" (or something like that) is indeed due to :
You can fix them with: class MyEnv(gym.Env):
def __init__(self, env_config):
import grid2op
from grid2op.gym_compat import GymEnv
from grid2op.gym_compat import ScalerAttrConverter, ContinuousToDiscreteConverter, MultiToTupleConverter
from lightsim2grid import LightSimBackend
# 1. create the grid2op environment
if not "env_name" in env_config:
raise RuntimeError("The configuration for RLLIB should provide the env name")
nm_env = env_config["env_name"]
del env_config["env_name"]
self.env_glop = grid2op.make(nm_env, backend=LightSimBackend(), **env_config)
self.env_glop.deactivate_forecast()
# 2. create the gym environment
self.env_gym = GymEnv(self.env_glop)
obs_gym = self.env_gym.reset()
# 3. (optional) customize it (see section above for more information)
## customize action space
self.env_gym.action_space = self.env_gym.action_space.ignore_attr("set_bus").ignore_attr("set_line_status")
self.env_gym.action_space = self.env_gym.action_space.reencode_space("redispatch",
ContinuousToDiscreteConverter(nb_bins=11)
)
self.env_gym.action_space = self.env_gym.action_space.reencode_space("change_bus", MultiToTupleConverter())
self.env_gym.action_space = self.env_gym.action_space.reencode_space("change_line_status",
MultiToTupleConverter())
self.env_gym.action_space = self.env_gym.action_space.reencode_space("redispatch", MultiToTupleConverter())
## customize observation space
ob_space = self.env_gym.observation_space
ob_space = ob_space.keep_only_attr(["rho", "gen_p", "load_p", "topo_vect", "actual_dispatch"])
if True:
ob_space = ob_space.reencode_space("actual_dispatch",
ScalerAttrConverter(substract=0.,
divide=self.env_glop.gen_pmax
)
)
ob_space = ob_space.reencode_space("gen_p",
ScalerAttrConverter(substract=0.,
divide=self.env_glop.gen_pmax
)
)
ob_space = ob_space.reencode_space("load_p",
ScalerAttrConverter(substract=obs_gym["load_p"],
divide=0.5 * obs_gym["load_p"]
)
)
self.env_gym.observation_space = ob_space
# 4. specific to rllib
self.action_space = self.env_gym.action_space
self.observation_space = self.env_gym.observation_space
self.step_count = 0
self.observation_space["gen_p"].low[:] = -np.inf
self.observation_space["gen_p"].high[:] = np.inf
self.observation_space["load_p"].low[:] = -np.inf
self.observation_space["load_p"].high[:] = np.inf
def reset(self):
obs = self.env_gym.reset()
self.step_count = 0
return obs
def step(self, action):
self.step_count += 1
obs, reward, done, info = self.env_gym.step(action)
return obs, reward, done, info A fix (partial for the first issue) will come soon. The "complete" fix for the issue is to put |
Code is working on linux now |
Environment
1.5.1
Bug description
The code is working for l2rpn-sandbox but it is producing error for l2rpn-neurips-track-1-small environment.
Code snippet
The output of the code snippet above (in colab):
The output of the code snippet above (in local machine)
The text was updated successfully, but these errors were encountered: