-
Notifications
You must be signed in to change notification settings - Fork 723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request for adding Dict type observation space #947
Comments
Duplicate of e.g. #100 There is not support for Tuple/Dict spaces, but one is planned for stable-baselines3. In your case you could concatenate the actions into one longer vector and split them later in environment (e.g. first 100 for "position", next 100 for "price"). Also check out tips on custom envs, especially on rescaling the values. You may close this issue if there are no further issues/enhancements related to stable-baselines. |
Yes, the wrapper is called |
Thanks!
Any advice on which one is preferred? |
I got similar problem, I planed to use spaces.turple( spaces.box(), spaces.multi_binary() ) in my observation space. It seems I had to concatenate them in a matrix and use spaces.Box instead. I want to did you solve it or you come up other ideas for the observation space ? Does it works? |
The "flattening into a box" works well for observations, no bigger issues there as long you normalize your values appropriately (see docs and rl tips on that). Biggest nag is if you have e.g. images and 1D vector data, and you want to pass image through convolutions and concatenate 1D data later. |
@Miffyli Thanks a lot! But when it comes to more complex cases, e.g. images and 1D vector data, I need to:
I referred your answer in , I'm really new to DQN and gym, trying to figure out how can I properly fit the input into correct space form. I guess maybe they are silly questions(Or if there is any project in similar cases recommended I can learn from). |
Yes, you need to take bit more extra steps as described in the issue you linked. You do not need the "FlattenObservation" wrapper, rather you need to manually add the 1D vector to final channel of the image. I used a following wrapper along with the network modules I linked in the other thread. Note that we do not offer tech support, so unfortunately I won't be answering further questions regarding on "how to do X", unless they are specific stable-baselines issues ^^' class AppendFeaturesToImageWrapper(gym.Wrapper):
"""
Append direct features to the image observation on last channel
Assumes the underlying observation space is a Tuple of (image_obs, feature_obs)
"""
def __init__(self, env):
super().__init__(env)
# Check that observation_space is valid
if not isinstance(env.observation_space, spaces.Tuple) or len(env.observation_space) != 2:
raise ValueError("Underlying observation_space should be a tuple of two spaces")
self.env = env
self.image_height = env.observation_space[0].shape[0]
self.image_width = env.observation_space[0].shape[1]
self.original_channels = env.observation_space[0].shape[2]
self.num_image_values = self.image_height * self.image_width
self.num_features = env.observation_space[1].shape[0]
self.num_padding = self.num_image_values - self.num_features
# Make sure image is large enough to store the direct features
if self.num_padding < 0:
raise ValueError("Images are too small to contain all features ({})".format(
self.num_features
))
self.observation_space = spaces.Box(
low=0,
high=1,
shape=(self.image_height, self.image_width, self.original_channels + 1)
)
def _append_features_to_image(self, image, features):
"""
Append append features on a new channel in the image
"""
# Turn features to same size as number of values in image channel,
# resize and append to image
features = np.concatenate((
features,
np.zeros((self.num_padding,), dtype=np.float32)
))
features.resize((self.image_height, self.image_width, 1))
image = np.concatenate((image, features), axis=2)
return image
def step(self, action):
obs, reward, terminal, info = self.env.step(action)
obs = self._append_features_to_image(obs[0], obs[1])
return obs, reward, terminal, info
def reset(self, **kwargs):
obs = self.env.reset(**kwargs)
obs = self._append_features_to_image(obs[0], obs[1])
return obs |
@Miffyli Thanks a lot!! |
I want to train a model in which the observation space is like
in gym language. But it seems stable_baselines repo has not yet added Dict: https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/input.py
I'd like some advice on implementing Dict type observation space in stable_baselines? o.w. any easy work arounds is welcome.
The text was updated successfully, but these errors were encountered: