Feature request for adding Dict type observation space #947

sophiagu · 2020-07-14T23:52:34Z

I want to train a model in which the observation space is like

observation_space = spaces.Dict({
  'position': spaces.Box(low=-100, high=100, shape=()),
  'price': spaces.Box(low=0, high=100, shape=()),
 })

in gym language. But it seems stable_baselines repo has not yet added Dict: https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/input.py

I'd like some advice on implementing Dict type observation space in stable_baselines? o.w. any easy work arounds is welcome.

The text was updated successfully, but these errors were encountered:

Miffyli · 2020-07-15T09:08:18Z

Duplicate of e.g. #100

There is not support for Tuple/Dict spaces, but one is planned for stable-baselines3. In your case you could concatenate the actions into one longer vector and split them later in environment (e.g. first 100 for "position", next 100 for "price"). Also check out tips on custom envs, especially on rescaling the values.

You may close this issue if there are no further issues/enhancements related to stable-baselines.

araffin · 2020-07-15T09:33:43Z

In your case you could concatenate the actions into one longer vector and split them later in environment

Yes, the wrapper is called FlattentObservation: https://github.com/openai/gym/blob/master/gym/wrappers/flatten_observation.py

sophiagu · 2020-07-17T02:12:48Z

Thanks!
I also tried the following:

observation_space = spaces.Box(
      low=np.array([-100*self.length, 0,]),
      high=np.array([100*self.length, self.equilibrium_price * 10.**self.high]),
      shape=(2,))

Any advice on which one is preferred?

berlintofind · 2020-07-25T08:28:31Z

I got similar problem, I planed to use spaces.turple( spaces.box(), spaces.multi_binary() ) in my observation space. It seems I had to concatenate them in a matrix and use spaces.Box instead. I want to did you solve it or you come up other ideas for the observation space ? Does it works?
And I don't understand why I still need FlattentObservation() to go one step further? Since Box is supported in observation space, concatenate matrix and vector (padding with zeros) in a Box form is enough I think. Why should I continue to flatten the box in matrix form into a vector?

Miffyli · 2020-07-25T09:42:08Z

@berlintofind

The "flattening into a box" works well for observations, no bigger issues there as long you normalize your values appropriately (see docs and rl tips on that). Biggest nag is if you have e.g. images and 1D vector data, and you want to pass image through convolutions and concatenate 1D data later.

berlintofind · 2020-07-25T11:39:24Z

@Miffyli Thanks a lot!
So if I understand correctly, The "flattening into a box, Matrix form" should works fine for common cases.

But when it comes to more complex cases, e.g. images and 1D vector data, I need to:

flatten the image data into 1D data by using FlattenObservation(), and concatenate it with 1D vector. Feed the vector as state of env.
split the augmented image into actual image and direct features (e.g. 1d sensor data). And process actual image, 1D data respectively
concatenate the image and direct features again, prepare for the next state

I referred your answer in , I'm really new to DQN and gym, trying to figure out how can I properly fit the input into correct space form. I guess maybe they are silly questions(Or if there is any project in similar cases recommended I can learn from).
Anyway, thanks in advance !

Miffyli · 2020-07-25T11:49:13Z

Yes, you need to take bit more extra steps as described in the issue you linked. You do not need the "FlattenObservation" wrapper, rather you need to manually add the 1D vector to final channel of the image. I used a following wrapper along with the network modules I linked in the other thread.

Note that we do not offer tech support, so unfortunately I won't be answering further questions regarding on "how to do X", unless they are specific stable-baselines issues ^^'

class AppendFeaturesToImageWrapper(gym.Wrapper):
    """
    Append direct features to the image observation on last channel

    Assumes the underlying observation space is a Tuple of (image_obs, feature_obs)
    """

    def __init__(self, env):
        super().__init__(env)

        # Check that observation_space is valid
        if not isinstance(env.observation_space, spaces.Tuple) or len(env.observation_space) != 2:
            raise ValueError("Underlying observation_space should be a tuple of two spaces")

        self.env = env
        self.image_height = env.observation_space[0].shape[0]
        self.image_width = env.observation_space[0].shape[1]
        self.original_channels = env.observation_space[0].shape[2]
        self.num_image_values = self.image_height * self.image_width
        self.num_features = env.observation_space[1].shape[0]

        self.num_padding = self.num_image_values - self.num_features

        # Make sure image is large enough to store the direct features
        if self.num_padding < 0:
            raise ValueError("Images are too small to contain all features ({})".format(
                self.num_features
            ))

        self.observation_space = spaces.Box(
            low=0,
            high=1,
            shape=(self.image_height, self.image_width, self.original_channels + 1)
        )

    def _append_features_to_image(self, image, features):
        """
        Append append features on a new channel in the image
        """
        # Turn features to same size as number of values in image channel,
        # resize and append to image

        features = np.concatenate((
            features,
            np.zeros((self.num_padding,), dtype=np.float32)
        ))

        features.resize((self.image_height, self.image_width, 1))

        image = np.concatenate((image, features), axis=2)

        return image

    def step(self, action):
        obs, reward, terminal, info = self.env.step(action)
        obs = self._append_features_to_image(obs[0], obs[1])
        return obs, reward, terminal, info

    def reset(self, **kwargs):
        obs = self.env.reset(**kwargs)
        obs = self._append_features_to_image(obs[0], obs[1])
        return obs

berlintofind · 2020-07-25T13:06:35Z

@Miffyli Thanks a lot!!
I will study it carefully ^-^

Miffyli added duplicate This issue or pull request already exists question Further information is requested labels Jul 15, 2020

sophiagu closed this as completed Jul 31, 2020

araffin mentioned this issue Oct 20, 2020

[question] Example for Cusom Policy for SAC with combined image and box type observation #1025

Closed

Miffyli mentioned this issue Nov 23, 2020

Support for input space of Dict format #1045

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request for adding Dict type observation space #947

Feature request for adding Dict type observation space #947

sophiagu commented Jul 14, 2020 •

edited

Loading

Miffyli commented Jul 15, 2020

araffin commented Jul 15, 2020

sophiagu commented Jul 17, 2020

berlintofind commented Jul 25, 2020 •

edited

Loading

Miffyli commented Jul 25, 2020

berlintofind commented Jul 25, 2020

Miffyli commented Jul 25, 2020

berlintofind commented Jul 25, 2020

Feature request for adding Dict type observation space #947

Feature request for adding Dict type observation space #947

Comments

sophiagu commented Jul 14, 2020 • edited Loading

Miffyli commented Jul 15, 2020

araffin commented Jul 15, 2020

sophiagu commented Jul 17, 2020

berlintofind commented Jul 25, 2020 • edited Loading

Miffyli commented Jul 25, 2020

berlintofind commented Jul 25, 2020

Miffyli commented Jul 25, 2020

berlintofind commented Jul 25, 2020

sophiagu commented Jul 14, 2020 •

edited

Loading

berlintofind commented Jul 25, 2020 •

edited

Loading