Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request for adding Dict type observation space #947

Closed
sophiagu opened this issue Jul 14, 2020 · 8 comments
Closed

Feature request for adding Dict type observation space #947

sophiagu opened this issue Jul 14, 2020 · 8 comments
Labels
duplicate This issue or pull request already exists question Further information is requested

Comments

@sophiagu
Copy link

sophiagu commented Jul 14, 2020

I want to train a model in which the observation space is like

observation_space = spaces.Dict({
  'position': spaces.Box(low=-100, high=100, shape=()),
  'price': spaces.Box(low=0, high=100, shape=()),
 })

in gym language. But it seems stable_baselines repo has not yet added Dict: https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/input.py

I'd like some advice on implementing Dict type observation space in stable_baselines? o.w. any easy work arounds is welcome.

@Miffyli Miffyli added duplicate This issue or pull request already exists question Further information is requested labels Jul 15, 2020
@Miffyli
Copy link
Collaborator

Miffyli commented Jul 15, 2020

Duplicate of e.g. #100

There is not support for Tuple/Dict spaces, but one is planned for stable-baselines3. In your case you could concatenate the actions into one longer vector and split them later in environment (e.g. first 100 for "position", next 100 for "price"). Also check out tips on custom envs, especially on rescaling the values.

You may close this issue if there are no further issues/enhancements related to stable-baselines.

@araffin
Copy link
Collaborator

araffin commented Jul 15, 2020

In your case you could concatenate the actions into one longer vector and split them later in environment

Yes, the wrapper is called FlattentObservation: https://github.com/openai/gym/blob/master/gym/wrappers/flatten_observation.py

@sophiagu
Copy link
Author

Thanks!
I also tried the following:

observation_space = spaces.Box(
      low=np.array([-100*self.length, 0,]),
      high=np.array([100*self.length, self.equilibrium_price * 10.**self.high]),
      shape=(2,))

Any advice on which one is preferred?

@berlintofind
Copy link

berlintofind commented Jul 25, 2020

I got similar problem, I planed to use spaces.turple( spaces.box(), spaces.multi_binary() ) in my observation space. It seems I had to concatenate them in a matrix and use spaces.Box instead. I want to did you solve it or you come up other ideas for the observation space ? Does it works?
And I don't understand why I still need FlattentObservation() to go one step further? Since Box is supported in observation space, concatenate matrix and vector (padding with zeros) in a Box form is enough I think. Why should I continue to flatten the box in matrix form into a vector?

@Miffyli
Copy link
Collaborator

Miffyli commented Jul 25, 2020

@berlintofind

The "flattening into a box" works well for observations, no bigger issues there as long you normalize your values appropriately (see docs and rl tips on that). Biggest nag is if you have e.g. images and 1D vector data, and you want to pass image through convolutions and concatenate 1D data later.

@berlintofind
Copy link

@Miffyli Thanks a lot!
So if I understand correctly, The "flattening into a box, Matrix form" should works fine for common cases.

But when it comes to more complex cases, e.g. images and 1D vector data, I need to:

  1. flatten the image data into 1D data by using FlattenObservation(), and concatenate it with 1D vector. Feed the vector as state of env.
  2. split the augmented image into actual image and direct features (e.g. 1d sensor data). And process actual image, 1D data respectively
  3. concatenate the image and direct features again, prepare for the next state

I referred your answer in , I'm really new to DQN and gym, trying to figure out how can I properly fit the input into correct space form. I guess maybe they are silly questions(Or if there is any project in similar cases recommended I can learn from).
Anyway, thanks in advance !

@Miffyli
Copy link
Collaborator

Miffyli commented Jul 25, 2020

Yes, you need to take bit more extra steps as described in the issue you linked. You do not need the "FlattenObservation" wrapper, rather you need to manually add the 1D vector to final channel of the image. I used a following wrapper along with the network modules I linked in the other thread.

Note that we do not offer tech support, so unfortunately I won't be answering further questions regarding on "how to do X", unless they are specific stable-baselines issues ^^'

class AppendFeaturesToImageWrapper(gym.Wrapper):
    """
    Append direct features to the image observation on last channel

    Assumes the underlying observation space is a Tuple of (image_obs, feature_obs)
    """

    def __init__(self, env):
        super().__init__(env)

        # Check that observation_space is valid
        if not isinstance(env.observation_space, spaces.Tuple) or len(env.observation_space) != 2:
            raise ValueError("Underlying observation_space should be a tuple of two spaces")

        self.env = env
        self.image_height = env.observation_space[0].shape[0]
        self.image_width = env.observation_space[0].shape[1]
        self.original_channels = env.observation_space[0].shape[2]
        self.num_image_values = self.image_height * self.image_width
        self.num_features = env.observation_space[1].shape[0]

        self.num_padding = self.num_image_values - self.num_features

        # Make sure image is large enough to store the direct features
        if self.num_padding < 0:
            raise ValueError("Images are too small to contain all features ({})".format(
                self.num_features
            ))

        self.observation_space = spaces.Box(
            low=0,
            high=1,
            shape=(self.image_height, self.image_width, self.original_channels + 1)
        )

    def _append_features_to_image(self, image, features):
        """
        Append append features on a new channel in the image
        """
        # Turn features to same size as number of values in image channel,
        # resize and append to image

        features = np.concatenate((
            features,
            np.zeros((self.num_padding,), dtype=np.float32)
        ))

        features.resize((self.image_height, self.image_width, 1))

        image = np.concatenate((image, features), axis=2)

        return image

    def step(self, action):
        obs, reward, terminal, info = self.env.step(action)
        obs = self._append_features_to_image(obs[0], obs[1])
        return obs, reward, terminal, info

    def reset(self, **kwargs):
        obs = self.env.reset(**kwargs)
        obs = self._append_features_to_image(obs[0], obs[1])
        return obs

@berlintofind
Copy link

@Miffyli Thanks a lot!!
I will study it carefully ^-^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants