[question] [feature request] support for Dict and Tuple spaces #133

AloshkaD · 2018-12-15T23:44:52Z

I want to train using two images from different cameras and an array of 1d data from a sensor. I'm passing these input as my env state. Obviously I need a cnn that can take those inputs, concatenate, and train on them. My question is how to pass these input to such a custom cnn in polocies.py. Also, I tried to pass two images and apparently dummy_vec_env.py had trouble with that.
obs = env.reset() File "d:\resources\stable-baselines\stable_baselines\common\vec_env\dummy_vec_env.py", line 57, in reset self._save_obs(env_idx, obs) File "d:\resources\stable-baselines\stable_baselines\common\vec_env\dummy_vec_env.py", line 75, in _save_obs self.buf_obs[key][env_idx] = obs ValueError: cannot copy sequence with size 2 to array axis with dimension 80

I appreciate any thoughts or examples.

The text was updated successfully, but these errors were encountered:

araffin · 2018-12-16T09:53:12Z

Hello,

Could you please provide a minimal code to reproduce the error?

AloshkaD · 2018-12-16T18:38:44Z

@araffin So for simplicity, say I need to pass two rgb images (observations from two cameras onboard a robot) of the size (80,160,4) as states like this

`class MyCustomEnv(gym.Env):

def __init__(self):

    self.observation_space = spaces.Box(low=0, high=255, shape=(80,160,4), dtype=np.float32)

    self.state =(np.zeros((80, 160,4), dtype=np.uint8),np.zeros((80, 160,4), dtype=np.uint8))

    .
    .
    .
def step(self, action):
    .
    .
    .
    self.state = self.rgbimage_1,self.rgbimage_2 
    return self.state, reward, done, info
def reset(self):
    .
    .
    .
    self.state = self.rgbimage_1,self.rgbimage_2 
    return self.state

`
I hope this is good enough. I also suspect my definition of the observation_space might not be correct but I tried different methods to define an observation_space for two images and nothing worked. I saw that you are a contributor here and I hope you would be able to help with defining the ob_space too
for the record, I tried to build an observation space like this

`self.nested_observation_space = spaces.Dict({

    'sensors':  spaces.Dict({

        #'position': spaces.Box(low=-100, high=100, shape=(3,)),

        #'velocity': spaces.Box(low=-1, high=1, shape=(3,)),

        'front_cam': spaces.Tuple((

            spaces.Box(low=0, high=255, shape=(80, 160, 4)),

            spaces.Box(low=0, high=255, shape=(80, 160, 4))

        )),
        }) 
        })

`
but that didn't work either and returned the error

env = DummyVecEnv([lambda: env]) # The algorithms require a vectorized environment to run File "d:\stable-baselines\stable_baselines\common\vec_env\dummy_vec_env.py", line 31, in __init__ shapes[key] = box.shape
for simplicity I passed this and still got the same error
self.nested_observation_space =spaces.Tuple(( spaces.Box(low=0, high=255, shape=(80, 160, 4)), spaces.Box(low=0, high=255, shape=(80, 160, 4)) ))

I can send you the complete class if you like.
Thanks

AloshkaD · 2018-12-17T05:58:24Z

The problem appears to be with vectorizing the env.. I get
"d:stable-baselines\stable_baselines\common\vec_env\dummy_vec_env.py", line 35, in __init__ self.buf_obs = {k: np.zeros((self.num_envs,) + tuple(shapes[k]), dtype=dtypes[k]) for k in self.keys} File "d:\stable-baselines\stable_baselines\common\vec_env\dummy_vec_env.py", line 35, in <dictcomp> self.buf_obs = {k: np.zeros((self.num_envs,) + tuple(shapes[k]), dtype=dtypes[k]) for k in self.keys} TypeError: 'NoneType' object is not iterable

for defining the state like this
self.observation_space =spaces.Tuple(( spaces.Box(low=0, high=255, shape=(80, 160, 4), dtype=np.uint8), spaces.Box(low=0, high=255, shape=(80, 160, 4), dtype=np.uint8) ))

araffin · 2018-12-17T09:20:11Z

Hello,
Dict and Tuple spaces are not supported for observations spaces. Did you try concatenating the images along the channel axis?

AloshkaD · 2018-12-18T19:00:37Z

I could concatenate the images and then separate them when fed to the cnn. I could also pad the signal with zeros and concatenate it as a 2x2 channel. I'm worried about the scalability of this approach.

pulver22 · 2018-12-19T14:15:44Z

Hello,
Dict and Tuple spaces are not supported for observations spaces. Did you try concatenating the images along the channel axis?

Why aren't they supported? I also would like to pass image + scalars an input to the policy at the current stage this is not possible. I don't know if it's more convenient to write a code for this or just add a vector of scalar at the end of the image and then separate it later.

hill-a · 2018-12-19T14:34:09Z

hey,

~~@pulver22 Well Tuple space could be supported with some effort (IIRC you can feed tuples into the feed_dict with a tf.concat of placeholders).~~

However Dict would require quite a bit of reworking for it to be compatible with all the models, as each placeholder for each tensor would be called by name, and not by sequential order.

EDIT: if anyone can see a quick hack that could work in https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/input.py without changing anything else, it would be awsome to hear from you.

EDIT2: Just tried the Tuple with tf.concat and tf.stack, and it doesn't seem to want to play nice. Which makes sense when you think about concatenating an image of 90x128 integers with 4 floating point value, the code would need to flatten the input, and make it all floating point numbers; and that would only work with MLP policies.

AloshkaD · 2018-12-19T23:11:04Z

@hill-a there seem to be a hack proposed by @Atcold here but it does not seem to generalize to all envs
Atcold@dbc329f

hill-a · 2018-12-21T09:17:51Z

@AloshkaD That seems to be more of a alterations of the models, which is exactly what I would like to avoid doing; as it might generate more unforeseen issues and bugs when changing all the models in such a way.

I was hoping to be able to simply change the input parsing code (stable_baselines/common/input.py) (that almost all the models use).

However, if this unlikely to be possible, then a redesign of the return of the input parsing code might be a more viable solution to this problem.

AloshkaD · 2018-12-29T02:06:47Z

Agreed! thank you @hill-a and @araffin.

Atcold · 2019-01-03T15:55:37Z

Sorry, I've been away these past two weeks...
Thanks @AloshkaD for the ping, btw.

What you found is a working hack.
Currently, in order to avoid headaches while pulling the latest master, I've resorted to reshaping all my observations as 1D tensors (long vectors) and concatenate them all. Later on, in my neural net, I take apart the observation and send the different parts to different encoders. See traffic_models.py.

AloshkaD · 2019-01-03T21:03:54Z

@Atcold thank you. Similarly, concatenating images on the channel access worked for me but it caused many issues with tensorboard logging. The logging expect an image that is 4 channels at most and by passing 6 channels it fails. Even if I initialize the empty tensor to the right shape, the incoming images have 6 channels. I'm going to dedicate more time to fix this issue in the weekend.

Atcold · 2019-01-16T19:48:10Z

@AloshkaD, you can always reshape your data before logging it.
From the TensorFlow documentation we have that:

The summary has up to max_outputs summary values containing images. The images are built from tensor which must be 4-D with shape [batch_size, height, width, channels] and where channels can be:

1: tensor is interpreted as Grayscale.

3: tensor is interpreted as RGB.

4: tensor is interpreted as RGBA.

You can pass channel=1, and have width=6 * original_width. So, a simple reshape should be sufficient.
Please, let me know if you have any other issue.

srivatsankrishnan · 2019-03-29T17:46:35Z

Hi ,
Is the Dict space working with stable-baselines? I am confused since the documentation doesn't mention it. It seems that this PR(#207) doesn't work. I see the code changes in utils.py but the error I am getting is in stable-baselines/common/input.py. I don't see any code that corresponds to "Dict" workspace in inputs.py.

My requirement is also similar to @AloshkaD, where I want to process multiple images and measurement vectors. I am open to try concatenating the images. Did you pad zeros to 1-D vector to concatenate with the images. Do you have reference code somewhere that I can use as a starting point?

araffin · 2019-03-29T18:51:39Z

Is the Dict space working with stable-baselines?

Hi, it is mentioned here in the doc:
"Non-array spaces such as Dict or Tuple are not currently supported by any algorithm."

The PR you are referring only adds the support for the VecEnvs, not the algorithms.

srivatsankrishnan · 2019-03-29T19:08:36Z

Thanks for your quick response and clarification. I was thinking that the feature is supported but the document is out of date. So the workaround it to basically do what @AloshkaD by concatenating the images across the channel axis.

bschreck · 2019-04-14T21:54:25Z

@araffin Has anyone proposed a PR to implement Tuple/Dict/etc for the action space? I came across this in a project I'm working on- I need to specify both discrete values (which internally in the Env represent indexes into an array) and continuous (specifying specific new amounts to add to the array, to simplify a bit). I'm open to working on a PR if none is in the works.

bschreck · 2019-04-15T05:15:18Z

Experimented a bit with a MultiMixedProbabilityDistribution: https://github.com/hill-a/stable-baselines/compare/master...bschreck:add-multi-mixed-proba?expand=1

Not tested at all yet

araffin · 2019-04-15T10:53:15Z

Hello,
for now, nobody is working on that.
However, there are two important things that needs to be taken into account when creating a PR for that feature:

it should not break previous versions
the changes should be as minimal as possible (so the code stays readable)

araffin · 2019-04-30T19:06:42Z

Small update on that topic, dict obs space will be supported for HER (see #273 ), when using gym.GoalEnv.
But it requires for now all keys to have the same type.

AloshkaD · 2019-07-18T14:16:16Z

Thanks @gautams3. As @araffin mentioned, this may not work for my case where I have images and 1d sensor data. I'm still using the workaround in which I convert my state observations into an image with multiple channels(3 for rgb, 1 depth, and one for each sensor) and recover the signal data before feeding them to the network. I'm using PPO2

nkleber1 · 2019-09-21T15:25:53Z

Hello @AloshkaD,
I think your workaround is interesting. Could you please explain how you recover the signal data before feeding them to the network.
Thanks in advance.

Miffyli · 2019-09-21T15:32:11Z

@nkleber1

You can use a custom policy for this. In case of CNN policy you can replace the cnn_extractor with a head of your liking where you split the augmented image into actual image and direct features (e.g. 1d sensor data). Like so:

num_direct_features = NUMBER_OF_DIRECT_FEATURES

def augmented_nature_cnn(scaled_images, **kwargs):
        """
        Copied from stable_baselines policies.py.
        This is nature CNN head where last channel of the image contains
        direct features on the last channel.

        :param scaled_images: (TensorFlow Tensor) Image input placeholder
        :param kwargs: (dict) Extra keywords parameters for the convolutional layers of the CNN
        :return: (TensorFlow Tensor) The CNN output layer
        """
        activ = tf.nn.relu

        # Take last channel as direct features
        other_features = tf.contrib.slim.flatten(scaled_images[..., -1])
        # Take known amount of direct features, rest are padding zeros
        other_features = other_features[:, :num_direct_features]

        scaled_images = scaled_images[..., :-1]

        layer_1 = activ(conv(scaled_images, 'cnn1', n_filters=32, filter_size=8, stride=4, init_scale=np.sqrt(2), **kwargs))
        layer_2 = activ(conv(layer_1, 'cnn2', n_filters=64, filter_size=4, stride=2, init_scale=np.sqrt(2), **kwargs))
        layer_3 = activ(conv(layer_2, 'cnn3', n_filters=64, filter_size=3, stride=1, init_scale=np.sqrt(2), **kwargs))
        layer_3 = conv_to_fc(layer_3)

        img_output = activ(linear(layer_3, 'cnn_fc1', n_hidden=512, init_scale=np.sqrt(2)))

        concat = tf.concat((img_output, other_features), axis=1)

        return concat

policy_kwargs = {
        "cnn_extractor": augmented_nature_cnn(num_features)
}

agent = PPO2(policy_kwargs=policy_kwargs, ...)

araffin · 2019-09-21T16:18:23Z

Additional remark: you should be careful regarding the automatic normalization, cf discussion #456

radusl · 2019-12-04T10:10:37Z

For most of the Atari games, the observation space is quite simple, you either have a Box or Discrete. The problem is that when working with real world environments or business cases, some have more complex observation spaces: single/multiple Box or a combination of Box and Discrete. Hence support for Tuple would be very nice.

The custom environment i try to implement with stable-baselines has a Tuple observation space of 4 different time series represented as 'Box', each with different shapes . After reading the comments in this section, i understood that one can merge all of them for the input and then split them apart in the custom policy. Can somebody give an example of how this might be achieved?

Miffyli · 2019-12-04T19:36:12Z

@radusl

You can append the "direct features" (non-image) features on e.g. last channel of the image, and pad it with zeros to match the other dimensions. Then you can use a cnn_extractor like one returned by this function to process the actual image with convolutions and then append it with direct features:

def create_augmented_nature_cnn(num_direct_features):
    """
    Create and return a function for augmented_nature_cnn
    used in stable-baselines.

    num_direct_features tells how many direct features there
    will be in the image.
    """

    def augmented_nature_cnn(scaled_images, **kwargs):
        """
        Copied from stable_baselines policies.py.
        This is nature CNN head where last channel of the image contains
        direct features.

        :param scaled_images: (TensorFlow Tensor) Image input placeholder
        :param kwargs: (dict) Extra keywords parameters for the convolutional layers of the CNN
        :return: (TensorFlow Tensor) The CNN output layer
        """
        activ = tf.nn.relu

        # Take last channel as direct features
        other_features = tf.contrib.slim.flatten(scaled_images[..., -1])
        # Take known amount of direct features, rest are padding zeros
        other_features = other_features[:, :num_direct_features]

        scaled_images = scaled_images[..., :-1]

        layer_1 = activ(conv(scaled_images, 'cnn1', n_filters=32, filter_size=8, stride=4, init_scale=np.sqrt(2), **kwargs))
        layer_2 = activ(conv(layer_1, 'cnn2', n_filters=64, filter_size=4, stride=2, init_scale=np.sqrt(2), **kwargs))
        layer_3 = activ(conv(layer_2, 'cnn3', n_filters=64, filter_size=3, stride=1, init_scale=np.sqrt(2), **kwargs))
        layer_3 = conv_to_fc(layer_3)

        # Append direct features to the final output of extractor
        img_output = activ(linear(layer_3, 'cnn_fc1', n_hidden=512, init_scale=np.sqrt(2)))

        concat = tf.concat((img_output, other_features), axis=1)

        return concat

    return augmented_nature_cnn

pirobot · 2019-12-10T14:21:53Z

I am very interested in getting mixed dictionary input spaces officially supported in stable-baselines and would be willing to pay for someone to do the work since I doubt I have the skills to do it myself. If anyone here has the skills or knows of a pay-for service where I might post the project, please let me know.

nicofirst1 · 2020-02-15T15:48:08Z

Is there any update on this?
I am trying to use a mixed dictionary space (Discrete + MultiDiscrete) as action space but rllib yields:
NotImplementedError: Dict action spaces are not supported, consider using gym.spaces.Tuple instead

Miffyli · 2020-02-15T17:07:35Z

@nicofirst1
No updates yet. We are focusing on transitioning on the new backend first (v3.0), after which this will be one of the high-priority updates for v3.1.

nicofirst1 · 2020-02-15T17:10:09Z

Any clue on how long will it take?

Miffyli · 2020-02-15T17:11:45Z

I can not give any exact times but at least a month, I would say.

Regarding your rllibs problem: You could modify your space to be a Tuple, no? Just make sure you provide observations in same order on each step. Please do not further this discussion here, but just food for thought.

araffin · 2020-02-15T18:56:12Z

Regarding your problem, it seems to me that Discrete is a sub ensemble of MultiDiscrete, so you could use only MultiDiscrete space in your case.
Btw, we plan support for observation Dict first, action space Dict is an open question of research.

araffin · 2021-05-11T11:08:35Z

Closing this as DLR-RM/stable-baselines3#243 in now merged with SB3 master =)

AloshkaD changed the title ~~[question] how to go about an env with two input images and an array~~ [question] [feature request] support for Dict and Tuple spaces Dec 19, 2018

araffin added enhancement New feature or request question Further information is requested help wanted Help from contributors is needed labels Dec 20, 2018

araffin removed the help wanted Help from contributors is needed label Dec 21, 2018

araffin mentioned this issue Feb 2, 2019

PPO2 using observation space Box and Discrete [question] #183

Closed

araffin mentioned this issue Feb 12, 2019

Tuple action space with stable baselines PPO2 [question] #107

Closed

AdamGleave mentioned this issue Feb 18, 2019

Dict space support for VecEnv #207

Merged

This was referenced May 7, 2019

Observation with multiple type of gym.spaces #313

Closed

Creating a custom policy that combines CNN Policy and MLP Policy #321

Closed

Miffyli mentioned this issue Oct 9, 2019

Multi output policies #502

Closed

Miffyli mentioned this issue Nov 6, 2019

[feature request] Problem when using Tuple Observation Space and Custom Policy #541

Closed

araffin mentioned this issue Nov 23, 2019

V3.0 implementation design #576

Closed

araffin mentioned this issue Dec 23, 2019

Custom model for multisensor environments #631

Closed

araffin mentioned this issue Mar 16, 2020

Multi Discrete for DQN #742

Closed

araffin mentioned this issue Apr 5, 2020

[question] Multimodal input #784

Closed

Miffyli mentioned this issue Apr 9, 2020

[feature request] Support of gym.spaces.Tuple #100

Closed

araffin mentioned this issue Jun 17, 2020

Action space Tuple [question] #894

Closed

berlintofind mentioned this issue Jul 25, 2020

Feature request for adding Dict type observation space #947

Closed

Miffyli mentioned this issue Sep 4, 2020

[question] Combined continuous and discrete action space #995

Open

Miffyli mentioned this issue Sep 12, 2020

How to design an actor-critic network with two non-shared LSTMs that take separate inputs? #1002

Closed

araffin added the v3 Discussion about V3 label Oct 12, 2020

araffin mentioned this issue Oct 20, 2020

[question] Example for Cusom Policy for SAC with combined image and box type observation #1025

Closed

Skylion007 mentioned this issue Dec 3, 2020

Tensorflow 2.0 support facebookresearch/habitat-lab#534

Closed

Miffyli mentioned this issue Feb 11, 2021

Input position and image into DQN policy #1086

Closed

BarisYazici mentioned this issue Apr 1, 2021

use other algorithm BarisYazici/deep-rl-grasping#4

Closed

araffin closed this as completed May 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[question] [feature request] support for Dict and Tuple spaces #133

[question] [feature request] support for Dict and Tuple spaces #133

AloshkaD commented Dec 15, 2018 •

edited

Loading

araffin commented Dec 16, 2018

AloshkaD commented Dec 16, 2018 •

edited

Loading

AloshkaD commented Dec 17, 2018 •

edited

Loading

araffin commented Dec 17, 2018

AloshkaD commented Dec 18, 2018 •

edited

Loading

pulver22 commented Dec 19, 2018

hill-a commented Dec 19, 2018 •

edited

Loading

AloshkaD commented Dec 19, 2018 •

edited

Loading

hill-a commented Dec 21, 2018

AloshkaD commented Dec 29, 2018

Atcold commented Jan 3, 2019 •

edited

Loading

AloshkaD commented Jan 3, 2019

Atcold commented Jan 16, 2019 •

edited

Loading

srivatsankrishnan commented Mar 29, 2019 •

edited

Loading

araffin commented Mar 29, 2019

srivatsankrishnan commented Mar 29, 2019

bschreck commented Apr 14, 2019

bschreck commented Apr 15, 2019

araffin commented Apr 15, 2019

araffin commented Apr 30, 2019 •

edited

Loading

AloshkaD commented Jul 18, 2019

nkleber1 commented Sep 21, 2019

Miffyli commented Sep 21, 2019

araffin commented Sep 21, 2019

radusl commented Dec 4, 2019 •

edited

Loading

Miffyli commented Dec 4, 2019

pirobot commented Dec 10, 2019

nicofirst1 commented Feb 15, 2020

Miffyli commented Feb 15, 2020

nicofirst1 commented Feb 15, 2020

Miffyli commented Feb 15, 2020 •

edited

Loading

araffin commented Feb 15, 2020

araffin commented May 11, 2021

[question] [feature request] support for Dict and Tuple spaces #133

[question] [feature request] support for Dict and Tuple spaces #133

Comments

AloshkaD commented Dec 15, 2018 • edited Loading

araffin commented Dec 16, 2018

AloshkaD commented Dec 16, 2018 • edited Loading

AloshkaD commented Dec 17, 2018 • edited Loading

araffin commented Dec 17, 2018

AloshkaD commented Dec 18, 2018 • edited Loading

pulver22 commented Dec 19, 2018

hill-a commented Dec 19, 2018 • edited Loading

AloshkaD commented Dec 19, 2018 • edited Loading

hill-a commented Dec 21, 2018

AloshkaD commented Dec 29, 2018

Atcold commented Jan 3, 2019 • edited Loading

AloshkaD commented Jan 3, 2019

Atcold commented Jan 16, 2019 • edited Loading

srivatsankrishnan commented Mar 29, 2019 • edited Loading

araffin commented Mar 29, 2019

srivatsankrishnan commented Mar 29, 2019

bschreck commented Apr 14, 2019

bschreck commented Apr 15, 2019

araffin commented Apr 15, 2019

araffin commented Apr 30, 2019 • edited Loading

AloshkaD commented Jul 18, 2019

nkleber1 commented Sep 21, 2019

Miffyli commented Sep 21, 2019

araffin commented Sep 21, 2019

radusl commented Dec 4, 2019 • edited Loading

Miffyli commented Dec 4, 2019

pirobot commented Dec 10, 2019

nicofirst1 commented Feb 15, 2020

Miffyli commented Feb 15, 2020

nicofirst1 commented Feb 15, 2020

Miffyli commented Feb 15, 2020 • edited Loading

araffin commented Feb 15, 2020

araffin commented May 11, 2021

AloshkaD commented Dec 15, 2018 •

edited

Loading

AloshkaD commented Dec 16, 2018 •

edited

Loading

AloshkaD commented Dec 17, 2018 •

edited

Loading

AloshkaD commented Dec 18, 2018 •

edited

Loading

hill-a commented Dec 19, 2018 •

edited

Loading

AloshkaD commented Dec 19, 2018 •

edited

Loading

Atcold commented Jan 3, 2019 •

edited

Loading

Atcold commented Jan 16, 2019 •

edited

Loading

srivatsankrishnan commented Mar 29, 2019 •

edited

Loading

araffin commented Apr 30, 2019 •

edited

Loading

radusl commented Dec 4, 2019 •

edited

Loading

Miffyli commented Feb 15, 2020 •

edited

Loading