Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] Support of gym.spaces.Tuple #100

Closed
RGring opened this issue Nov 27, 2018 · 16 comments
Closed

[feature request] Support of gym.spaces.Tuple #100

RGring opened this issue Nov 27, 2018 · 16 comments
Labels
enhancement New feature or request

Comments

@RGring
Copy link

RGring commented Nov 27, 2018

Thanks for the nice and clean drl library!
Is the support of gym.spaces.Tuple coming in the near future? :) Would be helpful for more complex problems...

@araffin araffin added the enhancement New feature or request label Nov 27, 2018
@araffin
Copy link
Collaborator

araffin commented Nov 27, 2018

Hello,

Thanks for the nice and clean drl library!

Your welcome =)

Is the support of gym.spaces.Tuple coming in the near future?

It is not planned for now. But PR are welcomed ;) Do you have a concrete use-case where tuple space is needed?
Also, I don't know how easy it will be to integrate it.

@RGring
Copy link
Author

RGring commented Nov 27, 2018

I want to define an action space for a mobile robot with action space for the translational and rotational velocity. They have different limits (low/high). It's not possible to put them in one gym.spaces.Box.

@araffin
Copy link
Collaborator

araffin commented Nov 27, 2018

They have different limits (low/high). It's not possible to put them in one gym.spaces.Box.

Are you sure? I would do something like (if my limits are [-1, 1] and [-5, 5] for instance):

spaces.Box(low=np.array([-1, -5]), high=np.array([1, 5]), dtype=np.float32)

EDIT: tuple spaces are normally useful when you mix for instance discrete and continuous spaces

@hill-a
Copy link
Owner

hill-a commented Nov 27, 2018

It should not be too hard to implement for the action_space if it is really important (only actor-critic models). You just need to imitate MultiCategoricalProbabilityDistribution (https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/distributions.py#L333)

The hard part might be the observation_space however.

@RGring
Copy link
Author

RGring commented Nov 27, 2018

@araffin okay, true - that solves the problem. Thanks a lot! And sorry for requesting the "wrong" issue.

@VXU1230
Copy link

VXU1230 commented Oct 25, 2019

It should not be too hard to implement for the action_space if it is really important (only actor-critic models). You just need to imitate MultiCategoricalProbabilityDistribution (https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/distributions.py#L333)

The hard part might be the observation_space however.

@hill-a Is it possible to implement other models (DQN etc) with tuple action space?

@denyHell
Copy link

denyHell commented Jan 16, 2020

Hi,

How can one define a action space where an action consists of a categorical variable (e.g. 0 or 1) and a continuous variable (any real number in the interval (0,10)) ?

If I were to use spaces.Tuple in my action space definition, and have implemented a corresponding probability distribution (e.g. TupleProbabilityDistribution), what may go wrong if I would like to try a model, say PPO2, on my environment?

Thank you very much!

@Miffyli
Copy link
Collaborator

Miffyli commented Jan 16, 2020

@denyHell

The issue lies in how observations are handled under the hood: They are being concatenated into numpy arrays of right shape and whatnot, and thus they won't work when observation is a Tuple/Dict. This would require major rework all around the code and it is currently planned for v3.1 (the next update after migrating to TF2).

@denyHell
Copy link

@denyHell

The issue lies in how observations are handled under the hood: They are being concatenated into numpy arrays of right shape and whatnot, and thus they won't work when observation is a Tuple/Dict. This would require major rework all around the code and it is currently planned for v3.1 (the next update after migrating to TF2).

Thank you for the quick reply! I am not using Tuple/Dict for observation space, but action space. Would that also cause a lot of issues?

@Miffyli
Copy link
Collaborator

Miffyli commented Jan 16, 2020

Ah, sorry for misunderstanding! The issue is still the same, though: Actions are being stacked into arrays of right shapes, and thus Tuple/Dict spaces won't work even if you have necessary distributions available etc. Support for this too is planned for v3.1.

@denyHell
Copy link

For my first question: To design a customized environment, how can one define a observation/action space where an obs/action consists of categorical variables (e.g. 0 or 1) and continuous variables (e.g. any real number in the interval (0,10)) ?

Do you have any suggestion? Thanks a lot!

@Miffyli
Copy link
Collaborator

Miffyli commented Jan 16, 2020

Like mentioned, the correct way to do this (Tuple/Dict) is not supported in stable-baselines as of writing. You could try doing some trickery around this by e.g. defining a single action space of bunch of continuous actions (Box), then slicing off the variables you want to be discrete (categorical) and thresholding them (i.e. if above 0.5, then set to 1, otherwise 0). There are zero guarantees this will work, though. Other than that I do not have tips to give :/

@BarisYazici
Copy link

Any recommendation on how to use 3 dimensional image observation(rgb) with 1 dimensional continous observation. I was thinking about using Tuple, but just noticed that it's not supported. Warning from check_env helper recommends to flatten the Tuple by unpacking it. I can flatten the whole image observation but then don't how it affects the optimizer when the dimensionalty of the image observation is gone. Then, I wouldn't be able to use CNN policies, I guess.

@Miffyli
Copy link
Collaborator

Miffyli commented Apr 9, 2020

@BarisYazici

With bit of trickery, you can do this. See this example. TL;DR: Put the 1D stuff on the (new) last channel of the RGB image, and extract accordingly in the network.

@GraderYuval
Copy link

Hi @Miffyli,
Is there any update regarding the support of continuous and discrete action space (e.g support of gym spaces such as tuple, dict)?

@araffin
Copy link
Collaborator

araffin commented Apr 12, 2022

Hi @Miffyli, Is there any update regarding the support of continuous and discrete action space (e.g support of gym spaces such as tuple, dict)?

See DLR-RM/stable-baselines3#731 and DLR-RM/stable-baselines3#527

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants