-
Notifications
You must be signed in to change notification settings - Fork 725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature request] Support of gym.spaces.Tuple #100
Comments
Hello,
Your welcome =)
It is not planned for now. But PR are welcomed ;) Do you have a concrete use-case where tuple space is needed? |
I want to define an action space for a mobile robot with action space for the translational and rotational velocity. They have different limits (low/high). It's not possible to put them in one gym.spaces.Box. |
Are you sure? I would do something like (if my limits are [-1, 1] and [-5, 5] for instance): spaces.Box(low=np.array([-1, -5]), high=np.array([1, 5]), dtype=np.float32) EDIT: tuple spaces are normally useful when you mix for instance discrete and continuous spaces |
It should not be too hard to implement for the action_space if it is really important (only actor-critic models). You just need to imitate MultiCategoricalProbabilityDistribution (https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/distributions.py#L333) The hard part might be the observation_space however. |
@araffin okay, true - that solves the problem. Thanks a lot! And sorry for requesting the "wrong" issue. |
@hill-a Is it possible to implement other models (DQN etc) with tuple action space? |
Hi, How can one define a action space where an action consists of a categorical variable (e.g. 0 or 1) and a continuous variable (any real number in the interval (0,10)) ? If I were to use spaces.Tuple in my action space definition, and have implemented a corresponding probability distribution (e.g. TupleProbabilityDistribution), what may go wrong if I would like to try a model, say PPO2, on my environment? Thank you very much! |
The issue lies in how observations are handled under the hood: They are being concatenated into numpy arrays of right shape and whatnot, and thus they won't work when observation is a Tuple/Dict. This would require major rework all around the code and it is currently planned for v3.1 (the next update after migrating to TF2). |
Thank you for the quick reply! I am not using Tuple/Dict for observation space, but action space. Would that also cause a lot of issues? |
Ah, sorry for misunderstanding! The issue is still the same, though: Actions are being stacked into arrays of right shapes, and thus Tuple/Dict spaces won't work even if you have necessary distributions available etc. Support for this too is planned for v3.1. |
For my first question: To design a customized environment, how can one define a observation/action space where an obs/action consists of categorical variables (e.g. 0 or 1) and continuous variables (e.g. any real number in the interval (0,10)) ? Do you have any suggestion? Thanks a lot! |
Like mentioned, the correct way to do this (Tuple/Dict) is not supported in stable-baselines as of writing. You could try doing some trickery around this by e.g. defining a single action space of bunch of continuous actions (Box), then slicing off the variables you want to be discrete (categorical) and thresholding them (i.e. if above 0.5, then set to 1, otherwise 0). There are zero guarantees this will work, though. Other than that I do not have tips to give :/ |
Any recommendation on how to use 3 dimensional image observation(rgb) with 1 dimensional continous observation. I was thinking about using Tuple, but just noticed that it's not supported. Warning from check_env helper recommends to flatten the Tuple by unpacking it. I can flatten the whole image observation but then don't how it affects the optimizer when the dimensionalty of the image observation is gone. Then, I wouldn't be able to use CNN policies, I guess. |
With bit of trickery, you can do this. See this example. TL;DR: Put the 1D stuff on the (new) last channel of the RGB image, and extract accordingly in the network. |
Hi @Miffyli, |
See DLR-RM/stable-baselines3#731 and DLR-RM/stable-baselines3#527 |
Thanks for the nice and clean drl library!
Is the support of gym.spaces.Tuple coming in the near future? :) Would be helpful for more complex problems...
The text was updated successfully, but these errors were encountered: