Skip to content

Commit

Permalink
Fixed bug when buffer not reinitialized on reset
Browse files Browse the repository at this point in the history
  • Loading branch information
yannbouteiller committed Jan 24, 2022
1 parent 516ddd1 commit e2a7304
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 2 deletions.
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -570,6 +570,8 @@ However, note that doing this would imply using a longer action buffer.

#### Bonus 3: Pro tips

##### a) Elasticity

The time-step's maximum elasticity defines the tolerance of your environment in terms of time-wise precision.
It is set in the configuration dictionary as the `"time_step_timeout_factor"` entry.
This can be any value `> 0.0`.
Expand All @@ -590,6 +592,14 @@ You may want this to be as tight as possible.
In such situation, keep in mind that inference must end before the end of this next time-step, since the computed action is to be applied there.
Otherwise, your time-steps will timeout.

##### b) Reset

In `rtgym`, the default action is sent when `reset()` is called.
This is to maintain the real-time flow of time-steps during reset transitions.

It may happen that you prefer to repeat the previous action instead, for instance because it is hard in your application to implement a no-op action.

To achieve this behavior, you can simply replace the `default_action` attribute of your environment with the action that you want being sent, right before calling `reset()`.

---

Expand Down
2 changes: 2 additions & 0 deletions rtgym/envs/real_time_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,8 @@ def reset(self):
self.current_step = 0
if self.reset_act_buf:
self.init_action_buffer()
else:
self.act_buf.append(self.default_action)
elt = self.interface.reset()
if self.act_in_obs:
elt = elt + list(self.act_buf)
Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@

setup(name='rtgym',
packages=[package for package in find_packages()],
version='0.5',
version='0.6',
license='MIT',
description='Easily implement custom OpenAI Gym environments for real-time applications',
long_description=long_description,
long_description_content_type="text/markdown",
author='Yann Bouteiller',
url='https://github.com/yannbouteiller/rtgym',
download_url='https://github.com/yannbouteiller/rtgym/archive/refs/tags/v0.5.tar.gz',
download_url='https://github.com/yannbouteiller/rtgym/archive/refs/tags/v0.6.tar.gz',
keywords=['gym', 'real', 'time', 'custom', 'environment', 'reinforcement', 'learning', 'random', 'delays'],
install_requires=['gym', 'numpy'],
classifiers=[
Expand Down

0 comments on commit e2a7304

Please sign in to comment.