Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SingleRoomUndirected & SingleRoomDirected #153

Merged

Conversation

Sid-Bhatia-0
Copy link
Member

@findmyway I have rethought the design of this package to achieve our various goals, including decoupling with respect to the RLBase API. Here's what I think:

RL environments are games first before they are environments. For GridWorlds.jl we don't need to create yet another powerful RL API like RLBase, we just need to be able to support RLBase (or CommonRLInterface or anything else for that matter) on top of a lightweight API that only covers running the core logic of all the grid-world games.

It turns out that all the game logic nicely fits into two methods - reset! and act!. Other than the core logic, the GridWorlds.jl API doesn't need to explicitly provide methods to get the state, for example, because states are only defined in the context of RL, and the ability to get a state via a method is not part of the core logic of a grid world game. It just needs to be run the game logic, and all the RL related methods will be implemented by the concerned RL API like RLBase (we will provide the implementations for RLBase, just that they will be decoupled from the core game logic, which will be governed by the GridWorlds.jl specific methods GW.reset! and GW.act!).

In this way, we can later support any API apart from RLBase, like CommonRLInterface as well.

For this, I am adding a new abstract type AbstractGridWorldGame <: Any instead of the current AbstractGridWorld <: RLBase.AbstractEnv. All the RLBase API methods for all the games will live in a separate module called RLBaseGridWorldModule.

On separation of environments, each grid-world game will be in a separate module. For example, SingleRoomUndirected is present in SingleRoomUndirectedModule and SingleRoomDirected is in SingleRoomDirectedModule. This allows for each environment to define its own consts that are isolated from other environments.

Another interesting thing to notice is that directed and undirected environments differ only in the way navigation occurs, and the rest of the logic is the same for the most part (in the same type of environment like SingleRoom). Also, undirected navigation is the more primitive/simpler one as it doesn't need to keep track of direction. So directed environments can reuse significant functionality from undirected environments. An example is SingleRoomUndirected and SingleRoomDirected as in this PR.

@codecov-commenter
Copy link

codecov-commenter commented Jun 30, 2021

Codecov Report

Merging #153 (8bd9159) into master (6d87e18) will decrease coverage by 2.32%.
The diff coverage is 54.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #153      +/-   ##
==========================================
- Coverage   78.21%   75.89%   -2.33%     
==========================================
  Files          21       25       +4     
  Lines        2226     2468     +242     
==========================================
+ Hits         1741     1873     +132     
- Misses        485      595     +110     
Impacted Files Coverage Δ
src/GridWorlds.jl 100.00% <ø> (ø)
src/abstract_grid_world.jl 14.28% <0.00%> (-60.72%) ⬇️
src/play.jl 0.00% <0.00%> (ø)
src/rlbase.jl 64.70% <64.70%> (ø)
src/envs/single_room_directed.jl 71.42% <71.42%> (ø)
src/envs/single_room_undirected.jl 79.10% <79.10%> (ø)
src/actions.jl 83.33% <86.95%> (+4.38%) ⬆️
src/directions.jl 100.00% <100.00%> (ø)
src/envs/envs.jl 100.00% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d87e18...8bd9159. Read the comment docs.

@findmyway
Copy link
Member

cool, glad we reach an agreement on it ;)

@Sid-Bhatia-0
Copy link
Member Author

Here is an example of SingleRoomUndirected.

single_room_undirected

Here is an example of SingleRoomDirected.

single_room_directed

There is no need for colors in this environment since all objects are unique.

@Sid-Bhatia-0
Copy link
Member Author

Here is an example of being able to replay animations at a given frame_rate as well as stepping through the animation manually (being able to go to the next frame, go to the previous frame, go to the first frame as many times as the user likes).

replay

@Sid-Bhatia-0 Sid-Bhatia-0 marked this pull request as ready for review July 1, 2021 14:16
@Sid-Bhatia-0 Sid-Bhatia-0 merged commit c585c20 into JuliaReinforcementLearning:master Jul 1, 2021
@Sid-Bhatia-0 Sid-Bhatia-0 deleted the single_room branch July 1, 2021 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants