Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RL models clean up #112

Closed
Changes from 1 commit
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
9c06583
Updated RL docs with latest models
Jun 24, 2020
33be076
Merge branch 'master' of https://github.com/PyTorchLightning/pytorch-…
Jun 25, 2020
fdc92f9
Merge branch 'master' of https://github.com/PyTorchLightning/pytorch-…
Jun 28, 2020
682bbe6
Merge branch 'master' of https://github.com/PyTorchLightning/pytorch-…
Jun 30, 2020
17073bc
Merge branch 'master' of https://github.com/PyTorchLightning/pytorch-…
Jun 30, 2020
d05db21
Merge branch 'master' of https://github.com/PyTorchLightning/pytorch-…
Jul 8, 2020
8cde396
Updated RL docs with latest models
Jun 24, 2020
96aaa97
Merge branch 'master' of https://github.com/djbyrne/pytorch-lightning…
djbyrne Jul 11, 2020
885be16
Cleaned up avg_reward calculation
djbyrne Jul 11, 2020
00a8547
Refactored DQN to use train_batch structure
djbyrne Jul 12, 2020
0aca98d
Merge branch 'master' into enhancement/rl_models_clean_up
djbyrne Jul 12, 2020
cfd139e
Cleaned up VPG metrics
djbyrne Jul 12, 2020
2741c5b
Refactore double dqn to use train_batch structure
djbyrne Jul 12, 2020
ad54460
Refactored noisy dqn to use train_batch structure
djbyrne Jul 12, 2020
164c7b4
Refactored per dqn to use train_batch structure
djbyrne Jul 12, 2020
407ff94
Updated docstrings
djbyrne Jul 12, 2020
6df878f
format
Borda Jul 12, 2020
44e0006
Apply suggestions from code review
Borda Jul 12, 2020
4f3d164
typo
Borda Jul 13, 2020
0333ebb
Merge branch 'enhancement/rl_models_clean_up' of https://github.com/d…
Borda Jul 13, 2020
2e18e19
Fixed pep8 errors
djbyrne Jul 14, 2020
79e7e5c
Fixed flake8 errors
djbyrne Jul 14, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Refactore double dqn to use train_batch structure
djbyrne committed Jul 12, 2020
commit 2741c5b901e132bbd37fe1a21bb6b880833d74f9
17 changes: 0 additions & 17 deletions pl_bolts/models/rl/double_dqn_model.py
Original file line number Diff line number Diff line change
@@ -71,30 +71,13 @@ def training_step(self, batch: Tuple[torch.Tensor, torch.Tensor], _) -> OrderedD
Returns:
Training loss and log metrics
"""
self.agent.update_epsilon(self.global_step)

# step through environment with agent and add to buffer
exp, reward, done = self.source.step(self.device)
self.buffer.append(exp)

self.episode_reward += reward
self.episode_steps += 1

# calculates training loss
loss = double_dqn_loss(batch, self.net, self.target_net)

if self.trainer.use_dp or self.trainer.use_ddp2:
loss = loss.unsqueeze(0)

if done:
self.total_reward = self.episode_reward
self.reward_list.append(self.total_reward)
self.avg_reward = sum(self.reward_list[-100:]) / 100
self.episode_count += 1
self.episode_reward = 0
self.total_episode_steps = self.episode_steps
self.episode_steps = 0

# Soft update of target network
if self.global_step % self.sync_rate == 0:
self.target_net.load_state_dict(self.net.state_dict())