Feature/split p2e #151

michele-milesi · 2023-11-09T13:32:09Z

Summary

Describe the purpose of the pull request, including:

Split p2e_dv1 into two separate scripts: p2e_dv1_exploration and p2e_dv1_finetuning
Split p2e_dv2 into two separate scripts: p2e_dv2_exploration and p2e_dv2_finetuning
Now the finetuning can start with the same buffer of the exploration or with a completely new one. If the buffer is new, then the user can decide whether to collect the first experiences (until learning_start is reached) with the actor_exploration or the actor_task. After the learning_start, only the actor_task collects experiences.
Added p2e_dv3 algorithm. It works as the other P2E algorithms, with the difference that this one enables the definition of more than one exploration critic. Each critic learns to predict either the intrinsic or task-related targets. Moreover, for actor learning, each critic has a weight that establishes its importance in the actor loss.

Type of Change

Please select the one relevant option below:

Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

Please confirm that the following tasks have been completed:

I have tested my changes locally and they work as expected. (Please describe the tests you performed.)
I have added unit tests for my changes, or updated existing tests if necessary.
I have updated the documentation, if applicable.
I have installed pre-commit and run locally for my code changes.

* feat: implemented p2e_dv3 * feat: added the possibility to have more critics for exploration * tests: added p2e_dv3 test * docs: update p2e_dv3 docs * docs: update * fix: p2e_dv3 refactoring * fix: checkpoint * Fix missing 0.5 value * feat: add validate args to p2e_dv3 * feat: uniform p2e_dv3 with last improvements * fix: ppo tests * feat: split exploration and finetuning * fix: resume from checkpoint controls * fix: bugs * tests: added p2e_dv3 and resume from checkpoint tests * fix: p2e dv3 resume from checkpoint * tests: update p2e dv3 test * feat: added p2e_dv3 evaluation * fix: evaluate and __init__ * fix: cli controls * fix: added detach() when learning world model in exploration * fix: checks in cli * fix: exploration amount * fix: removed minedojo test cfgs --------- Co-authored-by: belerico_t <[email protected]>

michele-milesi added 4 commits November 9, 2023 12:35

p2e dv1 and p2e dv2 split into exploration and finetuning

c30e6ae

fix: exploration amount

01a09c2

fix: change actor from exploration to task when starting training

eb176b8

fix: from __future__ import annotations

43af3e9

michele-milesi requested a review from belerico November 9, 2023 13:32

michele-milesi and others added 5 commits November 9, 2023 14:53

fix: exploration amount

9161569

docs: added p2e readme

92eeb6a

merge: main into split-p2e

915faea

fix: buffer load

cbcd02d

michele-milesi marked this pull request as ready for review November 9, 2023 16:27

belerico approved these changes Nov 13, 2023

View reviewed changes

belerico merged commit 9b68f22 into main Nov 13, 2023
7 checks passed

michele-milesi deleted the feature/split-p2e branch November 15, 2023 09:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/split p2e #151

Feature/split p2e #151

michele-milesi commented Nov 9, 2023 •

edited

Loading

Feature/split p2e #151

Feature/split p2e #151

Conversation

michele-milesi commented Nov 9, 2023 • edited Loading

Summary

Type of Change

Checklist

michele-milesi commented Nov 9, 2023 •

edited

Loading