Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/split p2e #151

Merged
merged 9 commits into from
Nov 13, 2023
Merged

Feature/split p2e #151

merged 9 commits into from
Nov 13, 2023

Conversation

michele-milesi
Copy link
Member

@michele-milesi michele-milesi commented Nov 9, 2023

Summary

Describe the purpose of the pull request, including:

  • Split p2e_dv1 into two separate scripts: p2e_dv1_exploration and p2e_dv1_finetuning
  • Split p2e_dv2 into two separate scripts: p2e_dv2_exploration and p2e_dv2_finetuning
  • Now the finetuning can start with the same buffer of the exploration or with a completely new one. If the buffer is new, then the user can decide whether to collect the first experiences (until learning_start is reached) with the actor_exploration or the actor_task. After the learning_start, only the actor_task collects experiences.
  • Added p2e_dv3 algorithm. It works as the other P2E algorithms, with the difference that this one enables the definition of more than one exploration critic. Each critic learns to predict either the intrinsic or task-related targets. Moreover, for actor learning, each critic has a weight that establishes its importance in the actor loss.

Type of Change

Please select the one relevant option below:

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

Please confirm that the following tasks have been completed:

  • I have tested my changes locally and they work as expected. (Please describe the tests you performed.)
  • I have added unit tests for my changes, or updated existing tests if necessary.
  • I have updated the documentation, if applicable.
  • I have installed pre-commit and run locally for my code changes.

michele-milesi and others added 5 commits November 9, 2023 14:53
* feat: implemented p2e_dv3

* feat: added the possibility to have more critics for exploration

* tests: added p2e_dv3 test

* docs: update p2e_dv3 docs

* docs: update

* fix: p2e_dv3 refactoring

* fix: checkpoint

* Fix missing 0.5 value

* feat: add validate args to p2e_dv3

* feat: uniform p2e_dv3 with last improvements

* fix: ppo tests

* feat: split exploration and finetuning

* fix: resume from checkpoint controls

* fix: bugs

* tests: added p2e_dv3 and resume from checkpoint tests

* fix: p2e dv3 resume from checkpoint

* tests: update p2e dv3 test

* feat: added p2e_dv3 evaluation

* fix: evaluate and __init__

* fix: cli controls

* fix: added detach() when learning world model in exploration

* fix: checks in cli

* fix: exploration amount

* fix: removed minedojo test cfgs

---------

Co-authored-by: belerico_t <[email protected]>
@michele-milesi michele-milesi marked this pull request as ready for review November 9, 2023 16:27
@belerico belerico merged commit 9b68f22 into main Nov 13, 2023
7 checks passed
@michele-milesi michele-milesi deleted the feature/split-p2e branch November 15, 2023 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants