Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Describe the purpose of the pull request, including:
p2e_dv1
into two separate scripts:p2e_dv1_exploration
andp2e_dv1_finetuning
p2e_dv2
into two separate scripts:p2e_dv2_exploration
andp2e_dv2_finetuning
learning_start
is reached) with theactor_exploration
or theactor_task
. After thelearning_start
, only theactor_task
collects experiences.p2e_dv3
algorithm. It works as the other P2E algorithms, with the difference that this one enables the definition of more than one exploration critic. Each critic learns to predict either theintrinsic
ortask-related
targets. Moreover, for actor learning, each critic has a weight that establishes its importance in the actor loss.Type of Change
Please select the one relevant option below:
Checklist
Please confirm that the following tasks have been completed: