Missing of Regularization of GCL #627

zhixiongzh · 2023-03-29T12:17:41Z

I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- system worker bug
- system utils bug
- code design/refactor
- documentation request
- [+] new feature request
[+] I have visited the readme and doc
[+] I have searched through the issue tracker and pr tracker

[+] I have mentioned version numbers, operating system and environment, where applicable:

import ding, torch, sys
print(ding.__version__, torch.__version__, sys.version, sys.platform)

>>> print(ding.__version__, torch.__version__, sys.version, sys.platform)
v0.4.6 1.10.0 3.7.11 (default, Jul 27 2021, 14:32:16) 
[GCC 7.5.0] linux

Dear Developers,

I am looking into the implementation of guided cost reward model. In the training process there is only the loss of IOC but not the regularization term g_lcr and g_mono. Do I miss that or is it just not implemented in the code?

In addition, in the paper of guided cost learning, their Loss_Ioc consider different trajectories, each of which should be a complete episode. However, in DI-ENGINE, the training data consists of time-steps sampled from tracjectories, which means the time-steps in the training data are not from a complete episode and might also have repeated time-steps duiring sample. Is this designed on purpose or any misunderstanding of the paper?

Best regards
Zhixiong

The text was updated successfully, but these errors were encountered:

PaParaZz1 · 2023-04-18T03:28:06Z

In the beginning of the implementation GCL, we discussed the problem of using the whole episodes or some fixed-length and non-overlapped trajectories. We checked the theoretical details in original paper and conducted some comparison experiments. The episode version didn't show obvious performance gain. Therefore, we add the trajectory version in DI-engine as default for simplicity.

The regularization terms in GCL are designed to help the special RL algorithm (Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics). But when we combine GCL with some latest DRL algorithms like DQN, we can omit these terms so we don't implement then in the current version

PaParaZz1 added the discussion Discussion of a typical issue label Mar 29, 2023

PaParaZz1 mentioned this issue Apr 13, 2023

Roadmap for DI-engine #548

Open

PaParaZz1 closed this as completed Apr 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing of Regularization of GCL #627

Missing of Regularization of GCL #627

zhixiongzh commented Mar 29, 2023

PaParaZz1 commented Apr 18, 2023

Missing of Regularization of GCL #627

Missing of Regularization of GCL #627

Comments

zhixiongzh commented Mar 29, 2023

PaParaZz1 commented Apr 18, 2023