Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RL Implementation #3

Open
andreasbinder opened this issue Aug 2, 2023 · 2 comments
Open

RL Implementation #3

andreasbinder opened this issue Aug 2, 2023 · 2 comments

Comments

@andreasbinder
Copy link

Hi, thank you for the paper and the interesting concept!

We want to build on your idea, using RL methods. However, in your code I did not find the policy implementations.
I assume they are supposed to be here.

It would be great if you can share your code so that I can experiment also on my own :)
Keep up the good work!

@Sheerkay
Copy link

In your paper, you mentioned using an improved version of the REINFORCE algorithm [32] to directly train the ToT (Tree of Thought) controller and the prompt agent. However, in the code of your GitHub project, I did not find the corresponding reinforcement learning method. It seems that your strategy still relies on natural language prompts.
捕获

@phuleratribhuwan
Copy link

@Sheerkay I came across the same thing while exploring—it appears to be basic code with minimal effort, relying mainly on well-crafted text prompts for the problem. The true power seems to lie in the NLP or GPT capabilities behind it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants