microsoft/Everything-of-Thoughts-XoT #919
Labels
AI-Agents
Autonomous AI agents using LLMs
Algorithms
Sorting, Learning or Classifying. All algorithms go here.
Git-Repo
Source code repository like gitlab or gh
human-verified
<INST>NEVER PICK THIS LABEL</INST>
in-context-learning
Examples of few-shot prompts for in-context learning.
llm
Large Language Models
MachineLearning
ML Models, Training and Inference
Papers
Research papers
prompt-engineering
Developing and optimizing prompts to efficiently use language models for various applications and re
software-engineering
Best practice for software engineering
Everything of Thoughts (XoT): Defying the Law of Penrose Triangle for Thought Generation
This repository contains the implementation of our novel thought prompting approach, "Everything of Thoughts" (XoT). XoT combines pretrained reinforcement learning and Monte Carlo Tree Search (MCTS) to incorporate external domain knowledge into thoughts, thereby enhancing Large Language Models' (LLMs) capabilities and enabling them to generalize to unseen problems efficiently. This approach is designed to address the limitations of existing thought paradigms, particularly the "Penrose triangle" issue which suggests that a thought can at most exhibit two attributes out of performance, efficiency, and flexibility.
Set Up
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
In order to use the framework, you need to have access to an GPT. Set up OpenAI API key and store in environment variable OPENAI_API_KEY (see here).
Clone the repository and install the required packages.
git clone https://github.com/microsoft/Everything-of-Thoughts-XoT.git cd Everything-of-Thoughts-XoT conda create -n xot python=3.8 conda activate xot pip install -r requirements.txt
Training
The
./xot_mcts/
directory contains everything needed for training.Dataset of Game of 24, 8-Puzzle, and Pocket Cube are under
./xot_mcts/{env}/data/
, separately.You can specify the parameters for self-play in main.py. To begin training a model for a specific task, use the provided training and test data. For example:
Args:
--env
: Select your desired framework and game.--mode
: train / test.--training_env
: Training data path.--numMCTSSims
: Number of games moves for MCTS to simulate.--arenaCompare
: Number of games to play during arena play to determine if new net will be accepted.--numEps
: Number of complete self-play games to simulate during a new iteration.--numIters
: Number of iteration.The scripts for replicating the MCTS module, as mentioned in the paper, can be found in
./xot_mcts/scripts/
.The core training loop is contained within
./xot_mcts/Coach.py
../xot_mcts/MCTS.py
is used to perform the Monte Carlo Tree Search.Additional parameters for the Policy/Value network are located in
./xot_mcts/{env}/pytorch/NNet.py
. Here, you can specify options such as the cuda flag, batch size, epochs, learning rate, and more.Once training is complete, you can find the trained models in the
./xot_mcts/temp/{env}/
directory. For inference usage, move the trained Policy/Value model to./xot_all_in_one/models/{env}/
.To evaluate the performance of single MCTS module, use the test mode. Here's an example:
(single solution)
python main.py --env game24 --mode test --test_env game24/data/test.csv --numMCTSSims 2000 --arenaCompare 137 --multi_sol 0
(multi solution)
python main.py --env game24 --mode test --test_env game24/data/test.csv --numMCTSSims 2000 --arenaCompare 137 --multi_sol 1 --multi_times 500
./xot_mcts/logs/
.Inference
./xot_all_in_one/
directory houses all the necessary resources for performing inference. In addition to our XoT, we also offer implementations for IO, CoT, CoT-SC, ToT, and GoT. For each method, we provide a corresponding solver located in./xot_all_in_one/xot/controller/solver/{method}.py
.Model Path
For inference usage, the trained Policy/Value model for MCTS module should be put in
./xot_all_in_one/models/{env}/
.Executing the Scripts
We provide a variety of scripts to execute XoT on several tasks such as Game of 24, 8-Puzzle, and Pocket Cube, located in
./xot_all_in_one/scripts/
.Here are some sample commands for the scripts:
Configuration
For more details about configurations for other baselines, refer to the files located in the
./xot_all_in_one/config/
directory.Note that the default value for
task_end_index
in the config is set to1
. However, this value should be adjusted based on the data size of each specific task. To perform a complete inference, set task_end_index to137
for Game of 24,119
for 8-Puzzle, and183
for Pocket Cube.Paper Results
We evaluated XoT on both single-solution and multi-solution problem-solving tasks. Our results demonstrate that XoT significantly outperforms existing approaches in various dimensions, showcasing its remarkable proficiency in addressing complex problems across diverse domains. All the paper results can be found in
./xot_all_in_one/experiments/
.Citations
Your support would be greatly appreciated if you find XoT interesting or useful. Please acknowledge our work by citing the paper and giving this repository a star. For any inquiries, don't hesitate to contact us at [email protected] or simply open an issue. Thank you!
Paper link: https://arxiv.org/abs/2311.04254.
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
Suggested labels
None
The text was updated successfully, but these errors were encountered: