O1-CODER

O1-CODER: An O1 Replication for Coding (Paper)

Overview

O1-CODER is an attempt to replicate OpenAI's O1 model, focused on coding tasks. The approach combines Reinforcement Learning (RL) and Monte Carlo Tree Search (MCTS) to enhance the model’s System-2 thinking capabilities, aiming to generate more efficient and logical code.

Method

The core components of O1-CODER are:

Test Case Generator (TCG): Automatically generates standardized test cases to evaluate the correctness of the generated code.
Self-Play and Reinforcement Learning: The model generates reasoning data through self-play, and uses RL and MCTS to iteratively optimize the policy model. These methods work in an iterative cycle, continuously refining the model to improve systematic reasoning and optimization in coding tasks.

News

Latest Updates

- 2024-12-10

Updated the Reward Aggregator

- 2024-12-07

Updated the training code for the process reward model and Test Case Generator.
Updated the MCTS-based data synthesis code for O1-CODER.

- 2024-12-01

Updated the technical report for O1-CODER.

Planned Updates

TODO: Reinforcement Learning code, Curated datasets and derived models

TODO: Reinforcement Fine-Tuning (RFT) Version of O1-Coder Due to the characteristics of the test case generator, O1-Coder can generate diverse process supervision data with only a small amount of ground truth code. Therefore, in the RFT version, we will skip the use of CoT data for initializing the policy model.

License

This work is released under the MIT License. See the LICENSE file for more details. By using this code or associated materials, you agree to comply with the terms outlined in the license.

Citation

If you use O1-CODER or parts of this work in your research or applications, please cite the following paper:

@misc{zhang2024o1codero1replicationcoding,
      title={O1-Coder: An O1 Replication for Coding}, 
      author={Yuxiang Zhang and Shangxi Wu and Yuqi Yang and Jiangming Shu and Jinlin Xiao and Chao Kong and Jitao Sang},
      year={2024},
      eprint={2412.00154},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2412.00154}, 
}

Name	Name	Last commit message	Last commit date
Latest commit ADaM-BJTU Add files via upload Dec 11, 2024 f01c769 · Dec 11, 2024 History 23 Commits
PDF	PDF	Add files via upload	Dec 11, 2024
assets	assets	Add files via upload	Nov 29, 2024
src	src	Update README.md	Dec 10, 2024
LICENSE	LICENSE	Create LICENSE	Dec 10, 2024
README.md	README.md	Update README.md	Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

O1-CODER

Overview

Method

News

Latest Updates

- 2024-12-10

- 2024-12-07

- 2024-12-01

Planned Updates

License

Citation

About

Releases

Packages

Languages

License

ADaM-BJTU/O1-CODER

Folders and files

Latest commit

History

Repository files navigation

O1-CODER

Overview

Method

News

Latest Updates

- 2024-12-10

- 2024-12-07

- 2024-12-01

Planned Updates

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages