-
ProgCo: Program Helps Self-Correction of Large Language Models,
arXiv, 2501.01264
, arxiv, pdf, cication: -1Xiaoshuai Song, Yanan Wu, Weixun Wang, ..., Wenbo Su, Bo Zheng
-
Dynamic Scaling of Unit Tests for Code Reward Modeling,
arXiv, 2501.01054
, arxiv, pdf, cication: -1Zeyao Ma, Xiaokang Zhang, Jing Zhang, ..., Sijia Luo, Jie Tang
-
Training Software Engineering Agents and Verifiers with SWE-Gym,
arXiv, 2412.21139
, arxiv, pdf, cication: -1Jiayi Pan, Xingyao Wang, Graham Neubig, ..., Alane Suhr, Yizhe Zhang · (SWE-Gym - SWE-Gym)
-
Outcome-Refining Process Supervision for Code Generation,
arXiv, 2412.15118
, arxiv, pdf, cication: -1Zhuohao Yu, Weizheng Gu, Yidong Wang, ..., Wei Ye, Shikun Zhang · (ORPS - zhuohaoyu)
-
🌟 o1-Coder: an o1 Replication for Coding,
arXiv, 2412.00154
, arxiv, pdf, cication: -1Yuxiang Zhang, Shangxi Wu, Yuqi Yang, ..., Chao Kong, Jitao Sang · (O1-CODER - ADaM-BJTU)
-
Leveraging training and search for better software engineering agents
· (𝕏)
-
Bug fixes & analysis for Qwen 2.5 𝕏
· (t)
-
GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models,
arXiv, 2411.05830
, arxiv, pdf, cication: -1Nizar Islah, Justine Gehring, Diganta Misra, ..., Terry Yue Zhuo, Massimo Caccia · (GitChameleon - NizarIslah)
-
🌟 Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level,
arXiv, 2411.03562
, arxiv, pdf, cication: -1Antoine Grosnit, Alexandre Maraval, James Doran, ..., Haitham Bou-Ammar, Jun Wang
-
🌟 OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models,
arXiv, 2411.04905
, arxiv, pdf, cication: -1Siming Huang, Tianhao Cheng, Jason Klein Liu, ..., Yinghui Xu, Wei Chu · (opencoder-llm.github)
-
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions,
arXiv, 2410.20424
, arxiv, pdf, cication: -1Ziming Li, Qianbo Zang, David Ma, ..., Wenhao Huang, Ge Zhang · (AutoKaggle%5D - multimodal-art-projection)
-
SelfCodeAlign: Self-Alignment for Code Generation,
arXiv, 2410.24198
, arxiv, pdf, cication: -1Yuxiang Wei, Federico Cassano, Jiawei Liu, ..., Arjun Guha, Lingming Zhang · (selfcodealign - bigcode-project)
-
The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models,
arXiv, 2501.09653
, arxiv, pdf, cication: -1Jonathan Katzy, Razvan Mihai Popescu, Arie van Deursen, ..., Maliheh Izadi
-
🌟 CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings,
arXiv, 2501.01257
, arxiv, pdf, cication: -1Shanghaoran Quan, Jiaxi Yang, Bowen Yu, ..., Binyuan Hui, Junyang Lin · (codeelo-bench.github)
-
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation,
arXiv, 2412.21199
, arxiv, pdf, cication: -1Zhaojian Yu, Yilun Zhao, Arman Cohan, ..., Xiao-Ping Zhang · (CodeEval-Pro - CodeEval-Pro) · (answers111.github)
-
Evaluating and Aligning CodeLLMs on Human Preference,
arXiv, 2412.05210
, arxiv, pdf, cication: -1Jian Yang, Jiaxi Yang, Ke Jin, ..., Binyuan Hui, Junyang Lin · (Qwen2.5-Coder - QwenLM) · (arxiv) · (huggingface)
-
Can Language Models Replace Programmers? REPOCOD Says 'Not Yet',
arXiv, 2410.21647
, arxiv, pdf, cication: -1Shanchao Liang, Yiran Hu, Nan Jiang, ..., Lin Tan
-
cursor + claude is cool but not coming for our jobs either imo 𝕏
-
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI,
arXiv, 2410.11096
, arxiv, pdf, cication: -1Yu Yang, Yuzhou Nie, Zhun Wang, ..., Bo Li, Dawn Song · (seccodeplt.github) · (huggingface)
-
deepseek-engineer - Doriandarko
-
SWE-Gym - SWE-Gym
-
gitingest - cyclotruc
-
llamacoder - Nutlope
-
MPLSandbox - Ablustrund
· (arxiv)
-
Lingma-SWE-GPT - LingmaTongyi
SoftWare Engineering Process Data Synthesis and Inference Workflow for Lingma SWE-GPT
-
Awesome-Code-LLM - huybery
-
aider - Aider-AI
-
screenshot-to-code - abi
-
composio - ComposioHQ
-
fast-apply - kortix-ai
Pipeline for Data Generation & Fine-Tuning Qwen2.5 Coder Models
-
sage - Storia-AI
Chat with any codebase
- PearAI: The Open Source AI Code Editor
- NEO A fully autonomousMachine Learning Engineer
- The first agentic IDE, and then some. The Windsurf Editor is where the work of developers and AI truly flow together, allowing for a coding experience that feels like literal magic
- Edit your codebase and run commands quicklywith natural language in your terminal.
- Find out how we’re evolving GitHub and GitHub Copilot—and get access to the latest previews and GA releases.