This repository contains the code and resources for our AI-powered theorem-proving project, designed to tackle complex formal proofs using language models.
For more details, please refer to our paper.
To get started with model deployment and experimentation, follow these setup instructions.
We provide a complete setup for the Isabelle environment to ensure smooth integration with the project’s scripts. Please refer to the Isabelle pipeline for detailed instructions.
Ensure that you have a conda environment with PyTorch and CUDA installed. Then, run the following command to set up the required environment:
cd project_directory
pip install -r requirements.txt
We provide curated datasets essential for initializing the expert learning phase. These datasets can be downloaded from our datasets hub. The full data preparation pipeline for generating these datasets is located in data preparation.
We offer a comprehensive training pipeline for model development. Please refer to the training pipeline to get started.
Our inference pipeline is designed for applying models to new proof tasks efficiently. Please refer to the inference pipeline for instructions.
To ensure the correctness of generated proofs, we provide a verification pipeline. Details can be found in the verification pipeline.
We provide our model weights, which you can use directly for model deployment and experimentation. The weights can be downloaded from our model hub.
If you use our code or data, please cite our paper:
@article{zhao2024subgoalxl,
title = {SubgoalXL: Subgoal-based Expert Learning for Theorem Proving},
author = {Zhao, Xueliang and Zheng, Lin and Bo, Haige and Hu, Changran and Thakker, Urmish and Kong, Lingpeng},
journal={arXiv preprint arXiv:2408.11172},
url={https://arxiv.org/abs/2408.11172},
year = {2024},
}
Our research was greatly accelerated by SambaNova Systems' advanced technology, which enabled us to surpass the 1000 tokens per second barrier for large language model inference. This capability was critical in generating complex formal statements and proofs with exceptional speed, leveraging the outstanding performance of the SN20, SN30, and SN40 series chips.