Releases: kortix-ai/fast-apply
Releases · kortix-ai/fast-apply
Fast Apply v1.0 Release Notes
We're excited to announce the release of Fast Apply v1.0, a pipeline for data generation and fine-tuning of Qwen2.5 Coder models designed for instant code application.
Key Features
- High-speed code editing with maintained accuracy
- 1.5B model: ~340 tokens/second
- 7B model: ~150 tokens/second
- Optimized for deployment on fast providers like Fireworks
- Designed to power SoftGen AI (https://softgen.ai/)
Models and Dataset
Available on HuggingFace:
Technical Details
- Fine-tuned using QLoRA with 4-bit quantization
- Base models: Qwen2.5 Coder (1.5B and 7B versions)
- Dataset: ~5,600 examples (80% TypeScript/TSX, 15% Python, 5% Other)
- Hyperparameters:
- 1.5B model: rank (r) = 32, alpha = 16
- 7B model: rank (r) = 16, alpha = 16
- Training epochs: 1
Usage
Inference prompt structure:
<|im_start|>user
Merge all changes from the <update> snippet into the <code> below.
- Preserve the code's structure, order, comments, and indentation exactly.
- Output only the updated code, enclosed within <updated-code> and </updated-code> tags.
- Do not include any additional text, explanations, placeholders, ellipses, or code fences.
<code>{original_code}</code>
<update>{update_snippet}</update>
Provide the complete updated code."""
Expected model output:
<|im_start|>assistant
<updated-code>[Full-complete updated file]</updatedcode>
Deployment
Instructions for deploying on Fireworks are available in the repository.
Contributing
We welcome contributions to improve Fast Apply! Check out our GitHub repository for ways to contribute, including:
- Adding more diverse language data
- Reporting bugs
- Requesting features
- Submitting code improvements
- Sharing fine-tuning optimizations
Acknowledgements
This project utilizes open-source NextJS-like projects and leverages Unsloth
for fine-tuning.
For more information and detailed documentation, please visit our GitHub repository.