Under progress: starcoder2 awq
- Python 3.11
- Nvidia GPU (for running the model on the server) - Only tested 20 series and above
-
Install the required dependencies
pip install -r requirements.txt
-
Launch server
python3 server.py
This should launch the instance at the default port used by the HF extension. The vLLM library will download the model and perform transformations.