Popular repositories Loading
-
onnxruntime
onnxruntime PublicForked from microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
C++
-
neural-compressor
neural-compressor PublicForked from intel/neural-compressor
Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, spar…
Python
-
transformers
transformers PublicForked from huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python
-
streaming-llm
streaming-llm PublicForked from mit-han-lab/streaming-llm
Efficient Streaming Language Models with Attention Sinks
Python
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
-
ipex-llm
ipex-llm PublicForked from intel-analytics/ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc,…
Python
If the problem persists, check the GitHub status page or contact support.