zhentaoyu

Follow

🎯

Focusing

zhentaoyu zhentaoyu

🎯

Focusing

Follow

5 followers · 16 following

intel
Shanghai
densecollections.top

Achievements

Achievements

Pinned Loading

intel/neural-speed intel/neural-speed Public archive

An innovative library for efficient LLM inference via low-bit quantization

C++ 349 38
intel/intel-extension-for-transformers intel/intel-extension-for-transformers Public archive

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2.1k 211
ggerganov/llama.cpp ggerganov/llama.cpp Public

LLM inference in C/C++

C++ 69.5k 10k
leejet/stable-diffusion.cpp leejet/stable-diffusion.cpp Public

Stable Diffusion and Flux in pure C/C++

C++ 3.6k 310
vllm-fork vllm-fork Public

Forked from HabanaAI/vllm-fork

A high-throughput and memory-efficient inference and serving engine for LLMs

Python
intel/neural-compressor intel/neural-compressor Public

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2.3k 258