Lists (3)
Sort Name ascending (A-Z)
Starred repositories
A framework for few-shot evaluation of language models.
[arXiv] On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
High-speed and easy-use LLM serving framework for local deployment
Train transformer language models with reinforcement learning.
My own approach at trying to create a conversational GPT
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.
C++ implementations for various tokenizers (sentencepiece, tiktoken etc).
Generative AI extensions for onnxruntime
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
🦀⚙️ Sudoless performance monitoring for Apple Silicon processors. CPU / GPU / RAM usage, power consumption & temperature 🌡️
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
Make websites accessible for AI agents
Fully open reproduction of DeepSeek-R1
Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)
A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
[NeurIPS2024] "Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design", Ruisi Cai, Yeonju Ro, Geon-Woo Kim, Peihao Wang, Babak Ehteshami Bejnordi, Aditya Akella, Z…
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
Mixture-of-Agents Framework Implementation at Distributed Edge Devices with Theoretical Guarantee of Finite Average Latency
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
Contains libraries for use in making provably-private applications.
[ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents