inferentia

Here are 9 public repositories matching this topic...

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving tpu hpu mlops xpu llm inferentia llmops llm-serving trainium

Updated Feb 18, 2025
Python

aphrodite-engine / aphrodite-engine

Star

Large-scale LLM inference engine

machine-learning cuda intel api-rest lora rocm inference-engine tpu inferentia speculative-decoding

Updated Feb 18, 2025
C++

aws-samples / foundation-model-benchmarking-tool

Star

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.

benchmarking benchmark p5 bedrock evaluation-metrics sagemaker g6 p4d g5 foundation-models inferentia generative-ai llama2 deepseek trainium llama3 g6e deepseek-r1

Updated Feb 14, 2025
Jupyter Notebook

aws-solutions-library-samples / guidance-for-machine-learning-inference-on-aws

Star

This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It addresses the basic implementation requirements as well as ways you can pack thousands of unique PyTorch deep learning (DL) models into a scalable architecture and evaluate performance

ml eks-cluster mlops-workflow do-framework inferentia graviton3

Updated Jan 30, 2025
Shell

aws-samples / aws-inferentia-huggingface-workshop

Star

CMP314 Optimizing NLP models with Amazon EC2 Inf1 instances in Amazon Sagemaker

nlp sagemaker inferentia

Updated Dec 20, 2023
Jupyter Notebook

aws-samples / awsome-fmops

Star

Collection of bet practices, reference architectures, examples, and utilities for foundation model development and deployment on AWS.

kubernetes gpu terraform pytorch eks kserve karpenter inferentia generative-ai llm-training llm-inference

Updated Nov 30, 2024
HCL

daekeun-ml / aws-inferentia

Star

This repository provides an easy hands-on way to get started with AWS Inferentia. A demonstration of this hands-on can be seen in the AWS Innovate 2023 - AIML Edition session.

inferentia

Updated Mar 3, 2023
Jupyter Notebook

DarkSector / inf1-sentence-transformers

Star

Sentence Transformers on EC2 Inf1

aws inferentia aws-neuron

Updated Nov 1, 2023
Jupyter Notebook

windson / inferentia-deployments

Sponsor

Star

Deploy Large Models on AWS Inferentia (Inf2) instances.

aws lmi inf2 large-model llm inferentia large-language-model large-model-inference aws-inferentia inferentia-2

Updated Dec 28, 2023
Jupyter Notebook

Improve this page

Add a description, image, and links to the inferentia topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inferentia topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inferentia

Here are 9 public repositories matching this topic...

vllm-project / vllm

aphrodite-engine / aphrodite-engine

aws-samples / foundation-model-benchmarking-tool

aws-solutions-library-samples / guidance-for-machine-learning-inference-on-aws

aws-samples / aws-inferentia-huggingface-workshop

aws-samples / awsome-fmops

daekeun-ml / aws-inferentia

DarkSector / inf1-sentence-transformers

windson / inferentia-deployments

Improve this page

Add this topic to your repo