added SVG for Groq model model providers #1470

paresh2806 · 2024-07-11T10:02:41Z

#1432 #1447
This PR adds support for the GROQ LLM (Large Language Model).

Groq is an AI solutions company delivering ultra-low latency inference with the first-ever LPU™ Inference Engine. The Groq API enables developers to integrate state-of-the-art LLMs, such as Llama-2 and llama3-70b-8192, into low latency applications with the request limits specified below. Learn more at groq.com.
Supported Models

ID	Requests per Minute	Requests per Day	Tokens per Minute
gemma-7b-it	30	14,400	15,000
gemma2-9b-it	30	14,400	15,000
llama3-70b-8192	30	14,400	6,000
llama3-8b-8192	30	14,400	30,000
mixtral-8x7b-32768	30	14,400	5,000

rag/llm/__init__.py

rag/llm/chat_model.py

api/db/init_data.py

added suggested changes Co-authored-by: Kevin Hu <[email protected]>

lionkingc · 2024-08-19T08:32:29Z

please add the newest model :
Model ID: llama-3.1-70b-versatile
Model ID: llama-3.1-8b-instant

infiniflow#1432 infiniflow#1447 This PR adds support for the GROQ LLM (Large Language Model). Groq is an AI solutions company delivering ultra-low latency inference with the first-ever LPU™ Inference Engine. The Groq API enables developers to integrate state-of-the-art LLMs, such as Llama-2 and llama3-70b-8192, into low latency applications with the request limits specified below. Learn more at [groq.com](https://groq.com/). Supported Models | ID | Requests per Minute | Requests per Day | Tokens per Minute | |----------------------|---------------------|------------------|-------------------| | gemma-7b-it | 30 | 14,400 | 15,000 | | gemma2-9b-it | 30 | 14,400 | 15,000 | | llama3-70b-8192 | 30 | 14,400 | 6,000 | | llama3-8b-8192 | 30 | 14,400 | 30,000 | | mixtral-8x7b-32768 | 30 | 14,400 | 5,000 | --------- Co-authored-by: paresh0628 <[email protected]> Co-authored-by: Kevin Hu <[email protected]>

added SVG for Groq model model providers

4d82a84