This repository contains the Flask application for a machine learning model serving API. It provides several endpoints to generate text, simulate chat conversations, tool executions, and grounded responses based on different input parameters.
- Generate text with a given prompt and additional parameters.
- Simulate chat conversations and generate appropriate responses.
- Execute tools with specific arguments in a conversational context.
- Retrieve grounded responses using provided documents and conversation history.
To run this application, you need Python 3 and pip installed on your system. Follow these steps to set up the server:
-
Clone the repository:
git clone <repository-url> cd <repository-folder>
-
Install dependencies:
pip install -r requirements.txt
-
Run the server:
python server.py --port 5000 --model 'mlx-community/c4ai-command-r-v01-4bit' --debug
Replace <repository-url>
and <repository-folder>
with the actual URL and folder name of your cloned repository.
Generates text based on the provided prompt.
{
"prompt": "Enter your prompt here",
"temperature": 0.2,
"max_tokens": 131072
}
{
"generated_text": "Generated text will be here."
}
Simulates a chat conversation and generates a response.
{
"conversation": [
{"role": "user", "content": "User's message"},
{"role": "assistant", "content": "Assistant's response"}
],
"temperature": 0.2,
"max_tokens": 131072
}
{
"generated_text": "Generated chat response will be here."
}
Runs a specified tool within the conversation context.
{
"conversation": [
{"role": "user", "content": "User's message"},
{"role": "assistant", "content": "Assistant's response"}
],
"tools": [
{
"name": "internet_search",
"description": "Searches the internet for the given query.",
"parameter_definitions": {
"query": {
"description": "Query to search the internet with",
"type": "str",
"required": true
}
}
}
]
}
{
"tool_response": "Generated tool response will be here."
}
Generates grounded responses based on the conversation and supplemental documents.
{
"conversation": [
{"role": "user", "content": "User's message"},
{"role": "assistant", "content": "Assistant's response"}
],
"documents": [
{ "title": "Document Title", "text": "Some relevant information." }
],
"citation_mode": "accurate"
}
{
"rag_response": "Generated grounded response will be here."
}
- Ensure that the Flask server is running as instructed above.
- Utilize tools such as
curl
or Postman to send HTTP POST requests to the desired endpoint with the required JSON payload.
- Make sure to validate input parameters and catch exceptions as per the example code.
- Always check for the latest updates and documentation.
Remember to replace placeholder texts with actual values suited to your environment and use case.