forked from opea-project/GenAIExamples
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Text to SQL microservice (opea-project#736)
* Initial Version of TextToSQL Signed-off-by: Yogesh <[email protected]> * Updated Code to add more metadata in the reply Signed-off-by: Yogesh <[email protected]> * Updated DockerFile Signed-off-by: Yogesh <[email protected]> * UpdatedREADME Signed-off-by: Yogesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed precommit CIs Signed-off-by: Yogesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated yaml files Signed-off-by: Yogesh <[email protected]> * Updated tests to use the right directory Signed-off-by: Yogesh <[email protected]> * Updated tests to use the right env Signed-off-by: Yogesh <[email protected]> * Updates as per review comments | fixed README and requirements.txt Signed-off-by: Yogesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added changed as per review comments Signed-off-by: Yogesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yogesh <[email protected]> Co-authored-by: Yogesh <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
e7fdf53
commit 827e3d4
Showing
10 changed files
with
24,625 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
# this file should be run in the root of the repo | ||
services: | ||
texttosql-langchain: | ||
build: | ||
dockerfile: comps/texttosql/langchain/Dockerfile | ||
image: ${REGISTRY:-opea}/texttosql:${TAG:-latest} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
FROM python:3.11-slim | ||
|
||
ENV LANG=C.UTF-8 | ||
ARG ARCH=cpu | ||
|
||
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \ | ||
build-essential \ | ||
libgl1-mesa-glx \ | ||
libjemalloc-dev | ||
|
||
RUN useradd -m -s /bin/bash user && \ | ||
mkdir -p /home/user && \ | ||
chown -R user /home/user/ | ||
|
||
USER user | ||
|
||
COPY comps /home/user/comps | ||
|
||
RUN pip install --no-cache-dir --upgrade pip setuptools && \ | ||
if [ ${ARCH} = "cpu" ]; then \ | ||
pip install --no-cache-dir --extra-index-url https://download.pytorch.org/whl/cpu -r /home/user/comps/texttosql/langchain/requirements.txt; \ | ||
else \ | ||
pip install --no-cache-dir -r /home/user/comps/texttosql/langchain/requirements.txt; \ | ||
fi | ||
|
||
ENV PYTHONPATH=$PYTHONPATH:/home/user | ||
|
||
WORKDIR /home/user/comps/texttosql/langchain/ | ||
|
||
ENTRYPOINT ["python", "main.py"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
# Text-to-SQL Microservice | ||
|
||
In today's data-driven world, the ability to efficiently extract insights from databases is crucial. However, querying databases often requires specialized knowledge of SQL and database schemas, which can be a barrier for non-technical users. This is where the Text-to-SQL microservice comes into play, leveraging the power of LLMs and agentic frameworks to bridge the gap between human language and database queries. This microservice is built on Langchain/Langgraph frameworks. | ||
|
||
The microservice enables a wide range of use cases, making it a versatile tool for businesses, researchers, and individuals alike. Users can generate queries based on natural language questions, enabling them to quickly retrieve relevant data from their databases. Additionally, the service can be integrated into chatbots, allowing for natural language interactions and providing accurate responses based on the underlying data. Furthermore, it can be utilized to build custom dashboards, enabling users to visualize and analyze insights based on their specific requirements, all through the power of natural language. | ||
|
||
## 🚀1. Start Microservice with Python(Option 1) | ||
|
||
### 1.1 Install Requirements | ||
|
||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
### 1.2 Start PostgresDB Service | ||
|
||
We will use [Chinook](https://github.com/lerocha/chinook-database) sample database as a default to test the Text-to-SQL microservice. Chinook database is a sample database ideal for demos and testing ORM tools targeting single and multiple database servers. | ||
|
||
```bash | ||
export POSTGRES_USER=postgres | ||
export POSTGRES_PASSWORD=testpwd | ||
export POSTGRES_DB=chinook | ||
|
||
cd comps/texttosql/langchain | ||
|
||
docker run --name postgres-db --ipc=host -e POSTGRES_USER=${POSTGRES_USER} -e POSTGRES_HOST_AUTH_METHOD=trust -e POSTGRES_DB=${POSTGRES_DB} -e POSTGRES_PASSWORD=${POSTGRES_PASSWORD} -p 5442:5432 -d -v ./chinook.sql:/docker-entrypoint-initdb.d/chinook.sql postgres:latest | ||
``` | ||
|
||
### 1.3 Start TGI Service | ||
|
||
```bash | ||
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} | ||
export LLM_MODEL_ID="mistralai/Mistral-7B-Instruct-v0.3" | ||
export TGI_PORT=8008 | ||
|
||
docker run -d --name="texttosql-tgi-endpoint" --ipc=host -p $TGI_PORT:80 -v ./data:/data --shm-size 1g -e HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e model=${LLM_MODEL_ID} ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id $LLM_MODEL_ID | ||
``` | ||
|
||
### 1.4 Verify the TGI Service | ||
|
||
```bash | ||
|
||
export your_ip=$(hostname -I | awk '{print $1}') | ||
curl http://${your_ip}:${TGI_PORT}/generate \ | ||
-X POST \ | ||
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ | ||
-H 'Content-Type: application/json' | ||
``` | ||
|
||
### 1.5 Setup Environment Variables | ||
|
||
```bash | ||
export TGI_LLM_ENDPOINT="http://${your_ip}:${TGI_PORT}" | ||
``` | ||
|
||
### 1.6 Start Text-to-SQL Microservice with Python Script | ||
|
||
Start Text-to-SQL microservice with below command. | ||
|
||
```bash | ||
python3 main.py | ||
``` | ||
|
||
## 🚀2. Start Microservice with Docker (Option 2) | ||
|
||
### 2.1 Start PostGreSQL Database Service | ||
|
||
Please refer to 1.2. | ||
|
||
### 2.2 Start TGI Service | ||
|
||
Please refer to 1.3. | ||
|
||
### 2.3 Setup Environment Variables | ||
|
||
```bash | ||
export TGI_LLM_ENDPOINT="http://${your_ip}:${TGI_PORT}" | ||
``` | ||
|
||
### 2.4 Build Docker Image | ||
|
||
```bash | ||
cd GenAIComps/ # back to GenAIComps/ folder | ||
docker build -t opea/texttosql:latest -f comps/texttosql/langchain/Dockerfile . | ||
``` | ||
|
||
#### 2.5 Run Docker with CLI (Option A) | ||
|
||
```bash | ||
export TGI_LLM_ENDPOINT="http://${your_ip}:${TGI_PORT}" | ||
|
||
docker run --runtime=runc --name="comps-langchain-texttosql" -p 9090:8080 --ipc=host -e llm_endpoint_url=${TGI_LLM_ENDPOINT} opea/texttosql:latest | ||
|
||
``` | ||
|
||
#### 2.6 Run via docker compose. (Option B) | ||
|
||
Set Environment Variables. | ||
|
||
```bash | ||
export TGI_LLM_ENDPOINT=http://${your_ip}:${TGI_PORT} | ||
export HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN} | ||
export LLM_MODEL_ID="mistralai/Mistral-7B-Instruct-v0.3" | ||
export POSTGRES_USER=postgres | ||
export POSTGRES_PASSWORD=testpwd | ||
export POSTGRES_DB=chinook | ||
``` | ||
|
||
Start the services. | ||
|
||
```bash | ||
docker compose -f docker_compose_texttosql.yaml up | ||
``` | ||
|
||
## 🚀3. Consume Microservice | ||
|
||
Once Text-to-SQL microservice is started, user can use below command | ||
|
||
- Test the Database connection | ||
|
||
```bash | ||
curl --location http://${your_ip}:9090/v1/postgres/health \ | ||
--header 'Content-Type: application/json' \ | ||
--data '{"user": "'${POSTGRES_USER}'","password": "'${POSTGRES_PASSWORD}'","host": "'${your_ip}'", "port": "5442", "database": "'${POSTGRES_DB}'"}' | ||
``` | ||
|
||
- Invoke the microservice. | ||
|
||
```bash | ||
curl http://${your_ip}:9090/v1/texttosql\ | ||
-X POST \ | ||
-d '{"input_text": "Find the total number of Albums.","conn_str": {"user": "'${POSTGRES_USER}'","password": "'${POSTGRES_PASSWORD}'","host": "'${your_ip}'", "port": "5442", "database": "'${POSTGRES_DB}'"}}' \ | ||
-H 'Content-Type: application/json' | ||
``` |
Oops, something went wrong.