Refine Guardrails README and update model (opea-project#393)

* Refine Guardrails README and update model Signed-off-by: lvliang-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * upate model Signed-off-by: lvliang-intel <[email protected]> * revert back to guard-2 model Signed-off-by: lvliang-intel <[email protected]> * update readme Signed-off-by: lvliang-intel <[email protected]> * fix ci issue Signed-off-by: lvliang-intel <[email protected]> --------- Signed-off-by: lvliang-intel <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
pallavijaini0525 · Aug 6, 2024 · 7749ce3 · 7749ce3
1 parent c1887ed
commit 7749ce3
Show file tree

Hide file tree

Showing 9 changed files with 138 additions and 130 deletions.
diff --git a/comps/guardrails/README.md b/comps/guardrails/README.md
@@ -1,122 +1,10 @@
-# Guardrails Microservice
+# Trust and Safety with LLM
 
-To fortify AI initiatives in production, this microservice introduces guardrails designed to encapsulate LLMs, ensuring the enforcement of responsible behavior. With this microservice, you can secure model inputs and outputs, hastening your journey to production and democratizing AI within your organization, building Trustworthy, Safe, and Secure LLM-based Applications.
+The Guardrails service enhances the security of LLM-based applications by offering a suite of microservices designed to ensure trustworthiness, safety, and security.
 
-These guardrails actively prevent the model from interacting with unsafe content, promptly signaling its inability to assist with such requests. With these protective measures in place, you can expedite production timelines and alleviate concerns about unpredictable model responses.
+| MicroService                               | Description                                                                                |
+| ------------------------------------------ | ------------------------------------------------------------------------------------------ |
+| [Llama Guard](./llama_guard/README.md)     | Provides guardrails for inputs and outputs to ensure safe interactions                     |
+| [PII Detection](./pii_detection/README.md) | Detects Personally Identifiable Information (PII) and Business Sensitive Information (BSI) |
 
-The Guardrails Microservice now offers two primary types of guardrails:
-
-- Input Guardrails: These are applied to user inputs. An input guardrail can either reject the input, halting further processing.
-- Output Guardrails: These are applied to outputs generated by the LLM. An output guardrail can reject the output, preventing it from being returned to the user.
-
-We offer content moderation support utilizing Meta's [Llama Guard](https://huggingface.co/meta-llama/LlamaGuard-7b) model.
-
-Any content that is detected in the following categories is determined as unsafe:
-
-- Violence and Hate
-- Sexual Content
-- Criminal Planning
-- Guns and Illegal Weapons
-- Regulated or Controlled Substances
-- Suicide & Self Harm
-
-# 🚀1. Start Microservice with Python (Option 1)
-
-To start the Guardrails microservice, you need to install python packages first.
-
-## 1.1 Install Requirements
-
-```bash
-pip install -r requirements.txt
-```
-
-## 1.2 Start TGI Gaudi Service
-
-```bash
-export HF_TOKEN=${your_hf_api_token}
-export LANGCHAIN_TRACING_V2=true
-export LANGCHAIN_API_KEY=${your_langchain_api_key}
-export LANGCHAIN_PROJECT="opea/gaurdrails"
-volume=$PWD/data
-model_id="meta-llama/Meta-Llama-Guard-2-8B"
-docker pull ghcr.io/huggingface/tgi-gaudi:2.0.1
-docker run -p 8088:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy -e HF_TOKEN=$HF_TOKEN ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id $model_id --max-input-length 1024 --max-total-tokens 2048
-```
-
-## 1.3 Verify the TGI Gaudi Service
-
-```bash
-curl 127.0.0.1:8088/generate \
-  -X POST \
-  -d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-  -H 'Content-Type: application/json'
-```
-
-## 1.4 Start Guardrails Service
-
-Optional: If you have deployed a Guardrails model with TGI Gaudi Service other than default model (i.e., `meta-llama/LlamaGuard-7b`) [from section 1.2](## 1.2 Start TGI Gaudi Service), you will need to add the eviornment variable `SAFETY_GUARD_MODEL_ID` containing the model id. For example, the following informs the Guardrails Service the deployed model used LlamaGuard2:
-
-```bash
-export SAFETY_GUARD_MODEL_ID="meta-llama/Meta-Llama-Guard-2-8B"
-```
-
-```bash
-export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
-python langchain/guardrails_tgi_gaudi.py
-```
-
-# 🚀2. Start Microservice with Docker (Option 2)
-
-If you start an Guardrails microservice with docker, the `docker_compose_guardrails.yaml` file will automatically start a TGI gaudi service with docker.
-
-## 2.1 Setup Environment Variables
-
-In order to start TGI and LLM services, you need to setup the following environment variables first.
-
-```bash
-export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
-export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
-export LLM_MODEL_ID=${your_hf_llm_model}
-export LANGCHAIN_TRACING_V2=true
-export LANGCHAIN_API_KEY=${your_langchain_api_key}
-export LANGCHAIN_PROJECT="opea/gen-ai-comps:gaurdrails"
-```
-
-## 2.2 Build Docker Image
-
-```bash
-cd ../../
-docker build -t opea/guardrails-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/langchain/docker/Dockerfile .
-```
-
-## 2.3 Run Docker with CLI
-
-```bash
-docker run -d --name="guardrails-tgi-server" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN opea/guardrails-tgi:latest
-```
-
-## 2.4 Run Docker with Docker Compose
-
-```bash
-cd langchain/docker
-docker compose -f docker_compose_guardrails.yaml up -d
-```
-
-# 🚀3. Consume Guardrails Service
-
-## 3.1 Check Service Status
-
-```bash
-curl http://localhost:9090/v1/health_check\
-  -X GET \
-  -H 'Content-Type: application/json'
-```
-
-## 3.2 Consume Guardrails Service
-
-```bash
-curl http://localhost:9090/v1/guardrails\
-  -X POST \
-  -d '{"text":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-  -H 'Content-Type: application/json'
-```
+Additional safety-related microservices will be available soon.
diff --git a/comps/guardrails/langchain/README.md b/comps/guardrails/langchain/README.md
diff --git a/comps/guardrails/llama_guard/README.md b/comps/guardrails/llama_guard/README.md
@@ -0,0 +1,119 @@
+# Guardrails Microservice
+
+To fortify AI initiatives in production, this microservice introduces guardrails designed to encapsulate LLMs, ensuring the enforcement of responsible behavior. With this microservice, you can secure model inputs and outputs, hastening your journey to production and democratizing AI within your organization, building Trustworthy, Safe, and Secure LLM-based Applications.
+
+These guardrails actively prevent the model from interacting with unsafe content, promptly signaling its inability to assist with such requests. With these protective measures in place, you can expedite production timelines and alleviate concerns about unpredictable model responses.
+
+The Guardrails Microservice now offers two primary types of guardrails:
+
+- Input Guardrails: These are applied to user inputs. An input guardrail can either reject the input, halting further processing.
+- Output Guardrails: These are applied to outputs generated by the LLM. An output guardrail can reject the output, preventing it from being returned to the user.
+
+We offer content moderation support utilizing Meta's [Llama Guard](https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B) model.
+
+Any content that is detected in the following categories is determined as unsafe:
+
+- Violence and Hate
+- Sexual Content
+- Criminal Planning
+- Guns and Illegal Weapons
+- Regulated or Controlled Substances
+- Suicide & Self Harm
+
+# 🚀1. Start Microservice with Python (Option 1)
+
+To start the Guardrails microservice, you need to install python packages first.
+
+## 1.1 Install Requirements
+
+```bash
+pip install -r requirements.txt
+```
+
+## 1.2 Start TGI Gaudi Service
+
+```bash
+export HF_TOKEN=${your_hf_api_token}
+export LANGCHAIN_TRACING_V2=true
+export LANGCHAIN_API_KEY=${your_langchain_api_key}
+export LANGCHAIN_PROJECT="opea/gaurdrails"
+volume=$PWD/data
+model_id="meta-llama/Meta-Llama-Guard-2-8B"
+docker pull ghcr.io/huggingface/tgi-gaudi:2.0.1
+docker run -p 8088:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy -e HF_TOKEN=$HF_TOKEN ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id $model_id --max-input-length 1024 --max-total-tokens 2048
+```
+
+## 1.3 Verify the TGI Gaudi Service
+
+```bash
+curl 127.0.0.1:8088/generate \
+  -X POST \
+  -d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
+  -H 'Content-Type: application/json'
+```
+
+## 1.4 Start Guardrails Service
+
+Optional: If you have deployed a Guardrails model with TGI Gaudi Service other than default model (i.e., `meta-llama/Meta-Llama-Guard-2-8B`) [from section 1.2](## 1.2 Start TGI Gaudi Service), you will need to add the eviornment variable `SAFETY_GUARD_MODEL_ID` containing the model id. For example, the following informs the Guardrails Service the deployed model used LlamaGuard2:
+
+```bash
+export SAFETY_GUARD_MODEL_ID="meta-llama/Meta-Llama-Guard-2-8B"
+```
+
+```bash
+export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
+python langchain/guardrails_tgi.py
+```
+
+# 🚀2. Start Microservice with Docker (Option 2)
+
+If you start an Guardrails microservice with docker, the `docker_compose_guardrails.yaml` file will automatically start a TGI gaudi service with docker.
+
+## 2.1 Setup Environment Variables
+
+In order to start TGI and LLM services, you need to setup the following environment variables first.
+
+```bash
+export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
+export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
+export LLM_MODEL_ID=${your_hf_llm_model}
+```
+
+## 2.2 Build Docker Image
+
+```bash
+cd ../../
+docker build -t opea/guardrails-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/llama_guard/docker/Dockerfile .
+```
+
+## 2.3 Run Docker with CLI
+
+```bash
+docker run -d --name="guardrails-tgi-server" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN opea/guardrails-tgi:latest
+```
+
+## 2.4 Run Docker with Docker Compose
+
+```bash
+cd langchain/docker
+docker compose -f docker_compose_guardrails.yaml up -d
+```
+
+# 🚀3. Consume Guardrails Service
+
+## 3.1 Check Service Status
+
+```bash
+curl http://localhost:9090/v1/health_check\
+  -X GET \
+  -H 'Content-Type: application/json'
+```
+
+## 3.2 Consume Guardrails Service
+
+```bash
+curl http://localhost:9090/v1/guardrails\
+  -X POST \
+  -d '{"text":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
+  -H 'Content-Type: application/json'
+```
diff --git a/comps/guardrails/langchain/__init__.py → comps/guardrails/llama_guard/__init__.py b/comps/guardrails/langchain/__init__.py → comps/guardrails/llama_guard/__init__.py
diff --git a/comps/guardrails/langchain/docker/Dockerfile → .../guardrails/llama_guard/docker/Dockerfile b/comps/guardrails/langchain/docker/Dockerfile → .../guardrails/llama_guard/docker/Dockerfile
@@ -22,10 +22,10 @@ COPY comps /home/user/comps
 
 RUN pip install --no-cache-dir --upgrade pip && \
     if [ ${ARCH} = "cpu" ]; then pip install torch --index-url https://download.pytorch.org/whl/cpu; fi && \
-    pip install --no-cache-dir -r /home/user/comps/guardrails/requirements.txt
+    pip install --no-cache-dir -r /home/user/comps/guardrails/llama_guard/requirements.txt
 
 ENV PYTHONPATH=$PYTHONPATH:/home/user
 
-WORKDIR /home/user/comps/guardrails/langchain
+WORKDIR /home/user/comps/guardrails/llama_guard/
 
-ENTRYPOINT ["python", "guardrails_tgi_gaudi.py"]
+ENTRYPOINT ["python", "guardrails_tgi.py"]
diff --git a/...ain/docker/docker_compose_guardrails.yaml → ...ard/docker/docker_compose_guardrails.yaml b/...ain/docker/docker_compose_guardrails.yaml → ...ard/docker/docker_compose_guardrails.yaml
@@ -5,7 +5,7 @@ version: "3.8"
 
 services:
   tgi_gaudi_service:
-    image: ghcr.io/huggingface/tgi-gaudi:1.2.1
+    image: ghcr.io/huggingface/tgi-gaudi:2.0.1
     container_name: tgi-service
     ports:
       - "8088:80"
@@ -14,9 +14,9 @@ services:
     environment:
       HF_TOKEN: ${HF_TOKEN}
     shm_size: 1g
-    command: --model-id ${LLM_MODEL_ID}
+    command: --model-id ${LLM_MODEL_ID} --max-input-tokens 1024 --max-total-tokens 2048
   guardrails:
-    image: opea/gen-ai-comps:guardrails-tgi-gaudi-server
+    image: opea/guardrails-tgi:latest
     container_name: guardrails-tgi-gaudi-server
     ports:
       - "9090:9090"

diff --git a/...rdrails/langchain/guardrails_tgi_gaudi.py → .../guardrails/llama_guard/guardrails_tgi.py b/...rdrails/langchain/guardrails_tgi_gaudi.py → .../guardrails/llama_guard/guardrails_tgi.py
@@ -54,7 +54,7 @@ def get_tgi_service_model_id(endpoint_url, default=DEFAULT_MODEL):
 
 
 @register_microservice(
-    name="opea_service@guardrails_tgi_gaudi",
+    name="opea_service@guardrails_tgi",
     service_type=ServiceType.GUARDRAIL,
     endpoint="/v1/guardrails",
     host="0.0.0.0",
@@ -94,4 +94,4 @@ def safety_guard(input: TextDoc) -> TextDoc:
     # chat engine for server-side prompt templating
     llm_engine_hf = ChatHuggingFace(llm=llm_guard, model_id=safety_guard_model)
     print("guardrails - router] LLM initialized.")
-    opea_microservices["opea_service@guardrails_tgi_gaudi"].start()
+    opea_microservices["opea_service@guardrails_tgi"].start()
diff --git a/comps/guardrails/requirements.txt → ...s/guardrails/llama_guard/requirements.txt b/comps/guardrails/requirements.txt → ...s/guardrails/llama_guard/requirements.txt
diff --git a/tests/test_guardrails_langchain.sh → tests/test_guardrails_llama_guard.sh b/tests/test_guardrails_langchain.sh → tests/test_guardrails_llama_guard.sh
@@ -11,18 +11,19 @@ function build_docker_images() {
     echo "Start building docker images for microservice"
     cd $WORKPATH
     docker pull ghcr.io/huggingface/tgi-gaudi:2.0.1
-    docker build --no-cache -t opea/guardrails-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/langchain/docker/Dockerfile .
+    docker build --no-cache -t opea/guardrails-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/llama_guard/docker/Dockerfile .
     echo "Docker images built"
 }
 
 function start_service() {
     echo "Starting microservice"
     export model_id="meta-llama/Meta-Llama-Guard-2-8B"
-    export SAFETY_GUARD_ENDPOINT=http://${ip_address}:8088
+    export SAFETY_GUARD_MODEL_ID="meta-llama/Meta-Llama-Guard-2-8B"
+    export SAFETY_GUARD_ENDPOINT=http://${ip_address}:8088/v1/chat/completions
 
     docker run -d --name="test-guardrails-langchain-tgi-server" -p 8088:80 --runtime=habana -e HF_TOKEN=$HF_TOKEN -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id $model_id --max-input-length 1024 --max-total-tokens 2048
     sleep 4m
-    docker run -d --name="test-guardrails-langchain-service" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HF_TOKEN opea/guardrails-tgi:latest
+    docker run -d --name="test-guardrails-langchain-service" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_MODEL_ID=$SAFETY_GUARD_MODEL_ID -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HF_TOKEN opea/guardrails-tgi:latest
     sleep 10s
 
     echo "Microservice started"