Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add knowledge graph components #171

Merged
merged 43 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
f84588d
enable ragas (#129)
XuhuiRen Jun 8, 2024
986fe6b
Fix RAG performance issues (#132)
lvliang-intel Jun 8, 2024
525cdad
add microservice level perf statistics (#135)
Spycsh Jun 11, 2024
47eff21
Add More Contents to the Table of MicroService (#141)
zehao-intel Jun 11, 2024
2206d77
Use common security content for OPEA projects (#151)
chensuyue Jun 11, 2024
5218852
Enable vLLM Gaudi support for LLM service based on officially habana …
tianyil1 Jun 12, 2024
d2cf359
add knowledge graph
XinyaoWa Jun 13, 2024
794d8b5
knowledge graph microservice update
XinyaoWa Jun 17, 2024
31695be
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 13, 2024
c95f1a4
Support Dataprep Microservice with Llama Index (#154)
letonghan Jun 12, 2024
7d51fb0
Support Embedding Microservice with Llama Index (#150)
letonghan Jun 12, 2024
6959256
Support Ollama microservice (#142)
lvliang-intel Jun 12, 2024
b94aa12
Fix dataprep microservice path issue (#163)
lvliang-intel Jun 13, 2024
2516eb1
update CI to support dataprep_redis path level change (#155)
chensuyue Jun 13, 2024
bede595
Add Gateway for Translation (#169)
zehao-intel Jun 13, 2024
22c195b
Update LLM readme (#172)
lvliang-intel Jun 13, 2024
bfc0bc1
add milvus microservice (#158)
jinjunzh Jun 13, 2024
bec961d
enable python coverage (#149)
chensuyue Jun 13, 2024
44aa82f
Add Ray version for multi file process (#119)
xuechendi Jun 14, 2024
bdc9704
Add codecov (#178)
XuehaoSun Jun 14, 2024
fef866a
Rename lm-eval folder to utils/lm-eval (#179)
changwangss Jun 14, 2024
6f3e8d9
Support rerank and retrieval of RAG OPT (#164)
XinyuYe-Intel Jun 14, 2024
f694dff
Support vLLM XFT LLM microservice (#174)
lvliang-intel Jun 14, 2024
efbb034
Update Dataprep Microservice README (#173)
letonghan Jun 14, 2024
aaaa892
fixed milvus port conflict issues during deployment (#183)
jinjunzh Jun 17, 2024
bcba113
remove
XinyaoWa Jun 17, 2024
0379bcf
remove hard address
XinyaoWa Jun 17, 2024
a6c47ad
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 17, 2024
58ae6a6
update readme
XinyaoWa Jun 18, 2024
afab31c
add example data and ingestion
XinyaoWa Jun 18, 2024
7c6f971
fix typ
XinyaoWa Jun 18, 2024
d9ba9d1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 18, 2024
4fca625
Fix dataprep timeout issue (#203)
lvliang-intel Jun 18, 2024
a7a3104
Add a new embedding MosecEmbedding (#182)
miaojinc Jun 18, 2024
f83ba94
expand timeout for microservice test (#208)
chensuyue Jun 18, 2024
f10f99c
fix typo
XinyaoWa Jun 18, 2024
282e9ea
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 18, 2024
460407f
fix requirement
XinyaoWa Jun 18, 2024
585f51b
Merge branch 'main' into kg
XinyaoWa Jun 18, 2024
796c480
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 18, 2024
0f906d1
Merge branch 'main' into kg
XinyaoWa Jun 20, 2024
bd58085
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 20, 2024
d8266ea
Merge branch 'main' into kg
XinyaoWa Jun 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions comps/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
TextDoc,
RAGASParams,
RAGASScores,
GraphDoc,
LVMDoc,
)

Expand Down
3 changes: 3 additions & 0 deletions comps/cores/mega/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ class ServiceType(Enum):
UNDEFINED = 10
RAGAS = 11
LVM = 12
KNOWLEDGE_GRAPH = 13


class MegaServiceEndpoint(Enum):
Expand All @@ -50,6 +51,8 @@ class MegaServiceEndpoint(Enum):
RERANKING = "/v1/reranking"
GUARDRAILS = "/v1/guardrails"
RAGAS = "/v1/ragas"
GRAPHS = "/v1/graphs"

# COMMON
LIST_SERVICE = "/v1/list_service"
LIST_PARAMETERS = "/v1/list_parameters"
Expand Down
13 changes: 13 additions & 0 deletions comps/cores/proto/docarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,19 @@ class RAGASScores(BaseDoc):
context_precision: float


class GraphDoc(BaseDoc):
text: str
strtype: Optional[str] = Field(
description="type of input query, can be 'query', 'cypher', 'rag'",
default="query",
)
max_new_tokens: Optional[int] = Field(default=1024)
rag_index_name: Optional[str] = Field(default="rag")
rag_node_label: Optional[str] = Field(default="Task")
rag_text_node_properties: Optional[list] = Field(default=["name", "description", "status"])
rag_embedding_node_property: Optional[str] = Field(default="embedding")


class LVMDoc(BaseDoc):
image: str
prompt: str
Expand Down
146 changes: 146 additions & 0 deletions comps/knowledgegraphs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# Knowledge Graph Microservice

This microservice, designed for efficiently handling and retrieving informantion from knowledge graph. The microservice integrates text retriever, knowledge graph quick search and LLM agent, which can be combined to enhance question answering.

The service contains three modes:

- "cypher": Query knowledge graph directly with cypher
- "rag": Apply similarity search on embeddings of knowledge graph
- "query": An LLM agent will automatically choose tools (RAG or CypherChain) to enhance the question answering

Here is the overall workflow:

![Workflow](doc/workflow.png)

A prerequisite for using this microservice is that users must have a knowledge gragh database already running, and currently we have support [Neo4J](https://neo4j.com/) for quick deployment. Users need to set the graph service's endpoint into an environment variable and microservie utilizes it for data injestion and retrieve. If user want to use "rag" and "query" mode, still need a LLM text generation service (etc., TGI, vLLM and Ray) already running.

Overall, this microservice provides efficient support for applications related with graph dataset, especially for answering multi-part questions, or any other conditions including comples relationship between entities.

# 🚀1. Start Microservice with Docker

## 1.1 Setup Environment Variables

```bash
export NEO4J_ENDPOINT="neo4j://${your_ip}:7687"
export NEO4J_USERNAME="neo4j"
export NEO4J_PASSWORD=${define_a_password}
export HUGGINGFACEHUB_API_TOKEN=${your_huggingface_api_token}
export LLM_ENDPOINT="http://${your_ip}:8080"
export LLM_MODEL="meta-llama/Llama-2-7b-hf"
export AGENT_LLM="HuggingFaceH4/zephyr-7b-beta"
```

## 1.2 Start Neo4j Service

```bash
docker pull neo4j

docker run --rm \
--publish=7474:7474 --publish=7687:7687 \
--env NEO4J_AUTH=$NEO4J_USER/$NEO4J_PASSWORD \
--volume=$PWD/neo4j_data:"/data" \
--env='NEO4JLABS_PLUGINS=["apoc"]' \
neo4j
```

## 1.3 Start LLM Service for "rag"/"query" mode

You can start any LLM microserve, here we take TGI as an example.

```bash
docker run -p 8080:80 \
-v $PWD/llm_data:/data --runtime=habana \
-e HABANA_VISIBLE_DEVICES=all \
-e OMPI_MCA_btl_vader_single_copy_mechanism=none \
-e HUGGING_FACE_HUB_TOKEN=$HUGGINGFACEHUB_API_TOKEN \
--cap-add=sys_nice \
--ipc=host \
ghcr.io/huggingface/tgi-gaudi:2.0.0 \
--model-id $LLM_MODEL \
--max-input-tokens 1024 \
--max-total-tokens 2048
```

Verify LLM service.

```bash
curl $LLM_ENDPOINT/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":32}}' \
-H 'Content-Type: application/json'
```

## 1.4 Start Microservice

```bash
cd ../..
docker build -t opea/knowledge_graphs:latest \
--build-arg https_proxy=$https_proxy \
--build-arg http_proxy=$http_proxy \
-f comps/knowledgegraphs/langchain/docker/Dockerfile .

docker run --rm \
--name="knowledge-graph-server" \
-p 8060:8060 \
--ipc=host \
-e http_proxy=$http_proxy \
-e https_proxy=$https_proxy \
-e NEO4J_ENDPOINT=$NEO4J_ENDPOINT \
-e NEO4J_USERNAME=$NEO4J_USERNAME \
-e NEO4J_PASSWORD=$NEO4J_PASSWORD \
-e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN \
-e LLM_ENDPOINT=$LLM_ENDPOINT \
opea/knowledge_graphs:latest
```

# 🚀2. Consume Knowledge Graph Service

## 2.1 Cypher mode

```bash
curl http://${your_ip}:8060/v1/graphs \
-X POST \
-d "{\"text\":\"MATCH (t:Task {status:'open'}) RETURN count(*)\",\"strtype\":\"cypher\"}" \
-H 'Content-Type: application/json'
```

Example output:
![Cypher Output](doc/output_cypher.png)

## 2.2 Rag mode

```bash
curl http://${your_ip}:8060/v1/graphs \
-X POST \
-d "{\"text\":\"How many open tickets there are?\",\"strtype\":\"rag\", \"max_new_tokens\":128}" \
-H 'Content-Type: application/json'
```

Example output:
![Cypher Output](doc/output_rag.png)

## 2.3 Query mode

First example:

```bash
curl http://${your_ip}:8060/v1/graphs \
-X POST \
-d "{\"text\":\"Which tasks have optimization in their description?\",\"strtype\":\"query\"}" \
-H 'Content-Type: application/json'
```

Example output:
![Cypher Output](doc/output_query1.png)

Second example:

```bash
curl http://${your_ip}:8060/v1/graphs \
-X POST \
-d "{\"text\":\"Which team is assigned to maintain PaymentService?\",\"strtype\":\"query\"}" \
-H 'Content-Type: application/json'
```

Example output:
![Cypher Output](doc/output_query2.png)
2 changes: 2 additions & 0 deletions comps/knowledgegraphs/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
7 changes: 7 additions & 0 deletions comps/knowledgegraphs/build_docker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

docker build -t opea/knowledge_graphs:latest \
--build-arg https_proxy=$https_proxy \
--build-arg http_proxy=$http_proxy \
-f comps/knowledgegraphs/langchain/docker/Dockerfile .
Binary file added comps/knowledgegraphs/doc/output_cypher.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added comps/knowledgegraphs/doc/output_query1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added comps/knowledgegraphs/doc/output_query2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added comps/knowledgegraphs/doc/output_rag.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added comps/knowledgegraphs/doc/workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
2 changes: 2 additions & 0 deletions comps/knowledgegraphs/langchain/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
3 changes: 3 additions & 0 deletions comps/knowledgegraphs/langchain/data/microservices.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"query": "MERGE (catalog:Microservice {name: 'CatalogService', technology: 'Java'}) MERGE (order:Microservice {name: 'OrderService', technology: 'Python'}) MERGE (user:Microservice {name: 'UserService', technology: 'Go'}) MERGE (payment:Microservice {name: 'PaymentService', technology: 'Node.js'}) MERGE (inventory:Microservice {name: 'InventoryService', technology: 'Java'}) MERGE (shipping:Microservice {name: 'ShippingService', technology: 'Python'}) MERGE (review:Microservice {name: 'ReviewService', technology: 'Go'}) MERGE (recommendation:Microservice {name: 'RecommendationService', technology: 'Node.js'}) MERGE (auth:Microservice {name: 'AuthService', technology: 'Node.js'}) MERGE (db:Microservice {name: 'Database', technology: 'SQL'}) MERGE (cache:Microservice {name: 'Cache', technology: 'In-memory'}) MERGE (mq:Microservice {name: 'MessageQueue', technology: 'Pub-Sub'}) MERGE (api:Microservice {name: 'ExternalAPI', technology: 'REST'}) MERGE (bugFixCatalog:Task {name: 'BugFix', description: 'Address and resolve a critical bug impacting the CatalogService, affecting the user interface and experience, and hampering the overall performance and responsiveness of the service.', status: 'open'}) MERGE (featureAddOrder:Task {name: 'FeatureAdd', description: 'Implement a new feature in OrderService to facilitate bulk orders, ensuring the features seamless integration with existing functionalities and maintaining the overall stability and performance of the service.', status: 'in progress'}) MERGE (refactorUser:Task {name: 'Refactor', description: 'Refactor the UserService codebase to enhance its readability, maintainability, and scalability, focusing primarily on modularization and optimization of existing functionalities.', status: 'completed'}) MERGE (optimizePayment:Task {name: 'Optimize', description: 'Optimize PaymentService by refining the transaction processing logic, reducing the service’s latency, and improving its reliability and efficiency in handling transactions.', status: 'open'}) MERGE (updateInventory:Task {name: 'Update', description: 'Update InventoryService to include real-time stock updates, ensuring accurate reflection of the inventory levels and aiding in the efficient management of stock.', status: 'in progress'}) MERGE (enhanceShipping:Task {name: 'Enhance', description: 'Enhance the ShippingService by integrating a new shipping partner API, thereby expanding the shipping options available to the customers and improving the overall delivery experience.', status: 'completed'}) MERGE (reviewFix:Task {name: 'ReviewFix', description: 'Rectify a recurring issue in ReviewService affecting the retrieval of user reviews, by refining the service’s logic and improving its efficiency in handling and displaying user reviews.', status: 'open'}) MERGE (recommendationFeature:Task {name: 'RecommendationFeature', description: 'Add a new feature to RecommendationService to provide more personalized and accurate product recommendations to the users, leveraging user behavior and preference data.', status: 'in progress'}) MERGE (optimizeAuth:Task {name: 'Optimize', description: 'Enhance AuthService’s performance and security by optimizing the authentication mechanisms and implementing additional security measures to safeguard user information.', status: 'open'}) MERGE (newTask:Task {name: 'ImproveSecurity', description: 'Enhance the security of microservices by implementing advanced encryption and securing endpoints.', status: 'open'}) MERGE (teamA:Team {name: 'TeamA'}) MERGE (teamB:Team {name: 'TeamB'}) MERGE (teamC:Team {name: 'TeamC'}) MERGE (teamD:Team {name: 'TeamD'}) MERGE (alice:Person {name: 'Alice'}) MERGE (bob:Person {name: 'Bob'}) MERGE (charlie:Person {name: 'Charlie'}) MERGE (diana:Person {name: 'Diana'}) MERGE (eva:Person {name: 'Eva'}) MERGE (frank:Person {name: 'Frank'}) MERGE (catalog)-[:DEPENDS_ON]->(db) MERGE (order)-[:DEPENDS_ON]->(db) MERGE (user)-[:DEPENDS_ON]->(db) MERGE (payment)-[:DEPENDS_ON]->(db) MERGE (inventory)-[:DEPENDS_ON]->(db) MERGE (shipping)-[:DEPENDS_ON]->(mq) MERGE (review)-[:DEPENDS_ON]->(cache) MERGE (recommendation)-[:DEPENDS_ON]->(api) MERGE (auth)-[:DEPENDS_ON]->(db) MERGE (order)-[:DEPENDS_ON]->(inventory) MERGE (order)-[:DEPENDS_ON]->(shipping) MERGE (order)-[:DEPENDS_ON]->(payment) MERGE (catalog)-[:DEPENDS_ON]->(review) MERGE (catalog)-[:DEPENDS_ON]->(recommendation) MERGE (user)-[:DEPENDS_ON]->(auth) MERGE (payment)-[:DEPENDS_ON]->(auth) MERGE (shipping)-[:DEPENDS_ON]->(auth) MERGE (catalog)-[:MAINTAINED_BY]->(teamA) MERGE (order)-[:MAINTAINED_BY]->(teamB) MERGE (user)-[:MAINTAINED_BY]->(teamC) MERGE (payment)-[:MAINTAINED_BY]->(teamD) MERGE (inventory)-[:MAINTAINED_BY]->(teamA) MERGE (shipping)-[:MAINTAINED_BY]->(teamB) MERGE (review)-[:MAINTAINED_BY]->(teamC) MERGE (recommendation)-[:MAINTAINED_BY]->(teamD) MERGE (auth)-[:MAINTAINED_BY]->(teamA) MERGE (bugFixCatalog)-[:ASSIGNED_TO]->(teamA) MERGE (featureAddOrder)-[:ASSIGNED_TO]->(teamB) MERGE (refactorUser)-[:ASSIGNED_TO]->(teamC) MERGE (optimizePayment)-[:ASSIGNED_TO]->(teamD) MERGE (updateInventory)-[:ASSIGNED_TO]->(teamA) MERGE (enhanceShipping)-[:ASSIGNED_TO]->(teamB) MERGE (reviewFix)-[:ASSIGNED_TO]->(teamC) MERGE (recommendationFeature)-[:ASSIGNED_TO]->(teamD) MERGE (optimizeAuth)-[:ASSIGNED_TO]->(teamA) MERGE (bugFixCatalog)-[:LINKED_TO]->(catalog) MERGE (featureAddOrder)-[:LINKED_TO]->(order) MERGE (refactorUser)-[:LINKED_TO]->(user) MERGE (optimizePayment)-[:LINKED_TO]->(payment) MERGE (updateInventory)-[:LINKED_TO]->(inventory) MERGE (enhanceShipping)-[:LINKED_TO]->(shipping) MERGE (reviewFix)-[:LINKED_TO]->(review) MERGE (recommendationFeature)-[:LINKED_TO]->(recommendation) MERGE (optimizeAuth)-[:LINKED_TO]->(auth) MERGE (alice)-[:PART_OF]->(teamA) MERGE (bob)-[:PART_OF]->(teamB) MERGE (charlie)-[:PART_OF]->(teamC) MERGE (diana)-[:PART_OF]->(teamD) MERGE (eva)-[:PART_OF]->(teamA) MERGE (frank)-[:PART_OF]->(teamB) MERGE (newTask)-[:LINKED_TO]->(auth) MERGE (newTask)-[:ASSIGNED_TO]->(teamA)"
}
27 changes: 27 additions & 0 deletions comps/knowledgegraphs/langchain/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

FROM langchain/langchain:latest

RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
libgl1-mesa-glx \
libjemalloc-dev \
vim

RUN useradd -m -s /bin/bash user && \
mkdir -p /home/user && \
chown -R user /home/user/

COPY comps /home/user/comps

USER user

RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /home/user/comps/knowledgegraphs/requirements.txt

ENV PYTHONPATH=$PYTHONPATH:/home/user

WORKDIR /home/user/comps/knowledgegraphs/langchain

ENTRYPOINT ["python", "knowledge_graph.py"]
21 changes: 21 additions & 0 deletions comps/knowledgegraphs/langchain/ingest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import json
import os

from langchain_community.graphs import Neo4jGraph

neo4j_endpoint = os.getenv("NEO4J_ENDPOINT", "neo4j://localhost:7687")
neo4j_username = os.getenv("NEO4J_USERNAME", "neo4j")
neo4j_password = os.getenv("NEO4J_PASSWORD", "neo4j")
graph = Neo4jGraph(url=neo4j_endpoint, username=neo4j_username, password=neo4j_password)

# remove all nodes
graph.query("MATCH (n) DETACH DELETE n")

# ingest
import_query = json.load(open("data/microservices.json", "r"))["query"]
graph.query(import_query)
print("Total nodes: ", graph.query("MATCH (n) RETURN count(n)"))
print("Total edges: ", graph.query("MATCH ()-->() RETURN count(*)"))
Loading
Loading