Add RagAgentDocGrader to agent comp (opea-project#480)

* set hf_hub to 0.24.0 * add docgrader to agent strategy openai llm code passed * add nonstreaming output for agent * add react langgraph and tests * fix react langchain bug Signed-off-by: minmin-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix test script Signed-off-by: minmin-intel <[email protected]> * fix bug in test script Signed-off-by: minmin-intel <[email protected]> * update readme and rm old agentic-rag strategy * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update test and docgrader readme * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug in test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update rag agent strategy name and update readme Signed-off-by: minmin-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update test Signed-off-by: minmin-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: minmin-intel <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
wangkl2 · Aug 19, 2024 · 368c833 · 368c833
1 parent 61dba72
commit 368c833
Show file tree

Hide file tree

Showing 16 changed files with 412 additions and 191 deletions.
diff --git a/comps/agent/langchain/README.md b/comps/agent/langchain/README.md
@@ -1,39 +1,70 @@
-# langchain Agent Microservice
+# Agent Microservice
 
-The langchain agent model refers to a framework that integrates the reasoning capabilities of large language models (LLMs) with the ability to take actionable steps, creating a more sophisticated system that can understand and process information, evaluate situations, take appropriate actions, communicate responses, and track ongoing situations.
+## 1. Overview
 
-![Architecture Overview](agent_arch.jpg)
+This agent microservice is built on Langchain/Langgraph frameworks. Agents integrate the reasoning capabilities of large language models (LLMs) with the ability to take actionable steps, creating a more sophisticated system that can understand and process information, evaluate situations, take appropriate actions, communicate responses, and track ongoing situations.
+
+### 1.1 Supported agent types
+
+We currently support the following types of agents:
+
+1. ReAct: use `react_langchain` or `react_langgraph` as strategy. First introduced in this seminal [paper](https://arxiv.org/abs/2210.03629). The ReAct agent engages in "reason-act-observe" cycles to solve problems. Please refer to this [doc](https://python.langchain.com/v0.2/docs/how_to/migrate_agent/) to understand the differences between the langchain and langgraph versions of react agents.
+2. RAG agent: `rag_agent` strategy. This agent is specifically designed for improving RAG performance. It has the capability to rephrase query, check relevancy of retrieved context, and iterate if context is not relevant.
+3. Plan and execute: `plan_execute` strategy. This type of agent first makes a step-by-step plan given a user request, and then execute the plan sequentially (or in parallel, to be implemented in future). If the execution results can solve the problem, then the agent will output an answer; otherwise, it will replan and execute again.
+   For advanced developers who want to implement their own agent strategies, please refer to [Section 5](#5-customize-agent-strategy) below.
+
+### 1.2 LLM engine
+
+Agents use LLM for reasoning and planning. We support 2 options of LLM engine:
+
+1. Open-source LLMs served with TGI-gaudi. To use open-source llms, follow the instructions in [Section 2](#222-start-microservices) below. Note: we recommend using state-of-the-art LLMs, such as llama3.1-70B-instruct, to get higher success rate.
+2. OpenAI LLMs via API calls. To use OpenAI llms, specify `llm_engine=openai` and `export OPENAI_API_KEY=<your-openai-key>`
+
+### 1.3 Tools
 
-## 🚀1. Start Microservice with Python（Option 1）
+The tools are registered with a yaml file. We support the following types of tools:
 
-### 1.1 Install Requirements
+1. Endpoint: user to provide url
+2. User-defined python functions. This is usually used to wrap endpoints with request post or simple pre/post-processing.
+3. Langchain tool modules.
+   Examples of how to register tools can be found in [Section 4](#-4-provide-your-own-tools) below.
+
+### 1.4 Agent APIs
+
+Currently we have implemented OpenAI chat completion compatible API for agents. We are working to support OpenAI assistants APIs.
+
+# 🚀2. Start Agent Microservice
+
+## 2.1 Option 1: with Python
+
+### 2.1.1 Install Requirements
 
 ```bash
 cd comps/agent/langchain/
 pip install -r requirements.txt
 ```
 
-### 1.2 Start Microservice with Python Script
+### 2.1.2 Start Microservice with Python Script
 
 ```bash
 cd comps/agent/langchain/
 python agent.py
 ```
 
-## 🚀2. Start Microservice with Docker (Option 2)
+## 2.2 Option 2. Start Microservice with Docker
 
-### Build Microservices
+### 2.2.1 Build Microservices
 
 ```bash
 cd GenAIComps/ # back to GenAIComps/ folder
 docker build -t opea/comps-agent-langchain:latest -f comps/agent/langchain/docker/Dockerfile .
 ```
 
-### start microservices
+### 2.2.2 Start microservices
 
 ```bash
 export ip_address=$(hostname -I | awk '{print $1}')
-export model=meta-llama/Meta-Llama-3-8B-Instruct
+export model=mistralai/Mistral-7B-Instruct-v0.3
 export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
 export HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
 
@@ -53,10 +84,10 @@ docker logs comps-langchain-agent-endpoint
 > debug mode
 >
 > ```bash
-> docker run --rm --runtime=runc --name="comps-langchain-agent-endpoint" -v ./comps/agent/langchain/:/home/user/comps/agent/langchain/ -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} --env-file ${agent_env} opea/comps-agent-langchain:latest
+> docker run --rm --runtime=runc --name="comps-langchain-agent-endpoint" -v ./comps/agent/langchain/:/home/user/comps/agent/langchain/ -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e model=${model} -e ip_address=${ip_address} -e strategy=react -e llm_endpoint_url=http://${ip_address}:8080 -e llm_engine=tgi -e recursion_limit=5 -e require_human_feedback=false -e tools=/home/user/comps/agent/langchain/tools/custom_tools.yaml opea/comps-agent-langchain:latest
 > ```
 
-## 🚀3. Validate Microservice
+# 🚀 3. Validate Microservice
 
 Once microservice starts, user can use below script to invoke.
 
@@ -73,7 +104,7 @@ data: [DONE]
 
 ```
 
-## 🚀4. Provide your own tools
+# 🚀 4. Provide your own tools
 
 - Define tools
 
@@ -148,3 +179,8 @@ data: 'The weather information in Austin is not available from the Open Platform
 
 data: [DONE]
 ```
+
+# 5. Customize agent strategy
+
+For advanced developers who want to implement their own agent strategies, you can add a separate folder in `src\strategy`, implement your agent by inherit the `BaseAgent` class, and add your strategy into the `src\agent.py`. The architecture of this agent microservice is shown in the diagram below as a reference.
+![Architecture Overview](agent_arch.jpg)
diff --git a/comps/agent/langchain/agent.py b/comps/agent/langchain/agent.py
@@ -12,7 +12,7 @@
 comps_path = os.path.join(cur_path, "../../../")
 sys.path.append(comps_path)
 
-from comps import LLMParamsDoc, ServiceType, opea_microservices, register_microservice
+from comps import GeneratedDoc, LLMParamsDoc, ServiceType, opea_microservices, register_microservice
 from comps.agent.langchain.src.agent import instantiate_agent
 from comps.agent.langchain.src.utils import get_args
 
@@ -27,20 +27,26 @@
     port=args.port,
     input_datatype=LLMParamsDoc,
 )
-def llm_generate(input: LLMParamsDoc):
+async def llm_generate(input: LLMParamsDoc):
     # 1. initialize the agent
     print("args: ", args)
+    input.streaming = args.streaming
     config = {"recursion_limit": args.recursion_limit}
     agent_inst = instantiate_agent(args, args.strategy)
     print(type(agent_inst))
 
     # 2. prepare the input for the agent
     if input.streaming:
+        print("-----------STREAMING-------------")
         return StreamingResponse(agent_inst.stream_generator(input.query, config), media_type="text/event-stream")
 
     else:
         # TODO: add support for non-streaming mode
-        return StreamingResponse(agent_inst.stream_generator(input.query, config), media_type="text/event-stream")
+        print("-----------NOT STREAMING-------------")
+        response = await agent_inst.non_streaming_run(input.query, config)
+        print("-----------Response-------------")
+        print(response)
+        return GeneratedDoc(text=response, prompt=input.query)
 
 
 if __name__ == "__main__":

diff --git a/comps/agent/langchain/requirements.txt b/comps/agent/langchain/requirements.txt
@@ -4,7 +4,7 @@ docarray[full]
 #used by tools
 duckduckgo-search
 fastapi
-huggingface_hub
+huggingface_hub==0.24.0
 langchain #==0.1.12
 langchain-huggingface
 langchain-openai

diff --git a/comps/agent/langchain/src/agent.py b/comps/agent/langchain/src/agent.py
@@ -2,20 +2,23 @@
 # SPDX-License-Identifier: Apache-2.0
 
 
-def instantiate_agent(args, strategy="react"):
-    if strategy == "react":
+def instantiate_agent(args, strategy="react_langchain"):
+    if strategy == "react_langchain":
         from .strategy.react import ReActAgentwithLangchain
 
         return ReActAgentwithLangchain(args)
+    elif strategy == "react_langgraph":
+        from .strategy.react import ReActAgentwithLanggraph
+
+        return ReActAgentwithLanggraph(args)
     elif strategy == "plan_execute":
         from .strategy.planexec import PlanExecuteAgentWithLangGraph
 
         return PlanExecuteAgentWithLangGraph(args)
-    elif strategy == "agentic_rag":
-        from .strategy.agentic_rag import RAGAgentwithLanggraph
 
-        return RAGAgentwithLanggraph(args)
-    else:
-        from .strategy.base_agent import BaseAgent, BaseAgentState
+    elif strategy == "rag_agent":
+        from .strategy.ragagent import RAGAgent
 
-        return BaseAgent(args)
+        return RAGAgent(args)
+    else:
+        raise ValueError(f"Agent strategy: {strategy} not supported!")
diff --git a/comps/agent/langchain/src/strategy/agentic_rag/README.md b/comps/agent/langchain/src/strategy/agentic_rag/README.md
diff --git a/comps/agent/langchain/src/strategy/agentic_rag/prompt.py b/comps/agent/langchain/src/strategy/agentic_rag/prompt.py
diff --git a/comps/agent/langchain/src/strategy/base_agent.py b/comps/agent/langchain/src/strategy/base_agent.py
@@ -17,3 +17,6 @@ def compile(self):
 
     def execute(self, state: dict):
         pass
+
+    def non_streaming_run(self, query, config):
+        raise NotImplementedError
diff --git a/comps/agent/langchain/src/strategy/ragagent/README.md b/comps/agent/langchain/src/strategy/ragagent/README.md
@@ -0,0 +1,31 @@
+# RAG Agent
+
+This agent is specifically designed to improve answer quality over conventional RAG.
+This agent strategy includes steps listed below:
+
+1. QueryWriter
+   This is an llm with tool calling capability, it decides if tool calls are needed to answer the user query or it can answer with llm's parametric knowledge.
+
+   - Yes: Rephrase the query in the form of a tool call to the Retriever tool, and send the rephrased query to 'Retriever'. The rephrasing is important as user queries may be not be clear and simply using user query may not retrieve relevant documents.
+   - No: Complete the query with Final answer
+
+2. Retriever:
+
+   - Get related documents from a retrieval tool, then send the documents to 'DocumentGrader'. Note: The retrieval tool here is broad-sense, which can be a text retriever over a proprietary knowledge base, a websearch API, knowledge graph API, SQL database API etc.
+
+3. DocumentGrader
+   Judge retrieved info relevance with respect to the user query
+
+   - Yes: Go to TextGenerator
+   - No: Go back to QueryWriter to rewrite query.
+
+4. TextGenerator
+   - Generate an answer based on query and last retrieved context.
+   - After generation, go to END.
+
+Note:
+
+- Currently the performance of this RAG agent has been tested and validated with only one retrieval tool. If you want to use multiple retrieval tools, we recommend a hierarchical multi-agent system where a supervisor agent dispatches requests to multiple worker RAG agents, where individual worker RAG agents uses one type of retrieval tool.
+- The max number of retrieves is set at 3.
+- You can specify a small `recursion_limit` to stop early or a big `recursion_limit` to fully use the 3 retrieves.
+- The TextGenerator only looks at the last retrieved docs.
diff --git a/...hain/src/strategy/agentic_rag/__init__.py → ...ngchain/src/strategy/ragagent/__init__.py b/...hain/src/strategy/agentic_rag/__init__.py → ...ngchain/src/strategy/ragagent/__init__.py
@@ -1,4 +1,4 @@
 # Copyright (C) 2024 Intel Corporation
 # SPDX-License-Identifier: Apache-2.0
 
-from .planner import RAGAgentwithLanggraph
+from .planner import RAGAgent