V.0.1.0 (#3)

* Update .env example & next.config * Update README * Vector Search UI & Skeletal Code to send request to backend * Update themes to use system preference * Update <main> container to be reuseable * Update avatar and chat api endpoint * Update Header to be Mobile Responsive * Update README.md * Update README.md * Create LICENSE * Update README.md * Update README.md * Update README.md * Minor UI Updates * Add Backend API Status page and Status icon to Header * Add Query Function Page * Add Healthcheck Status API * Add Query API * Update Imports * Update API Routes * Update Indexing Function * Update Search Frontend & API * Update Dependencies * Simple healthcheck endpoint test * Update Imports & Dependencies * Update Page * Update README.md * Update README.md * Update Search Functions * Improved UI for table, added toast msgs * Fixed Z depth of sign-in page * Moved 'Header' & 'Main' component to layout
digitalbuiltenvironment · Jan 25, 2024 · caae15f · caae15f
1 parent 38ae0cb
commit caae15f
Show file tree

Hide file tree

Showing 41 changed files with 1,922 additions and 1,204 deletions.
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2024 Digital Built Environment
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -8,15 +8,15 @@
 <div align="center">
 
 [![Status](https://img.shields.io/badge/status-active-success.svg)]()
-[![GitHub Issues](https://img.shields.io/github/issues/digitalbuiltenvironment/Smart-Retrieval.svg)](https://github.com/digitalbuiltenvironment/Smart-Retrieval)
+[![GitHub Issues](https://img.shields.io/github/issues/digitalbuiltenvironment/Smart-Retrieval.svg)](https://github.com/digitalbuiltenvironment/Smart-Retrieval/issues)
 [![GitHub Pull Requests](https://img.shields.io/github/issues-pr/digitalbuiltenvironment/Smart-Retrieval.svg)](https://github.com/digitalbuiltenvironment/Smart-Retrieval/pulls)
 [![License](https://img.shields.io/badge/license-MIT-blue.svg)](/LICENSE)
 
 </div>
 
 ---
 
-<p align="center"> A Large Language Model (LLM) powered chatbot for information retrieval.
+<p align="center"> A Large Language Model (LLM) powered platform for information retrieval.
     <br>
 </p>
 
@@ -26,14 +26,14 @@
 - [Getting Started](#getting_started)
 - [Deployment](#deployment)
 - [Built Using](#built_using)
-- [TODO](../TODO.md)
 - [Contributing](../CONTRIBUTING.md)
 - [Authors](#authors)
 - [Acknowledgments](#acknowledgement)
 
 ## 🧐 About <a name = "about"></a>
 
-Write about 1-2 paragraphs describing the purpose of your project.
+Smart Retrieval is a platform for efficient and streamlined information retrieval, especially in the realm of legal and compliance documents.
+With the power of Open-Source Large Language Models (LLM) and Retrieval Augmented Generation (RAG), it aims to enhance user experiences at JTC by addressing key challenges such as manual search inefficiencies and rigid file naming conventions, revolutionizing the way JTC employees access and comprehend crucial documents
 
 Project files bootstrapped with [`create-llama`](https://github.com/run-llama/LlamaIndexTS/tree/main/packages/create-llama).
 
@@ -50,18 +50,20 @@ These instructions will get you a copy of the project up and running on your loc
 
 ## 🚀 Deployment <a name = "deployment"></a>
 
-Add additional notes about how to deploy this on a live system.
+How to deploy this on a live system.
 
 ## ⛏️ Built Using <a name = "built_using"></a>
 
-- [MongoDB](https://www.mongodb.com/) - Database
-- [Express](https://expressjs.com/) - Server Framework
-- [VueJs](https://vuejs.org/) - Web Framework
-- [NodeJs](https://nodejs.org/en/) - Server Environment
+- [NextJs](https://nextjs.org/) - Frontend Web Framework
+- [Vercel AI](https://vercel.com/ai) - AI SDK library for building AI-powered streaming text and chat UIs.
+- [NodeJs](https://nodejs.org/en/) - Frontend Server Environment
+- [Python](https://python.org/) - Backend Server Environment
+- [FastAPI](https://fastapi.tiangolo.com/) - Backend API Web Framework
+- [LlamaIndex](https://www.llamaindex.ai/) - Data Framework for LLM
 
 ## ✍️ Authors <a name = "authors"></a>
 
-- [@xkhronoz](https://github.com/xkhronoz) - Idea & Initial work
+- [@xkhronoz](https://github.com/xkhronoz) - Initial work
 
 See also the list of [contributors](https://github.com/digitalbuiltenvironment/Smart-Retrieval/contributors) who participated in this project.
 

diff --git a/backend/README.md b/backend/README.md
@@ -1,23 +1,50 @@
-This is a [LlamaIndex](https://www.llamaindex.ai/) backend using [FastAPI](https://fastapi.tiangolo.com/) bootstrapped with [`create-llama`](https://github.com/run-llama/LlamaIndexTS/tree/main/packages/create-llama).
+# Smart Retrieval Backend
+
+The backend is built using Python & [FastAPI](https://fastapi.tiangolo.com/) bootstrapped with [`create-llama`](https://github.com/run-llama/LlamaIndexTS/tree/main/packages/create-llama).
+
+## Requirements
+
+1. Python >= 3.11
+2. Poetry (To manage dependencies)
+   - ```pipx install poetry```
 
 ## Getting Started
 
-First, setup the environment:
+First, setup the `pyproject.toml` file to install the correct version of PyTorch (CPU or GPU):
 
+Comment/Uncomment the following block depending on your system. Use CPU only if you do not have a supported Cuda Device.
+
+```bash
+# For CPU version: Windows and Linux and MacOS (arm64)
+torch = [
+    { url = "https://download.pytorch.org/whl/cpu/torch-2.1.1%2Bcpu-cp311-cp311-win_amd64.whl", markers = "sys_platform == 'win32'" },
+    { url = "https://download.pytorch.org/whl/cpu/torch-2.1.1%2Bcpu-cp311-cp311-linux_x86_64.whl", markers = "sys_platform == 'linux'" },
+    { url = "https://download.pytorch.org/whl/cpu/torch-2.1.1-cp311-none-macosx_11_0_arm64.whl", markers = "sys_platform == 'darwin'" },
+]
+## For GPU version: Windows and Linux and MacOS (arm64)
+# torch = [
+#     { url = "https://download.pytorch.org/whl/cu121/torch-2.1.1%2Bcu121-cp311-cp311-win_amd64.whl", markers = "sys_platform == 'win32'" },
+#     { url = "https://download.pytorch.org/whl/cu121/torch-2.1.1%2Bcu121-cp311-cp311-linux_x86_64.whl", markers = "sys_platform == 'linux'" },
+#     { url = "https://download.pytorch.org/whl/cu121/torch-2.1.1-cp311-none-macosx_11_0_arm64.whl", markers = "sys_platform == 'darwin'" },
+# ]
 ```
+
+Second, setup the environment:
+
+```bash
 poetry install
 poetry shell
 ```
 
-Second, run the development server:
+Third, run the development server:
 
-```
-python main.py
+```bash
+python run.py
 ```
 
 Then call the API endpoint `/api/chat` to see the result:
 
-```
+```bash
 curl --location 'localhost:8000/api/chat' \
 --header 'Content-Type: application/json' \
 --data '{ "messages": [{ "role": "user", "content": "Hello" }] }'
@@ -29,7 +56,7 @@ Open [http://localhost:8000/docs](http://localhost:8000/docs) with your browser
 
 The API allows CORS for all origins to simplify development. You can change this behavior by setting the `ENVIRONMENT` environment variable to `prod`:
 
-```
+```bash
 ENVIRONMENT=prod uvicorn main:app
 ```
 
@@ -38,5 +65,5 @@ ENVIRONMENT=prod uvicorn main:app
 To learn more about LlamaIndex, take a look at the following resources:
 
 - [LlamaIndex Documentation](https://docs.llamaindex.ai) - learn about LlamaIndex.
-
-You can check out [the LlamaIndex GitHub repository](https://github.com/run-llama/llama_index) - your feedback and contributions are welcome!
+- [LlamaIndexTS Documentation](https://ts.llamaindex.ai) - learn about LlamaIndexTS (Typescript features).
+- [FastAPI Documentation](https://fastapi.tiangolo.com/) - learn about FastAPI.
diff --git a/backend/backend/app/api/routers/chat.py b/backend/backend/app/api/routers/chat.py
@@ -7,13 +7,24 @@
 from fastapi.responses import StreamingResponse
 from fastapi.websockets import WebSocketDisconnect
 from llama_index import VectorStoreIndex
-from llama_index.llms.base import ChatMessage, MessageRole
+from llama_index.llms.base import ChatMessage
+from llama_index.llms.types import MessageRole
 from llama_index.memory import ChatMemoryBuffer
 from llama_index.prompts import PromptTemplate
 from pydantic import BaseModel
 
 chat_router = r = APIRouter()
 
+"""
+This router is for chatbot functionality which consist of chat memory and chat engine.
+The chat memory is used to store the chat history and the chat engine is used to query the chat memory and context.
+Chat engine is a wrapper around the query engine and it is used to query the chat memory and context.
+Chat engine also does the following:
+1. Condense the question based on the chat history
+2. Add context to the question
+3. Answer the question
+"""
+
 
 class _Message(BaseModel):
     role: MessageRole
@@ -24,6 +35,7 @@ class _ChatData(BaseModel):
     messages: List[_Message]
 
 
+# custom prompt template to be used by chat engine
 custom_prompt = PromptTemplate(
     """\
 Given a conversation (between Human and Assistant) and a follow up message from Human, \

diff --git a/backend/backend/app/api/routers/healthcheck.py b/backend/backend/app/api/routers/healthcheck.py
@@ -0,0 +1,32 @@
+from fastapi import APIRouter, Request
+
+healthcheck_router = r = APIRouter()
+
+"""
+This router is for healthcheck functionality.
+"""
+
+
+@r.get("")
+async def healthcheck(
+    request: Request,
+    # index: VectorStoreIndex = Depends(get_index),
+):
+    results = {}
+    # check if index is ready
+    # if index:
+    #     results["index"] = True
+    # else:
+    #     results["index"] = False
+
+    # TODO: check if other services are ready
+
+    # logger.info("Healthcheck: {results}")
+
+    results = {"status": "OK"}
+    return results
+
+
+# Simple test to check if the healthcheck endpoint is working
+def test_healthcheck():
+    assert healthcheck() == {"status": "OK"}
diff --git a/backend/backend/app/api/routers/query.py b/backend/backend/app/api/routers/query.py
@@ -0,0 +1,72 @@
+import logging
+from typing import List
+
+from app.utils.index import get_index
+from app.utils.json import json_to_model
+from fastapi import APIRouter, Depends, HTTPException, Request, status
+from fastapi.responses import StreamingResponse
+from fastapi.websockets import WebSocketDisconnect
+from llama_index import VectorStoreIndex
+from llama_index.llms.types import MessageRole
+from pydantic import BaseModel
+
+query_router = r = APIRouter()
+
+"""
+This router is for query functionality which consist of query engine.
+The query engine is used to query the index.
+There is no chat memory used here, every query is independent of each other.
+"""
+
+
+class _Message(BaseModel):
+    role: MessageRole
+    content: str
+
+
+class _ChatData(BaseModel):
+    messages: List[_Message]
+
+
+@r.get("")
+async def search(
+    request: Request,
+    # Note: To support clients sending a JSON object using content-type "text/plain",
+    # we need to use Depends(json_to_model(_ChatData)) here
+    data: _ChatData = Depends(json_to_model(_ChatData)),
+    index: VectorStoreIndex = Depends(get_index),
+):
+    # check preconditions and get last message which is query
+    if len(data.messages) == 0:
+        raise HTTPException(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            detail="No query provided",
+        )
+    query = data.messages.pop()
+    if query.role != MessageRole.USER:
+        raise HTTPException(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            detail="Last message must be from user",
+        )
+    logger = logging.getLogger("uvicorn")
+    logger.info(f"Query: {query}")
+
+    # Query index
+    query_engine = index.as_query_engine(streaming=True, similarity_top_k=1)
+    response = query_engine.query(query)
+
+    # stream response
+    async def event_generator():
+        try:
+            logger = logging.getLogger("uvicorn")
+            for token in response.response_gen:
+                # If client closes connection, stop sending events
+                if await request.is_disconnected():
+                    logger.info("Client disconnected, closing stream")
+                    break
+                yield token
+        except WebSocketDisconnect:
+            # WebSocket was disconnected, gracefully handle it
+            logger.info("Client disconnected, closing stream")
+
+    return StreamingResponse(event_generator(), media_type="text/plain")
diff --git a/backend/backend/app/api/routers/search.py b/backend/backend/app/api/routers/search.py
@@ -0,0 +1,79 @@
+import logging
+import re
+
+from app.utils.index import get_index
+from fastapi import APIRouter, Depends, HTTPException, Request, status
+from llama_index import VectorStoreIndex
+from llama_index.postprocessor import SimilarityPostprocessor
+from llama_index.retrievers import VectorIndexRetriever
+
+search_router = r = APIRouter()
+
+"""
+This router is for search functionality which consist of query engine.
+The query engine is used to query the index.
+It is similar to query except that it does not return the formulated response.
+Instead it returns the relevant information from the index.
+"""
+
+
+@r.get("")
+async def search(
+    request: Request,
+    index: VectorStoreIndex = Depends(get_index),
+):
+    query = request.query_params.get("query")
+    logger = logging.getLogger("uvicorn")
+    logger.info(f"Search: {query}")
+    if query is None:
+        raise HTTPException(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            detail="No search info provided",
+        )
+
+    # configure retriever
+    retriever = VectorIndexRetriever(
+        index=index,
+        similarity_top_k=10,
+    )
+    # similarity postprocessor: filter nodes below 0.45 similarity score
+    node_postprocessor = SimilarityPostprocessor(similarity_cutoff=0.45)
+
+    # retrieve results
+    query_results = retriever.retrieve(query)
+
+    query_results_scores = [result.get_score() for result in query_results]
+
+    logger.info(f"Search results similarity score: {query_results_scores}")
+
+    # postprocess results
+    filtered_results = node_postprocessor.postprocess_nodes(query_results)
+
+    filtered_results_scores = [result.get_score() for result in filtered_results]
+
+    logger.info(f"Filtered Search results similarity score: {filtered_results_scores}")
+
+    response = []
+    id = 1
+    for node in filtered_results:
+        node_dict = node.to_dict()["node"]
+        logger.debug(f"Node dict: {node_dict}")
+        node_metadata = node_dict["metadata"]
+        logger.debug(f"Node metadata: {node_metadata}")
+        data = {}
+        data["id"] = id
+        data["file_name"] = node_metadata["file_name"]
+        data["page_no"] = node_metadata["page_label"]
+        cleaned_text = re.sub(
+            "^_+ | _+$", "", node_dict["text"]
+        )  # remove leading and trailing underscores
+        data["text"] = cleaned_text
+        data["similarity_score"] = round(
+            node.get_score(), 2
+        )  # round to 2 decimal places
+        response.append(data)
+        id += 1
+
+    # TODO: do a reranking of the results and return them?
+    # TODO: do a highlighting of the results in the relevant documents and return them?
+    return response