Skip to content

Commit

Permalink
V.0.1.0 (#3)
Browse files Browse the repository at this point in the history
* Update .env example & next.config

* Update README

* Vector Search UI & Skeletal Code to send request to backend

* Update themes to use system preference

* Update <main> container to be reuseable

* Update avatar and chat api endpoint

* Update Header to be Mobile Responsive

* Update README.md

* Update README.md

* Create LICENSE

* Update README.md

* Update README.md

* Update README.md

* Minor UI Updates

* Add Backend API Status page and Status icon to Header

* Add Query Function Page

* Add Healthcheck Status API

* Add Query API

* Update Imports

* Update API Routes

* Update Indexing Function

* Update Search Frontend & API

* Update Dependencies

* Simple healthcheck endpoint test

* Update Imports & Dependencies

* Update Page

* Update README.md

* Update README.md

* Update Search Functions

* Improved UI for table, added toast msgs

* Fixed Z depth of sign-in page

* Moved 'Header' & 'Main' component to layout
  • Loading branch information
xKhronoz authored Jan 25, 2024
1 parent 38ae0cb commit caae15f
Show file tree
Hide file tree
Showing 41 changed files with 1,922 additions and 1,204 deletions.
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024 Digital Built Environment

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
22 changes: 12 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,15 @@
<div align="center">

[![Status](https://img.shields.io/badge/status-active-success.svg)]()
[![GitHub Issues](https://img.shields.io/github/issues/digitalbuiltenvironment/Smart-Retrieval.svg)](https://github.com/digitalbuiltenvironment/Smart-Retrieval)
[![GitHub Issues](https://img.shields.io/github/issues/digitalbuiltenvironment/Smart-Retrieval.svg)](https://github.com/digitalbuiltenvironment/Smart-Retrieval/issues)
[![GitHub Pull Requests](https://img.shields.io/github/issues-pr/digitalbuiltenvironment/Smart-Retrieval.svg)](https://github.com/digitalbuiltenvironment/Smart-Retrieval/pulls)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](/LICENSE)

</div>

---

<p align="center"> A Large Language Model (LLM) powered chatbot for information retrieval.
<p align="center"> A Large Language Model (LLM) powered platform for information retrieval.
<br>
</p>

Expand All @@ -26,14 +26,14 @@
- [Getting Started](#getting_started)
- [Deployment](#deployment)
- [Built Using](#built_using)
- [TODO](../TODO.md)
- [Contributing](../CONTRIBUTING.md)
- [Authors](#authors)
- [Acknowledgments](#acknowledgement)

## 🧐 About <a name = "about"></a>

Write about 1-2 paragraphs describing the purpose of your project.
Smart Retrieval is a platform for efficient and streamlined information retrieval, especially in the realm of legal and compliance documents.
With the power of Open-Source Large Language Models (LLM) and Retrieval Augmented Generation (RAG), it aims to enhance user experiences at JTC by addressing key challenges such as manual search inefficiencies and rigid file naming conventions, revolutionizing the way JTC employees access and comprehend crucial documents

Project files bootstrapped with [`create-llama`](https://github.com/run-llama/LlamaIndexTS/tree/main/packages/create-llama).

Expand All @@ -50,18 +50,20 @@ These instructions will get you a copy of the project up and running on your loc

## 🚀 Deployment <a name = "deployment"></a>

Add additional notes about how to deploy this on a live system.
How to deploy this on a live system.

## ⛏️ Built Using <a name = "built_using"></a>

- [MongoDB](https://www.mongodb.com/) - Database
- [Express](https://expressjs.com/) - Server Framework
- [VueJs](https://vuejs.org/) - Web Framework
- [NodeJs](https://nodejs.org/en/) - Server Environment
- [NextJs](https://nextjs.org/) - Frontend Web Framework
- [Vercel AI](https://vercel.com/ai) - AI SDK library for building AI-powered streaming text and chat UIs.
- [NodeJs](https://nodejs.org/en/) - Frontend Server Environment
- [Python](https://python.org/) - Backend Server Environment
- [FastAPI](https://fastapi.tiangolo.com/) - Backend API Web Framework
- [LlamaIndex](https://www.llamaindex.ai/) - Data Framework for LLM

## ✍️ Authors <a name = "authors"></a>

- [@xkhronoz](https://github.com/xkhronoz) - Idea & Initial work
- [@xkhronoz](https://github.com/xkhronoz) - Initial work

See also the list of [contributors](https://github.com/digitalbuiltenvironment/Smart-Retrieval/contributors) who participated in this project.

Expand Down
45 changes: 36 additions & 9 deletions backend/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,50 @@
This is a [LlamaIndex](https://www.llamaindex.ai/) backend using [FastAPI](https://fastapi.tiangolo.com/) bootstrapped with [`create-llama`](https://github.com/run-llama/LlamaIndexTS/tree/main/packages/create-llama).
# Smart Retrieval Backend

The backend is built using Python & [FastAPI](https://fastapi.tiangolo.com/) bootstrapped with [`create-llama`](https://github.com/run-llama/LlamaIndexTS/tree/main/packages/create-llama).

## Requirements

1. Python >= 3.11
2. Poetry (To manage dependencies)
- ```pipx install poetry```

## Getting Started

First, setup the environment:
First, setup the `pyproject.toml` file to install the correct version of PyTorch (CPU or GPU):

Comment/Uncomment the following block depending on your system. Use CPU only if you do not have a supported Cuda Device.

```bash
# For CPU version: Windows and Linux and MacOS (arm64)
torch = [
{ url = "https://download.pytorch.org/whl/cpu/torch-2.1.1%2Bcpu-cp311-cp311-win_amd64.whl", markers = "sys_platform == 'win32'" },
{ url = "https://download.pytorch.org/whl/cpu/torch-2.1.1%2Bcpu-cp311-cp311-linux_x86_64.whl", markers = "sys_platform == 'linux'" },
{ url = "https://download.pytorch.org/whl/cpu/torch-2.1.1-cp311-none-macosx_11_0_arm64.whl", markers = "sys_platform == 'darwin'" },
]
## For GPU version: Windows and Linux and MacOS (arm64)
# torch = [
# { url = "https://download.pytorch.org/whl/cu121/torch-2.1.1%2Bcu121-cp311-cp311-win_amd64.whl", markers = "sys_platform == 'win32'" },
# { url = "https://download.pytorch.org/whl/cu121/torch-2.1.1%2Bcu121-cp311-cp311-linux_x86_64.whl", markers = "sys_platform == 'linux'" },
# { url = "https://download.pytorch.org/whl/cu121/torch-2.1.1-cp311-none-macosx_11_0_arm64.whl", markers = "sys_platform == 'darwin'" },
# ]
```
Second, setup the environment:
```bash
poetry install
poetry shell
```
Second, run the development server:
Third, run the development server:
```
python main.py
```bash
python run.py
```
Then call the API endpoint `/api/chat` to see the result:
```
```bash
curl --location 'localhost:8000/api/chat' \
--header 'Content-Type: application/json' \
--data '{ "messages": [{ "role": "user", "content": "Hello" }] }'
Expand All @@ -29,7 +56,7 @@ Open [http://localhost:8000/docs](http://localhost:8000/docs) with your browser
The API allows CORS for all origins to simplify development. You can change this behavior by setting the `ENVIRONMENT` environment variable to `prod`:
```
```bash
ENVIRONMENT=prod uvicorn main:app
```
Expand All @@ -38,5 +65,5 @@ ENVIRONMENT=prod uvicorn main:app
To learn more about LlamaIndex, take a look at the following resources:
- [LlamaIndex Documentation](https://docs.llamaindex.ai) - learn about LlamaIndex.

You can check out [the LlamaIndex GitHub repository](https://github.com/run-llama/llama_index) - your feedback and contributions are welcome!
- [LlamaIndexTS Documentation](https://ts.llamaindex.ai) - learn about LlamaIndexTS (Typescript features).
- [FastAPI Documentation](https://fastapi.tiangolo.com/) - learn about FastAPI.
14 changes: 13 additions & 1 deletion backend/backend/app/api/routers/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,24 @@
from fastapi.responses import StreamingResponse
from fastapi.websockets import WebSocketDisconnect
from llama_index import VectorStoreIndex
from llama_index.llms.base import ChatMessage, MessageRole
from llama_index.llms.base import ChatMessage
from llama_index.llms.types import MessageRole
from llama_index.memory import ChatMemoryBuffer
from llama_index.prompts import PromptTemplate
from pydantic import BaseModel

chat_router = r = APIRouter()

"""
This router is for chatbot functionality which consist of chat memory and chat engine.
The chat memory is used to store the chat history and the chat engine is used to query the chat memory and context.
Chat engine is a wrapper around the query engine and it is used to query the chat memory and context.
Chat engine also does the following:
1. Condense the question based on the chat history
2. Add context to the question
3. Answer the question
"""


class _Message(BaseModel):
role: MessageRole
Expand All @@ -24,6 +35,7 @@ class _ChatData(BaseModel):
messages: List[_Message]


# custom prompt template to be used by chat engine
custom_prompt = PromptTemplate(
"""\
Given a conversation (between Human and Assistant) and a follow up message from Human, \
Expand Down
32 changes: 32 additions & 0 deletions backend/backend/app/api/routers/healthcheck.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from fastapi import APIRouter, Request

healthcheck_router = r = APIRouter()

"""
This router is for healthcheck functionality.
"""


@r.get("")
async def healthcheck(
request: Request,
# index: VectorStoreIndex = Depends(get_index),
):
results = {}
# check if index is ready
# if index:
# results["index"] = True
# else:
# results["index"] = False

# TODO: check if other services are ready

# logger.info("Healthcheck: {results}")

results = {"status": "OK"}
return results


# Simple test to check if the healthcheck endpoint is working
def test_healthcheck():
assert healthcheck() == {"status": "OK"}
72 changes: 72 additions & 0 deletions backend/backend/app/api/routers/query.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import logging
from typing import List

from app.utils.index import get_index
from app.utils.json import json_to_model
from fastapi import APIRouter, Depends, HTTPException, Request, status
from fastapi.responses import StreamingResponse
from fastapi.websockets import WebSocketDisconnect
from llama_index import VectorStoreIndex
from llama_index.llms.types import MessageRole
from pydantic import BaseModel

query_router = r = APIRouter()

"""
This router is for query functionality which consist of query engine.
The query engine is used to query the index.
There is no chat memory used here, every query is independent of each other.
"""


class _Message(BaseModel):
role: MessageRole
content: str


class _ChatData(BaseModel):
messages: List[_Message]


@r.get("")
async def search(
request: Request,
# Note: To support clients sending a JSON object using content-type "text/plain",
# we need to use Depends(json_to_model(_ChatData)) here
data: _ChatData = Depends(json_to_model(_ChatData)),
index: VectorStoreIndex = Depends(get_index),
):
# check preconditions and get last message which is query
if len(data.messages) == 0:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="No query provided",
)
query = data.messages.pop()
if query.role != MessageRole.USER:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Last message must be from user",
)
logger = logging.getLogger("uvicorn")
logger.info(f"Query: {query}")

# Query index
query_engine = index.as_query_engine(streaming=True, similarity_top_k=1)
response = query_engine.query(query)

# stream response
async def event_generator():
try:
logger = logging.getLogger("uvicorn")
for token in response.response_gen:
# If client closes connection, stop sending events
if await request.is_disconnected():
logger.info("Client disconnected, closing stream")
break
yield token
except WebSocketDisconnect:
# WebSocket was disconnected, gracefully handle it
logger.info("Client disconnected, closing stream")

return StreamingResponse(event_generator(), media_type="text/plain")
79 changes: 79 additions & 0 deletions backend/backend/app/api/routers/search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
import logging
import re

from app.utils.index import get_index
from fastapi import APIRouter, Depends, HTTPException, Request, status
from llama_index import VectorStoreIndex
from llama_index.postprocessor import SimilarityPostprocessor
from llama_index.retrievers import VectorIndexRetriever

search_router = r = APIRouter()

"""
This router is for search functionality which consist of query engine.
The query engine is used to query the index.
It is similar to query except that it does not return the formulated response.
Instead it returns the relevant information from the index.
"""


@r.get("")
async def search(
request: Request,
index: VectorStoreIndex = Depends(get_index),
):
query = request.query_params.get("query")
logger = logging.getLogger("uvicorn")
logger.info(f"Search: {query}")
if query is None:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="No search info provided",
)

# configure retriever
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=10,
)
# similarity postprocessor: filter nodes below 0.45 similarity score
node_postprocessor = SimilarityPostprocessor(similarity_cutoff=0.45)

# retrieve results
query_results = retriever.retrieve(query)

query_results_scores = [result.get_score() for result in query_results]

logger.info(f"Search results similarity score: {query_results_scores}")

# postprocess results
filtered_results = node_postprocessor.postprocess_nodes(query_results)

filtered_results_scores = [result.get_score() for result in filtered_results]

logger.info(f"Filtered Search results similarity score: {filtered_results_scores}")

response = []
id = 1
for node in filtered_results:
node_dict = node.to_dict()["node"]
logger.debug(f"Node dict: {node_dict}")
node_metadata = node_dict["metadata"]
logger.debug(f"Node metadata: {node_metadata}")
data = {}
data["id"] = id
data["file_name"] = node_metadata["file_name"]
data["page_no"] = node_metadata["page_label"]
cleaned_text = re.sub(
"^_+ | _+$", "", node_dict["text"]
) # remove leading and trailing underscores
data["text"] = cleaned_text
data["similarity_score"] = round(
node.get_score(), 2
) # round to 2 decimal places
response.append(data)
id += 1

# TODO: do a reranking of the results and return them?
# TODO: do a highlighting of the results in the relevant documents and return them?
return response
Loading

0 comments on commit caae15f

Please sign in to comment.