-
Notifications
You must be signed in to change notification settings - Fork 482
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
27 changed files
with
5,310 additions
and
78 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,15 @@ | ||
FROM python:3.10-bullseye | ||
|
||
WORKDIR /app | ||
|
||
COPY backend backend | ||
|
||
RUN pip install --no-cache-dir --upgrade pip \ | ||
&& pip install --no-cache-dir -r backend/requirements.in | ||
COPY frontend/ . | ||
|
||
RUN apt-get update && apt-get install -y | ||
|
||
RUN curl -fsSL https://deb.nodesource.com/setup_18.x | bash - && apt-get install -y nodejs | ||
|
||
RUN npm install && npm run build | ||
|
||
ENTRYPOINT ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8080"] | ||
RUN pip install --no-cache-dir --upgrade pip && pip install --no-cache-dir -r backend/requirements.in |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,19 @@ | ||
# Question the Docs | ||
# Question the Docs :book: | ||
|
||
FARM stack tutorial here: https://www.mongodb.com/developer/languages/python/farm-stack-fastapi-react-mongodb/ | ||
This app introduces the FARMS stack - FastAPI, React, MongoDB and SuperDuperDB. Full details on the FARM stack are available [here](https://www.mongodb.com/developer/languages/python/farm-stack-fastapi-react-mongodb/). | ||
|
||
FARM stack repo here: https://github.com/mongodb-developer/FARM-Intro | ||
## Frontend :art: | ||
|
||
The frontend has been developed with Node.js version 18.17.1. The packages can be installed with `npm install --prefix frontend/` and the app run with `npm run dev --prefix frontend`. | ||
|
||
## Backend :computer: | ||
|
||
The backend has been developed with CPython 3.8. To begin, you will need to create a GitHub PAT token and set this as an environment variable (`GITHUB_TOKEN`) in your local environment. This token is required for interacting with the GitHub API. See `backend/ai/utils/github.py` for details. | ||
|
||
Next, you will need to setup an account with MongoDB Atlas and configure a cluster for access with the app. You should set the URI for this cluster as an environment variable (`mongo_uri`). If all goes well, you should end with something like `mongo_uri="mongodb+srv://<USER>:<PASSWORD>@<CLUSTER>.qwekqo3.mongodb.net/<DB>?retryWrites=true&w=majority"`. Please contact Timo if there are any issues at this stage. | ||
|
||
Finally, you will also need to create an OpenAI account, get a token and set this as an environment variable (`OPENAI_API_KEY`). | ||
|
||
After you have set these environment variables, to run the backend, install the Python environment in `backend/requirement.in` and start the webserver (eg `uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload`). | ||
|
||
Good luck! :rocket: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,24 @@ | ||
from backend.ai.utils.github import get_repo_details, save_github_md_files_locally | ||
from backend.ai.utils.github import save_github_md_files_locally | ||
from backend.ai.utils.text import chunk_file_contents | ||
from backend.config import settings | ||
|
||
from superduperdb.container.document import Document | ||
from superduperdb.db.mongodb.query import Collection | ||
|
||
|
||
def _create_ai_text_artifacts(repo): | ||
files = save_github_md_files_locally(repo) | ||
# Chunked text is more suitable input for the AI models | ||
ai_text_artifacts = chunk_file_contents(files) | ||
return ai_text_artifacts | ||
|
||
|
||
def load_ai_artifacts(db): | ||
for repo_url in settings.default_repos: | ||
details = get_repo_details(repo_url) | ||
repo = details['repo'] | ||
for repo in settings.default_repos: | ||
# Skip if already exists in database | ||
if repo in db.show('vector_index'): | ||
continue | ||
artifacts = _create_ai_text_artifacts(details) | ||
|
||
artifacts = _create_ai_text_artifacts(repo) | ||
documents = [Document({settings.vector_embedding_key: v}) for v in artifacts] | ||
db.execute(Collection(name=repo).insert_many(documents)) | ||
|
||
|
||
def _create_ai_text_artifacts(repo_details): | ||
files = save_github_md_files_locally(repo_details) | ||
ai_text_artifacts = chunk_file_contents(files) | ||
return ai_text_artifacts |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,31 +1,38 @@ | ||
import typing as t | ||
|
||
from pydantic import BaseSettings | ||
|
||
|
||
class FastAPISettings(BaseSettings): | ||
mongo_uri: str = 'mongodb://localhost:27017/' | ||
mongo_db_name: str = 'documentation' | ||
mongo_collection_name: str = "docs" | ||
port: int = 8000 | ||
host: str = "0.0.0.0" | ||
debug_mode: bool = False | ||
|
||
|
||
class AISettings(FastAPISettings): | ||
# Model details | ||
vector_index_name: str = 'documentation_index' | ||
vector_embedding_model: str = 'text-embedding-ada-002' | ||
vector_embedding_key: str = 'text' | ||
qa_model: str = 'gpt-3.5-turbo' | ||
doc_file_levels: int = 3 | ||
doc_file_ext: str = 'md' | ||
default_repos: list = [ | ||
'https://github.com/SuperDuperDB/superduperdb/tree/main', | ||
'https://github.com/langchain-ai/langchain/tree/master', | ||
'https://github.com/lm-sys/FastChat/tree/main' | ||
default_repos: t.List[str] = [ | ||
'superduperdb', | ||
'langchain', | ||
'fastchat', | ||
] | ||
|
||
# Query configuration | ||
nearest_to_query: int = 5 | ||
|
||
PROMPT: str = '''Use the following descriptions and code-snippets to answer the question. | ||
Do NOT use any information you have learned about other python packages. | ||
ONLY base your answer on the code-snippets retrieved: | ||
{context} | ||
Here's the question: | ||
''' | ||
|
||
|
||
settings = AISettings() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,24 @@ | ||
# fly.toml app configuration file generated for question-the-doc on 2023-08-16T14:04:57+02:00 | ||
# fly.toml app configuration file generated for question-the-docs on 2023-08-18T15:45:29+02:00 | ||
# | ||
# See https://fly.io/docs/reference/configuration/ for information about how to use this file. | ||
# | ||
|
||
app = "question-the-doc" | ||
primary_region = "cdg" | ||
app = "question-the-docs" | ||
primary_region = "ams" | ||
|
||
[build] | ||
|
||
[http_service] | ||
internal_port = 8080 | ||
force_https = true | ||
auto_stop_machines = true | ||
[processes] | ||
worker = "uvicorn backend.main:app --host 0.0.0.0 --port 8000" | ||
|
||
[[services]] | ||
protocol = "" | ||
internal_port = 8000 | ||
auto_stop_machines = false | ||
auto_start_machines = true | ||
min_machines_running = 0 | ||
processes = ["app"] | ||
min_machines_running = 1 | ||
processes = ["worker"] | ||
|
||
[[statics]] | ||
guest_path = "/app/build" | ||
url_prefix = "/" |
Oops, something went wrong.