-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RAG app example #118
base: main
Are you sure you want to change the base?
RAG app example #118
Conversation
This reverts commit 7b00a1c.
@@ -0,0 +1,181 @@ | |||
import argparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should simply call this directory DocQA
with the organization:
- DocQA
- app.py
- README.md
- scripts/
- data/
Could we also avoid prefixing files 01_
or 02_
etc.
|
||
### How to run the pipeline: | ||
|
||
![RAG_workflow](./RAG_workflow.jpg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any chance we can simplify this diagram a lot? actually, I think a simpler inline Mermaid diagram which shows just the basic high-level flow would be more useful. docker, etc. should be completely avoided.
} | ||
``` | ||
|
||
2. Inside of docker folder, `run_RAG.sh` is the main script that can create `.env` file for compose.yaml and then actually start the `docker compose` process to launch all the pipelines in our dockers. `compose.yaml` is the main docker yaml that specifies all the mount option and docker configs, change the mounts if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these details are unnecessary I think. these scripts are very simple and self-documenting in a way.
examples/E2E-RAG-App/README.md
Outdated
|
||
4. ChromaDB docker will also start. This docker will host the chroma database that can interact with llama-stack. | ||
|
||
5. Lastly, Llama-stack docker will start. The `llama_stack_start.sh` control the docker startup behavior, change it if needed. It will first run the ingestion pipeline to convert all the documents into MarkDown files. Then, it will run llama-stack server based on the `llama_stack_run.yaml` config. Once the server is ready, then it will run the `gradio_interface.py` which will insert document chunks into memory_bank and start the UI for user interaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have scripts within scripts, is it possible to inline some of them perhaps?
], | ||
"query_generator_config": {"type": "default", "sep": " "}, | ||
"max_tokens_in_context": 300, | ||
"max_chunks": 5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just elide this and leave it to default. because we don't want people to be thinking about all these pieces when they first look at the stack (and even later if we do a good job)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, will leave out "query_generator_config": {"type": "default", "sep": " "},
but since we are running ollama locally, we will need to keep the max_tokens_in_context
and max_chunks
for it to run at resonable speed
… a ragservice template
What does this PR do?
Creating a E2E RAG example that is able to do retrieval on documents and answer user questions. Components included:
Inference (with llama-stack)
Memory (with llama-stack)
Agent (with llama-stack)
Frontend (with Gradio)
Feature/Issue validation/testing/test plan
1120.mov
Before submitting
Pull Request section?
to it if that's the case.
Thanks for contributing 🎉!