RAG app example #118

heyjustinai · 2024-11-18T22:36:11Z

What does this PR do?

Creating a E2E RAG example that is able to do retrieval on documents and answer user questions. Components included:

Inference (with llama-stack)
Memory (with llama-stack)
Agent (with llama-stack)
Frontend (with Gradio)

Feature/Issue validation/testing/test plan

1120.mov

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Thanks for contributing 🎉!

into rag-app

This reverts commit 7b00a1c.

into rag-app

examples/agents/rag_with_memory_bank.py

ashwinb · 2024-11-21T01:45:00Z

examples/E2E-RAG-App/01_ingest_files.py

@@ -0,0 +1,181 @@
+import argparse


I think we should simply call this directory DocQA with the organization:

DocQA

app.py

README.md

scripts/

data/

Could we also avoid prefixing files 01_ or 02_ etc.

ashwinb · 2024-11-21T01:46:52Z

examples/E2E-RAG-App/README.md

+
+### How to run the pipeline:
+
+![RAG_workflow](./RAG_workflow.jpg)


any chance we can simplify this diagram a lot? actually, I think a simpler inline Mermaid diagram which shows just the basic high-level flow would be more useful. docker, etc. should be completely avoided.

ashwinb · 2024-11-21T01:47:34Z

examples/E2E-RAG-App/README.md

+}
+```
+
+2. Inside of docker folder, `run_RAG.sh` is the main script that can create `.env` file for compose.yaml and then actually start the `docker compose` process to launch all the pipelines in our dockers. `compose.yaml` is the main docker yaml that specifies all the mount option and docker configs, change the mounts if needed.


these details are unnecessary I think. these scripts are very simple and self-documenting in a way.

ashwinb · 2024-11-21T01:48:20Z

examples/E2E-RAG-App/README.md

+
+4. ChromaDB docker will also start. This docker will host the chroma database that can interact with llama-stack.
+
+5. Lastly, Llama-stack docker will start. The `llama_stack_start.sh` control the docker startup behavior, change it if needed. It will first run the ingestion pipeline to convert all the documents into MarkDown files. Then, it will run llama-stack server based on the  `llama_stack_run.yaml` config. Once the server is ready, then it will run the `gradio_interface.py` which will insert document chunks into memory_bank and start the UI for user interaction.


we have scripts within scripts, is it possible to inline some of them perhaps?

examples/E2E-RAG-App/gradio_interface.py

ashwinb · 2024-11-21T01:54:47Z

examples/E2E-RAG-App/gradio_interface.py

+                    ],
+                    "query_generator_config": {"type": "default", "sep": " "},
+                    "max_tokens_in_context": 300,
+                    "max_chunks": 5,


I think we should just elide this and leave it to default. because we don't want people to be thinking about all these pieces when they first look at the stack (and even later if we do a good job)

sounds good, will leave out "query_generator_config": {"type": "default", "sep": " "},

but since we are running ollama locally, we will need to keep the max_tokens_in_context and max_chunks for it to run at resonable speed

examples/E2E-RAG-App/gradio_interface.py

requirements.txt

examples/E2E-RAG-App/data/eval/eval.py

… a ragservice template

init27 and others added 30 commits November 12, 2024 05:36

Create Readme.MD

14c1860

rag_main works for single-turn

1922f9e

multi-turn support

7a4fed3

added persistent memory

5b9af9a

removed faiss

1e6aebf

included external chromadb

c06b2ad

added query, implement cprint

4785432

Create ingestion_script.py

e9e958a

Update ingestion_script.py

8056d41

Update ingestion_script.py

0a928e6

Update ingestion_script.py

ba21470

Update ingestion_script.py

14a8452

Update README.md

5ff9ca5

fix doc retrieval issue, inclu requirement.txt

94174d6

include filename in context, added debugging for query_chromadb

ae71007

modify embeddings to improve retrieval

fca7b76

added eval dataset, change ingestion script dir and updated gitignore

75c6efa

change queries to reflect updated data

88cd585

added eval and eval analysis

c4a86f4

small fixes

7fcab0f

ollama working with docker, but still need to merge

f3d3921

added stack memory integration

81cb93c

Create ingest_with_image_extraction.py

306f293

changed ollama_main to use memory_bank, simplified rag_stack_memory

e1c0dcb

updated ollama-main

8488969

add script

780bf94

Merge branch 'rag-app' of https://github.com/meta-llama/llama-stack-apps

49396db

into rag-app

add script

23a35e5

add how to run

0984535

fix path

50aeaa0

Changes req by Kai

48c3641

heyjustinai marked this pull request as ready for review November 20, 2024 21:11

heyjustinai requested review from ashwinb, yanxi0830, hardikjshah, dltn and raghotham as code owners November 20, 2024 21:11

Update 01_ingest_files.py

0f727c4

heyjustinai changed the title ~~[WIP] Rag app~~ RAG app example Nov 20, 2024

wukaixingxp and others added 6 commits November 20, 2024 13:38

kind of working

8f240ed

Update gradio_interface.py

7b00a1c

wip-eval: trying to get it work with current stack

cda78f5

Revert "Update gradio_interface.py"

778fddf

This reverts commit 7b00a1c.

Merge branch 'rag-app' of https://github.com/meta-llama/llama-stack-apps

f95f046

into rag-app

stop ingest when there is output folder

5c1d91c

ashwinb requested changes Nov 21, 2024

View reviewed changes

yanxi0830 reviewed Nov 21, 2024

View reviewed changes

examples/E2E-RAG-App/data/eval/eval.py Outdated Show resolved Hide resolved

dltn and others added 13 commits November 21, 2024 10:10

Update README.md

aae8dea

modified eval for 0.0.53

4230d55

undo changes on example.agent.rag_with_memory_bank

708ba38

changes made before PR review, stable branch

fe7fcee

removed default value for memory tool, modified gitignore and created…

f22cc09

… a ragservice template

changes to run locally v0.55

ab3a31e

change handling of eventlog and streaming

9dcc817

move requirements to proj dir

865f1bd

removed unnecesary requirement.txt from root

2296546

updated eval

01f700c

include readme for eval

7e9abd8

remove external chroma, using only memorybank

30ffd0e

changed to app.py and add GPU flag

f1501be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG app example #118

RAG app example #118

heyjustinai commented Nov 18, 2024 •

edited

Loading

ashwinb Nov 21, 2024

ashwinb Nov 21, 2024

ashwinb Nov 21, 2024

ashwinb Nov 21, 2024

ashwinb Nov 21, 2024

heyjustinai Nov 25, 2024


		### How to run the pipeline:

		![RAG_workflow](./RAG_workflow.jpg)


		4. ChromaDB docker will also start. This docker will host the chroma database that can interact with llama-stack.

		5. Lastly, Llama-stack docker will start. The `llama_stack_start.sh` control the docker startup behavior, change it if needed. It will first run the ingestion pipeline to convert all the documents into MarkDown files. Then, it will run llama-stack server based on the `llama_stack_run.yaml` config. Once the server is ready, then it will run the `gradio_interface.py` which will insert document chunks into memory_bank and start the UI for user interaction.

RAG app example #118

Are you sure you want to change the base?

RAG app example #118

Conversation

heyjustinai commented Nov 18, 2024 • edited Loading

What does this PR do?

Feature/Issue validation/testing/test plan

Before submitting

ashwinb Nov 21, 2024

Choose a reason for hiding this comment

ashwinb Nov 21, 2024

Choose a reason for hiding this comment

ashwinb Nov 21, 2024

Choose a reason for hiding this comment

ashwinb Nov 21, 2024

Choose a reason for hiding this comment

ashwinb Nov 21, 2024

Choose a reason for hiding this comment

heyjustinai Nov 25, 2024

Choose a reason for hiding this comment

heyjustinai commented Nov 18, 2024 •

edited

Loading