-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rag fusion rw 002 vector database #3
base: master
Are you sure you want to change the base?
Rag fusion rw 002 vector database #3
Conversation
update from main
Description: This commit introduces Chroma, a powerful vector database, to enhance our search functionality. The 'vector_search()' function now performs actual Chroma vector searches, replacing the previous mock database. For each document retrieved from Chroma, random scores are assigned, maintaining our existing scoring mechanism. This integration improves the accuracy and relevance of search results, offering a more robust search experience.
Removed comments and line spacing. |
|
yes, using financial balance sheet and P/L sheet I want to query data on it. |
Hey @richardwhiteii |
Hi @richardwhiteii and @Navanit-git First off, a huge thanks to both of you for your dedication and hard work on the RAG Fusion project. However, I'm a bit concerned about the added complexity, especially considering beginners who might be using this project as a stepping stone in their learning journey.
To make this more accessible, I propose:
I'd love to hear your thoughts on these suggestions. Thanks again for your invaluable contribution, and I eagerly await your perspective on making the project more beginner-friendly. Cheers, |
This is a really interesting idea! |
I understand. I can bounce some updates your way and let me know what you think. To make sure I'm going in the right direction. |
I made some updates specifically I removed the logging and added docstrings and comments. Let me know your thoughts. How do you envision the branch tailored for beginners looking? |
Implement vector search using Chroma DB, this was the first one I found that I could quickly understand.
I expect it is notional and will later support any vector database.
This migrates vector search from random mock data to using the Chroma database. Document text and metadata are retrieved from Chroma and passed through the pipeline. Additional logging provides visibility into the process. Reciprocal rank fusion is updated to work with the Chroma results structure.
Update improves the backend search functionality using a real vector database, while preserving the existing pipeline structure.
TODO:
Better understand vector search to remove "random"
Remove logging
Refactor the functions now that they are larger.