-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: RAG system #5502
Comments
Assuming that the retrieval step has been done externally, you can apply your own template to the result and pass the formatted string to the model via |
So i should get the relevant context from the retriever and pass it to the LLM.generate()? Then where should i pass the question? And how shoul i define a prompt and pass it to the LLM? |
You should use a library that's dedicated to RAG to perform the retrieval based on the question. vLLM can only handle the model generation part. |
`from langchain.llms import VLLM def create_query_engine(vectorstore_path):
def get_relevant_docs(question, retriever): def get_query_response(query: str, query_engine): def chat():
if name == 'main': |
It appears that you're using the LangChain integration. In that case, you should probably ask over at their repo since you're asking about how to use the integration itself rather than how to use vLLM directly. (The LangChain integration is not part of this repo) |
I know it is not part of this repo, but if i could replace that langchain part with this repo and it would work thats totally perfect. |
So can it be done with this vLLM or not? |
As mentioned before, vLLM can only handle the model generation part. If you're not using the LangChain integration, then you have to write your own code to link together the different components of RAG. |
|
Your current environment
How would you like to use vllm
I want to run a RAG system using vLLM. Is it supported or not. I want to use the vLLM to use a llm model, pass the relevant docs to it and get the answer from it. I can't define a prompt template with the context and question. Can someone help me with this?
The text was updated successfully, but these errors were encountered: