Reranker Models are awfully slow on macOS #2925
Unanswered
AlphaMoury
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Re-rank models are slow for that it need to generate and calculate embedding of nealy hundred of chunks to compute similarity between chunks and query. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have updated the latest version of RagFlow in macOS. However, when I'm doing tests over small pdf documents, even using ChatGPT API, the answers take long time to be retrieved when reranker models are included.
This is happening in the retrieval testing and in the Chat Module as well.
Is there any reason for it?
Maybe the reranker is not taking in account the Metal Plugin so that the inference is taking longer?
In previous versions of RagFlow, macOS version was running correctly up to long documents, where the task executor would get stuck after thousand + pages processed.
Are there some ideas where the source code could be analyzed so that the MPS performance could be used correctly?
Beta Was this translation helpful? Give feedback.
All reactions