[Question] CUDA use in LLamaSharp #545
Replies: 7 comments
-
@vvdb-architecture I've noticed something similar but I could not repro, I would report it to the LLamaSharp project. They will probably ask for logs |
Beta Was this translation helpful? Give feedback.
-
If you add a call to For instance, if you run the code here https://github.com/microsoft/kernel-memory/tree/llamatest the console should contain some useful information. |
Beta Was this translation helpful? Give feedback.
-
It seems that in LLamaSharp the CPU back-end and the Cuda back-ends can't be installed at the same time. I would suggest the maintainers of Kernel-memory to either add a comment in the |
Beta Was this translation helpful? Give feedback.
-
Considering that the service is also packaged as a Docker image, even if we add a comment, the Docker image will have all the LLamaSharp packages, and the issue will persist. We could opt for ollama or LM Studio to support LLama models, maybe removing LLamaSharp. |
Beta Was this translation helpful? Give feedback.
-
It's intended that they should be installable at the same time now. If there are multiple installed LLamaSharp is doing runtime feature detection to try and detect which backend is best to use. There seems to be a bug in that right now though :( |
Beta Was this translation helpful? Give feedback.
-
The runtime detection was available last year too but it never worked in my tests, with runtime always using CPU. Might be about the way assemblies are loaded and persist in memory, just guessing. |
Beta Was this translation helpful? Give feedback.
-
Update: KM v0.72 now includes an Ollama connector, making it extremely easier to work with local models. Example here: https://github.com/microsoft/kernel-memory/blob/main/examples/212-dotnet-ollama/Program.cs |
Beta Was this translation helpful? Give feedback.
-
Context / Scenario
I'm using Kernel-memory with LLamaSharp. Despite having a RTX 3080 and the latest CUDA drivers installed, CUDA is not used.
Question
Not sure if this is a bug or I'm missing something, so here's a question instead:
The LlamaSharp.csproj contains
I found out that if both
Cpu
andCuda12
back-ends are referenced, only the CPU is being used even if the CUDA DLL is loaded.If I remove the reference to
LLamaSharp.Backend.Cpu
, then the CUDA back-end will start to be used.It might be a "latest version thing", I don't know. But here you are.
Beta Was this translation helpful? Give feedback.
All reactions