-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan: memory management issue with ggml update #481
Comments
(dumb question if you already know this, but are you using |
I tried, but with the API changes it was annoying to try and fix things at every bisect step. I also tried reverting Vulkan related commits one by one, but I couldn't identify the culprit easily this way either. |
Good. Yea its annoying to also change sd.cpp code. But it still works. :)
|
I gave up trying to fugure out the commit that introduced this behavior (the root cause is probably from the AMD drivers anyways). |
When available VRAM becomes low, it looks like the Vulkan backend now allocates compute buffer on the shared memory, which causes very significant slowdowns, even if there is actually enough VRAM available. The older version of GGML used before c3eeb66 didn't have this issue.
I've had no luck finding the commit that introduced this behavior in ggml so far.
Example when generating a 896 x 896 image with Flux Schnell Q3_k, idle VRAM usage of 1.2 GB (Chrome and vsCode are opened in the background)
Relevant logs (identical between the two runs):
The text was updated successfully, but these errors were encountered: