Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vulkan: Handle GPUs with less shared memory #10468

Merged
merged 1 commit into from
Nov 27, 2024

Conversation

jeffbolznv
Copy link
Collaborator

@jeffbolznv jeffbolznv commented Nov 23, 2024

There have been reports of failure to compile on systems with <= 32KB of shared memory (e.g. #10037). This change makes the large tile size fall back to a smaller size if necessary, and makes mul_mat_id fall back to CPU if there's only 16KB of shared memory.

I don't have a real system with these smaller sizes. But I did try forcing a smaller shared memory size to be reported and verified there were no validation layer errors reported.

Fixes #10037.

There have been reports of failure to compile on systems with <= 32KB
of shared memory (e.g. ggml-org#10037). This change makes the large tile size
fall back to a smaller size if necessary, and makes mul_mat_id fall
back to CPU if there's only 16KB of shared memory.
@0cc4m 0cc4m merged commit 5b3466b into ggml-org:master Nov 27, 2024
54 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Dec 20, 2024
There have been reports of failure to compile on systems with <= 32KB
of shared memory (e.g. ggml-org#10037). This change makes the large tile size
fall back to a smaller size if necessary, and makes mul_mat_id fall
back to CPU if there's only 16KB of shared memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Vulkan backend freezes during its execution
2 participants