-
Notifications
You must be signed in to change notification settings - Fork 956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V0.19 seems taking toooo long on preparing BindGroupLayout
#5196
Comments
@cwfitzgerald looks like it might have been caused by the bindgroup layout dedup refactor ? |
Double checked the trace - I think this is actually arcanization. Based on the trace provided, |
For those hitting this problem, are you calling get_bind_group_layout every frame? To be clear this is a bug on our side, but reducing calls to get_bind_group_layout should reduce the problem. |
@cwfitzgerald, we are indeed calling We are still waiting to have many |
If you cache the bind group layouts alongside the compute pipeline, the problem should mostly go away. Currently (with the bugs) I believe performance is ~O(n^2) where n = calls to get_bind_group_layout. |
Just to clarify, instead of calling I actually tried it, and it didn't impact the performance significantly: https://github.com/tracel-ai/burn/blob/57cc3ffe60f8526a218404d433373128c3b24f17/burn-wgpu/src/compute/server.rs#L344 |
Yeah that is what I meant - that's a bit unexpected. Could you try using |
This should be worked around in 0.19.2, and a full fix should land in 0.20. The leak still exists, but you shouldn't notice. |
Description
I try to upgrade
web-rwkv
, an LLM inferencing backend using compute shaders. to v0.19. However after upgrading, I find that when running the model, it gets slower and slower, and most of the time, the GPU is idle. I suspect that internally, the CPU side is waiting on something.This does not happen in v0.18.
Repro steps
Try to upgrade
web-rwkv
to v0.19 without touch anything other than adding.into_iter()
when selecting adapters and run the model viaExpected vs observed behavior
Extra materials
I did a framegraph (attached) and found that comparing to v0.18, v0.19 spent a lot of time on
wgpu::ComputePipeline::get_bind_group_layout
.v0.18
v0.19
Platform
The text was updated successfully, but these errors were encountered: