Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wgpu alignment validation errror on GpuProfiler::resolve_queries #79

Closed
EriKWDev opened this issue Sep 5, 2024 · 14 comments · Fixed by #80
Closed

wgpu alignment validation errror on GpuProfiler::resolve_queries #79

EriKWDev opened this issue Sep 5, 2024 · 14 comments · Fixed by #80
Labels
bug Something isn't working

Comments

@EriKWDev
Copy link

EriKWDev commented Sep 5, 2024

wgpu error: Validation Error

Caused by:
  In CommandEncoder::resolve_query_set
    Error encountered while trying to resolve a query
      Resolve buffer offset has to be aligned to `QUERY_RESOLVE_BUFFER_ALIGNMENT

have not had this issue before but happened now when I was integrating the profiler into our company's much larger renderer

@EriKWDev
Copy link
Author

EriKWDev commented Sep 5, 2024

in resolve_queries, the resolve_query_set function calculates an offset:

encoder.resolve_query_set(
    &query_pool.query_set,
    num_resolved_queries..num_used_queries,
    &query_pool.resolve_buffer,
    (num_resolved_queries * wgpu::QUERY_SIZE) as u64,
);

This offset is, to my understsanding, not guaranteed to be aligned to wgpu::QUERY_RESOLVE_BUFFER_ALIGNMENT which it has to be according to the wgpu docs

It seems that the resolve_buffer's size has to include some padding and this offset calculation needs to round to the nearest larger multiple of QUERY_RESOLVE_BUFFER_ALIGNMENT.

I solved this in a fork by a) allocating a larger resolve_buffer and b) making sure this alignment is calculated accordingly here: https://github.com/bitwise-git/wgpu-profiler

It might be incorrect though

@Wumpf
Copy link
Owner

Wumpf commented Sep 8, 2024

very surprising that this didn't come up before! Thank you so much for reporting

I had a bit of a look at your commit. This should generally work but I'd solve it a little bit differently - adjust MIN_CAPACITY and use Rust's builtin next_multiple_of
I'll bring a up PR shortly and put you as reviewer - would be great if you could confirm in your environment that the proposed fix works as well

@Wumpf
Copy link
Owner

Wumpf commented Sep 8, 2024

actually I think this approach doesn't quite work because of how the resolve buffer fills up, but looking at this again I think this isn't done all that well to begin with - the resolve buffer is right away copied to a read buffer, so why even bother offsetting 💡
PR with new suggestion soon (:

@EriKWDev
Copy link
Author

EriKWDev commented Sep 8, 2024

Good idea :) will check it out once it lands. Great work overall btw! When I implemented it in a few more compute and renderpasses the alignment issue seemed to go away for that specific set of inputs at least so got it working in our engine as is, but a real solution would be good!

@Wumpf
Copy link
Owner

Wumpf commented Sep 8, 2024

Thank you!
It's up now! I was even able to repro the issue in a unit test, but would be still great if you could have a look and verify :)
#80

@Wumpf Wumpf added the bug Something isn't working label Sep 8, 2024
@Wumpf
Copy link
Owner

Wumpf commented Sep 11, 2024

will check it out once it lands

Ah, I was hoping you could try the PR branch already? 🥺
Well, no worries if not. I'm somewhat confident the change is an improvement regardless

@EriKWDev
Copy link
Author

Great! I will give it a try today if I find time in the afternoon :)

@EriKWDev
Copy link
Author

all I can say is that this PR seems to work with our latest version of the engine! Made a little egui widget for it but looking forward to potential puffin integration #59!

image

image

@Wumpf
Copy link
Owner

Wumpf commented Sep 12, 2024

awesome, thanks for having a look!

@Wumpf Wumpf closed this as completed in #80 Sep 12, 2024
@DJMcNab
Copy link

DJMcNab commented Sep 12, 2024

We have also had this reported at linebender/vello#678, and would ideally like to see this be in a release.

@Wumpf
Copy link
Owner

Wumpf commented Sep 12, 2024

@DJMcNab
Copy link

DJMcNab commented Sep 12, 2024

Thanks!

Unfortunately, this seems to still be broken.
I'm now getting:

[2024-09-12T10:10:03.568Z ERROR wgpu::backend::wgpu_core] Handling wgpu errors as fatal by default
thread 'main' panicked at /home/djmcnab/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-22.1.0/src/backend/wgpu_core.rs:3411:5:
wgpu error: Validation Error

Caused by:
  In CommandEncoder::copy_buffer_to_buffer
    Copy error
      Copy of 256..544 would end up overrunning the bounds of the Destination buffer of size 288


stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::ops::function::Fn::call
   3: wgpu::backend::wgpu_core::ContextWgpuCore::handle_error
   4: <wgpu::backend::wgpu_core::ContextWgpuCore as wgpu::context::Context>::command_encoder_copy_buffer_to_buffer
   5: <T as wgpu::context::DynContext>::command_encoder_copy_buffer_to_buffer
   6: wgpu::CommandEncoder::copy_buffer_to_buffer
   7: wgpu_profiler::profiler::GpuProfiler::resolve_queries
   8: vello::wgpu_engine::WgpuEngine::run_recording
   9: vello::Renderer::render_to_surface
  10: <with_winit::VelloApp as winit::application::ApplicationHandler<with_winit::UserEvent>>::window_event
  11: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut
  12: winit::platform_impl::platform::wayland::event_loop::EventLoop<T>::pump_events
  13: winit::platform_impl::platform::wayland::event_loop::EventLoop<T>::run_on_demand
  14: with_winit::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

@Wumpf
Copy link
Owner

Wumpf commented Sep 12, 2024

@DJMcNab can you create an new bug for this, ideally with a way to repro it if possible?

@Wumpf
Copy link
Owner

Wumpf commented Sep 12, 2024

sounds like something super silly is still in there that isn't caught by the unit tests :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants