Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early Resource Deletion in Poll Tests #3160

Closed
1 of 3 tasks
cwfitzgerald opened this issue Nov 1, 2022 · 6 comments
Closed
1 of 3 tasks

Early Resource Deletion in Poll Tests #3160

cwfitzgerald opened this issue Nov 1, 2022 · 6 comments
Assignees
Labels
area: correctness We're behaving incorrectly area: tests Improvements or issues with our test suite help required We need community help to make this happen. type: bug Something isn't working

Comments

@cwfitzgerald
Copy link
Member

cwfitzgerald commented Nov 1, 2022

Poll tests (https://github.com/gfx-rs/wgpu/blob/master/wgpu/tests/poll.rs) are currently completely disabled due to bugs in resource tracking. I believe the problem stems from the resources being completely dropped by the time the device is maintained.

  • These tests should be modified to not hit the bug (to let the tests pass)
  • Separate tests should be added which exploit the bug.
  • The bug should be fixed.
@cwfitzgerald cwfitzgerald added type: bug Something isn't working help required We need community help to make this happen. area: correctness We're behaving incorrectly area: tests Improvements or issues with our test suite labels Nov 1, 2022
@teoxoy
Copy link
Member

teoxoy commented Nov 10, 2022

I ran the poll tests locally and this is what I'm seeing:

  • DX12 on Windows 11 with Intel iGPU, Nvidia dGPU and WARP errors with

    [2022-11-10T11:04:34Z ERROR wgpu_hal::auxil::dxgi::exception] ID3D12Resource2::<final-release>: CORRUPTION: An ID3D12Resource object (0x00000204A3036F50:'Unnamed Object') is referenced by GPU operations in-flight on Command Queue (0x000002049A790500:'Unnamed ID3D12CommandQueue Object').  It is not safe to final-release objects that may have GPU operations pending.  This can result in application instability. [ EXECUTION ERROR #921: OBJECT_DELETED_WHILE_STILL_IN_USE]
    
  • Vulkan on Windows 11 with Intel iGPU and Nvidia dGPU passes

  • Vulkan on Linux with llvmpipe and swiftshader segfaults

  • OpenGL on Linux with llvmpipe and Intel iGPU passes


In #3174 the CI seems to have the same issues (segfaults in linux/vulkan and errors with OBJECT_DELETED_WHILE_STILL_IN_USE on windows/dx12). So, finding and fixing the underlying bug might also fix that test case.

@jimblandy
Copy link
Member

It would be nice to see if this still occurs after arcanization.

@cwfitzgerald
Copy link
Member Author

I tested it and it does.

@teoxoy
Copy link
Member

teoxoy commented Jul 16, 2024

#3873 seems to have removed all the .skip(FailureCase::always()) from the poll tests.

I tried running the poll tests on the parent commit and they failed.

I can't tell why they are passing with #3873, I don't see any changes in wgpu-core.

@cwfitzgerald do you know why they are working now?

@teoxoy
Copy link
Member

teoxoy commented Jul 25, 2024

Ah, I see what happened, #3873 added a DummyWorkData struct that contains the CommandBuffer resources so that they are not dropped before the CommandBuffer.

@teoxoy
Copy link
Member

teoxoy commented Jul 25, 2024

I bisected this being fixed by aade481 (#4894). More specifically by this being removed:

//Releasing safely unused resources to decrement refcount
bind_group.used_buffer_ranges.write().clear();
bind_group.used_texture_ranges.write().clear();
bind_group.dynamic_binding_info.write().clear();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: correctness We're behaving incorrectly area: tests Improvements or issues with our test suite help required We need community help to make this happen. type: bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

3 participants