Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for DX12 bug not realesed, affecting downstream users #2592

Closed
eduardvercaemer opened this issue Apr 13, 2022 · 9 comments
Closed

Fix for DX12 bug not realesed, affecting downstream users #2592

eduardvercaemer opened this issue Apr 13, 2022 · 9 comments

Comments

@eduardvercaemer
Copy link

Description
There is a bug with DX12 that makes wgpu 0.12.0 error and crash. This error has been fixed as of the latest commit, but has not yet been released, which makes for example bevy unusable by affected users.

Attempting to patch using the latest branch is also not an option as it introduces breaking changes.

I would like to know if there is a way of releasing the fix without introducing breaking changes.

Repro steps

  1. git clone wgpu
  2. checkout release 0.12.0
  3. cargo run --example boids

Expected vs observed behavior
The expected behavior is that the example works, instead it errors like this:

[2022-04-13T03:54:26Z INFO  wgpu_core::instance] Adapter Dx12 AdapterInfo { name: "NVIDIA GeForce GTX 1650 Ti", vendor: 4318, device: 8085, device_type: DiscreteGpu, backend: Dx12 }
Using NVIDIA GeForce GTX 1650 Ti (Dx12)
[2022-04-13T03:54:26Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000172B78A4540:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
error: process didn't exit successfully: `target\debug\examples\boids.exe` (exit code: 1)

Extra materials
I bisected the fix to commit 4d7f6eb, i.e. this commit and after it, the expected behavior occurs.

This is the full output of the example when run on 4d7f6eb:

[2022-04-13T04:07:02Z INFO  wgpu_core::instance] Adapter Dx12 AdapterInfo { name: "NVIDIA GeForce GTX 1650 Ti", vendor: 4318, device: 8085, device_type: DiscreteGpu, backend: Dx12 }
Using NVIDIA GeForce GTX 1650 Ti (Dx12)
[2022-04-13T04:07:02Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D2420:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:02Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D2420:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:02Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D1F90:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:02Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D2420:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:03Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D1F90:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:03Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D2420:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:03Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D2420:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:04Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D1670:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:04Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D2420:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:04Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D1F90:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:04Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D1670:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
Avg frame time 17.209745ms
[2022-04-13T04:07:04Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D1670:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
[2022-04-13T04:07:05Z ERROR wgpu_hal::dx12::instance] ID3D12CommandQueue::Present: Resource state (0x800: D3D12_RESOURCE_STATE_COPY_SOURCE) (promoted from COMMON state) of resource (0x00000297E79D1670:'Unnamed ID3D12Resource Object') (subresource: 0) must be in COMMON state when transitioning to use in a different Command List type, because resource state on previous Command List type : D3D12_COMMAND_LIST_TYPE_COPY, is actually incompatible and different from that on the next Command List type : D3D12_COMMAND_LIST_TYPE_DIRECT. [ RESOURCE_MANIPULATION ERROR #990: RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE]
Avg frame time 16.882772ms
Avg frame time 16.918545ms
...

Essentially the error is handled as non fatal and execution can continue without errors (program works as expected).

This is related to the underlying DX12 bug: https://stackoverflow.com/questions/69805245/directx-12-application-is-crashing-in-windows-11

Platform
OS: Windows 11
Integrated GC: AMD Radeon(TM) Graphics
Discrete GC: NVIDIA GeForce GTX 1650 Ti

Environment
RUST_LOG = error,wgpu_core::instance=info
WGPU_POWER_PREF = high
WGPU_BACKEND = DX12

@eduardvercaemer
Copy link
Author

It looks like cherry picking 4d7f6eb on top of v0.12 gives the expected behavior.

@cwfitzgerald
Copy link
Member

I think we would be okay semver wise to cherry pick the wgpu-hal and release it as a patch.

@eduardvercaemer
Copy link
Author

Now this is a similar but separate issue: vulkan also gives me problems in v0.12, patching with d45e6b4 fixes it. (if you want me to i can create a new issue for this)

I feel both this commits could be released as patches without breaking anything.

@cwfitzgerald
Copy link
Member

cwfitzgerald commented Apr 14, 2022

I believe ash version is technically a public dependency because of https://github.com/gfx-rs/wgpu/blob/master/wgpu-hal/src/vulkan/adapter.rs#L916 and other "from raw" apis, so it would be a breaking change. It's also an MSRV bump, which we consider breaking.

If you'd be up for making a PR for the DX12 fix into the v0.12 branch, that would be much appreciated!

@kvark
Copy link
Member

kvark commented Apr 16, 2022

Uh, I'm not quite convinced that porting 4d7f6eb to 0.12 is the solution here. It just makes the error non-fatal. Not just this error - any validation error. wgpu's goal is to guarantee safety. If a validation error happens, it means we have a logical error to fix. We can't just ignore it and continue like nothing happened.

What needs to happen here is somebody looking at the root cause and fixing an actual issue. Alternatively, if Bevy is doing something unexpected, changing Bevy code, as a workaround.

@cwfitzgerald
Copy link
Member

I think the idea is that this isn't actually a bug in wgpu, this is an acknowledged issue in the d3d12 and dxgi debug layers.

@eduardvercaemer
Copy link
Author

Yes this should be a bug with d12 and not actually in wgpu, so the fix is really to ignore it i think, this link https://stackoverflow.com/questions/69805245/directx-12-application-is-crashing-in-windows-11 mentions some solutions but i don't know enough d12 to know what it would mean to implement.

@kvark
Copy link
Member

kvark commented Apr 17, 2022

Oh interesting. I suppose it's good news.
In this case, we should suppress only D3D12_MESSAGE_ID_RESOURCE_BARRIER_MISMATCHING_COMMAND_LIST_TYPE, not all errors from the validation layer.

@cwfitzgerald
Copy link
Member

cwfitzgerald commented Apr 21, 2022

wgpu-hal 0.12.5 is out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants