Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable streams for the DML EP #19481

Merged
merged 2 commits into from
Feb 10, 2024

Conversation

PatriceVignola
Copy link
Contributor

There's currently a bug in the allocation planner when reusing buffers and more than one streams are used that make it possible (although rarely) to reach a reference count of 0 for a buffer that is still being used. Since DML doesn't benefit from multiple streams, disabling it is the safest option for now.

This is a high priority issue that we need to fix for 1.17.1 since it breaks stable diffusion. Identifying the perfect fix and fixing the underlying issue would be too risky for a patch release, especially given the limited time that we have.

#19480

@PatriceVignola
Copy link
Contributor Author

/azp run Linux GPU CI Pipeline (Linux_Test)

Copy link

No pipelines are associated with this pull request.

@PatriceVignola
Copy link
Contributor Author

@snnn I keep getting CUDA failure 100: no CUDA-capable device is detected ; GPU=0 ; hostname=12a729cd64de ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=138 ; expr=cudaGetDeviceCount(&num_devices); on the Linux build.

@PatriceVignola
Copy link
Contributor Author

/azp run Linux GPU CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@PatriceVignola
Copy link
Contributor Author

/azp run Linux GPU CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@PatriceVignola PatriceVignola merged commit 1182b55 into main Feb 10, 2024
94 checks passed
@PatriceVignola PatriceVignola deleted the user/pavignol/disable-streams-dml-ep branch February 10, 2024 08:34
YUNQIUGUO pushed a commit that referenced this pull request Feb 11, 2024
There's currently a bug in the allocation planner when reusing buffers
and more than one streams are used that make it possible (although
rarely) to reach a reference count of 0 for a buffer that is still being
used. Since DML doesn't benefit from multiple streams, disabling it is
the safest option for now.

This is a high priority issue that we need to fix for 1.17.1 since it
breaks stable diffusion. Identifying the perfect fix and fixing the
underlying issue would be too risky for a patch release, especially
given the limited time that we have.

#19480
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Pat for restoring SD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants