Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Community Pipelines]Accelerate inference of stable diffusion by IPEX on CPU #3105

Merged
merged 15 commits into from
May 23, 2023

Conversation

yingjie-han
Copy link
Contributor

@yingjie-han yingjie-han commented Apr 14, 2023

This diffusion pipeline aims to speed up the inference of Stable Diffusion on Intel Xeon CPUs on Linux. It can get 1.5 times performance acceleration with BFloat16 on fourth generation of Intel Xeon CPUs, code-named Sapphire Rapids.
It is recommended to run on Pytorch/IPEX v2.0 to get the best performance boost.
-For Pytorch/IPEX v2.0, it benefits from MHA optimization with Flash Attention and TorchScript mode optimization in IPEX.
-For Pytorch/IPEX v1.13, it benefits from TorchScript mode optimization in IPEX.
Following tables show the test result on Intel® Xeon® Platinum 8480 Processor (56cores):
image

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 14, 2023

The documentation is not available anymore as the PR was closed or merged.

@yingjie-han
Copy link
Contributor Author

yingjie-han commented Apr 18, 2023

Hi,patrickvonplaten, could you help to review it? Thanks!

@yingjie-han yingjie-han marked this pull request as draft April 18, 2023 08:23
@yingjie-han yingjie-han marked this pull request as ready for review April 18, 2023 08:24
@yingjie-han yingjie-han mentioned this pull request Apr 24, 2023
6 tasks
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label May 14, 2023
@yingjie-han yingjie-han changed the title Add community pipeline to accelerate inference of stable diffusion by IPEX on Intel CPUs [Community Pipelines]Accelerate inference of stable diffusion by IPEX on CPU May 16, 2023
@yingjie-han yingjie-han reopened this May 16, 2023
@yingjie-han
Copy link
Contributor Author

Hi, @williamberman @patrickvonplaten is there any problem of this PR? Could you help me to review it ? This PR has been submitted for some time, but there has been no review. I don't know what I can do to move it on.

@yingjie-han
Copy link
Contributor Author

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

bumping

@pcuenca pcuenca removed the stale Issues that haven't received updates label May 16, 2023
Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard for me to understand the benefits of this optimization given that the test results are performed on a Xeon Platinum CPU. I'd suggest to explain the compatibility and limitations of this method:

  • Does this work on any Intel CPU? Or is it intended for server hardware? Does it work on Windows?
  • What versions of PyTorch does it require?
  • How does performance compare against PyTorch 2 (with and without torch.compile())?

examples/community/README.md Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/stable_diffusion_ipex.py Outdated Show resolved Hide resolved
examples/community/stable_diffusion_ipex.py Outdated Show resolved Hide resolved
Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcuenca feel free to merge if things look good to you

Copy link
Contributor

@williamberman williamberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedro's suggestions make sense, happy to merge after they've been addressed :) Sorry for the delay here @yingjie-han

@yingjie-han
Copy link
Contributor Author

It's hard for me to understand the benefits of this optimization given that the test results are performed on a Xeon Platinum CPU. I'd suggest to explain the compatibility and limitations of this method:

  • Does this work on any Intel CPU? Or is it intended for server hardware? Does it work on Windows?
  • What versions of PyTorch does it require?
  • How does performance compare against PyTorch 2 (with and without torch.compile())?

pcuenca Thanks very much for your review and suggestions. They make sense.
Here are the answers to your questions:
• Does this work on any Intel CPU? Or is it intended for server hardware? Does it work on Windows?
-- This pipeline is intended for Intel Xeon CPUs on Linux, not only Platinum Xeon. For FP32, it is supposed to benefit for every generation of Xeons (e.g., Skylake / Cascade Lake / IceLake), while BF16 works for CooperLake and Sapphire Rapids. Since it relies on Intel Extension for PyTorch, this pipeline only supports server CPUs and it does not work on Windows.
• What versions of PyTorch does it require?
It is recommended to use Pytorch/IPEX2.0 to get the best performance boost of flash attention optimization. It can also work on Pytorch/IPEX 1.13.
• How does performance compare against PyTorch 2 (with and without torch.compile())?
PyTorch 2.0 compile() is not support BF16 data type. For FP32, this ipex optimized pipeline get better performance than Pytorch2.0 compile(). I updated the test data in the above table, please check it.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for iterating here! 🙌

I think the purpose of the pipeline is clearer now. I just suggested a couple of minor text modifications and then we are ready to merge.

Awesome contribution @yingjie-han, thanks a lot for your patience!

examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
examples/community/README.md Outdated Show resolved Hide resolved
@yingjie-han
Copy link
Contributor Author

pcuenca Thanks a lot for your review and suggestions. The modifications are committed. It's ready to merge.

@pcuenca
Copy link
Member

pcuenca commented May 23, 2023

Fixed the conflicts, merging now. Thanks again, @yingjie-han!

@pcuenca pcuenca merged commit edc6505 into huggingface:main May 23, 2023
@yingjie-han yingjie-han deleted the ipex_pipeline branch June 6, 2023 07:32
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
… on CPU (huggingface#3105)

* add stable_diffusion_ipex community pipeline

* Update readme.md

* reformat

* reformat

* Update examples/community/README.md

Co-authored-by: Pedro Cuenca <[email protected]>

* Update examples/community/README.md

Co-authored-by: Pedro Cuenca <[email protected]>

* Update examples/community/README.md

Co-authored-by: Pedro Cuenca <[email protected]>

* Update examples/community/README.md

Co-authored-by: Pedro Cuenca <[email protected]>

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <[email protected]>

* Update README.md

* Update README.md

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <[email protected]>

* style

---------

Co-authored-by: Pedro Cuenca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants