-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Community Pipelines]Accelerate inference of stable diffusion by IPEX on CPU #3105
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Hi,patrickvonplaten, could you help to review it? Thanks! |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Hi, @williamberman @patrickvonplaten is there any problem of this PR? Could you help me to review it ? This PR has been submitted for some time, but there has been no review. I don't know what I can do to move it on. |
bumping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard for me to understand the benefits of this optimization given that the test results are performed on a Xeon Platinum CPU. I'd suggest to explain the compatibility and limitations of this method:
- Does this work on any Intel CPU? Or is it intended for server hardware? Does it work on Windows?
- What versions of PyTorch does it require?
- How does performance compare against PyTorch 2 (with and without torch.compile())?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pcuenca feel free to merge if things look good to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedro's suggestions make sense, happy to merge after they've been addressed :) Sorry for the delay here @yingjie-han
Co-authored-by: Pedro Cuenca <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
pcuenca Thanks very much for your review and suggestions. They make sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for iterating here! 🙌
I think the purpose of the pipeline is clearer now. I just suggested a couple of minor text modifications and then we are ready to merge.
Awesome contribution @yingjie-han, thanks a lot for your patience!
Co-authored-by: Pedro Cuenca <[email protected]>
pcuenca Thanks a lot for your review and suggestions. The modifications are committed. It's ready to merge. |
Fixed the conflicts, merging now. Thanks again, @yingjie-han! |
… on CPU (huggingface#3105) * add stable_diffusion_ipex community pipeline * Update readme.md * reformat * reformat * Update examples/community/README.md Co-authored-by: Pedro Cuenca <[email protected]> * Update examples/community/README.md Co-authored-by: Pedro Cuenca <[email protected]> * Update examples/community/README.md Co-authored-by: Pedro Cuenca <[email protected]> * Update examples/community/README.md Co-authored-by: Pedro Cuenca <[email protected]> * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * Update README.md * Update README.md * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * style --------- Co-authored-by: Pedro Cuenca <[email protected]>
This diffusion pipeline aims to speed up the inference of Stable Diffusion on Intel Xeon CPUs on Linux. It can get 1.5 times performance acceleration with BFloat16 on fourth generation of Intel Xeon CPUs, code-named Sapphire Rapids.
![image](https://private-user-images.githubusercontent.com/96510654/239459274-7123dea5-5a66-4b48-a3cd-a04eb3b0d728.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzMzg4NjEsIm5iZiI6MTczOTMzODU2MSwicGF0aCI6Ii85NjUxMDY1NC8yMzk0NTkyNzQtNzEyM2RlYTUtNWE2Ni00YjQ4LWEzY2QtYTA0ZWIzYjBkNzI4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDA1MzYwMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTk3ZjExYjVjMDRhYTFhYzA0NGE1Mzg2MTVhZmIzNDIyOTc0NmM3MjcxOGE0YjY0ZjE0NWZhNmVkNWRlZjZhNTcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.ddhLouchcIDcoX5ixUqcoIuvviz4-_wiKSUw-XL87Co)
It is recommended to run on Pytorch/IPEX v2.0 to get the best performance boost.
-For Pytorch/IPEX v2.0, it benefits from MHA optimization with Flash Attention and TorchScript mode optimization in IPEX.
-For Pytorch/IPEX v1.13, it benefits from TorchScript mode optimization in IPEX.
Following tables show the test result on Intel® Xeon® Platinum 8480 Processor (56cores):