Improve Decode Shape Performance for AMD FP8 #2658

jwfromm · 2024-06-01T00:21:58Z

Summary: Add tuning config for decode workloads. Improves performance substantially for shapes with small M.

Differential Revision: D58031289

facebook-github-bot · 2024-06-01T00:22:07Z

This pull request was exported from Phabricator. Differential Revision: D58031289

netlify · 2024-06-01T00:22:14Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`0e1553d`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/665e5a51e6e5b20008b0e099
😎 Deploy Preview	https://deploy-preview-2658--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Summary: Add tuning config for decode workloads. Improves performance substantially for shapes with small M. Differential Revision: D58031289

facebook-github-bot · 2024-06-03T23:49:55Z

This pull request was exported from Phabricator. Differential Revision: D58031289

Summary: Add tuning config for decode workloads. Improves performance substantially for shapes with small M. Differential Revision: D58031289

facebook-github-bot · 2024-06-03T23:50:35Z

This pull request was exported from Phabricator. Differential Revision: D58031289

Summary: Pull Request resolved: pytorch#2658 Add tuning config for decode workloads. Improves performance substantially for shapes with small M. Differential Revision: D58031289

facebook-github-bot · 2024-06-04T00:05:05Z

This pull request was exported from Phabricator. Differential Revision: D58031289

facebook-github-bot · 2024-06-04T22:56:41Z

This pull request has been merged in 849c1ff.

facebook-github-bot added the cla signed label Jun 1, 2024

facebook-github-bot added the fb-exported label Jun 1, 2024

jwfromm force-pushed the export-D58031289 branch from b9417b0 to 8e07ad2 Compare June 3, 2024 23:49

jwfromm force-pushed the export-D58031289 branch from 8e07ad2 to 0338a93 Compare June 3, 2024 23:50

Improve Decode Shape Performance for AMD FP8 (pytorch#2658)

0e1553d

Summary: Pull Request resolved: pytorch#2658 Add tuning config for decode workloads. Improves performance substantially for shapes with small M. Differential Revision: D58031289

jwfromm force-pushed the export-D58031289 branch from 0338a93 to 0e1553d Compare June 4, 2024 00:05

facebook-github-bot closed this in 849c1ff Jun 4, 2024

facebook-github-bot added the Merged label Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Decode Shape Performance for AMD FP8 #2658

Improve Decode Shape Performance for AMD FP8 #2658

jwfromm commented Jun 1, 2024

facebook-github-bot commented Jun 1, 2024

netlify bot commented Jun 1, 2024 •

edited

Loading

facebook-github-bot commented Jun 3, 2024

facebook-github-bot commented Jun 3, 2024

facebook-github-bot commented Jun 4, 2024

facebook-github-bot commented Jun 4, 2024

Improve Decode Shape Performance for AMD FP8 #2658

Improve Decode Shape Performance for AMD FP8 #2658

Conversation

jwfromm commented Jun 1, 2024

facebook-github-bot commented Jun 1, 2024

netlify bot commented Jun 1, 2024 • edited Loading

✅ Deploy Preview for pytorch-fbgemm-docs ready!

facebook-github-bot commented Jun 3, 2024

facebook-github-bot commented Jun 3, 2024

facebook-github-bot commented Jun 4, 2024

facebook-github-bot commented Jun 4, 2024

netlify bot commented Jun 1, 2024 •

edited

Loading