-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVX-512 throughput improvement opportunties #83946
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsThe PR to enable EVEX support by default introduced some JIT throughput regressions. The comments in that PR analyzed the cause of these regressions and identified possible follow-up investigations and improvements. This issue tracks recovering some of the TP regressions by investigating the proposed improvements or mitigations. For example, LSRA has a number of places with the following loop structure:
and with AVX-512 available, there are an additional 16 SIMD registers and 8 opmask (k) registers, so these loops iterate more.
|
Link: #83648 |
Related #83109 |
Assigning to @kunalspathak. Please feel free to reassign. |
The LSRA TP improvements mentioned in #83648 (comment) and #83648 (comment) are for improving the |
The PR to enable EVEX support by default introduced some JIT throughput regressions. The comments in that PR analyzed the cause of these regressions and identified possible follow-up investigations and improvements.
This issue tracks recovering some of the TP regressions by investigating the proposed improvements or mitigations.
For example, LSRA has a number of places with the following loop structure:
and with AVX-512 available, there are an additional 16 SIMD registers and 8 opmask (k) registers, so these loops iterate more.
The text was updated successfully, but these errors were encountered: