Fix unrolling for LocateLastFoundChar and LocateLastFoundByte #46977
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Despite what the comments say, the pattern employed by these methods is not recognized by the Jit, because the loop will not have a flag
LPFLG_SIMD_LIMIT
set on it, which only happens for loops with limits that haveGTF_ICON_SIMD_COUNT
set on them, which in turn is only set forVector_.Count
nodes during importation.This can be confirmed by looking at the assembly:
This PR fixes the issue by introducing a new loop counter that satisfies the Jit's requirements. I have also collected a small benchmark (source can be found here: https://gist.github.com/SingleAccretion/d17ef1b0e885b50fffed47c82afdc29e). This is only for illustrative/sanity checking purposes, as:
Notes:
i--
at the end of the loop body is deliberate:for (int j = 0; j < Vector<ulong>.Count; j++, i--)
breaks the recognition of the pattern (for (int j = 0; j < Vector<ulong>.Count; i--, j++)
works).Byte
andChar
versions which only differ in their last line. The methods could be folded into one without performance loss. However, this would disrupt the nice "Char/Byte" pattern that the source files have. My personal opinion here would be that it's not worth it.source.dot.net
for places whereVector<T>.Count
is used suggests that this problematic pattern is not widespread (only the fixed methods show up). The unrolling today is laser-focused on thefor (int i = 0; i < Vector<T>.Count; i++)
case, and I think it makes sense to leave it that way until more significant investment in the area is determined to be needed.