Skip to content

Commit

Permalink
Merge pull request #3825 from Sonicadvance1/scale_64bit_gather
Browse files Browse the repository at this point in the history
AVX128: Prescale addresses in gathers if possible
  • Loading branch information
Sonicadvance1 authored Jul 6, 2024
2 parents bbf8dde + 6e8ca3b commit 47d077f
Show file tree
Hide file tree
Showing 3 changed files with 181 additions and 392 deletions.
13 changes: 13 additions & 0 deletions FEXCore/Source/Interface/Core/OpcodeDispatcher/AVX_128.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2603,6 +2603,19 @@ OpDispatchBuilder::RefPair OpDispatchBuilder::AVX128_VPGatherImpl(OpSize Size, O
BaseAddr = Invalid();
}

if (ElementLoadSize == OpSize::i64Bit && AddrElementSize == OpSize::i64Bit && (VSIB.Scale == 2 || VSIB.Scale == 4) &&
CTX->HostFeatures.SupportsSVE128) {
// SVE gather instructions don't support scaling their vector elements by anything other than 1 or the address element size.
// Pre-scale 64-bit addresses in the case that scale doesn't match in-order to hit SVE code paths more frequently.
// Only hit this path if the host supports SVE. Otherwise it's a degradation for the ASIMD codepath.
VSIB.Low = _VShlI(OpSize::i128Bit, OpSize::i64Bit, VSIB.Low, FEXCore::ilog2(VSIB.Scale));
if (!Is128Bit) {
VSIB.High = _VShlI(OpSize::i128Bit, OpSize::i64Bit, VSIB.High, FEXCore::ilog2(VSIB.Scale));
}
///< Set the scale to one now that it has been prescaled.
VSIB.Scale = 1;
}

RefPair Result {};
///< Calculate the low-half.
Result.Low = _VLoadVectorGatherMasked(OpSize::i128Bit, ElementLoadSize, Dest.Low, Mask.Low, BaseAddr, VSIB.Low, VSIB.High,
Expand Down
Loading

0 comments on commit 47d077f

Please sign in to comment.