[RISCV] Shrink vslidedown when lowering fixed extract_subvector #65598

lukel97 · 2023-09-07T11:24:20Z

As noted in #65392 (comment), when lowering an extract of a fixed length vector from another vector, we don't need to perform the vslidedown on the full vector type. Instead we can extract the smallest subregister that contains the subvector to be extracted and perform the vslidedown with a smaller LMUL. E.g, with +Zvl128b:

v2i64 = extract_subvector nxv4i64, 2

is currently lowered as

vsetivli zero, 2, e64, m4, ta, ma
vslidedown.vi v8, v8, 2

This patch shrinks the vslidedown to LMUL=2:

vsetivli zero, 2, e64, m2, ta, ma
vslidedown.vi v8, v8, 2

Because we know that there's at least 128*2=256 bits in v8 at LMUL=2, and we only need the first 256 bits to extract a v2i64 at index 2.

lowerEXTRACT_VECTOR_ELT already has this logic, so this extracts it out and reuses it.

I've split this out into a separate PR rather than include it in #65392, with the hope that we'll be able to generalize it later.

This patch refactors extract_subvector lowering to lower to extract_subreg directly, and to shortcut whenever the index is 0 when extracting a scalable vector. This doesn't change any of the existing behaviour, but makes an upcoming patch that extends the scalable path slightly easier to read.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll

preames

LGTM

As noted in llvm#65392 (comment), when lowering an extract of a fixed length vector from another vector, we don't need to perform the vslidedown on the full vector type. Instead we can extract the smallest subregister that contains the subvector to be extracted and perform the vslidedown with a smaller LMUL. E.g, with +Zvl128b: v2i64 = extract_subvector nxv4i64, 2 is currently lowered as vsetivli zero, 2, e64, m4, ta, ma vslidedown.vi v8, v8, 2 This patch shrinks the vslidedown to LMUL=2: vsetivli zero, 2, e64, m2, ta, ma vslidedown.vi v8, v8, 2 Because we know that there's at least 128*2=256 bits in v8 at LMUL=2, and we only need the first 256 bits to extract a v2i64 at index 2. lowerEXTRACT_VECTOR_ELT already has this logic, so this extracts it out and reuses it. I've split this out into a separate PR rather than include it in llvm#65392, with the hope that we'll be able to generalize it later.

If we know the VL and offset of a vslidedown_vl, we can work out the minimum number of registers it's going to operate across. We can reuse the logic from extract_vector_elt to perform it in a smaller type and reduce the LMUL. The aim is to generalize llvm#65598 and hopefully extend this to vslideup_vl too so that we can get the same optimisation for insert_subvector and insert_vector_elt. One observation from adding this is that the vslide*_vl nodes all take a mask operand, but currently anything other than vmset_vl will fail to select, as all the patterns expect true_mask. So we need to create a new vmset_vl instead of using extract_subvector on the existing vmset_vl.

…#65598) As noted in llvm#65392 (comment), when lowering an extract of a fixed length vector from another vector, we don't need to perform the vslidedown on the full vector type. Instead we can extract the smallest subregister that contains the subvector to be extracted and perform the vslidedown with a smaller LMUL. E.g, with +Zvl128b: v2i64 = extract_subvector nxv4i64, 2 is currently lowered as vsetivli zero, 2, e64, m4, ta, ma vslidedown.vi v8, v8, 2 This patch shrinks the vslidedown to LMUL=2: vsetivli zero, 2, e64, m2, ta, ma vslidedown.vi v8, v8, 2 Because we know that there's at least 128*2=256 bits in v8 at LMUL=2, and we only need the first 256 bits to extract a v2i64 at index 2. lowerEXTRACT_VECTOR_ELT already has this logic, so this extracts it out and reuses it. I've split this out into a separate PR rather than include it in llvm#65392, with the hope that we'll be able to generalize it later. This patch refactors extract_subvector lowering to lower to extract_subreg directly, and to shortcut whenever the index is 0 when extracting a scalable vector. This doesn't change any of the existing behaviour, but makes an upcoming patch that extends the scalable path slightly easier to read.

Similar to llvm#65598, if we're using a vslideup to insert a fixed length vector into another vector, then we can work out the minimum number of registers it will need to slide up across given the minimum VLEN, and shrink the type operated on to reduce LMUL accordingly. This is somewhat dependent on llvm#65916, since it introduces a subregister copy that triggers a crash with -early-live-intervals in one of the tests.

…65997) Similar to #65598, if we're using a vslideup to insert a fixed length vector into another vector, then we can work out the minimum number of registers it will need to slide up across given the minimum VLEN, and shrink the type operated on to reduce LMUL accordingly. This is somewhat dependent on #66211 , since it introduces a subregister copy that triggers a crash with -early-live-intervals in one of the tests. Stacked upon #66211

lukel97 requested a review from a team as a code owner September 7, 2023 11:24

lukel97 requested review from frasercrmck, preames, topperc and yetingk September 7, 2023 11:24

github-actions bot added the backend:RISC-V label Sep 7, 2023

preames reviewed Sep 7, 2023

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

preames reviewed Sep 7, 2023

View reviewed changes

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll Outdated Show resolved Hide resolved

preames approved these changes Sep 11, 2023

View reviewed changes

lukel97 added 3 commits September 11, 2023 17:18

Share logic from extract_vector_elt

b788713

Fix typo in comment

07e5fed

lukel97 force-pushed the extract-subvector-fixed-shrink branch from adaa712 to 07e5fed Compare September 11, 2023 16:18

lukel97 merged commit e33f3f0 into llvm:main Sep 11, 2023

lukel97 mentioned this pull request Sep 11, 2023

[RISCV] Shrink vslideup's LMUL when lowering fixed insert_subvector #65997

Merged

michaelrj-google mentioned this pull request Sep 12, 2023

[libc] Move long double table option to new config #66151

Merged

vzakhari mentioned this pull request Sep 12, 2023

internap proc trampolines #66156

Closed

lukel97 mentioned this pull request Sep 18, 2023

[RISCV] Combine vslidedown_vl with known VL and offset to a smaller LMUL #66267

Closed

lukel97 mentioned this pull request Oct 4, 2023

[RISCV] Extract subregister if VLEN is known when lowering extract_subvector #65392

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Shrink vslidedown when lowering fixed extract_subvector #65598

[RISCV] Shrink vslidedown when lowering fixed extract_subvector #65598

lukel97 commented Sep 7, 2023 •

edited

Loading

preames left a comment

[RISCV] Shrink vslidedown when lowering fixed extract_subvector #65598

[RISCV] Shrink vslidedown when lowering fixed extract_subvector #65598

Conversation

lukel97 commented Sep 7, 2023 • edited Loading

preames left a comment

Choose a reason for hiding this comment

lukel97 commented Sep 7, 2023 •

edited

Loading