[FEATURE] Adds missing extract implementations for AVX512. #2926

rrahn · 2022-01-07T14:47:47Z

This PR adds the missing implementations for the family of extract functions specific to the AVX512 instruction set, as they are much faster then the default implementation.

A bit on the background:
These functions allow to extract only a part (half, i.e. 2x256 bit, quarter, i.e. 4x128 bit, or eighth, i.e. 8x64 bit) of the given simd vector and are used in combination with the upcast function in order to sign/zero extend a simd vector to a set of simd vectors with larger operand types, e.g. one vint_8x64_t vector to 2 vint_16x32_t vectors, with 8 (16) being the bit-size of the operands and 64 (32) being the number of operands that fit into a single __m512i vector type.

The extracted bytes will be placed in the lower bits of the target simd vector. There is no dedicated intrinsic to extract only 64 bits, so I had to emulate it with the intrinsic to extract 128 bit and using an additional shuffle operation for the uneven indices. This shuffle operation exchanges the higher 64 bits with the lower 64 bits of each of the four 128 bit lanes of the source simd vector. When upcasting this vector, only the first 64 bits are considered (Note this is only needed when going from 8-bit operands to 64-bit operands)

vercel · 2022-01-07T14:47:52Z

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/seqan/seqan3/35cna2ZNEWZn9HFawYVQjC8ckpJn
✅ Preview: https://seqan3-git-fork-rrahn-feature-avx512extract-seqan.vercel.app

codecov · 2022-01-07T15:01:22Z

Codecov Report

Merging #2926 (c7b45d8) into master (0f5fb9d) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #2926   +/-   ##
=======================================
  Coverage   98.28%   98.28%           
=======================================
  Files         267      267           
  Lines       11462    11462           
=======================================
  Hits        11265    11265           
  Misses        197      197

Impacted Files	Coverage Δ
include/seqan3/utility/simd/algorithm.hpp	`100.00% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f5fb9d...c7b45d8. Read the comment docs.

include/seqan3/utility/simd/algorithm.hpp

smehringer

What about tests for the new functions?

smehringer · 2022-01-24T07:24:30Z

include/seqan3/utility/simd/detail/simd_algorithm_avx512.hpp

+#if defined(__AVX512DQ__)
 template <uint8_t index, simd::simd_concept simd_t>
-constexpr simd_t extract_quarter_avx512(simd_t const & src);
+constexpr simd_t extract_quarter_avx512(simd_t const & src)


Why are extract_quarter/eighth_avx512 wrapped in #if defined(__AVX512DQ__) but extract_half_avx512 isn't?

The tests existed already (for SSE and AVX2). I ran the tests on icebear with AVX512 support to check the results.

The additional check is because the extract quarter requires an intrinsic that is only available by the AVX-512 CD subset. Not all AVX512 platforms might have this additional intrinsics subset, but still you can use other AVX512 intrinsics. The extract half doesn't need it because it only works with AVX512-F which is the foundational intrinsics set used by all that support AVX512.

[FEATURE] Adds missing extract implementations for AVX512.

c7b45d8

rrahn requested review from a team and remyschwab and removed request for a team January 7, 2022 14:47

rrahn commented Jan 20, 2022

View reviewed changes

include/seqan3/utility/simd/algorithm.hpp Show resolved Hide resolved

remyschwab reviewed Jan 20, 2022

View reviewed changes

include/seqan3/utility/simd/algorithm.hpp Show resolved Hide resolved

include/seqan3/utility/simd/algorithm.hpp Show resolved Hide resolved

remyschwab approved these changes Jan 20, 2022

View reviewed changes

rrahn requested review from a team and smehringer and removed request for a team January 20, 2022 14:39

smehringer reviewed Jan 24, 2022

View reviewed changes

smehringer approved these changes Jan 25, 2022

View reviewed changes

smehringer merged commit e1728f5 into seqan:master Jan 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Adds missing extract implementations for AVX512. #2926

[FEATURE] Adds missing extract implementations for AVX512. #2926

rrahn commented Jan 7, 2022 •

edited

Loading

vercel bot commented Jan 7, 2022 •

edited

Loading

codecov bot commented Jan 7, 2022

smehringer left a comment

smehringer Jan 24, 2022

rrahn Jan 24, 2022

[FEATURE] Adds missing extract implementations for AVX512. #2926

[FEATURE] Adds missing extract implementations for AVX512. #2926

Conversation

rrahn commented Jan 7, 2022 • edited Loading

vercel bot commented Jan 7, 2022 • edited Loading

codecov bot commented Jan 7, 2022

Codecov Report

smehringer left a comment

Choose a reason for hiding this comment

smehringer Jan 24, 2022

Choose a reason for hiding this comment

rrahn Jan 24, 2022

Choose a reason for hiding this comment

rrahn commented Jan 7, 2022 •

edited

Loading

vercel bot commented Jan 7, 2022 •

edited

Loading