Skip to content
Michael R. Crusoe edited this page Sep 4, 2020 · 2 revisions

For the most part, SIMDe tries to stick to the official APIs. However, sometimes functions which would be useful to us are missing, so we write one.

Non-standard extensions have an "x" prefix before the function name; e.g., simde_x_mm_set_pu8. Below is a list of all non-standard extensions implemented by SIMDe, as well as a description of what they do and why they exist.

MMX

simde_x_mm_set_pu8

simde__m64
simde_x_mm_set_pu8(uint8_t e7, uint8_t e6, uint8_t e5, uint8_t e4,
                   uint8_t e3, uint8_t e2, uint8_t e1, uint8_t e0);

simde__m64
simde_x_mm_set_pu16(uint16_t e3, uint16_t e2, uint16_t e1, uint16_t e0);

Acts like _mm_set_pi8 and _mm_set_pi16, but with unsigned 8-bit or 16-bit integers instead of signed integers.

This function makes it easy to load 8-bit unsigned integers, especially values greater than 2^7 - 1, while avoiding warnings such as clang's -Wconstant-conversion.

simde_x_mm_set_pu16

simde__m64
simde_x_mm_set_pu16(uint16_t e3, uint16_t e2, uint16_t e1, uint16_t e0);

Same as simde_x_mm_set_pu16 but for 16-bit instead of 8-bit.

SSE

simde_x_mm_not_ps

simde_x_mm_select_ps

simde_x_mm_abs_ps

simde_x_mm_copysign_ps

simde_x_mm_xorsign_ps

simde_x_mm_negate_ps

simde_x_mm_setone_ps

SSE2

simde_x_mm_abs_pd

simde_x_mm_not_pd

simde_x_mm_select_pd

simde_x_mm_copysign_pd

simde_x_mm_xorsign_pd

simde_x_mm_loadu_epi8

simde_x_mm_loadu_epi16

simde_x_mm_loadu_epi32

simde_x_mm_loadu_epi64

simde_x_mm_mul_epi64

simde_x_mm_mod_epi64

simde_x_mm_set_epu8

simde_x_mm_set_epu16

simde_x_mm_set_epu32

simde_x_mm_set_epu64x

simde_x_mm_set1_epu8

simde_x_mm_set1_epu16

simde_x_mm_set1_epu32

simde_x_mm_set1_epu64

simde_x_mm_setone_pd

simde_x_mm_setone_si128

simde_x_mm_sub_epu32

simde_x_mm_negate_pd

simde_x_mm_not_si128

simde_x_mm_deinterleaveeven_epi16

simde_x_mm_deinterleaveodd_epi16

simde_x_mm_deinterleaveeven_epi32

simde_x_mm_deinterleaveodd_epi32

simde_x_mm_deinterleaveeven_ps

simde_x_mm_deinterleaveodd_ps

simde_x_mm_deinterleaveeven_pd

simde_x_mm_deinterleaveodd_pd

SSE4.1

simde_x_mm_blendv_epi16

simde_x_mm_blendv_epi32

simde_x_mm_blendv_epi64

simde_x_kadd_f32

simde_x_kadd_f64

simde_x_mm_mullo_epu32

AVX

simde_x_mm256_not_ps

simde_x_mm256_select_ps

simde_x_mm256_not_pd

simde_x_mm256_select_pd

simde_x_mm256_setone_si256

simde_x_mm256_setone_ps

simde_x_mm256_setone_pd

simde_x_mm256_set_epu8

simde_x_mm256_set_epu16

simde_x_mm256_set_epu32

simde_x_mm256_set_epu64x

simde_x_mm256_deinterleaveeven_epi16

simde_x_mm256_deinterleaveodd_epi16

simde_x_mm256_deinterleaveeven_epi32

simde_x_mm256_deinterleaveodd_epi32

simde_x_mm256_deinterleaveeven_ps

simde_x_mm256_deinterleaveodd_ps

simde_x_mm256_deinterleaveeven_pd

simde_x_mm256_deinterleaveodd_pd

simde_x_mm256_abs_ps

simde_x_mm256_abs_pd

simde_x_mm256_copysign_ps

simde_x_mm256_copysign_pd

simde_x_mm256_loadu_epi8

simde_x_mm256_loadu_epi16

simde_x_mm256_loadu_epi32

simde_x_mm256_loadu_epi64

simde_x_mm256_xorsign_ps

simde_x_mm256_xorsign_pd

simde_x_mm256_negate_ps

simde_x_mm256_negate_pd

AVX2

simde_x_mm256_mullo_epu32

simde_x_mm256_sub_epu32

simde_x_mm256_test_all_ones`

AVX512 copysign

simde_x_mm512_copysign_ps

simde_x_mm512_copysign_pd

AVX512 lzcnt

simde_x_clz32

simde_x_clz64

AVX512 negate

simde_x_mm512_negate_ps

simde_x_mm512_negate_pd

AVX512 set

simde_x_mm512_set_epu8

simde_x_mm512_set_epu16

simde_x_mm512_set_epu32

simde_x_mm512_set_epu64

simde_x_mm512_set_m128i

simde_x_mm512_set_m256i

AVX512 set1

simde_x_mm512_set1_epu8

simde_x_mm512_set1_epu16

simde_x_mm512_set1_epu32

simde_x_mm512_set1_epu64

AVX512 setone

simde_x_mm512_setone_si512

simde_x_mm512_setone_epi32

simde_x_mm512_setone_ps

simde_x_mm512_setone_pd

AVX512 xorsign

simde_x_mm512_xorsign_ps

simde_x_mm512_xorsign_pd

GFNI

simde_x_mm_gf2p8matrix_multiply_epi64_epi8

simde_x_mm256_gf2p8matrix_multiply_epi64_epi8

simde_x_mm512_gf2p8matrix_multiply_epi64_epi8

simde_x_mm_gf2p8inverse_epi8

simde_x_mm256_gf2p8inverse_epi8

simde_x_mm512_gf2p8inverse_epi8

simde_x_mm_gf2p8matrix_multiply_inverse_epi64_epi8

simde_x_mm256_gf2p8matrix_multiply_inverse_epi64_epi8

simde_x_mm512_gf2p8matrix_multiply_inverse_epi64_epi8

SVML

simde_x_mm_deg2rad_ps

simde_x_mm_deg2rad_pd

simde_x_mm256_deg2rad_ps

simde_x_mm256_deg2rad_pd

simde_x_mm512_deg2rad_ps

simde_x_mm512_deg2rad_pd

NEON

simde_x_vmax_s64

simde_x_vmax_u64

simde_x_vmaxq_s64

simde_x_vmaxq_u64

simde_x_vmin_s64

simde_x_vmin_u64

simde_x_vminq_s64

simde_x_vminq_u64

Clone this wiki locally