Extensions - simd-everywhere/simde GitHub Wiki

For the most part, SIMDe tries to stick to the official APIs. However, sometimes functions which would be useful to us are missing, so we write one.

Non-standard extensions have an "x" prefix before the function name; e.g., simde_x_mm_set_pu8. Below is a list of all non-standard extensions implemented by SIMDe, as well as a description of what they do and why they exist.

MMX

simde_x_mm_set_pu8

simde__m64
simde_x_mm_set_pu8(uint8_t e7, uint8_t e6, uint8_t e5, uint8_t e4,
                   uint8_t e3, uint8_t e2, uint8_t e1, uint8_t e0);

simde__m64
simde_x_mm_set_pu16(uint16_t e3, uint16_t e2, uint16_t e1, uint16_t e0);

Acts like _mm_set_pi8 and _mm_set_pi16, but with unsigned 8-bit or 16-bit integers instead of signed integers.

This function makes it easy to load 8-bit unsigned integers, especially values greater than 2^7 - 1, while avoiding warnings such as clang's -Wconstant-conversion.

simde_x_mm_set_pu16

simde__m64
simde_x_mm_set_pu16(uint16_t e3, uint16_t e2, uint16_t e1, uint16_t e0);

Same as simde_x_mm_set_pu16 but for 16-bit instead of 8-bit.

SSE

simde_x_mm_not_ps

simde_x_mm_select_ps

simde_x_mm_abs_ps

simde_x_mm_copysign_ps

simde_x_mm_xorsign_ps

simde_x_mm_negate_ps

simde_x_mm_setone_ps

SSE2

simde_x_mm_abs_pd

simde_x_mm_not_pd

simde_x_mm_select_pd

simde_x_mm_copysign_pd

simde_x_mm_xorsign_pd

simde_x_mm_loadu_epi8

simde_x_mm_loadu_epi16

simde_x_mm_loadu_epi32

simde_x_mm_loadu_epi64

simde_x_mm_mul_epi64

simde_x_mm_mod_epi64

simde_x_mm_set_epu8

simde_x_mm_set_epu16

simde_x_mm_set_epu32

simde_x_mm_set_epu64x

simde_x_mm_set1_epu8

simde_x_mm_set1_epu16

simde_x_mm_set1_epu32

simde_x_mm_set1_epu64

simde_x_mm_setone_pd

simde_x_mm_setone_si128

simde_x_mm_sub_epu32

simde_x_mm_negate_pd

simde_x_mm_not_si128

simde_x_mm_deinterleaveeven_epi16

simde_x_mm_deinterleaveodd_epi16

simde_x_mm_deinterleaveeven_epi32

simde_x_mm_deinterleaveodd_epi32

simde_x_mm_deinterleaveeven_ps

simde_x_mm_deinterleaveodd_ps

simde_x_mm_deinterleaveeven_pd

simde_x_mm_deinterleaveodd_pd

SSE4.1

simde_x_mm_blendv_epi16

simde_x_mm_blendv_epi32

simde_x_mm_blendv_epi64

simde_x_kadd_f32

simde_x_kadd_f64

simde_x_mm_mullo_epu32

AVX

simde_x_mm256_not_ps

simde_x_mm256_select_ps

simde_x_mm256_not_pd

simde_x_mm256_select_pd

simde_x_mm256_setone_si256

simde_x_mm256_setone_ps

simde_x_mm256_setone_pd

simde_x_mm256_set_epu8

simde_x_mm256_set_epu16

simde_x_mm256_set_epu32

simde_x_mm256_set_epu64x

simde_x_mm256_deinterleaveeven_epi16

simde_x_mm256_deinterleaveodd_epi16

simde_x_mm256_deinterleaveeven_epi32

simde_x_mm256_deinterleaveodd_epi32

simde_x_mm256_deinterleaveeven_ps

simde_x_mm256_deinterleaveodd_ps

simde_x_mm256_deinterleaveeven_pd

simde_x_mm256_deinterleaveodd_pd

simde_x_mm256_abs_ps

simde_x_mm256_abs_pd

simde_x_mm256_copysign_ps

simde_x_mm256_copysign_pd

simde_x_mm256_loadu_epi8

simde_x_mm256_loadu_epi16

simde_x_mm256_loadu_epi32

simde_x_mm256_loadu_epi64

simde_x_mm256_xorsign_ps

simde_x_mm256_xorsign_pd

simde_x_mm256_negate_ps

simde_x_mm256_negate_pd

AVX2

simde_x_mm256_mullo_epu32

simde_x_mm256_sub_epu32

simde_x_mm256_test_all_ones`

AVX512 copysign

simde_x_mm512_copysign_ps

simde_x_mm512_copysign_pd

AVX512 lzcnt

simde_x_clz32

simde_x_clz64

AVX512 negate

simde_x_mm512_negate_ps

simde_x_mm512_negate_pd

AVX512 set

simde_x_mm512_set_epu8

simde_x_mm512_set_epu16

simde_x_mm512_set_epu32

simde_x_mm512_set_epu64

simde_x_mm512_set_m128i

simde_x_mm512_set_m256i

AVX512 set1

simde_x_mm512_set1_epu8

simde_x_mm512_set1_epu16

simde_x_mm512_set1_epu32

simde_x_mm512_set1_epu64

AVX512 setone

simde_x_mm512_setone_si512

simde_x_mm512_setone_epi32

simde_x_mm512_setone_ps

simde_x_mm512_setone_pd

AVX512 xorsign

simde_x_mm512_xorsign_ps

simde_x_mm512_xorsign_pd

GFNI

simde_x_mm_gf2p8matrix_multiply_epi64_epi8

simde_x_mm256_gf2p8matrix_multiply_epi64_epi8

simde_x_mm512_gf2p8matrix_multiply_epi64_epi8

simde_x_mm_gf2p8inverse_epi8

simde_x_mm256_gf2p8inverse_epi8

simde_x_mm512_gf2p8inverse_epi8

simde_x_mm_gf2p8matrix_multiply_inverse_epi64_epi8

simde_x_mm256_gf2p8matrix_multiply_inverse_epi64_epi8

simde_x_mm512_gf2p8matrix_multiply_inverse_epi64_epi8

SVML

simde_x_mm_deg2rad_ps

simde_x_mm_deg2rad_pd

simde_x_mm256_deg2rad_ps

simde_x_mm256_deg2rad_pd

simde_x_mm512_deg2rad_ps

simde_x_mm512_deg2rad_pd

NEON

simde_x_vmax_s64

simde_x_vmax_u64

simde_x_vmaxq_s64

simde_x_vmaxq_u64

simde_x_vmin_s64

simde_x_vmin_u64

simde_x_vminq_s64

simde_x_vminq_u64