Extensions - simd-everywhere/simde GitHub Wiki
For the most part, SIMDe tries to stick to the official APIs. However, sometimes functions which would be useful to us are missing, so we write one.
Non-standard extensions have an "x" prefix before the function name; e.g., simde_x_mm_set_pu8
. Below is a list of all non-standard extensions implemented by SIMDe, as well as a description of what they do and why they exist.
MMX
simde_x_mm_set_pu8
simde__m64
simde_x_mm_set_pu8(uint8_t e7, uint8_t e6, uint8_t e5, uint8_t e4,
uint8_t e3, uint8_t e2, uint8_t e1, uint8_t e0);
simde__m64
simde_x_mm_set_pu16(uint16_t e3, uint16_t e2, uint16_t e1, uint16_t e0);
Acts like _mm_set_pi8
and _mm_set_pi16
, but with unsigned 8-bit or 16-bit integers instead of signed integers.
This function makes it easy to load 8-bit unsigned integers, especially values greater than 2^7 - 1
, while avoiding warnings such as clang's -Wconstant-conversion
.
simde_x_mm_set_pu16
simde__m64
simde_x_mm_set_pu16(uint16_t e3, uint16_t e2, uint16_t e1, uint16_t e0);
Same as simde_x_mm_set_pu16
but for 16-bit instead of 8-bit.
SSE
simde_x_mm_not_ps
simde_x_mm_select_ps
simde_x_mm_abs_ps
simde_x_mm_copysign_ps
simde_x_mm_xorsign_ps
simde_x_mm_negate_ps
simde_x_mm_setone_ps
SSE2
simde_x_mm_abs_pd
simde_x_mm_not_pd
simde_x_mm_select_pd
simde_x_mm_copysign_pd
simde_x_mm_xorsign_pd
simde_x_mm_loadu_epi8
simde_x_mm_loadu_epi16
simde_x_mm_loadu_epi32
simde_x_mm_loadu_epi64
simde_x_mm_mul_epi64
simde_x_mm_mod_epi64
simde_x_mm_set_epu8
simde_x_mm_set_epu16
simde_x_mm_set_epu32
simde_x_mm_set_epu64x
simde_x_mm_set1_epu8
simde_x_mm_set1_epu16
simde_x_mm_set1_epu32
simde_x_mm_set1_epu64
simde_x_mm_setone_pd
simde_x_mm_setone_si128
simde_x_mm_sub_epu32
simde_x_mm_negate_pd
simde_x_mm_not_si128
simde_x_mm_deinterleaveeven_epi16
simde_x_mm_deinterleaveodd_epi16
simde_x_mm_deinterleaveeven_epi32
simde_x_mm_deinterleaveodd_epi32
simde_x_mm_deinterleaveeven_ps
simde_x_mm_deinterleaveodd_ps
simde_x_mm_deinterleaveeven_pd
simde_x_mm_deinterleaveodd_pd
SSE4.1
simde_x_mm_blendv_epi16
simde_x_mm_blendv_epi32
simde_x_mm_blendv_epi64
simde_x_kadd_f32
simde_x_kadd_f64
simde_x_mm_mullo_epu32
AVX
simde_x_mm256_not_ps
simde_x_mm256_select_ps
simde_x_mm256_not_pd
simde_x_mm256_select_pd
simde_x_mm256_setone_si256
simde_x_mm256_setone_ps
simde_x_mm256_setone_pd
simde_x_mm256_set_epu8
simde_x_mm256_set_epu16
simde_x_mm256_set_epu32
simde_x_mm256_set_epu64x
simde_x_mm256_deinterleaveeven_epi16
simde_x_mm256_deinterleaveodd_epi16
simde_x_mm256_deinterleaveeven_epi32
simde_x_mm256_deinterleaveodd_epi32
simde_x_mm256_deinterleaveeven_ps
simde_x_mm256_deinterleaveodd_ps
simde_x_mm256_deinterleaveeven_pd
simde_x_mm256_deinterleaveodd_pd
simde_x_mm256_abs_ps
simde_x_mm256_abs_pd
simde_x_mm256_copysign_ps
simde_x_mm256_copysign_pd
simde_x_mm256_loadu_epi8
simde_x_mm256_loadu_epi16
simde_x_mm256_loadu_epi32
simde_x_mm256_loadu_epi64
simde_x_mm256_xorsign_ps
simde_x_mm256_xorsign_pd
simde_x_mm256_negate_ps
simde_x_mm256_negate_pd
AVX2
simde_x_mm256_mullo_epu32
simde_x_mm256_sub_epu32
simde_x_mm256_test_all_ones`
AVX512 copysign
simde_x_mm512_copysign_ps
simde_x_mm512_copysign_pd
AVX512 lzcnt
simde_x_clz32
simde_x_clz64
AVX512 negate
simde_x_mm512_negate_ps
simde_x_mm512_negate_pd
AVX512 set
simde_x_mm512_set_epu8
simde_x_mm512_set_epu16
simde_x_mm512_set_epu32
simde_x_mm512_set_epu64
simde_x_mm512_set_m128i
simde_x_mm512_set_m256i
AVX512 set1
simde_x_mm512_set1_epu8
simde_x_mm512_set1_epu16
simde_x_mm512_set1_epu32
simde_x_mm512_set1_epu64
AVX512 setone
simde_x_mm512_setone_si512
simde_x_mm512_setone_epi32
simde_x_mm512_setone_ps
simde_x_mm512_setone_pd
AVX512 xorsign
simde_x_mm512_xorsign_ps
simde_x_mm512_xorsign_pd
GFNI
simde_x_mm_gf2p8matrix_multiply_epi64_epi8
simde_x_mm256_gf2p8matrix_multiply_epi64_epi8
simde_x_mm512_gf2p8matrix_multiply_epi64_epi8
simde_x_mm_gf2p8inverse_epi8
simde_x_mm256_gf2p8inverse_epi8
simde_x_mm512_gf2p8inverse_epi8
simde_x_mm_gf2p8matrix_multiply_inverse_epi64_epi8
simde_x_mm256_gf2p8matrix_multiply_inverse_epi64_epi8
simde_x_mm512_gf2p8matrix_multiply_inverse_epi64_epi8
SVML
simde_x_mm_deg2rad_ps
simde_x_mm_deg2rad_pd
simde_x_mm256_deg2rad_ps
simde_x_mm256_deg2rad_pd
simde_x_mm512_deg2rad_ps
simde_x_mm512_deg2rad_pd
NEON
simde_x_vmax_s64
simde_x_vmax_u64
simde_x_vmaxq_s64
simde_x_vmaxq_u64
simde_x_vmin_s64
simde_x_vmin_u64
simde_x_vminq_s64
simde_x_vminq_u64