Compiler Bugs - simd-everywhere/simde GitHub Wiki

SIMDe has proven to be rather good at finding compiler bugs.

It is generally possible for us to work around these issues, usually by using the preprocessor to choose a different code path. When we find bugs in the current versions of compilers we try to report them so they can be fixed; the GCC and Emscripten developers in particular have been very responsive and helpful.

Bugs We Can't Work Around

Unfortunately, sometimes working around a bug simply isn't feasible. What follows is a list of bugs we can't currently work around.

Clang

  • Clang 3.8 hits an internal compiler error when trying to auto-vectorize some functions using AVX-512VL. This isn't usually a problem outside of our test suite since if AVX-512VL is enabled SIMDe will typically use the AVX-512VL intrinsics instead of auto-vectorizing. Either upgrade to clang 3.9, don't use -DSIMDE_NO_NATIVE (which is really only for testing), or don't auto-vectorize to AVX-512VL (don't use -mavx512vl or add -mno-avx512vl).

GCC

  • All GCC 7.x releases hit a floating point exception internal compiler error when compiling many AVX-512 functions if optimization is enabled (-O1 or higher), debugging is enabled (-g), and AVX is not enabled. For details, see issue #278. This only effects applications which use the SIMDe versions of certain AVX-512 instructions.
  • GCC 4.7 - 5 don't work reliably with AVX-512 support as many functions will generate internal compiler errors. You can still use there compilers with SIMDe, just don't pass -mavx512f (or otherwise enable it, such as with -march=native on an AVX-512 machine) or pass -mno-avx512f.

PGI

  • PGI doesn't work in C++ mode. PGI is tracking the issue internally as TPR #28482. For details, see their community forum

Bugs Discovered By SIMDe

As mentioned in the introduction, we seem to find a fair number of compiler bugs. What follows is an incomplete list of some of the issues we've found in compilers which have been fixed, and/or for which there is a work-around in SIMDe.

Compiler Bug Present In Fixed In Type Description
GCC 95227 Spurious warning vec_extract doesn't mark input as used in C++ mode
GCC 95239 10 11 12 10.5 12.3 13 Spurious warning Unable to ignore -Wattribute-warning in macro
GCC 95421 API missing / incorrect [AArch64] Missing NEON functions documented on ARM's web site
GCC 95471 API missing / incorrect [ARMv8] vrndvq_f32 defined even when not supported by ARMv8
GCC 95782 ICE [ppc64le] ICE in _cpp_pop_context
GCC 97016 API missing / incorrect _MM_CMPINT_ENUM type is missing
GCC 95483 11 API missing / incorrect [i386] Missing SIMD functions.
GCC 95399 API missing / incorrect [ARM] 32/64-bit vcvtnq_* functions are missing
GCC 96313 8.5 API missing / incorrect [AArch64] vqmovun* return types should be unsigned.
GCC 94482 8 / 9 8.5/9.4/10 Incorrect optimization [8/9 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result.
GCC 93557 9.3/10.1 Spurious warning __builtin_convertvector doesn't mark input as used.
GCC 94385 9.4/10.1 ICE [10 Regression] Internal compiler error for __builtin_convertvector + statement expr. Latent bug in GCC 9.
GCC 94488 8.5/9.4/10.1 ICE [AArch64] ICE on right shift of V2DImode by DImode shift.
GCC 96174 9.4/10.2 API missing / incorrect AVX-512 functions missing when compiled without optimization.
GCC 98428 11 11.1 ICE ICE with omp simd loop + optimization
GCC 98521 10.2.1 10.3 API missing / incorrect [x86] _mm256_cmov_si256 XOP function is missing.
GCC 97248 ICE [mips] unrecognizable insn when left shifting uint64 vector by scalar with MSA
GCC 99754 11 12 11.4 12.1 API missing / incorrect [sse2] new _mm_loadu_si16 and _mm_loadu_si32 implemented incorrectly
GCC 100760 10.2 12.1 ICE [mips + msa] ICE: maximum number of generated reload insns per insn achieved
GCC 100761 10.2 12.1 ICE [mips+msa] ICE when using __builtin_convertvector to convert from u8x8 to u8x16
GCC 100762 10.2 12.1 ICE [mips+msa] ICE when comparing 64 bit vectors
GCC 100927 10.2 Incorrect optimization [sse2] floating point to integer conversion functions incorrect results w/ NaN constants + optimization
GCC 105339 9, 10, 11, 12 9.5, 10.4, 11.4, 12.1 API missing / incorrect [x86] missing AVX-512F scalef functions when optimization is disabled
GCC 101714 10.2.1 12.1 Implementation incorrect [POWER] vec_min / vec_max handles NaN incorrectly when evaluated at compile time
GCC 101985 9, 10, 11 12.0 Implementation incorrect [POWER] vec_cpsgn parameter order was reversed
GCC 114075 14.0 14.1 miscompilation [s390x] Do not emulate vectors containing floats
GCC 113065 10.2 ICE [ARM32] shift of 128-bit NEON types
GCC 118476 13.1.0, 13.3.0, 14.1.0, 14.2.0, 9.1.0 12.5.0, 13.4.0, 14.3.0, 15.0, 8.5.0 ICE [i586] invalid 'PHI' argument
Clang BZ45541 GH44886 15.0 Incorrect optimization [AArch64] Incorrect result for vector conversion with -O2
Clang 45931 11 API missing / incorrect Many AVX-512 functions take an int instead of unsigned int.
Clang BZ45959 GH45304 Spurious warning SIMD & reduction on signed types emits sign-conversion diagnostic
Clang BZ46770 GH46114 12 API missing / incorrect [ppc] vec_sel variants missing
Clang 46844 GH46188 12 (AArch64-only) Incorrect optimization [AArch64] incorrect results from vcvt* functions with negative inputs when optimization is enabled.
Clang 46840 12 API missing / incorrect [AArch64] vqmovun* return types should be unsigned.
Clang 32827 5 API missing / incorrect _mm_set_pd1 missing from emmintrin.h
Clang 44589 Spurious warning _mm_extract_pi16 and _mm_insert_pi16 warn with -Wvector-conversion. Fixed in clang 11.0
Clang BZ48257 GH47601 Spurious warning vget_lane_p64 triggers -Wvector-conversion diagnostic
Clang BZ48673 GH48017 API missing / incorrect [x86] _mm_frcz_ss and _mm_frcz_sd should take two parameters
Clang 48718 12 API missing / incorrect NEON scalar comparison functions should return unsigned values
Clang 49716 7+ 13 ICE clang segfault at -O2 in C mode
Clang BZ50893 GH50237 11.0 13.0 ICE Compiler error when converting from vector of 2x32 to 2x64-bit int on POWER7
Clang BZ50901 GH50245 11.0 13.0 ICE [power7] error in backend: Cannot select v16i8 = PPCISD::SCALAR_TO_VECTOR_PERMUTED
Clang BZ50905 GH50249 11.0 trunk Perf slow code for absolute value of int8 x 16 vector on POWER9 at -O3
Clang BZ50932 GH50276 14.0 API missing / incorrect [POWER] vec_bperm missing documented signatures
Clang 51992 14-dev 14-dev ICE Regression [VectorCombine] ScalarizationResult destructor assertion due to SafeWithFreeze and scalarizeLoadExtract
Clang 71362 Incorrect Implementation NEON intrinsic compilation error occurs when using -fno-lax-vector-conversions
Clang 71365 Incorrect Implementation Wrong return type in some ARM NEON intrinsics
Clang 71751 15 trunk API missing / incorrect Wrong return type of NEON intrinsic vqrshrunh_n_s16 in arm_neon.h
Clang 71763 trunk Incorrect Implementation Wrong result of NEON intrinsic vld2q_dup_p16 with -march=armv7-a
Emscripten 10425 1.39.7 1.39.11 ICE LLVM assertion failure when casting a SIMD type with -O3
Emscripten 11315 ICE Crash in WebAssembly Register Stackify with tot
Emscripten 11176 1.39.16-dev 1.39.17 ICE "Not a vector MVT!" clang error with -s SIMD=1 on tot
Emscripten 10563 1.39.17 ICE emcc crashes with SIMD without optimization
Emscripten 10651 1.39.9 1.39.10 API missing / incorrect emscripten errors in wasm_simd128.h
Emscripten 14629 2.0.27 ICE UNREACHABLE executed at …/binaryen/src/wasm-interpreter.h:503!
Emscripten 19179 3.1.34 3.1.43 API incorrect Compiler advertises __builtin_roundeven{f,} but it is not implemented
ICC ??? Incorrect optimization ICC generates incorrect code for signed absolute difference
ICC ??? False advertisement of feature __builtin_expect_with_probability unsupported but __has_builtin claims otherwise
ICC ??? False advertisement of feature fallthrough attribute unsupported, contrary to icc's claims
ICC ??? ICE Internal error: 04010002_1671
NVC 30104 21.3 ICE ICE from % operator on vector extensions
NVC 30107 21.3 ICE Using _mm_cvtpd_epi32 results in compiler error
NVC 30106 21.3 API missing/incorrect Undefined reference to `__builtin_ia32_palignr256’ when calling _mm256_alignr_epi8

Note: some of the version number information may be incorrect or missing. Sorry, it's a recent addition to the table.