ARM NEON tips - yszheda/wiki GitHub Wiki
- https://community.arm.com/tags/neon
- An Introduction to ARM NEON
- https://stackoverflow.com/questions/2851421/is-there-a-good-reference-for-arm-neon-intrinsics
- https://stackoverflow.com/questions/28547697/coding-for-arm-neon-how-to-start
- 关于Android NDK的10个技巧
- ARM NEON编程初探——一个简单的BGR888转YUV444实例详解
- On Android NDK use #include <cpu-features.h> with (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM) && (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON). Note this is for 32 bit ARM. ARM 64 bit has different flags but the idea is the same. See the sources/docs.
- On Linux, if available use #include <sys/auxv.h> and #include <asm/hwcap.h> with getauxval(AT_HWCAP) & HWCAP_NEON.
- https://stackoverflow.com/questions/2616274/how-do-i-reorder-vector-data-using-arm-neon-intrinsics
- https://community.arm.com/processors/b/blog/posts/coding-for-neon---part-5-rearranging-vectors
- How to shuffle bits and Check high bit value using Neon Intrinsics?
- https://stackoverflow.com/questions/37106500/neon-sse-and-interleaving-loads-vs-shuffles
- How to use the multiply and accumulate intrinsics in ARM Cortex-a8?
- https://stackoverflow.com/questions/5717011/neon-optimization-using-intrinsics
- ARM GCC Inline Assembler Cookbook
- gcc: 6.45.2 Extended Asm - Assembler Instructions with C Expression Operands
- GCC内嵌汇编
- GCC-Inline-Assembly-HOWTO
- cross compile (arm-none-eabi-as) arm assembly error “junk at end of line /” or undefined symbol
- https://stackoverflow.com/questions/31375991/how-to-address-errors-from-gcc-cross-compiler-for-arm7-target
- VIP: Debugging an ARM assembly (Neon extension)
- https://stackoverflow.com/questions/9830732/arm-neon-debugging-for-android-ndk
Use BRK
to set break point in assembly code. See the documents of BRK
:
- http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0802a/BRK.html
- http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0801c/pge1427897654274.html
-
Maximum optimization of element wise multiplication via ARM NEON assembly
-
ARM NEON Optimization no faster than C++ Pointer Implementation
- ARM and NEON can work in parallel?
- [VIP!] 从一个复数点积算法看NEON的汇编优化
- https://stackoverflow.com/questions/31766844/neon-and-arm-assembly-optimization
- https://stackoverflow.com/questions/6383826/how-to-measure-arm-performance
- [VIP] https://stackoverflow.com/questions/3247373/how-to-measure-program-execution-time-in-arm-cortex-a8-processor