Kprintf - tingxingdong/clBLAS-private GitHub Wiki
KPRINTF template renderer
Most of the L2 routines and all of the L1 routines are not dynamically generated, but are written in a static style that allows the datatypes to be generic. This is an attempt to combine the ease of use of having a kernel .cl file, but being able to specialize the algorithm for the different datatypes that that BLAS supports.
The documents below captures the rational, design and algorithms used to develop the kernels for the kprintf tempalte renderer
L3 & L2 routines
- TRMV, TRSV, SYMM
- HEMV, GER, GERU, GERC, HER, HER2, SYR, SYR2, HEMM, HERK
- GBMV, TBMV, SBMV, HBMV, TBSV
- HPMV, SPMV, TPMV, TPSV, HPR, SPR
- HER2K