How to tune clBLAS GEMM - tingxingdong/clBLAS-private GitHub Wiki

After you build clBLAS.so successfully. You may want to re-tune the clBLAS library/kernels on a new hardware, like AMD's Navi card.

Step 1:

cd ~/clBLAS/src/library/blas/AutoGemm/AutoGemmTools

vim ProfileAutoGemm.cpp read, edit whatever you want like datatype, matrix size, etc

Step 2:

cd ~/clBLAS_BUILD/

make AutoGemm_Tools_Profile

Step 3: cd ./library ./AutoGemm_Tools_Profile bench-marking, after a while

You will see *.csv and *.txt file generated at current folder. where *.txt is a summary of optimal kernel configurations. Trim & merge & simplify this table in *.txt

Step 4:

cd ~/clBLAS/src/library/blas/AutoGemm/

vim AutoGemmParameters.py , you will see two big tables ending with Hawaii, Fiji. Add the table you edited in last step. You can pick a name ending with Navi like what you see with Hawaii, Fiji. Enable you added table in the code to make sure it is called.

Notice, change a table looking like from

[ 400, [ 32, 32], [ [ 64, 64], [ 48, 48], [ 80, 80] ] ],

To

[ 400, [ 16, 16, 2, 2], [ [ 16, 16, 4, 4], [ 16, 16, 3, 3], [ 16, 16, 5, 5] ] ],

Step 5:

python AutoGemm.py -h

Edit AutoGemm.py, add "Navi" beside "Fiji"

python AutoGemm.py --opencl-compiler-version 1.2 --architecture Navi

new kernels tuned for Navi will be added to AutoGemm

Step 6:

rebuild clBLAS.so again, and you are done. Optimal setting for Navi should be set up.

Step 7 (optional):

If you want to add Navi as an cmake option, like Fiji, Hawaii. cd ~/clBLAS & grep -r Fiji * and add Navi wherever Fiji appeared.