How to tune clBLAS GEMM - tingxingdong/clBLAS-private GitHub Wiki
After you build clBLAS.so successfully. You may want to re-tune the clBLAS library/kernels on a new hardware, like AMD's Navi card.
Step 1:
cd ~/clBLAS/src/library/blas/AutoGemm/AutoGemmTools
vim ProfileAutoGemm.cpp
read, edit whatever you want like datatype, matrix size, etc
Step 2:
cd ~/clBLAS_BUILD/
make AutoGemm_Tools_Profile
Step 3:
cd ./library
./AutoGemm_Tools_Profile
bench-marking, after a while
You will see *.csv and *.txt file generated at current folder. where *.txt is a summary of optimal kernel configurations. Trim & merge & simplify this table in *.txt
Step 4:
cd ~/clBLAS/src/library/blas/AutoGemm/
vim AutoGemmParameters.py
, you will see two big tables ending with Hawaii, Fiji. Add the table you edited in last step. You can pick a name ending with Navi like what you see with Hawaii, Fiji. Enable you added table in the code to make sure it is called.
Notice, change a table looking like from
[ 400, [ 32, 32], [ [ 64, 64], [ 48, 48], [ 80, 80] ] ],
To
[ 400, [ 16, 16, 2, 2], [ [ 16, 16, 4, 4], [ 16, 16, 3, 3], [ 16, 16, 5, 5] ] ],
Step 5:
python AutoGemm.py -h
Edit AutoGemm.py, add "Navi" beside "Fiji"
python AutoGemm.py --opencl-compiler-version 1.2 --architecture Navi
new kernels tuned for Navi will be added to AutoGemm
Step 6:
rebuild clBLAS.so again, and you are done. Optimal setting for Navi should be set up.
Step 7 (optional):
If you want to add Navi as an cmake option, like Fiji, Hawaii. cd ~/clBLAS & grep -r Fiji *
and add Navi wherever Fiji appeared.