AMD Hackathon Notes - dawiedotcom/IMEX_SfloW2D_v2 GitHub Wiki
Current instructions are as follows and might change soon:
Load the AMD flang compiler:
$ module load amdflang-new/rocm-afar-6.1.0
$ which amdflang
> /opt/rocmplus-6.4.0/rocm-afar-6.1.0/bin/amdflang
export HSA_XNACK=1
$ git clone https://github.com/jklebes/IMEX_SfloW2D_v2 --branch=amd
$ cd IMEX_SfloW2D_v2
$ autoreconf
$ ./configure --prefix=$PWD
$ make FCFLAGS='-fopenmp' FC=amdflang LDFLAGS='' install
Some of the examples require Python packages for pre and post-processing:
$ python -m venv .venv
$ . .venv/bin/activate
(.venv) $ pip install numpy pandas netcdf4 matplotlib
Same instructions as in [/EXAMPLES/EXAMPLE_ETNA/README.txt]:
(.venv) $ cd DEM
(.venv) $ unzip *.zip
(.venv) $ cd ..
(.venv) $ python create_input_ellipsoid.py
(.venv) $ ../../bin/IMEX_SfloW2D
> ...
> Time taken by iterations is 461.651952007 seconds
> Elapsed real time = 483.532 seconds
> ...
I place the following in the src/Makefile.amd
and invoke
with make -f Makefile.amd
from src/
TARGET=IMEX_SfloW2D
FCFLAGS=-fopenmp -fopenmp-force-usm --offload-arch=gfx942
LDFLAGS=-lflang_rt.hostdevice
SRC= parameters_2d.f90 \
complexify.f90 \
geometry_2d.f90 \
constitutive_2d.f90 \
solver_2d.f90 \
init_2d.f90 \
inpout_2d.f90 \
IMEX_SfloW2D.f90
OBJ=$(patsubst %.f90, %.o, $(SRC))
$(info $(SRC))
$(info $(OBJ))
all : $(TARGET)
%.o : %.f90
$(FC) $(FCFLAGS) -c -o $@ $<
$(TARGET) : $(OBJ)
$(FC) $(FCFLAGS) $(LDFLAGS) -o $@ $(OBJ)
.phony: clean
clean:
rm -rf $(TARGET) *.o *.mod
Based on the tutorial in [1].
- Added
rocxt.f90
with therocxt
module to the IMEX project'ssrc/
directory. - Added the following to a
.f90
file to include the module:USE roctx, ONLY: roctxpop, roctxpush USE ISO_C_BINDING, ONLY: c_null_char
- Instrumented sections of the code with:
CALL rocxtpush("some label" // c_null_char) ! ... CALL rocxtpop("some label" // c_null_char)
- Added the following link flags:
LDFLAGS+=-L${ROCM_PATH}/lib -lrocprofiler-sdk-roctx
- Compiled the code and ran the profiler (from one of the example directories) with:
Which produced
rocprofv3 --sys-trace --marker-trace --output-format pftrace -- <bin_name>
.ptrace
files that can be visualised in ui.perfetto.dev.