Quick Start: 1D Shock Tube - hyschive/gamer-fork GitHub Wiki
This page includes three demos:
1. Move to the source directory.
cd src
2. Generate Makefile
by configure.py.
cp ../example/test_problem/Hydro/Riemann/generate_make.sh ./
sh generate_make.sh --openmp=false
Execution results
... ... ======================================== Makefile is created. ========================================
Note
We have set
--mpi=false,
--gpu=false, and
--openmp=false
to run in a CPU-only mode
without OpenMP and MPI.
See Option List
for a complete list of all available options of configure.py
.
3. Compile the code.
make clean
make
Execution results
... ... Compiling GAMER --> Successful!
4. Create a working directory.
cd ../bin
mkdir shocktube
cd shocktube
5. Copy the GAMER executable and example files of the test problem to the working directory.
cp -r ../../example/test_problem/Hydro/Riemann/* .
cp ../gamer .
ls
Execution results
clean.sh gamer Input__Flag_Lohner Input__Parameter Input__TestProb plot__hydro_dens.gpt plot__mhd.gpt README ReferenceSolution
6. Run the code. It will display ~ GAMER OVER ~
if it succeeds.
./gamer
Execution results
... ... Time: 9.0000000e-02 -> 9.3057198e-02, Step: 29 -> 30, dt_base: 3.0571979e-03 Time: 9.3057198e-02 -> 9.6446514e-02, Step: 30 -> 31, dt_base: 3.3893166e-03 Time: 9.6446514e-02 -> 9.9834923e-02, Step: 31 -> 32, dt_base: 3.3884083e-03 Time: 9.9834923e-02 -> 1.0000000e-01, Step: 32 -> 33, dt_base: 1.6507717e-04 Output_DumpData_Part (DumpID = 10) ... Output_DumpData_Part (DumpID = 10) ... done End_GAMER ... End_MemFree ... done End_GAMER ... done ~ GAME OVER ~
7. The code will generate several log files Record__*
and a series of 1D text data files Xline_y0.000_z0.000_*
.
ls
Execution results
clean.sh plot__mhd.gpt Record__PatchCount Xline_y0.000_z0.000_000001 Xline_y0.000_z0.000_000007 gamer README Record__Performance Xline_y0.000_z0.000_000002 Xline_y0.000_z0.000_000008 Input__Flag_Lohner Record__Dump Record__TimeStep Xline_y0.000_z0.000_000003 Xline_y0.000_z0.000_000009 Input__Parameter Record__MemInfo Record__Timing Xline_y0.000_z0.000_000004 Xline_y0.000_z0.000_000010 Input__TestProb Record__NCorrUnphy ReferenceSolution Xline_y0.000_z0.000_000005 plot__hydro_dens.gpt Record__Note Xline_y0.000_z0.000_000000 Xline_y0.000_z0.000_000006
8. Plot a 1D text data file. You can use the sample gnuplot
script plot__hydro_dens.gpt
(replace display
by other image viewers if necessary).
gnuplot plot__hydro_dens.gpt
display Fig__Riemann_Density_000010.png
Execution results
9. Check the performance. See Log Files for more detailed timing analysis.
tail -n 3 Record__Note
Execution results
Total Processing Time : 75.954923 s
Next, we enable OpenMP for the same test problem. Repeat the steps above with the following modifications.
1. Re-generate Makefile
by configure.py and recompile gamer
.
sh generate_make.sh --openmp=true
make clean
make -j4
Caution
Remember to copy the new executable to bin/shocktube
.
2. Set the number of OpenMP threads by editing the runtime parameter OMP_NTHREAD in the input file Input__Parameter. The following example uses 4 threads.
OMP_NTHREAD 4 # number of OpenMP threads (<=0=auto) [-1]
Caution
All input files Input__*
must be put in the same directory as the executable gamer
.
3. Remove all old log and data files, if necessary.
You can use the helper script clean.sh
.
sh clean.sh
ls
Execution results
clean.sh gamer Input__Flag_Lohner Input__Parameter Input__TestProb plot__hydro_dens.gpt plot__mhd.gpt README ReferenceSolution
4. Run the code in the same way as without OpenMP.
./gamer
5. Validate the OpenMP settings by searching for the keyword "OpenMP" in the log file Record__Note. You should see something like
OpenMP Diagnosis *********************************************************************************** OMP__SCHEDULE DYNAMIC OMP__SCHEDULE_CHUNK_SIZE 1 OMP__NESTED OFF CPU core IDs of all OpenMP threads (tid == thread ID): ------------------------------------------------------------------------ Rank Host NThread tid-00 tid-01 tid-02 tid-03 0 golub123 4 2 5 4 7 ***********************************************************************************
Check the following things:
- The number under
NThread
is the same as the runtime parameter OMP_NTHREAD you just set - Different threads use different CPU cores
6. Check the performance. It should be about OMP_NTHREAD times faster than the case without OpenMP.
tail -n 3 Record__Note
Execution results
Total Processing Time : 20.460586 s
To enable both GPU and OpenMP, repeat the steps in CPU-only with OpenMP with the following modifications.
1. Re-generate Makefile
by configure.py and recompile gamer
.
sh generate_make.sh --machine=YOUR_MACHINE --openmp=true --gpu=true
make clean
make -j4
Caution
- Please make sure that the
GPU_COMPUTE_CAPABILITY
is set properly in your machine configuration file - Remember to copy the new executable to
bin/shocktube
.
2. Remove all old log and data files, if necessary. Run the code with the new executable.
sh clean.sh
./gamer
3. Validate the GPU settings by searching for the keyword "Device Diagnosis" in the log file Record__Note. You should see something like
Device Diagnosis *********************************************************************************** MPI_Rank = 0, hostname = golub123, PID = 47842 CPU Info : CPU Type : Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz CPU MHz : 2499.982 Cache Size : 25600 KB CPU Cores : 10 Total Memory : 63.0 GB GPU Info : Number of GPUs : 2 GPU ID : 0 GPU Name : Tesla K40m CUDA Driver Version : 8.0 CUDA Runtime Version : 7.0 CUDA Major Revision Number : 3 CUDA Minor Revision Number : 5 Clock Rate : 0.745000 GHz Global Memory Size : 11439 MB Constant Memory Size : 64 KB Shared Memory Size per Block : 48 KB Number of Registers per Block : 65536 Warp Size : 32 Number of Multiprocessors: : 15 Number of Cores per Multiprocessor: 192 Total Number of Cores: : 2880 Max Number of Threads per Block : 1024 Max Size of the Block X-Dimension : 1024 Max Size of the Grid X-Dimension : 2147483647 Concurrent Copy and Execution : Yes Concurrent Up/Downstream Copies : Yes Concurrent Kernel Execution : Yes GPU has ECC Support Enabled : Yes ***********************************************************************************
This example shows that we are running on the computing node golub123
with 2 GPUs, and we are using the one with GPU ID = 0
,
a Tesla K40m
GPU.
4. Check the performance. It should be noticeably faster than the case without GPU.
tail -n 3 Record__Note
Execution results
Total Processing Time : 9.417532 s