Quick Start: 1D Shock Tube - hyschive/gamer-fork GitHub Wiki
This page includes three demos:
1. Move to the source directory.
cd src2. Generate Makefile by configure.py.
cp ../example/test_problem/Hydro/Riemann/generate_make.sh ./
sh generate_make.sh --openmp=falseExecution results
... ... ======================================== Makefile is created. ========================================
Note
We have set
--mpi=false,
--gpu=false, and
--openmp=false
to run in a CPU-only mode
without OpenMP and MPI.
See Option List
for a complete list of all available options of configure.py.
3. Compile the code.
make clean
makeExecution results
... ... Compiling GAMER --> Successful!
4. Create a working directory.
cd ../bin
mkdir shocktube
cd shocktube5. Copy the GAMER executable and example files of the test problem to the working directory.
cp -r ../../example/test_problem/Hydro/Riemann/* .
cp ../gamer .
lsExecution results
clean.sh gamer Input__Flag_Lohner Input__Parameter Input__TestProb plot__hydro_dens.gpt plot__mhd.gpt README ReferenceSolution
6. Run the code. It will display ~ GAMER OVER ~ if it succeeds.
./gamerExecution results
... ... Time: 9.0000000e-02 -> 9.3057198e-02, Step: 29 -> 30, dt_base: 3.0571979e-03 Time: 9.3057198e-02 -> 9.6446514e-02, Step: 30 -> 31, dt_base: 3.3893166e-03 Time: 9.6446514e-02 -> 9.9834923e-02, Step: 31 -> 32, dt_base: 3.3884083e-03 Time: 9.9834923e-02 -> 1.0000000e-01, Step: 32 -> 33, dt_base: 1.6507717e-04 Output_DumpData_Part (DumpID = 10) ... Output_DumpData_Part (DumpID = 10) ... done End_GAMER ... End_MemFree ... done End_GAMER ... done ~ GAME OVER ~
7. The code will generate several log files Record__*
and a series of 1D text data files Xline_y0.000_z0.000_*.
lsExecution results
clean.sh plot__mhd.gpt Record__PatchCount Xline_y0.000_z0.000_000001 Xline_y0.000_z0.000_000007 gamer README Record__Performance Xline_y0.000_z0.000_000002 Xline_y0.000_z0.000_000008 Input__Flag_Lohner Record__Dump Record__TimeStep Xline_y0.000_z0.000_000003 Xline_y0.000_z0.000_000009 Input__Parameter Record__MemInfo Record__Timing Xline_y0.000_z0.000_000004 Xline_y0.000_z0.000_000010 Input__TestProb Record__NCorrUnphy ReferenceSolution Xline_y0.000_z0.000_000005 plot__hydro_dens.gpt Record__Note Xline_y0.000_z0.000_000000 Xline_y0.000_z0.000_000006
8. Plot a 1D text data file. You can use the sample gnuplot
script plot__hydro_dens.gpt (replace display by other image viewers if necessary).
gnuplot plot__hydro_dens.gpt
display Fig__Riemann_Density_000010.pngExecution results

9. Check the performance. See Log Files for more detailed timing analysis.
tail -n 3 Record__NoteExecution results
Total Processing Time : 75.954923 s
Next, we enable OpenMP for the same test problem. Repeat the steps above with the following modifications.
1. Re-generate Makefile by configure.py and recompile gamer.
sh generate_make.sh --openmp=true
make clean
make -j4Caution
Remember to copy the new executable to bin/shocktube.
2. Set the number of OpenMP threads by editing the runtime parameter OMP_NTHREAD in the input file Input__Parameter. The following example uses 4 threads.
OMP_NTHREAD 4 # number of OpenMP threads (<=0=auto) [-1]
Caution
All input files Input__* must be put in the same directory as the executable gamer.
3. Remove all old log and data files, if necessary.
You can use the helper script clean.sh.
sh clean.sh
lsExecution results
clean.sh gamer Input__Flag_Lohner Input__Parameter Input__TestProb plot__hydro_dens.gpt plot__mhd.gpt README ReferenceSolution
4. Run the code in the same way as without OpenMP.
./gamer5. Validate the OpenMP settings by searching for the keyword "OpenMP" in the log file Record__Note. You should see something like
OpenMP Diagnosis
***********************************************************************************
OMP__SCHEDULE DYNAMIC
OMP__SCHEDULE_CHUNK_SIZE 1
OMP__NESTED OFF
CPU core IDs of all OpenMP threads (tid == thread ID):
------------------------------------------------------------------------
Rank Host NThread tid-00 tid-01 tid-02 tid-03
0 golub123 4 2 5 4 7
***********************************************************************************
Check the following things:
- The number under
NThreadis the same as the runtime parameter OMP_NTHREAD you just set - Different threads use different CPU cores
6. Check the performance. It should be about OMP_NTHREAD times faster than the case without OpenMP.
tail -n 3 Record__NoteExecution results
Total Processing Time : 20.460586 s
To enable both GPU and OpenMP, repeat the steps in CPU-only with OpenMP with the following modifications.
1. Re-generate Makefile by configure.py and recompile gamer.
sh generate_make.sh --openmp=true --gpu=true
make clean
make -j4Caution
- Please make sure that the
GPU_COMPUTE_CAPABILITYis set properly in your machine configuration file - Remember to copy the new executable to
bin/shocktube.
2. Remove all old log and data files, if necessary. Run the code with the new executable.
sh clean.sh
./gamer3. Validate the GPU settings by searching for the keyword "Device Diagnosis" in the log file Record__Note. You should see something like
Device Diagnosis *********************************************************************************** MPI_Rank = 0, hostname = golub123, PID = 47842 CPU Info : CPU Type : Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz CPU MHz : 2499.982 Cache Size : 25600 KB CPU Cores : 10 Total Memory : 63.0 GB GPU Info : Number of GPUs : 2 GPU ID : 0 GPU Name : Tesla K40m CUDA Driver Version : 8.0 CUDA Runtime Version : 7.0 CUDA Major Revision Number : 3 CUDA Minor Revision Number : 5 Clock Rate : 0.745000 GHz Global Memory Size : 11439 MB Constant Memory Size : 64 KB Shared Memory Size per Block : 48 KB Number of Registers per Block : 65536 Warp Size : 32 Number of Multiprocessors: : 15 Number of Cores per Multiprocessor: 192 Total Number of Cores: : 2880 Max Number of Threads per Block : 1024 Max Size of the Block X-Dimension : 1024 Max Size of the Grid X-Dimension : 2147483647 Concurrent Copy and Execution : Yes Concurrent Up/Downstream Copies : Yes Concurrent Kernel Execution : Yes GPU has ECC Support Enabled : Yes ***********************************************************************************
This example shows that we are running on the computing node golub123
with 2 GPUs, and we are using the one with GPU ID = 0,
a Tesla K40m GPU.
4. Check the performance. It should be noticeably faster than the case without GPU.
tail -n 3 Record__NoteExecution results
Total Processing Time : 9.417532 s