Coarray Fortran - shawfdong/hyades GitHub Wiki
Coarray Fortran (CAF) is now included as part of the Fortran 2008 standards. Here are 2 sample CAF programs:
- caf_hello.f90: a sample CAF "Hello world" program written in Coarray Fortran;
- caf_pi.f90: a sample CAF programs that calculates pi using Monte Carlo method.
Use the Intel Fortran compiler to compile CAF codes:
$ ifort -coarray caf_hello.f90 -o caf_hello.x
By default, when a Coarray Fortran application is compiled with the Intel compiler, the invocation creates as many images as there are processor cores on the host platform. For example, there are 32 cores on the master node of Hyades:
$ ./caf_hello.x Hello from image 6 running on hyades.ucsc.edu out of 32 Hello from image 7 running on hyades.ucsc.edu out of 32 Hello from image 14 running on hyades.ucsc.edu out of 32 Hello from image 15 running on hyades.ucsc.edu out of 32 Hello from image 31 running on hyades.ucsc.edu out of 32 Hello from image 13 running on hyades.ucsc.edu out of 32 Hello from image 16 running on hyades.ucsc.edu out of 32 Hello from image 8 running on hyades.ucsc.edu out of 32 Hello from image 28 running on hyades.ucsc.edu out of 32 Hello from image 1 running on hyades.ucsc.edu out of 32 Hello from image 29 running on hyades.ucsc.edu out of 32 Hello from image 24 running on hyades.ucsc.edu out of 32 Hello from image 26 running on hyades.ucsc.edu out of 32 Hello from image 27 running on hyades.ucsc.edu out of 32 Hello from image 12 running on hyades.ucsc.edu out of 32 Hello from image 3 running on hyades.ucsc.edu out of 32 Hello from image 9 running on hyades.ucsc.edu out of 32 Hello from image 10 running on hyades.ucsc.edu out of 32 Hello from image 20 running on hyades.ucsc.edu out of 32 Hello from image 11 running on hyades.ucsc.edu out of 32 Hello from image 23 running on hyades.ucsc.edu out of 32 Hello from image 5 running on hyades.ucsc.edu out of 32 Hello from image 21 running on hyades.ucsc.edu out of 32 Hello from image 4 running on hyades.ucsc.edu out of 32 Hello from image 25 running on hyades.ucsc.edu out of 32 Hello from image 30 running on hyades.ucsc.edu out of 32 Hello from image 32 running on hyades.ucsc.edu out of 32 Hello from image 19 running on hyades.ucsc.edu out of 32 Hello from image 22 running on hyades.ucsc.edu out of 32 Hello from image 17 running on hyades.ucsc.edu out of 32 Hello from image 18 running on hyades.ucsc.edu out of 32 Hello from image 2 running on hyades.ucsc.edu out of 32
There are two methods to control the number of images:
1. to use the environment variable FOR_COARRAY_NUM_IMAGES:
$ export FOR_COARRAY_NUM_IMAGES=4 $ ./caf_hello.x Hello from image 1 running on hyades.ucsc.edu out of 4 Hello from image 2 running on hyades.ucsc.edu out of 4 Hello from image 3 running on hyades.ucsc.edu out of 4 Hello from image 4 running on hyades.ucsc.edu out of 4
2. to use the coarray-num-images=N compiler option to compile the application, where N is the number of images:
$ unset FOR_COARRAY_NUM_IMAGES $ ifort -coarray -coarray-num-images=2 caf_hello.f90 -o caf_hello.x $ ./caf_hello.x Hello from image 1 running on hyades.ucsc.edu out of 2 Hello from image 2 running on hyades.ucsc.edu out of 2
In the latter case, You can still use the environment variable FOR_COARRAY_NUM_IMAGES to set the number of images:
$ export FOR_COARRAY_NUM_IMAGES=4 $ ./caf_hello.x Hello from image 1 running on hyades.ucsc.edu out of 4 Hello from image 2 running on hyades.ucsc.edu out of 4 Hello from image 3 running on hyades.ucsc.edu out of 4 Hello from image 4 running on hyades.ucsc.edu out of 4
First create a Coarray config file (cafconfig.txt) with content like the following:
-machinefile hosts -genvall -genv I_MPI_FABRICS shm:ofa -n 4 ./caf_hello.distthen compile the CAF codes:
$ ifort -coarray=distributed -coarray-config-file=cafconfig.txt caf_hello.f90 -o caf_hello.dist $ ifort -coarray=distributed -coarray-config-file=cafconfig.txt caf_pi.f90 -o caf_pi.dist
First create a host file (hosts) with content like the following:
gpu-1:2 gpu-2:2then run the CAF executables:
$ ./caf_hello.dist Hello from image 1 running on gpu-1.local out of 4 Hello from image 2 running on gpu-1.local out of 4 Hello from image 3 running on gpu-2.local out of 4 Hello from image 4 running on gpu-2.local out of 4 $ ./caf_pi.x pi = 3.14143000000000
Note: Intel's CAF implementation seems to be built on top of Intel MPI:
$ ldd caf_hello.dist linux-vdso.so.1 => (0x00007fff2efff000) libicaf.so => /opt/intel/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libicaf.so (0x00002b66b4643000) libm.so.6 => /lib64/libm.so.6 (0x00000030ac000000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000030ac800000) libc.so.6 => /lib64/libc.so.6 (0x00000030abc00000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003f1be00000) libdl.so.2 => /lib64/libdl.so.2 (0x00000030ac400000) libmpi_mt.so.4 => /opt/intel/impi/4.1.3.045/intel64/lib/libmpi_mt.so.4 (0x00002b66b489a000) libintlc.so.5 => /opt/intel/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libintlc.so.5 (0x00002b66b4f19000) /lib64/ld-linux-x86-64.so.2 (0x00000030ab800000) librt.so.1 => /lib64/librt.so.1 (0x00000030acc00000)