Aesyle - shawfdong/hyades GitHub Wiki
Aesyle is the one and only Type IIb Compute Node (CN IIb) in our Hyades cluster. We were very fortunate to receive a donation of two (2x) Xeon Phi 5110P processors from Intel in 2013. We've since integrated those 2 Xeon Phi processors into a Dell PowerEdge R720 server, which contains two (2x) 6-core Intel Sandy Bridge Xeon E5-2630L processors at 2.0 GHz, 64 GB memory and one 500GB hard drive.
Many Integrated Core (MIC) Architecture is a coprocessor computer architecture developed by Intel and Xeon Phi is the brand name for all products based on the MIC architecture. Salient features of Intel Xeon Phi Coprocessor 5110P (belonging to the Knights Corner product line) are:
- 60 cores (in-order, dual-issue x86 design)
- 4 threads per core
- Core speed: 1.053 GHz
- 512-bit AVX (Advanced Vector Extensions)
- Double precision peak performance: 1.01 TFLOPS = 1.053 (GHz) x 60 (cores) x 512/64 (AVX) x 2 (FMA)
- Memory: 8GB GDDR5; bandwidth: 320 GB/s = 5 (GT/s) x 16 (channels) x 4 (B)
- PCI express 2.0 x16; bandwidth: 500 (MB/s) x 8/10 x 16 = 8 GB/s (16 GB/s duplex)
# mpssinfo MpssInfo Utility Log Board Vendor ID : 0x8086 Device ID : 0x2250 Subsystem ID : 0x2500 Coprocessor Stepping ID : 3 PCIe Width : x16 PCIe Speed : 5 GT/s PCIe Max payload size : 256 bytes PCIe Max read req size : 512 bytes Coprocessor Model : 0x01 Coprocessor Model Ext : 0x00 Coprocessor Type : 0x00 Coprocessor Family : 0x0b Coprocessor Family Ext : 0x00 Coprocessor Stepping : B1 Board SKU : B1PRQ-5110P/5120D ECC Mode : Enabled SMC HW Revision : Product 225W Passive CS Cores Total No of Active Cores : 60 Voltage : 1004000 uV Frequency : 1052631 kHz GDDR GDDR Vendor : Elpida GDDR Version : 0x1 GDDR Density : 2048 Mb GDDR Size : 7936 MB GDDR Technology : GDDR5 GDDR Speed : 5.000000 GT/s GDDR Frequency : 2500000 kHz GDDR Voltage : 1501000 uV
The Intel Manycore Platform Software Stack (MPSS)[2] is necessary to run the Intel Xeon Phi Coprocessor. MPSS 3.4.1 was released on October 22, 2014. The kernel version on Aesyle is 2.6.32-358 (RHEL/CentOS 6.4), which is supported by MPSS 3.4.1. Here we document the installation of MPSS 3.4.1 on Aesyle[3].
Download MPSS 3.4.1 and unpack the tar ball:
# cd /scratch/ # wget http://registrationcenter.intel.com/irc_nas/4862/mpss-3.4.1-linux.tar # tar xvf mpss-3.4.1-linux.tar
Remove previous installation of Intel MPSS:
# cd mpss-3.4.1 # ./uninstall.sh # rm -rf /var/mpss/*
If not present, generate a pair of SSH keys for root:
# cd ~/.ssh/ # ssh-keygen -t rsa
Install MPSS:
# cp ./modules/*`uname -r`*.rpm . # yum install *.rpm
Load the mic.ko driver, and then initialize MPSS Default Settings:
# modprobe mic # micctrl --cleanconfig # micctrl --initdefaults
Update Flash & SMC:
# micctrl -sIf the status for all of the coprocessors is not ready, reset the coprocessor(s):
# micctrl -rw2. Run:
# /usr/bin/micflash -update -device all3. Reboot for all flash and SMC changes to take effect.
After reboot, the MPSS service should start automatically. If not, run:
# chkconfig mpss on # service mpss start
Let's first decode some jargons[4]:
- HCA: Host Channel Adapter for InfiniBand (IB)
- OpenFabrics Enterprise Distribution (OFED): open-source software for remote direct memory access (RDMA) and kernel bypass applications.
- Symmetric Communication Interface Framework (SCIF)[5]
- Sockets-like API for communication between processes on MIC and host within the same system
- SCIF API provides both send-receive semantics, as well as Remote Memory Access (RMA) semantics
- Coprocessor Communication Link (CCL)
- Enables MIC to use IB directly and enables processes on the MIC to talk with the HCA
- It's an IB proxy through which all privileged operations are staged through
- Resides on the host and make requests on behalf of the process running on the MIC
- Data movement calls from the process on the MIC can be made in a direct manner to the HCA using PCIe peer-to-peer copies
- IB-SCIF
- Intel MPSS implementation of IB verbs over SCIF API
- This allows processes to use verbs API over a virtual HCA as underlying operations are handled using SCIF
# lspci | grep InfiniBand 41:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) # ibstatus Infiniband device 'mlx4_0' port 1 status: default gid: fe80:0000:0000:0000:0002:c903:002b:89eb base lid: 0xd3 sm lid: 0x1 state: 4: ACTIVE phys state: 5: LinkUp rate: 40 Gb/sec (4X QDR) link_layer: InfiniBand
RHEL/CentOS 6 includes Infiniband device drivers, verbs and MPI support; but it doesn't track OFED releases – Red Hat takes upstream packages directly, and takes kernel code from upstream kernels. For Xeon Phi to take advantage of InfiniBand, we need to manually compile and install an OFED distribution that supports Intel MPSS – see Chapter 2 of MPSS User's Guide[6].
# yum install libtool flex tcl-devel1. Download OFED 1.5.4.1:
# wget https://www.openfabrics.org/downloads/OFED/ofed-1.5.4/OFED-1.5.4.1.tgz # tar xvf OFED-1.5.4.1.tgz # cd OFED-1.5.4.12. Install the OFED stack:
# perl install.plDuring installation, select:
- Option 2 (Install OFED Software)
- Option 4 (Customize)
- ...exclude kernel-ib*, *-debuginfo and dapl* packages...
- ...exclude MPI packages...
- "Install 32-bit packages? [y/N]", answer N
- "Enable ROMIO support [Y/n]", answer Y
- "Enable shared library support [Y/n]", answer Y
- "Enable Checkpoint-Restart support [Y/n]", answer N
# cd /scratch/mpss-3.4.1/ # cp ofed/modules/*`uname -r`*.rpm ofed # rpm -Uvh ofed/*.rpmNOTES
- In this step we install the ofed-driver-`uname -r`-3.4.1-1.x86_64 package, which provides enhanced OFED drivers that support Intel MPSS on the host. Those kernel modules are located in /lib/modules/`uname -r`/updates/; while the stock IB kernel modules, provided by the kernel-`uname -r` package, are located in /lib/modules/`uname -r`/kernel/.
- The header files for the enhanced OFED drivers, provided by the ofed-driver-devel-`uname -r`-3.4.1-1.x86_64 package, are installed in /usr/src/ofed-driver/. We'll use those headers, in order to compile Lustre clients for both the host and the Phi coprocessors.
A virtual TCP/IP network connection between the host and the Intel Xeon Phi coprocessor is created over the PCIe bus. By default, the network addresses for the coprocessors are:
- Host side address of first coprocessor (mic0): 172.31.1.254
- IP address of first coprocessor (mic0): 172.31.1.1
- Host side address of second coprocessor (mic1): 172.31.2.254
- IP address of second coprocessor (mic1): 172.31.2.1
- The 2 Phi coprocessors can't talk to each directly. One consequence is that we won't be able to run an MPI program utilizing both coprocessors.
- We won't be able to mount the home NFS share on the coprocessors – The NFS share is exported to 2 subnets: 10.6.0.0/16 (Private GbE) & 10.7.0.0/16 (Private 10GbE).
Let's work out a better network configuration!
Here is the old configuration for em2 (10GbE interface):
# cat /etc/sysconfig/network-scripts/ifcfg-em2 DEVICE=em2 HWADDR=90:b1:1c:45:4d:76 IPADDR=10.7.7.2 NETMASK=255.255.0.0 BOOTPROTO=none ONBOOT=yes MTU=9000
We'll create a network bridge with 3 ports, em2, mic0 & mic1, on the host.
# service mpss stop # umount /home # umount /trove # ifdown em2 # micctrl --addbridge=br0 --type=external --ip=10.7.7.2 --netbits=16 --mtu=9000
We still have to manually add em2 to the bridge. Here is the modified configuration for em2:
# cat /etc/sysconfig/network-scripts/ifcfg-em2 DEVICE=em2 TYPE=Ethernet HWADDR=90:b1:1c:45:4d:76 BOOTPROTO=none ONBOOT=yes BRIDGE=br0 MTU=9000
Configure the virtual network interfaces on the Phi coprocessors:
# micctrl --network=static --bridge=br0 --ip=10.7.7.20 mic0 # micctrl --network=static --bridge=br0 --ip=10.7.7.21 mic1
Apply the new configurations:
# service network restart
Now that br0 has taken over the em2s old IP address, we can remount the NFS shares on the host:
# mount /home # mount /trove
Add the NFS mount /home to the Phi coprocessors:
# rm -rf /var/mpss/mic?/home/* # micctrl --addnfs=10.7.7.1:/export/home --dir=/home --option=noatime,nosuid,nolock,softwhich will append the following line to /etc/fstab on the coprocessors:
10.7.7.1:/export/home /home nfs noatime,nosuid,nolock,soft 1 1
Start the Intel MPSS service:
# service mpss startThe NFS share is now mounted on the coprocessors and seems to be working fine!
Modify /etc/mpss/ipoib.conf to look as follows:
ipoib_enabled=yes mic0_ib0="10.8.7.20 netmask 255.255.0.0" mic1_ib0="10.8.7.21 netmask 255.255.0.0"
Start Xeon Phi coprocessor specific OFED service on the host:
# chkconfig ofed-mic on # service ofed-mic start
COMMENTS
- RHEL/CentOS 6 uses /etc/init.d/rdma to load/unload InfiniBand kernel modules; while the ofed-driver-`uname -r`-3.4.1-1.x86_64 package provides /etc/init.d/openibd to perform the essentially same tasks. Only one script is needed. Either one can be disabled, e.g., by running chkconfig --del rdma.
- Here we use CCL-Direct and IPoIB, which currently only works with OFED-1.5.4.1 on the Mellanox mlx4 driver and hardware.
- Since we use CCL-Direct, probably we don't need the ccl-proxy service (mpxyd)?
As of October, 2014, the Terascala Lustre Storage runs Lustre server 2.15; and almost all nodes in the Hyades cluster run Lustre client 1.8.9. Here we document how to install the latest feature release (2.6.0) of Lustre client on both the host (Aesyle) and the Phi coprocessors.
Install dependencies:
# yum install libselinux-devel
Unmount the Lustre file system (/pfs):
# service lustre stop
Uninstall Lustre 1.8.9:
# rpm -e --noscripts lustre-modules lustre
Download the source RPM for latest feature release (2.6.0) of Lustre client:
$ wget --no-check-certificate https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el6/client/SRPMS/lustre-client-2.6.0-2.6.32_431.20.3.el6.x86_64.src.rpm
Rebuild RPMs for Lustre client:
$ rpmbuild --rebuild --define "configure_args --with-o2ib=/usr/src/ofed-driver" --without servers lustre-client-2.6.0-2.6.32_431.20.3.el6.x86_64.src.rpmNOTE We now use Intel MPSS OFED on Aesyle and Lustre should be compiled against Intel MPSS OFED headers (located in /usr/src/ofed-driver/). The option --define "configure_args --with-o2ib=/usr/src/ofed-driver" passes the option --with-o2ib=/usr/src/ofed-driver to configure when building the RPMs.
Install Lustre client on the host:
# cd ~dong/rpmbuild/RPMS/x86_64/ # rpm -Uvh lustre-client-modules-2.6.0-2.6.32_358.el6.x86_64.x86_64.rpm \ lustre-client-2.6.0-2.6.32_358.el6.x86_64.x86_64.rpmSome warnings will spurt out; but can be safely ignored:
WARNING: /lib/modules/2.6.32-358.el6.x86_64/kernel/drivers/infiniband/hw/ipath/ib_ipath.ko needs unknown symbol ib_wq WARNING: /lib/modules/2.6.32-358.el6.x86_64/updates/drivers/infiniband/ulp/srpt/ib_srpt.ko needs unknown symbol scst_unregister
Remount the Lustre file system on the host:
# service lustre start
We mostly follow the guide on how to cross-compile Lustre client for Xeon Phi[7], with some slight variations so as to use the latest MPSS and Lustre releases.
Download Software for Coprocessor OS (k1om):
# cd /scratch/ # wget http://registrationcenter.intel.com/irc_nas/4862/mpss-3.4.1-k1om.tar
Download MPSS source:
# wget http://registrationcenter.intel.com/irc_nas/4862/mpss-src-3.4.1.tar
Unpack the tar balls:
# tar xvf mpss-3.4.1-k1om.tar # tar xvf mpss-src-3.4.1.tar
Prepare the Linux kernel source code:
# tar xvfj ./mpss-3.4.1/src/linux-2.6.38+mpss3.4.1.tar.bz2which will create a new directory ./linux-2.6.38+mpss3.4.1 containing the Linux kernel source code.
# rpm2cpio ./mpss-3.4.1/k1om/kernel-dev-2.6.38+mpss3.4.1-1.knightscorner.rpm | cpio -idmvwhich will create a new directory ./boot containing the files needed to build new kernel modules for Xeon Phi.
# cp ./boot/config-2.6.38.8+mpss3.4.1 ./linux-2.6.38+mpss3.4.1/.config # cp ./boot/Module.symvers-2.6.38.8+mpss3.4.1 ./linux-2.6.38+mpss3.4.1/Module.symvers # cd ./linux-2.6.38+mpss3.4.1/ # make modules_prepare # cd ..
Retrieve the Lustre source code:
# git clone git://git.whamcloud.com/fs/lustre-release.git # cd lustre-release # git checkout b2_6
Create the build script /scratch/build-phi.sh:
#!/bin/bash set -e BUILD_DIR=`readlink -f $PWD` DEST_DIR=${BUILD_DIR}/lustre-root MPSS_DIR=${BUILD_DIR}/mpss-3.4.1 SCM_DIR=${BUILD_DIR}/lustre-release mkdir -p ${DEST_DIR} export ARCH=k1om source /opt/mpss/3.4.1/environment-setup-k1om-mpss-linux export LD=k1om-mpss-linux-ld cd ${SCM_DIR} sh autogen.sh ./configure $CONFIGURE_FLAGS \ --disable-tests --disable-doc --disable-server \ --with-o2ib=/usr/src/ofed-driver/ \ --with-linux=${BUILD_DIR}/linux-2.6.38+mpss3.4.1 make make install DESTDIR=${DEST_DIR} cd ${DEST_DIR} mv ./opt/lustre/2.*/k1om-mpss-linux/* . rm -rf ./opt/lustre tar cvzf ${BUILD_DIR}/lustre-phi.tar.gz ./ cd ${BUILD_DIR}
Cross-compile Lustre client for Xeon Phi:
# chmod +x build-phi.sh # ./build-phi.shwhich will create a new lustre-phi.tar.gz tarball.
Test Lustre client on Xeon Phi:
# scp lustre-phi.tar.gz mic0:/ # ssh mic0 [root@mic0]# cd / [root@mic0]# tar xvzf lustre-phi.tar.gz [root@mic0]# depmod [root@mic0]# echo 'options lnet networks=o2ib0(ib0)' >> /etc/modprobe.d/lustre.conf [root@mic0]# modprobe lnet [root@mic0]# lctl network up [root@mic0]# mkdir /pfs
mount -t lustre failed:
# mount -t lustre 10.8.8.142@o2ib0:10.8.8.143@o2ib0:/pfs /pfs mount: mounting 10.8.8.142@o2ib0:10.8.8.143@o2ib0:/pfs on /pfs failed: Invalid argument
But mount.lustre seems to work fine:
# /sbin/mount.lustre 10.8.8.142@o2ib0:10.8.8.143@o2ib0:/pfs /pfs
Automate Lustre client on Xeon Phi:
Add Lustre client to the root file system on Xeon Phi:
# tar xvfz lustre-phi.tar.gz -C /var/mpss/common/ # mkdir -p /var/mpss/common/etc/modprobe.d # # echo 'options lnet networks=o2ib0(ib0)' >> /var/mpss/common/etc/modprobe.d/lustre.conf # mkdir /var/mpss/common/pfs # rm -f /var/mpss/common/etc/init.d/lnet
Create an init script for Lustre client on Xeon Phi (/var/mpss/common/etc/init.d/lustre):
#!/bin/sh # system init for lustre let err=0 case "$1" in start) echo -n " Starting lustre ... " /sbin/mount.lustre 10.8.8.142@o2ib0:10.8.8.143@o2ib0:/pfs /pfs || let err++ echo "Done." ;; stop) echo -n " Stopping lustre ... " fuser -k /pfs/ /pfs/* umount /pfs &> /dev/null /usr/sbin/lustre_rmmod echo "Done." ;; restart) $0 stop && $0 start || exit 1 ;; status) mount | grep lustre [ $? -ne 0 ] && echo "Lustre is not mounted" ;; esac exit $err
Modify the init script /etc/init.d/ofed-mic on host:
$ssh $1 /etc/init.d/lustre start &> /dev/null2. Add the following line to the beginning of function stop_mic():
$ssh $1 /etc/init.d/lustre stop &> /dev/null3. Replace the following line in start():
ip address add 192.0.2.100/24 dev mic0 label mic0:ibwith
ip address add 192.0.2.100/24 dev br0 label br0:ib4. Replace the following line in stop():
ip address del 192.0.2.100/24 dev mic0 2>/dev/nullwith
ip address del 192.0.2.100/24 dev br0 2>/dev/nullNOTE the last 2 changes are necessary because mic0 is now a port on br0 and can't be assigned an IP address.
Create a symbolic /opt/intel Intel on the coprocessors, pointing to /pfs/sw/intel, where Intel compilers and Intel MPI are installed:
# rm -rf /var/mpss/mic?/opt # cd /var/mpss/common/opt/ # ln -s /pfs/sw/intel
Restart the mpss service:
# service mpss restart
Restart the ofed-mic service:
# service ofed-mic restart
Not sure if we need the ccl-proxy service. Let's start it nonetheless:
# chkconfig --add mpxyd # service mpxyd start
Voila! InfiniBand and Lustre client now appear to be fully working on both the host and the coprocessors!
When running applications directly on Xeon Phi coprocessors (native mode), we usually take the following steps[8]:
- Compile the application for native execution.
- Build required libraries for native execution.
- Copy the executable and any dependencies, such as runtime libraries, to the target hardware.
- Mount file shares to the target hardware for accessing input data sets and saving output data sets.
- Connect to the target hardware via console, set up the environment, and run the application.
There is a hurdle to overcome, though. The SSH server on Embedded Linux was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin; but Bash on the Embedded Linux does not source the ~/.bashrc file on non-login, non-interactive SSH sessions.
[aesyle]$ ssh mic0 echo \$PATH /usr/bin:/bin:/usr/sbin:/sbin
So we can't use ~/.bashrc to set environment variables like PATH and LD_LIBRARY_PATH. To run the sample MPI "Hello world" program in native mode, we would have to do something like the following:
[aesyle]$ ssh mic0 \ PATH=/usr/bin:/bin:/pfs/sw/intel/impi/4.1.3.045/mic/bin \ LD_LIBRARY_PATH=/pfs/sw/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic:/pfs/sw/intel/impi/4.1.3.045/mic/lib \ mpirun -n 60 /pfs/dong/mpi_hello.k1om
This is tiresome! One possible fix is to use ~/.ssh/environment to set to set environment variables. For this to work, we enable PermitUserEnvironment in /etc/ssh/sshd_config on the embedded Linux (the default is no):
PermitUserEnvironment yesbut then every user will have to modified his/her ~/.ssh/environment file, which is not ideal. The ordinary users want it just works. OpemSSH does not offer a way to set environment globally by itself. But we can use the PAM module pam_env.so to easily achieve the goal nonetheless. If we append the following line to the default /etc/pam.d/sshd on the coprocessors,
session required pam_env.so readenv=1SSH sessions will read /etc/environment to set environment variables. This is a better and my preferred solution!
Modify the root file system on the coprocessors:
# mkdir /var/mpss/common/etc/pam.d
Create /var/mpss/common/etc/pam.d/sshd that reads as follows:
#%PAM-1.0 auth include common-auth account required pam_nologin.so account include common-account password include common-password session optional pam_keyinit.so force revoke session include common-session session required pam_loginuid.so session required pam_limits.so session required pam_env.so readenv=1
Create /var/mpss/common/etc/environment:
PATH=/usr/bin:/bin:/usr/sbin:/sbin:/opt/intel/impi/4.1.3.045/mic/bin LD_LIBRARY_PATH=/opt/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic:/opt/intel/impi/4.1.3.045/mic/lib
Restart mpss to apply the new settings to the coprocessors:
# /etc/init.d/ofed-mic stop # /etc/init.d/mpss restart # /etc/init.d/ofed-mic start
Now we can run the sample MPI "Hello world" program in native mode, with a much simpler command:
[aesyle]$ ssh mic0 mpirun -n 60 /pfs/dong/mpi_hello.k1om
Now that we've fixed non-login, non-interactive Bash shell, we'll turn to interactive login Bash shell. By default, PATH for an interactive login shell is /usr/local/bin:/usr/bin:/bin on the coprocessors. So to run the sample MPI "Hello world" program interactively in native mode, we would have to do something like the following:
[aesyle]$ ssh mic1 [mic1]$ export PATH=$PATH:/pfs/sw/intel/impi/4.1.3.045/mic/bin [mic1]$ export LD_LIBRARY_PATH=/pfs/sw/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic [mic1]$ mpirun -n 60 /pfs/dong/mpi_hello.k1om [mic1]$ exitwhich can be easily fixed as well!
Modify the root file system on the coprocessors:
# mkdir /var/mpss/common/etc/profile.d
Create /var/mpss/common/etc/profile.d/intel.sh that reads as follows:
export PATH=/usr/bin:/bin:/usr/sbin:/sbin:/opt/intel/impi/4.1.3.045/mic/bin export LD_LIBRARY_PATH=/opt/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic:/opt/intel/impi/4.1.3.045/mic/lib
Restart mpss to apply the new settings to the coprocessors:
# /etc/init.d/ofed-mic stop # /etc/init.d/mpss restart # /etc/init.d/ofed-mic start
Now it is much easier to run the sample MPI "Hello world" program interactively in native mode:
[aesyle]$ ssh mic1 [mic1]$ mpirun -n 60 /pfs/dong/mpi_hello.k1om [mic1]$ exit
In the intel_mic module, we define the environmental variable I_MPI_MIC=enable to enable the MPI communication between host and coprocessors[9].
We set:
PATH=/usr/linux-k1om-4.7/bin:$PATH LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic:/opt/intel/impi/4.1.3.045/mic/lib/mic/libIn Symmetric Execution Mode, mpirun will by default replicate the host's environment variables to the coprocessors. Setting LD_LIBRARY_PATH as such will allow the MPI processes to find the appropriate shared libraries on the coprocessors; thus we can use much shorter commands to run programs in symmetric mode. Otherwise, we we would have to do something like the following:
[aesyle]$ mpirun -genv I_MPI_FABRICS shm:tcp \ -n 2 -host `hostname` /pfs/dong/mpi_hello.x86-64 : \ -env PATH /pfs/sw/intel/impi/4.1.3.045/mic/bin \ -env LD_LIBRARY_PATH /pfs/sw/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic:/pfs/sw/intel/impi/4.1.3.045/mic/lib \ -n 60 -host mic0 /pfs/dong/mpi_hello.k1om : \ -env PATH /pfs/sw/intel/impi/4.1.3.045/mic/bin \ -env LD_LIBRARY_PATH /pfs/sw/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic:/pfs/sw/intel/impi/4.1.3.045/mic/lib \ -n 60 -host mic1 /pfs/dong/mpi_hello.k1om
We also set the following for Offload Execution Mode:
MIC_ENV_PREFIX=MIC MIC_LD_LIBRARY_PATH=/opt/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic MIC_KMP_AFFINITY=scatter"By default, all environment variables defined in the environment of an executing CPU program are replicated to the coprocessor's execution environment when an offload occurs. You can modify this behavior by defining the environment variable MIC_ENV_PREFIX. When you set MIC_ENV_PREFIX to a specific prefix, then not all CPU environment variables are replicated to the coprocessor, but only those environment variables that begin with the value of the MIC_ENV_PREFIX environment variable. The environment variables set on the coprocessor have the prefix value removed. You thus have independent control of OpenMP, Intel Cilk Plus, and other execution environments that use common environment variable names."[10]
- ^ Intel Xeon Phi Coprocessor 5110P (8GB, 1.053 GHz, 60 core)
- ^ Intel Manycore Platform Software Stack (MPSS)
- ^ Intel Manycore Platform Software Stack MPSS 3.4.1 README
- ^ Communication in a HPC cluster with MIC
- ^ Symmetric Communications Interface (SCIF) User's Guide
- ^ Intel MPSS User's Guide
- ^ How to cross-compile Lustre client for Xeon Phi
- ^ Building a Native Application for Intel Xeon Phi Coprocessors
- ^ Using the Intel MPI Library on Intel Xeon Phi Coprocessor Systems
- ^ Setting Environment Variables on the CPU to Modify the Coprocessor's Execution Environment