Matt's Titan notes - openucx/ucx GitHub Wiki

Matt Baker's notes on getting UCX working on Titan

On Titan, your scratch space is separated by project name. To make the below instructions work you will need to set a project name variable or edit the code to have your project name instead.

export PROJNAME= #put project name here. Should show up possible names with 'ls $MEMBERWORK'

download sources to $MEMBERWORK/$PROJNAME/ucx-work/ ((TODO: Put source links here))

First, swap the PGI programming environment for the GNU programming environment. PGI may work, but is not tested currently.

module swap PrgEnv-pgi PrgEnv-gnu

Some versions of GCC are broken. Version 4.9.1 is known to work fine, so swap out for that version.

module swap gcc gcc/4.9.1

librte: Librte is optional and not required for running gtest or local utilities. Things like perf_test will require it to wire up remote nodes.

./configure --prefix=$MEMBERWORK/$PROJNAME/ucx-work/librte --with-pmi=/opt/cray/pmi/default/ --with-pmi-lib=/opt/cray/pmi/default/lib64/ --with-pmi-include=/opt/cray/pmi/default/include/  
make  
make install

UCX build:

./configure --prefix=$MEMBERWORK/$PROJNAME/ucx-work/ --enable-instrumentation --enable-frame-pointer --enable-stats --enable-memtrack --enable-fault-injection --enable-debug-data --without-verbs --with-rte=$MEMBERWORK/$PROJNAME/ucx-work/librte/ CC=cc CXX=CC LDFLAGS="-dynamic"
make -j 32
# If making gtest
make -C test/gtest -j 32

running:

qsub -I -A (account) -q debug -l nodes=1,walltime=30:00
(Wait for interactive shell)
cd $MEMBERWORK/$PROJNAME/ucx-work/ucx/
make -C test/gtest test LAUNCHER="aprun -n 1 -cc none" UCX_HANDLE_ERRORS="debug"