metis - noma/dm-heom GitHub Wiki

Graph Partitioning with METIS

Since graph partitioning in general is NP-hard, it is not feasible to generate an optimal partitioning. Maybe, for the HEOM graph, there is an efficient algorithm we don't know yet. My current intuition is, even an optimal partitioning will cause a lot of network traffic and won't give us a big increase in performance. So a good one is sufficient.

METIS is a popular tool for partitioning graphs and meshes using a set of different heuristic algorithms. We can use it to partition the hierarchy.

Building/Installing METIS

METIS is well documented, an easy to build. Here is a minimal recipe for METIS 5.1.0.:

# we assume there is a 'Downloads' folder in your home
cd $HOME/Downloads

# download Metis 5.1.0
wget http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/metis-5.1.0.tar.gz

# unpack
tar xf metis-5.1.0.tar.gz
cd metis-5.1.0/

# installation location, we assume there is a 'Software' folder in your home
export PREFIX='$HOME/Software/metis-5.1.0'

# OPTIONAL: Cray-specific, change to GNU compilers
module switch PrgEnv-cray PrgEnv-gnu
module load cmake

# configure, build and install (uses CMake)
make config shared=1 prefix=$PREFIX
make
make install

# modify ~/.bashrc for easy usage
echo -e "\n# METIS 5.1.0" >> $HOME/.bashrc
echo "export LD_LIBRARY_PATH=$PREFIX/lib:\$LD_LIBRARY_PATH" >> $HOME/.bashrc
echo "export PATH=$PREFIX/bin:\$PATH" >> $HOME/.bashrc

# load changes
source ~/.bashrc

METIS should now be installed within $HOME/Software/metis-5.1.0 and directly usable.

Usage

We use gpmetis which has a number of command line options to explore, see the PDF from the downloaded package, or try:

gpmetis -h

The simplest usage is:

gpmetis <graph_file> <number of partitions>

The number of partitions is the number of compute nodes or MPI ranks we want to run with, i.e. what we specify with the -ncommand line option when using mpirunor aprun to start a parallel job. The graph file is a text-file in a certain adjacency-list format that contains the hierarchy.

HEOM provides a util_graph_to_file which takes a population_dynamics configuration file and writes the corresponding hierarchy graph file. By default it will create a file with the configuration file name and append .graph for the output filename, an optional second argument can be used to specify an arbitrary output file name.

./build_folder/util_graph_to_file <population_dynamics config> [output filename]

gpmetis appends .part.<n>, e.g. .part.16to the specified graph file name to generate the output file name.

NOTE: Whenever you change a configuration file in a way that changes the hierarchy graph (e.g. ado_depth), the graph file and partitioning files must be re-generated. Recommendation: suffix configuration filename, e.g. fmo_22bath_d3.cfg to reflect an ado_depth of 3, so graph and partitioning filenames will clearly reflect this parameter too. Some METIS algorithms use random numbers, so generated partitions might not be reproducible without knowing the seed of the random number generator.

Examples

Let's assume you are in /data and want to create a graph partitioning for a 64 node job for a population_dynamics.cfg in that folder. You could use data/examples_fmo/fmo_population_dynamics_77K.cfg to actually execute this example step by step. We also assume a build folder in your repository.

# create the graph file (output: population_dynamics.cfg.graph)
../build/util_graph_to_file population_dynamics.cfg

# create the partitoning using METIS (output: `population_dynamics.graph.part.64`)
gpmetis population_dynamics.graph 64

You could run app_mpi_population_dynamics now (assuming you have the resources allocated, etc. which depends on the specific machine):

  • with mpirun
# we are in data where a suitable ocl_config.cfg is assumed:
mpirun -n 64 ../build/app_mpi_population_dynamics ocl_config.cfg population_dynamics.cfg population_dynamics.cfg.graph.part.64
  • with aprunon a Cray (assuming we are in an interactive job, or job script)
# we are in heom/cpp
aprun -n 64 -N 1 -cc none ../build/app_mpi_population_dynamics ocl_config.cfg population_dynamics.cfg  population_dynamics.cfg.graph.part.64

Generating lots of configurations

Generate graph files for all *.cfg in current directory, change pattern to match your needs.

for f in *.cfg; do ../build/util_graph_to_file $f; done

Generate partitionings for powers of 2 from 2 to 1024 for all graph files *.cfg.graph using METIS, adapt as needed.

RANKS="2 4 8 16 32 64 128 256 512 1024"; for f in *.cfg.graph; do for r in $RANKS; do gpmetis $f $r; done; done 

Generate partitionings for powers of 2 from 2 to 1024 for a single graph file population_dynamics.cfg.graph:

RANKS="2 4 8 16 32 64 128 256 512 1024"; for r in $RANKS; do gpmetis population_dynamics.cfg.graph $r; done
⚠️ **GitHub.com Fallback** ⚠️