Compilation with CUDA support - GeoEnergyLab-EPFL/BigWham GitHub Wiki
As the matrix-vector operation on the hierarchical matrices is memory-bandwidth bounded, an implementation of this operation based on CUDA has been developed to benefit from the high memory bandwidth of GPUs.
To compile BigWham with CUDA support you'll need :
- a BLAS vendor (mainly tested using OpenBlas)
- a C/C++ compiler
- CUDA drivers (mainly tested using cuda 12.6)
- python > 3.10 (mainly tested/developed using python 3.11) if interested in the python binder
1. Setting up the python environment
python -m venv bigwham_env
source bigwham_env/bin/activate
pip install numpy scipy matplotlib
cuda_dev
branch
2. Compiling BigWham on the git checkout cuda_dev
mkdir build
cd build
# Here to build BigWham with Cuda support and OpenBlas as the BLAS vendor :
cmake -DUSE_CUDA=ON -DCMAKE_BUILD_TYPE=Release ..
make -j4
bigwham4py
python module
3. Installing the cd interfaces/python/
pip install .
Using BigWham with CUDA support
Here is a small python script solving a linear system on the BigWham hierarchical matrix with Cuda support enabled, it reads the input mesh from a meshio compatible format (ex : vtk
). The relevant options of the BEMatrix
constructor to enable CUDA are useCuda
, homogeneous_size_pattern
and fixed_rank
, the two first must be set to True
while the fixed_rank
must be a positive integer. The n_GPUs
is optionnal.
import sys
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
from scipy.sparse.linalg import gmres
from bigwham4py import BEMatrix
import meshio
#reading it
print(f"Reading mesh from {sys.argv[1]}")
mesh = meshio.read(sys.argv[1])
coor = mesh.points.astype(np.float64)
conn = mesh.cells_dict['triangle'].astype(np.int64)
dim = coor.shape[1]
num_dof = conn.shape[0]*dim
# Building the hmat
G=1.
nu=0.15
elas_prop = np.array([2*G*(1+nu), nu])
eta=5.0
max_leaf_size = 32
fixed_rank = int(max_leaf_size / 4)
eps_aca = 1.0e-4
num_threads = 64
basekernel = "3DT0-H"
# BigWham hmat with Cuda support for matvec operations
hmat = BEMatrix(
basekernel, coor, conn, elas_prop, max_leaf_size, eta, eps_aca,
homogeneous_size_pattern=True,
fixed_rank=fixed_rank,
useCuda=True,
n_GPUs=1,
n_openMP_threads=num_threads, verbose=True
)
# Traction vector
tinf = np.ones(hmat.shape[0])
# Solve using scipy's GMRES
d,info = gmres(hmat, tinf, M=hmat.H_jacobi_prec(), rtol=1e-6)
# Plot solution
dd = d.reshape((-1, 3))
fig1, ax1 = plt.subplots()
triang=matplotlib.tri.Triangulation(coor[:,0], coor[:,1], triangles=conn, mask=None)
tri=ax1.tripcolor(triang, dd[:,2],cmap = plt.cm.rainbow, alpha = 0.5)
ax1.axis('equal')
plt.colorbar(tri)
plot_file = "bigwham_gmres_solve_GPU.png"
plt.savefig(plot_file)
print(f"Solution plot saved in " + plot_file)