Parallel Processing - shivamvats/notes GitHub Wiki

Cuda

  1. Multi Process Service (MPS): For efficient sharing of a GPU among multiple processes. Typically, if multiple processes share a GPU then the GPU schedules those tasks one by one. The Multi Process Service (MPS) allows multiple processes to be run on a GPU concurrently. To do this start an MPS server with nvidia-cuda-mps-control before you launch the processes.

    • Commands::

      sudo nvidia-smi–c 3 –i 6,7
      export CUDA_VISIBLE_DEVICES=6,7
      nvidia-cuda-mps-control –d
      export CUDA_VISIBLE_DEVICES=0,1; ./run # Note that 6,7 are remapped to 0,1
      
    • Note: When CUDA_VISIBLE_DEVICES is set before launching the control daemon, the devices will be remapped by the MPS server. This means that if your system has devices 0, 1 and 2, and if CUDA_VISIBLE_DEVICES is set to “0,2”, then when a client connects to the server it will see the remapped devices - device 0 and a device 1. Therefore, keeping CUDA_VISIBLE_DEVICES set to “0,2” when launching the client would lead to an error.

    • For detailed doc look at mps-doc.

OpenACC

This is a compiler directive based framework for parallelization that depends on the compiler to generate parallelized code.

Nvidia's OpenACC bootcamp materials are at nvidia-bootcamp.

Multithreading

Good beginner's reference: https://www.openmp.org/wp-content/uploads/omp-hands-on-SC08.pdf

Multithreading in STL

Use parallelstl : https://github.com/intel/parallelstl

⚠️ **GitHub.com Fallback** ⚠️