HPC FRAU - vanhoan310/tips GitHub Wiki

  1. Error: ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found Fix: export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH export PYTHONNOUSERSITE=1

Chi tiet: Good catch β€” and this is exactly the kind of thing that happens on HPC. Nothing β€œmysterious” changed in your environment. The error tells us where it went wrong.


πŸ”΄ Why the error came back (key insight)

You already proved once that things were correct:

libstdc++.so.6 => .../conda/envs/vrp/lib/.../libstdc++.so.6

But now, when you run:

(vrp) python test.py

you are getting:

ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found

πŸ‘‰ That means:

This shell is NOT using Conda’s libstdc++. The dynamic linker has fallen back to the system one again.

This usually happens because one of these is true:

  1. You are on a different node (login vs compute)
  2. LD_LIBRARY_PATH is not set in this shell
  3. A module load / module purge reset the environment
  4. You opened a new terminal where the fix was not applied

βœ… Confirm where you are (important)

Run right now:

hostname

If you see something like:

  • login, a0632, frontend β†’ ❌ login node
  • gpuXXX, cnXXX, nodeXXX β†’ βœ… compute node

Running PyTorch Geometric + SciPy on login nodes often breaks exactly like this.


βœ… Immediate fix (works right now)

In this same terminal, do:

export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
export PYTHONNOUSERSITE=1

Then test again:

python test.py

πŸ” Verify before running Python (recommended habit)

Run this every time something feels off:

ldd $(python - << 'EOF'
import scipy.spatial._ckdtree as m
print(m.__file__)
EOF
) | grep libstdc++

βœ… Good:

.../conda/envs/vrp/lib/libstdc++.so.6

❌ Bad:

/lib64/libstdc++.so.6

πŸ›‘οΈ Make the fix permanent (STRONGLY recommended)

Add this to ~/.bashrc, after conda initialize:

# Prevent pip --user pollution
export PYTHONNOUSERSITE=1

# Ensure Conda C++ runtime is used (HPC fix)
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

Then:

source ~/.bashrc

🧠 Why this keeps happening on FRAU / HPC

  • Login nodes and compute nodes often have different loaders
  • module load cuda may reset LD_LIBRARY_PATH
  • Conda does not always win unless forced
  • SciPy wheels require newer GLIBCXX than system provides

This is not a mistake on your side β€” it’s normal HPC behavior.


βœ… Best practice workflow (bulletproof)

Interactive GPU work

srun --partition=gpu --gres=gpu:1 --pty bash
module purge
module load cuda/12.1
conda activate vrp
export PYTHONNOUSERSITE=1
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
python test.py

Batch jobs (sbatch)

Always include:

export PYTHONNOUSERSITE=1
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

🟒 Final reassurance

  • ❌ This is not a PyTorch Geometric bug
  • ❌ This is not a SciPy bug
  • ❌ This is not a broken install
  • βœ… This is dynamic linker precedence on HPC

Once LD_LIBRARY_PATH is correct, the error cannot occur.

If you want, next I can:

  • check your .bashrc ordering
  • give you a one-line sanity check script
  • help you make Slurm-proof activation scripts

Just say how you usually start your sessions.

error: OSError: libcudart.so.11.0: cannot open shared object file: No such file or directory: Dung phien ban cpu cua toch de cai torch_sparse... (hinh nhu chi can commandline cuoi cung) PyTorch version: 2.1.0+cu121 CUDA version (compiled): 12.1 cuDNN version: 8902

https://github.com/Lotfollahi-lab/nichecompass-reproducibility/tree/main conda install cudatoolkit conda install cudnn pip install nichecompass[all]==0.2.0 pip install torch-geometric pip install torch_sparse torch_scatter torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.1.0+cpu.html