troubleshooting - skoriche/NGIAB-Calibration-DevCon25 GitHub Wiki
Troubleshooting Guide
This page contains solutions to common issues you might encounter during the workshop.
Installation Issues
Docker Permission Issues (Linux)
If you get permission denied errors when running Docker:
sudo usermod -aG docker $USER
# Then log out and back in, or run:
newgrp docker
UV Command Not Found
If uvx
command is not recognized:
#To add $HOME/.local/bin to your PATH, either restart your shell or run:
source $HOME/.local/bin/env (sh, bash, zsh)
source $HOME/.local/bin/env.fish (fish)
Docker Not Starting
Windows/Mac:
- Ensure Docker Desktop is running (check system tray)
- Restart Docker Desktop from the menu
Linux:
sudo systemctl start docker
sudo systemctl enable docker
Calibration Issues
Calibration Fails to Start
-
Check Docker is running:
docker ps
If this fails, Docker isn't running properly.
-
Verify data structure:
ls -la provo-10154200/ # Should show: calibration/ config/ forcings/ outputs/
-
Check error logs:
tail -n 50 provo-10154200/calibration/Output/Calibration_Run/ngen_*/ngen.log
High Computational Time
- Start with fewer iterations for testing (
-i 4
or-i 10
) - Check Docker resource allocation:
- Docker Desktop: Preferences → Resources → Increase CPU/Memory
- Linux: Check available system resources with
htop
orfree -h
Poor Calibration Results
Symptoms: Objective function not improving, parameters not converging
Solutions:
- Increase iterations: Try
-i 200
or more - Check parameter bounds: Edit
calibration/ngen_cal_conf.yaml
- Verify observation data quality:
head -20 calibration/obs_hourly_discharge.csv
- Adjust calibration period: Use different start/end dates
File and Permission Issues
File Permission Errors
NGIAB runs as root and ngiab-cal runs as user 1000:1000 by default, which can cause permission issues.
Fix existing permission issues:
# Fix ownership of data directory
sudo chown -R $USER:$USER provo-10154200/
# Or more specifically for calibration outputs
sudo chown -R $USER:$USER provo-10154200/calibration/Output/
Prevent permission issues:
# When running Docker manually, add user flag
docker run --user $(id -u):$(id -g) ...
Cannot Access Files
If you can't read/write files in the mounted directories:
- Ensure the path is absolute when mounting volumes in Docker
- Check that the directory exists before running
Docker Issues
Docker Memory Issues
Symptoms: Container killed, out of memory errors
Solutions:
- Docker Desktop: Increase memory allocation in Preferences → Resources
- Linux: Check available memory with
free -h
- Reduce the simulation period or catchment size
Docker Image Download Fails
Symptoms: Timeout or connection errors when pulling images
Solutions:
- Check internet connection
- Try pulling manually:
docker pull awiciroh/ngiab-cal docker pull awiciroh/ciroh-ngen-image
- Use a different network or VPN if behind a firewall
Container Exits Immediately
Check the container logs:
docker logs <container_id>
Common causes:
- Missing required files
- Incorrect mount paths
- Configuration syntax errors
Data Issues
Missing or Corrupt Data Files
Symptoms: Errors about missing files, unexpected EOF
Solutions:
-
Re-download the data:
wget https://communityhydrofabric.s3.us-east-1.amazonaws.com/example_data/provo-10154200.tar.gz tar -xzf provo-10154200.tar.gz
-
Verify file integrity:
# Check if files exist and have content ls -lah provo-10154200/forcings/ ls -lah provo-10154200/config/
Jetstream VM Issues
Cannot Connect to VM
- Verify IP address is correct
- Try verbose SSH:
ssh -v exouser@IP-ADDRESS
Common Error Messages
"No such file or directory"
- Check you're in the correct directory
- Verify paths in commands are correct
- Ensure data has been downloaded/extracted
"Permission denied"
- See File Permission Errors above
- Check Docker permissions
"Command not found"
- For
uvx
: See UV Command Not Found - For
docker
: Ensure Docker is installed and in PATH
Getting Additional Help
If you're still experiencing issues:
- Check existing solutions: Search GitHub Issues
- Create a new issue: Include:
- Error message (full text)
- Command you ran
- System information (OS, Docker version)
- Steps to reproduce
- Contact instructors:
- During workshop: Ask in person or via chat
- Email: [email protected], [email protected]