NOMAD Troubleshooting - aichemy-hub/docs GitHub Wiki

NOMAD Troubleshooting

NOMAD server is not accessible from inside the lab

Enter http://192.160.103.85 into the browser (this link should also be bookmarked). If you can't reach NOMAD, it may mean that either the server or proxy is offline. If you can reach http://aichemy-nmr.ch.ic.ac.uk, on a computer connected to the college network, ie not one of the lab computers, then the server is running and it's probably the proxy that is offline.

To verify the proxy is working, go on serv-10 and run ps aux | grep caddy to see if caddy is running. If it is running you should see something like /usr/local/bin caddy run --config /etc/caddy/Caddyfile in the output. If it is not running run:

/usr/local/bin/caddy run --config /etc/caddy/Caddyfile &

which will launch it in the background.

NOMAD server is not accessible from outside the lab

The virtual machine running the NOMAD server will reboot from time to time for automatic updates. The Docker containers running the NOMAD server should restart automatically when this happens.

To check that the Docker containers are running, SSH into the server ssh <your-username>@aichemy-nmr.ch.ic.ac.uk then run:

> sudo docker container ls

CONTAINER ID   IMAGE                    COMMAND                  CREATED        STATUS         PORTS                                         NAMES
01ce7d6b1728   nomadnmr/server:v3.5.5   "/docker-entrypoint.…"   2 months ago   Up 9 seconds   0.0.0.0:80->80/tcp, [::]:80->80/tcp           nomad-server-1
ad6d58835a13   nomadnmr/api:v3.5.5      "docker-entrypoint.s…"   2 months ago   Up 9 seconds   0.0.0.0:8080->8080/tcp, [::]:8080->8080/tcp   nomad-api-1
70d466e5f2bb   mongo                    "docker-entrypoint.s…"   7 months ago   Up 9 seconds   27017/tcp                                     nomad-mongodb-1

If you don't see three containers as listed above (the IDs will be different) navigate to /nomad and restart Docker and the containers:

sudo systemctl start docker
sudo docker compose up -d

You can check the status of the Docker daemon by running:

sudo systemctl status docker

The storage monitor page is not available

The doku storage and container monitoring page should be available at http://aichemy-nmr.ch.ic.ac.uk:9090/site/.

If it isn't, SSH to the server with ssh <your-username>@aichemy-nmr.ch.ic.ac.uk then run:

sudo docker container ls

There should be an entry for amerkurev/doku with status "Up n days". To restart the container, navigate to /doku-dashboard/ and run

sudo docker compose down
sudo docker compose up -d

Problems submitting jobs to an NMR instrument

Check you are logged in as default user on IconNMR

If not, log in as the default user.

Check that the server can see the client on the instrument machine

  • First, check that the instrument is connected to the server by going to the NOMAD UI at http://192.160.103.85 and looking at the instrument list. It should say if the instrument is connected or not. If the instrument is connected, this is not the issue.
  • If the instrument is not connected check if the nomad client is running on the computer. Run sudo systemctl status nomad-client. If the status is "Inactive (dead)" or similar, try starting the service again with sudo systemctl start nomad-client.

Cannot npm install new versions of nomad-client

Check that you have a proxy tunnel open

Make sure you have a terminal window open in which you are running ssh -D 8080 <username>@serv10 to enable the proxy tunnel.

Make sure https-proxy is set in npm config

Run npm config get https-proxy. The output should be socks5h://127.0.0.1:9050. If it isn't, run npm config set https-proxy=socks5h://127.0.0.1:8080.

⚠️ **GitHub.com Fallback** ⚠️