Automated reboot if network connection drops - SkycoinProject/skywire GitHub Wiki

skywire logo

Restart the nodes using an automated script

Currently deprecated and not working.

This guide assumes that you have read and understood the readme.md and have Skywire already installed and are able to connect via SSH or use the terminal in the web interface.

Table of Contents


Introduction

Currently deprecated and not working.

The network connection of the Skywire nodes appears to drop off at some rare occasions which often stay unnoticed. This guide will help you to set up measures that reboot your node in case its network connection drops off.

Requirements

  • Skywire already installed
  • SSH connection

Setup

Login via SSH or open a terminal in the web interface, then create the reboot script:

cd ~
nano check_network.sh

and paste the following content:

#!/bin/bash
if ping -q -c 1 -W 1 8.8.8.8 >/dev/null; 
then
  echo "IPv4 is up, the network connection is alive" 
else
  echo "IPv4 is down, the network connection is broken."
  reboot 
fi

Save via ctrl+x and make the script executable via chmod +x check_network.sh.

Please note that the network connection is only verified by sending a PING request to the 8.8.8.8 DNS server. Downtime of the 8.8.8.8 server would lead to a constant reboot loop of the node. Improvement may be added by using fping and sending PING requests to multiple servers.

Next, we need to make a crontab entry so the script gets executed every minute to keep the downtime to a minimum in case the network connection drops off:

crontab -e

Choose nano (1) in case you are requested to choose a text editor.

Scroll down to the end of the file and paste this line

* * * * * /root/check_network.sh

That's it, now your board verifies its network connection every minute and reboots itself in case it cannot ping the DNS server 8.8.8.8


Enhancement

This section explains how to link this guide and the restart guide in the usage section together. This will result in a script that verifies the network connection of the manager and restarts all other nodes in case its network connection drops off.

Restart.sh script

First, you need to setup the restart.sh script as it is described in this guide. Make sure it is executable and located in your home directory.

RSA key exchange

Please skip this section if you performed the one time upgrade script!

Next, you need to exchange rsa key pairs from the manager to all secondary nodes. Make sure you an rsa key is present on the manager in ~/.ssh/id_rsa and if not generate keys via ssh-keygen -A.

Now you proceed to distribute the rsa public key of the manager into the secondary nodes: nano rsa_dist.sh then paste the following content and make sure to adjust HOSTS in case your IP addresses differ, and USERNAME if it is not root:

#!/bin/bash
USERNAME=root
HOSTS="192.168.0.3 192.168.0.4 192.168.0.5 192.168.0.6 192.168.0.7 192.168.0.8 192.168.0.9"
for HOSTNAME in ${HOSTS} ; do
    if ping -W5 -i0.5 -c 1 &> /dev/null
        then
	    echo 1
	    echo "No connection to host" ${HOSTNAME}
	    echo "Could not exchange RSA key, proceeding."
        else
 	    echo "ping received from" ${HOSTNAME}
	    echo "Exchange RSA key"
	    cat ~/.ssh/id_rsa.pub | ssh ${USERNAME}@${HOSTNAME} "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys"
        fi
done

Save via ctrl+x and y. The make it executable via chmod +x rsa_dist.sh and execute it via ./rsa_dist.sh.

You are now able to login from the manager to the secondary nodes without password entry.

Reboot.sh script

Create the reboot script that reboots all nodes if the manager's network connection drops off. nano check_network_reboot_all.sh Then paste the following content:

#!/bin/bash
if ping -q -c 1 -W 1 8.8.8.8 >/dev/null; 
then
  echo "IPv4 is up, the network connection is alive" 
else
  echo "IPv4 is down, the network connection is broken."
  ./restart.sh 
fi

Save via ctrl+x and make the script executable via chmod + check_network_reboot_all.sh. Proceed by adding check_network_reboot_all.sh to the crontab file of your user via

crontab -e

Choose nano (1) in case you are requested to choose a text editor.

Scroll down to the end of the file and paste this line:

* * * * * /root/check_network_reboot_all.sh

That's it, now your manager verifies its network connection every minute and reboots itself and all secondary nodes in case it cannot ping the DNS server 8.8.8.8