Automated restart of node app - SkycoinProject/skywire GitHub Wiki
This guide assumes that you have read and understood the readme.md, downloaded the official images and do every step exactly the way it is described.
How to automatically restart the node app on the official images
Table of Contents
Introduction
The node app autostarts with the node if the orange pi prime is turning on. It's getting started by /etc/rebuild.sh
via /etc/rc.local
. The specific subprocess that hosts the node app is ./sockss.
A number of people in the community have reported that their node app crashed without them taking action. The following guide will provide you with the means to change the official images in a way that the node app will restart after a maximum delay of 5 minutes if it crashes.
Changing the images this way does not generate new public keys, it only generates new app keys! Generally the steps in here are not mandatory, but you can use them to be safe and to not have to monitor the nodes all the time to check if they're still connected to the discovery server.
Setup
Login via SSH or open a terminal in the manager node (How-to open a terminal in the manager). Once that's done we can get to work:
Crontab script
In this section you will create script that will be executed by crontab for the root user.
Create the script
First we need to create a directory to store the script that shall restart the node app every 5 minutes. This is the content of the script
#!/bin/bash
# Check if sockss is running
# -x flag only match processes whose name (or command line if -f is
# specified) exactly match the pattern.
if pgrep -x "./sockss" > /dev/null
then
echo "Running"
else
export GOPATH=/usr/local/skywire/go
cd $GOPATH/bin
./sockss -node-address :5000 > /dev/null 2>&1 & echo $! > socks.pid
fi
This is the content of the script displayed in the terminal
Create a directory for it via
mkdir /etc/cron.5min
then create the script and paste the content from above via
nano /etc/cron.5min/sockss
Once you're done save the changes with ctlr+x and 'y' + hit enter, your file should then look like this:
Apply the script
Next we need to make the newly created script executable via
chmod a+x /etc/cron.5min/sockss
If you didn't tamper with the images beforehand there is no crontab created for the root user, we need to set one up.
To create a new crontab for the root user and use nano as editor type
EDITOR=nano crontab -e
The default generated file has no entries, all lines are commented out.
You need to scroll down to the bottom of the file and add the following line
*/5 * * * * cd / && run-parts /etc/cron.5min
Your file should look like this:
Once you're done save the changes with ctlr+x and 'y' + hit enter.
To see what gets executed in cron.5min type
run-parts --test /etc/cron.5min
The output should be:
Optional: Replace newly generated app key
The node app generally uses keys stored in /usr/local/skywire/go/bin/.skywire/ss/keys.json
.
If the node app crashes (./sockss respectively) and is being restarted by the script defined in crontab, it will generate a new app key, located in ~/.skywire/ss/keys.json
and use this key every time it restarts the ./sockss process.
The following steps show you how to copy your old app key and replace the newly generated one, so that you'll always have the same app keys no matter which script executes the ./sockss process
Manually kill the ./sockss process to force the generation of new keys
You can obtain your app key via
cat /usr/local/skywire/go/bin/.skywire/ss/keys.json
This is the app key that is used when the orange pi prime boots up and starts the node with the rebuild.sh script which will execute the ./sockss process as well.
If our created script has to restart the ./sockss process it'll generate a new app key, located in ~/.skywire/ss/keys.json.
Verify
Type ps aux | grep 'sockss' | awk '{print $2, $11}'
this will give you the PID of the ./sockss process:
The PID (in the screenshot 2236) is the highlighted number; the 'grep' below that is just the process you just used to obtain the PID, you can ignore it
Kill it via
kill PID
in our example that is
kill 2236
Double check that it's down via
ps aux | grep 'sockss' | awk '{print $2, $11}'
Now you just have to wait a maximum of 5 minutes, then out recently added crontab script will restart the ./sockss process and generate a new app key while doing that.
Check if it has been restarted via
ps aux | grep 'sockss' | awk '{print $2, $11}'
The output should look like this again, only the PID is different than before:
Copy app key to new location to always use the same app key.
You can obtain your newly generated app key via
cat ~/.skywire/ss/keys.json
For you to always use the same app key we need to copy your old app key into the new location that is used for restarting ./sockss.
First rename the newly generated key file via
mv ~/.skywire/ss/keys.json ~/.skywire/ss/keys_new.json
You can validate if it worked via
cat ~/.skywire/ss/keys_new.json
This should output the same keys as before.
Now copy your old app key into the new location via
cp /usr/local/skywire/go/bin/.skywire/ss/keys.json ~/.skywire/ss/keys.json
Validate the procedure by comparing the results of
cat /usr/local/skywire/go/bin/.skywire/ss/keys.json
and
cat ~/.skywire/ss/keys.json
they should be equivalent.
Next step
Now you need to go back to the beginning and login to the next node and do all steps again. Do this for all 8 orange pi primes (192.168.0.2-9)
Verify
If you've done the steps as described above the script should restart the ./sockss process after a maximum delay of 5 minutes once it's down. To verify if it's working go to this section