supervise.8 - indimail/indimail-mta GitHub Wiki
supervise - start and monitor a service.
supervise dir [parent_ident]
supervise(8) switches to the directory named dir. It checks for the file down. If this file exists, supervise doesn't start the service. If the directory run/svscan exists, supervise creates the directory dir/supervise in run/svscan where run is either /run or /var/run tmpfs filesystem (depending on your operating system). You can disable the creation and use of this tmpfs filesystem by setting DISABLE_RUN environment variable. From now onwards, we will refer to /var/run as /run. svscan also opens /run/svscan/dir/supervise/lock in exclusive mode (1) to prevent multiple copies of supervise running for the same service. It exits 100 if it cannot open /run/svscan/dir/supervise/lock. The directory /run/svscan/dir must be writable for supervise. This directory is used to maintain status information in binary format and also create few named pipes or fifo(7). The status information can be read by svstat(8). The format of the status file is described in another section below. If DISABLE_RUN is set, or if your system doesn't have the /run, dir/supervise will be created in orig/dir directory where orig is the directory in which dir is present. orig is usually /service on most systems. The use of /run tmpfs filesystem allows supervise to be used on systems where the root filesystem is read-only. So, /run/svscan/dir/supervise is equivalent to dir/supervise when run tmpfs filesystem is present.
supervise then executes ./init if it exists. In case ./init exits with non-zero status, it pauses for 60 seconds before restarting ./init. The pause is required so that supervise doesn't loop too quickly causing high CPU usage. dir has to be relative to the current working directory and cannot start with the dot (.) or the slash (/) character. parent_ident is passed as a command line argument by svscan(8) when starting supervise log process when dir/log exists. This is useful when listing supervised log processes using the ps(1) command (See also the svps(1) command).
After ./init exits with zero exit status, supervise starts ./run. It restarts ./run if ./run exits. In case ./run exits with non-zero status, it pauses for a second after restarting ./run. The sleep avoids supervise from looping quickly when ./run has a problem. supervise expects ./run to remain in the foreground. Sometimes daemon fork themselves into background, which some consider bad software design. If you want to monitor such a daemon, set the sticky bit on ./run. This makes supervise go into subreaper mode using prctl(2) PR_SET_CHILD_SUBREAPER on Linux or procctl(2) PROC_REAP_ACQUIRE on FreeBSD. In subpreaper mode or when the environment variable SETPGID is set, the command started by ./run will have it's process Group ID set to the value of it's PID. Setting the process Group ID is required to monitor ./run reliably when ./run has a command which forks in the background. It is also required in such cases to make svc(8) command operate and control supervise reliably for such forked daemon/commands in ./run. You can also prefix such commands in ./run with fghack(1) instead of setting the sticky bit on ./run. supervise uses the selfpipe trick (2) to handle all SIGCHLD events reliably. This requires the use of two file descriptors for the selfpipe. ./run is passed two command line arguments with dir as argv[1] and how as argv[2] (how is explained later in this document).
If the directory ./variables exists, supervise sets environment variables before calling ./run using envdir(8). Files in ./variables directory must be compatible for envdir. If the directory ./variables doesn't have execute permissions for others group, all existing environment variables will be cleared before setting environment variables for ./run.
supervise uses /run/svscan/dir/supervise/control named pipe to read commands from svc. You can use svc give commands to supervise. You can give commands to supervise even if the file dir/down exists.
On receipt of SIGTERM, supervise sends SIGTERM followed by SIGCONT to its child. If running in subreaper mode or when SETPGID is set, supervise uses killpg(3) to send the signals, else kill(2) is used to send signals.
if the file dir/shutdown exists supervise executes shutdown when asked to exit or on receipt of SIGTERM. dir is passed as the first argument and the pid of the process that exited is passed as the second argument to shutdown.
if the file dir/alert exists supervise executes alert whenever ./run exits. dir is passed as the first argument, the pid of the process that exited is passed as the second argument, the exit value or signal (if killed by signal) is passed as the third argument to alert. The fourth argument is either of the strings exited or stopped / signalled.
supervise may exit immediately after startup if it cannot find the files it needs in dir or if another copy of supervise is already running in dir. Once supervise is successfully running, it will not exit unless it is killed or specifically asked to exit. On a successful startup supervise opens the named pipe /run/svscan/dir/supervise/ok in O_RDONLY|O_NDELAY mode. You can use svok(8) to check whether supervise is successfully running. You can use svscan to reliably start a collection of supervise processes. svscan mirrors the service directory in /run or /var/run directory (whichever is found first). So /run/svscan/dir will be analogous to /service/dir. If started by svscan, error messages printed by supervise will go the standard error output of svscan process. In such a case it will be the log file /var/log/svc/svscan/current.
supervise creates and opens the following named pipes / FIFOs with O_RDONLY|O_NDELAY mode. supervise will exit if it has trouble creating and opening these named pipes. See open(2) for description of O_RDONLY, O_WRONLY, O_NDELAY.
-
dir/supervise/control - for reading commands from clients like svc.
-
dir/supervise/ok - clients can open this in write mode (O_WRONLY) to test if supervise is running in dir. If write returns, it means supervise is running.
-
dir/supervise/up - clients can open this in write mode (O_WRONLY) to test if service in dir is up. Any client that opens this named pipe in O_WRONLY mode, will block until the service dir is up. If write returns, it means service in dir has executed dir/run. svc is one such client (-w option) that can be used to check if a service is up.
-
dir/supervise/dn - clients can open this in write mode to test if service in dir is down. This works like exactly like dir/supervise/up. svc (-W option) can be used to check if a service is down.
supervise opens /run/svscan/dir/supervise/up in read mode just after it executes ./run. Hence, if service w is up, write on /run/svscan/w/supervise/up returns immediately. If service w is down, the write will block until w is up and running. If service w doesn't have supervise running, supervise will wait for 60 seconds before attempting to open the file w/supervise/up again in read mode. The default value of 60 seconds gets overriden by the SCANINTERVAL environment variable used by svscan. If service w doesn't exist, dir/wait will be ignored.
supervise opens /run/svscan/dir/supervise/dn named pipe in read mode, when asked to bring down a service, using svc (-d or -r option). It opens this named pipe after issuing the TERM, CONT signal to the service. Hence, if servicew is down, write on w/supervise/dn returns immediately. if service w is up, the write will block until w is down.
supervise can wait for another service by having a file named dir/wait. This file has two lines. The first line is time t in seconds and the second line is a directory w for another supervise service. If w does not begin with '/', then it is assumed that the service for which supervise needs to wait is represented by ../w. If w begins with '/', then it represents a supervise service in a different base directory, probably being managed by another svscan process. By having dir/wait file, the supervise service for dir will wait t seconds after supervise service for w starts up successfully. The amount of time t is limited to a max of 32767 secs. Any value above this value will be limited to 60 secs. When w doesn't start with '/', the wait for another service is implemented by opening the named pipe ../w/supervise/up in O_WRONLY mode. When w starts with '/', then supervise will open the named pipe w/supervise/up in O_WRONLY mode. When using the wait feature, you can make supervise use a named pipe other than up by setting the environment variable UPFIFO. This allows supervise to wait for a non-supervise service, where the named fifo is opened by an external application.
supervise logs informational, warning and error messages to descriptor 2. Informational messages can be turned on by setting the environment variable VERBOSE. Warning messages can be turned off by setting the environment variable SILENT. If you are using svscan for service startup (as setup for indimail-mta), you can set environment variables for supervise in /service/.svscan/variables directory.
supervise passes arguments to the run, alert and the shutdown scripts. This allows you to handle various events in the three scripts. The table below shows the arguments passed. The script /usr/libexec/indimail/svalert uses these arguments to send alerts on port 3001. See svalert(8).
To call svalert in your run, alert, shutdown scripts just include this line as the first call.
[ -x /usr/libexec/indimail/svalert ] && /usr/libexec/indimail/svalert $@
If you don't have your own alert or shutdown scripts, you can link those scripts to /usr/libexec/indimail/svalert.
ln -s /usr/libexec/indimail/svalert dir/alert
ln -s /usr/libexec/indimail/svalert dir/shutdown
where dir is your service directory.
When called in run invoked with $@ as the argument, svalert is passed two command line arguments with dir as argv[1] and how as argv[2]. The value of how is as below in the table
| how Description | |
|---|---|
| abnormal startup When ./run exits on its own | |
| system failure When supervise is unable to fork to execute ./run | |
| manual restart When svc -u or -r is used to start the service | |
| one-time startup When svc -o is used to start the service | |
| auto startup Normal startup after supervise is run by svscan or | |
| manually |
When called in alert or as alert, svalert is passed four command line arguments with dir as argv[1], pid as argv[2]. The exit value or signal (if killed by signal) is passed as the third argument. The fourth argument is either of the strings exited, stopped or signalled.
When called in shutdown or as shutdown with $@ as the argmument, svalert is passed two arguments. dir is passed as the first argument and the pid of the process that exited is passed as the second argument.
supervise sets PPID environment variable as its PID (process ID) for all it's children.
The file supervise/status is used by supervise to write it's PID and status information. This file can be read by svstat to display a human readable status of supervise and the service it is running.
| Byte Description | |
|---|---|
| 0-12 TAI64N label | |
| 12-15 PID of the supervise or service | |
| 16 If service is paused (1 - paused, 0 - not paused) | |
| 17 Wants to be up or down ('u' - up, 'd' - down) | |
| 18-19 Short integer for wait interval when waiting for another service. | |
| (> 0 - wait interval, < -1 - wait indefinitely) | |
| 20 Up or Down (0 - down, 1 - up) |
supervise is designed to run forever after startup. However, it will exit 100 during startup, if it fails to open dir/supervise/lock, or exit 111, if it a encounters a system error.
1. The file dir/supervise/lock is opened in exclusive mode using
lock(fd, LOCK_EX | LOCK_NB) or lockf(fd, F_TLOCK, 0).
2. The Self Pipe Trick - https://cr.yp.to/docs/selfpipe.html
envdir(8), envuidgid(8), fghack(8), minisvc(8), multilog(8), open(2) mkfifo(3) fifo(7) pipe(7) pgrphack(8), readproctitle(8), setlock(8), setuidgid(8), softlimit(8), svalert(8), svc(8), svctool(8), svok(8), svps(1), svscan(8), svscanboot(8), svstat(8), tai64n(8), tai64nlocal(8), http://cr.yp.to/daemontools.html