Working Notes: SEC440: High Availability Project - eliminmax/cncs-journal GitHub Wiki
I had to set up a redundant web app for a class assignment. I chose poorly, going with Nextcloud, which does not have a good way to create a redundant setup.
This is an overview of how to set it up.
I wrote a "mind-map" in YAML, because it actually works better than a visual diagram for helping me understand what's going on. I'm hiding it in a <details>
block, because it's nearly 300 lines long.
Click here to see it.
---
hosts:
- name: "ha1"
vm: "ha1-SEC440-01-eli.minkoff"
system:
os: "Ubuntu 22.04"
hostname: "ha1.eli.local"
networks:
- name: "SEC440-01-OPT-eli.minkoff"
interface: "ens160"
addresses:
- "10.0.6.11"
- name: "ha2"
vm: "ha2-SEC440-01-eli.minkoff"
system:
os: "Ubuntu 22.04"
hostname: "ha2.eli.local"
networks:
- name: "SEC440-01-OPT-eli.minkoff"
interface: "ens160"
addresses:
- "10.0.6.12"
- name: "u1"
vm: "u1-SEC440-01-eli.minkoff"
system:
os: "Ubuntu 22.04"
hostname: "u1-eli"
networks:
- name: "SEC440-01-OPT-eli.minkoff"
interface: "ens160"
addresses:
- "10.0.6.21"
- name: "u2"
vm: "u2-SEC440-01-eli.minkoff"
system:
os: "Ubuntu 22.04"
hostname: "u2-eli"
networks:
- name: "SEC440-01-OPT-eli.minkoff"
interface: "ens160"
addresses:
- "10.0.6.22"
- name: "u3"
vm: "u3-SEC440-01-eli.minkoff"
system:
os: "Ubuntu 22.04"
hostname: "u3-eli"
networks:
- name: "SEC440-01-OPT-eli.minkoff"
interface: "ens160"
addresses:
- "10.0.6.23"
- name: "vyos1"
vm: "vyos1-SEC440-01-eli.minkoff"
system:
os: "VyOS 202209130217"
hostname: "vyos1.eli.local"
networks:
- name: "SEC440-01-WAN"
interface: "eth0"
addresses:
- "10.0.17.15"
- name: "SEC440-01-LAN-eli.minkoff"
interface: "eth1"
addresses:
- "10.0.5.2"
- name: "SEC440-01-OPT-eli.minkoff"
interface: "eth2"
addresses:
- "10.0.6.2"
- name: "vyos2"
vm: "vyos2-SEC440-01-eli.minkoff"
system:
os: "VyOS 202209130217"
hostname: "vyos2.eli.local"
networks:
- name: "SEC440-01-WAN"
interface: "eth0"
addresses:
- "10.0.17.75"
- name: "SEC440-01-LAN-eli.minkoff"
interface: "eth1"
addresses:
- "10.0.5.3"
- name: "SEC440-01-OPT-eli.minkoff"
interface: "eth2"
addresses:
- "10.0.6.3"
- name: "web01"
vm: "web01-SEC440-01-eli.minkoff"
system:
os: "CentOS 7"
hostname: "web01.eli.local"
networks:
- name: "SEC440-01-LAN-eli.minkoff"
interface: "ens192"
addresses:
- "10.0.5.100"
- name: "web02"
vm: "web02-SEC440-01-eli.minkoff"
system:
os: "CentOS 7"
hostname: "web02.eli.local"
networks:
- name: "SEC440-01-LAN-eli.minkoff"
interface: "ens192"
addresses:
- "10.0.5.101"
networks:
- name: "SEC440-01-LAN-eli.minkoff"
subnet: "10.0.5.0/24"
- name: "SEC440-01-OPT-eli.minkoff"
subnet: "10.0.6.0/24"
- name: "SEC440-01-WAN"
subnet: "10.0.17.0/24"
services:
- name: "haproxy"
description: "a high-availability proxy with a generic name"
- name: "glusterfs"
description: "a distributed filesystem with optional redundancy and failover handling"
- name: "keepalived"
description: "vrrp protocol for non-router services"
- name: "galera"
description: "cluster of MySQL/MariaDB servers that sync databases across nodes, and handle node failures"
- name: "vrrp"
description: "allow primary and backup routers to share IP addresses and coordinate failover"
haproxy:
frontends:
- name: "ha1_web"
on_hosts:
- "ha1"
settings:
- name: "bind"
options: "10.0.5.0:80"
- name: "default_backend"
options: "web0N"
- name: "ha2_web"
on_hosts:
- "ha2"
settings:
- name: "bind"
options: "10.0.5.0:80"
- name: "default_backend"
options: "web0N"
backends:
- name: "web0N"
on_hosts:
- "ha1"
- "ha2"
settings:
- name: "mode"
options: ["http"]
- name: "balance"
options: ["roundrobin"]
- name: "option"
options: ["httpchk"]
- name: "cookie"
options: ["โโโโโโโโโโโโ", "prefix", "nocache"]
- name: "server"
options: ["web01", "10.0.5.100:80", "check", "cookie", "web01"]
- name: "server"
options: ["web02", "10.0.5.100:80", "check", "cookie", "web02"]
listens:
- name: "ha1_sql"
on_hosts:
- "ha1"
settings:
- name: "bind"
options: ["10.0.6.10:3306"]
- name: "balance"
options: ["roundrobin"]
- name: "mode"
options: ["tcp"]
- name: "option"
options: ["tcpka"]
- name: "option"
options: ["mysql-check", "user", "haproxy"]
- name: "server"
options: ["u1", "10.0.6.21:3306", "check", "weight", "1"]
- name: "server"
options: ["u2", "10.0.6.22:3306", "check", "weight", "1"]
- name: "server"
options: ["u3", "10.0.6.23:3306", "check", "weight", "1"]
- name: "ha2_sql"
on_hosts:
- "ha2"
settings:
- name: "bind"
options: ["10.0.6.10:3306"]
- name: "balance"
options: ["roundrobin"]
- name: "mode"
options: ["tcp"]
- name: "option"
options: ["tcpka"]
- name: "option"
options: ["mysql-check", "user", "haproxy"]
- name: "server"
options: ["u1", "10.0.6.21:3306", "check", "weight", "1"]
- name: "server"
options: ["u2", "10.0.6.22:3306", "check", "weight", "1"]
- name: "server"
options: ["u3", "10.0.6.23:3306", "check", "weight", "1"]
glusterfs:
peers:
- "u1"
- "u2"
- "u3"
volumes:
- name: "gv0"
type: "replicate"
bricks:
- "u1:/data/glusterfs/nextcloud-store/brick"
- "u2:/data/glusterfs/nextcloud-store/brick"
- "u3:/data/glusterfs/nextcloud-store/brick"
keepalived:
nodes:
- host: "ha1"
priority: 101
- host: "ha1"
priority: 99
groups:
- vrid: 51
name: "LB_VIP"
network: "SEC440-01-LAN-eli.minkoff"
address: "10.0.6.10"
galera:
nodes:
- "u1"
- "u2"
- "u3"
settings:
- wsrep_cluster_address: "gcomm://10.0.6.21,10.0.6.22,10.0.6.23"
- binlog_format: "row"
- default_storage_engine: "InnoDB"
- innodb_autoinc_lock_mode: 2
- bind-address: "0.0.0.0"
- wsrep_cluster_name: "MariaDB_Cluster"
- wsrep_node_address: "10.0.6.21"
vrrp:
nodes:
- host: "vyos1"
priority: 200
- host: "vyos2"
priority: 100
groups:
- vrid: 10
name: "WAN-VRRP"
network: "SEC440-01-WAN"
address: "10.0.17.105"
- vrid: 20
name: "LAN-VRRP"
network: "SEC440-01-LAN-eli.minkoff"
address: "10.0.5.1"
- vrid: 30
name: "OPT-VRRP"
network: "SEC440-01-OPT-eli.minkoff"
address: "10.0.6.1"
Instructions for individual services are on their own pages, and selected configuration files with some info redacted are available here.
What you need to do to replicate my setup is as follows:
- Set up routers running VyOS, and assign their addresses
- Configure the VRRP service for VyOS with the
vrrp
settings in the YAML mind map
- Set up the networking stack and firewalls on each of the various servers
- Install Apache on web01 and web02
- Set up Galera cluster on u1, u2, and u3, and create the needed users and database rules with the SQL commands below:
CREATE USER 'haproxy'@'10.0.6.10';
CREATE USER 'haproxy'@'10.0.6.11';
CREATE USER 'haproxy'@'10.0.6.12';
CREATE DATABASE 'nextclouddb';
CREATE USER 'nextcloud-user'@'10.0.5.100' IDENTIFIED BY "โโโโโโโโโโโโโโโโโโโโโโโโ";
CREATE USER 'nextcloud-user'@'10.0.5.101' IDENTIFIED BY "โโโโโโโโโโโโโโโโโโโโโโโโ";
CREATE USER 'nextcloud-user'@'10.0.6.11' IDENTIFIED BY "โโโโโโโโโโโโโโโโโโโโโโโโ";
CREATE USER 'nextcloud-user'@'10.0.6.12' IDENTIFIED BY "โโโโโโโโโโโโโโโโโโโโโโโโ";
GRANT ALL PRIVILEGES ON 'nextclouddb' TO 'nextcloud-user'@'10.0.5.100';
GRANT ALL PRIVILEGES ON 'nextclouddb' TO 'nextcloud-user'@'10.0.5.101';
GRANT ALL PRIVILEGES ON 'nextclouddb' TO 'nextcloud-user'@'10.0.6.11';
GRANT ALL PRIVILEGES ON 'nextclouddb' TO 'nextcloud-user'@'10.0.6.12';
FLUSH PRIVILEGES;
Note that all 4 passwords must be the same.
- Set up GlusterFS cluster on u1, u2, and u3
- Set up HAproxy on ha1 and ha2. For now, omit the line starting with
cookie
from the backend configuration, and omit it from theserver
lines
- Set up keepalived on ha1 and ha2
- see Networking: Infrastructure: High Availability: VRRP ยง keepalived
- I heavily based the keepalived.conf files on those from this tutorial on kifarunix.com.
- Set up port forwarding from the WAN-VRRP address to the LB_VIP address on the vyos systems
- Mount the GlusterFS volume created in step 6 to the mountpoint
/var/www/html/nextcloud
on both web01 and web02.
I installed the following packages, though I'm not sure that they all were necessary to simply mount a glusterfs volume
centos-release-gluster41
centos-release-gluster9
glusterfs
glusterfs-client-xlators
libglusterd0
-
libglusterfs0
To mount it, I added the following lines to the end of the/etc/fstab
file:
# nextcloud glusterfs
u1-eli:gv0 /var/www/html/nextcloud glusterfs defaults,_netdev 0 0
After that, I ran sudo systemctl daemon-reload && sudo mount -a
to mount them
- Set up the background images for web01 and web02
- This was done entirely to demonstrate that different requests were going to different servers. I whipped up 2 images in GIMP, which just had "web01" and "web02" written in giant letters as a repeated pattern. I have since noticed a tiny letter "t" that I must have accidentally typed and not noticed while exporting them. In case anyone reading this is curious, they can be found here. I copied
web01-nc.png
to/var/www/background.png
on web01, andweb02-nc.png
to the same place on web02
- Install Nextcloud itself
- Download the latest release tarball, and extract it to
/var/www/html/nextcloud
, configure the SElinux rules (requires thepolicycoreutils-python
package)
dl_dir="$(mktemp -d)"
pushd "$dl_dir"
wget https://download.nextcloud.com/server/releases/latest.tar.bz2
wget https://download.nextcloud.com/server/releases/latest.tar.bz2.sha256
# only continue if checksum matches
if sha256sum --check latest.tar.bz2.sha256; then
# extract the contents into /var/www/html/nextcloud, removing the leading "/nextcloud/" from the path to keep it at the same layer
sudo tar -C /var/www/html/nextcloud --strip-components=1 -xjvf latest.tar.bz2
sudo chown -R apache: /var/www/html/nextcloud
sudo semanage fcontext -a -t httpd_sys_rw_content_t '/var/www/html/nextcloud/data(/.*)?'
sudo semanage fcontext -a -t httpd_sys_rw_content_t '/var/www/html/nextcloud/config(/.*)?'
sudo semanage fcontext -a -t httpd_sys_rw_content_t '/var/www/html/nextcloud/apps(/.*)?'
sudo semanage fcontext -a -t httpd_sys_rw_content_t '/var/www/html/nextcloud/.htaccess'
sudo semanage fcontext -a -t httpd_sys_rw_content_t '/var/www/html/nextcloud/.user.ini'
sudo semanage fcontext -a -t httpd_sys_rw_content_t '/var/www/html/nextcloud/3rdparty/aws/aws-sdk-php/src/data/logs(/.*)?'
sudo restorecon -Rv '/var/www/html/nextcloud/'
pushd /var/www/html/nextcloud
sudo -u apache php occ maintenance:install --database mysql --database-name=nextclouddb --database-host=10.0.610 --database-port=3306 --database-pass=โโโโโโโโโโโโโโโโโโโโโโโโ --database-user=nextcloud-user --admin-user=eliminmax --admin-pass=โโโโโโโโโโโโโโโโโโโโโโโโ
popd
popd
rm -rf "$dl_dir"
fi
- On both web01 and web02, create a file in
/etc/httpd/conf.d
callednextcloud.conf
, with the following contents:
Alias "/nextcloud/index.php/apps/theming/image/background" "/var/www/background.png"
Alias /nextcloud "/var/www/html/nextcloud/"
<Directory /var/www/html/nextcloud/>
Require all granted
AllowOverride All
Options FollowSymLinks MultiViews
<IfModule mod_dav.c>
Dav off
</IfModule>
</Directory>
Note that database-pass
must be the password set in step 5.
- Deal with Nextcloud's
config.php
- In the file
/var/www/html/nextcloud/config/config.php
, there is aninstanceid
field. When connecting to the Nextcloud server, it creates a session cookie with the same name as the value of that field, which HAProxy can look at to ensure that it doesn't switch you between servers mid-session, which can break things. Make sure to make a note of that field's value. - Nextcloud is pretty strict about avoiding connections through proxies it is not configured to trust, so I had to add the IP for
WAN-VRRP
to thetrusted_domains
array in that file.
- Configure HAproxy for Nextcloud
- Now that the
instanceid
value is available, go back to the HAProxy configuration on ha1 and ha2, and addcookie
line, with theinstanceid
value as theโโโโโโโโโโโโ
value for theweb0N
backend, and restart the services.
Note that a more proper setup for Nextcloud will involve configuring a cron job to run certain tasks every 5 minutes, and setting up SMTP to enable emails. For the purposes of this assignment and demonstration, however, I do not believe that those are necessary.