Edge Gateways - FLARE-forecast/flare-forecast.github.io GitHub Wiki

Description

Edge gateways are small field-deployable computers running open-source Ubuntu Server Linux operating systems and code at the FLARE sites. They serve as storage staging ground and compute nodes that are responsible for reliably sending data and logs to GitHub cloud repositories, using the Git protocol. By using Git, and since the data loggers append data samples as new lines to a CSV file, the gateways only need to transfer differences on a commit. Several Fitlet2 and Fitlet3 industrial gateways are used for this purpose. Numerous Bash scripts are responsible to run the desired tasks on the gateways. A set of custom-built Bach scripts are responsible to handle the services on the gateways. To check the online status of the gateways, a free service from HealthChecks.io is being used. If a gateway is offline more than it is expected, a notification will be sent to #gateway-monitor Slack channel in CIBR-FLARE Slack workspace and an email will also be sent to the stakeholders.


Setup and Configuration

Installing the Operating System and Core Software Packages on Gateways

Installing the Operating System and Core Software Packages on Gateways

Deploying FLARE-specific Code on Gateways

Clone Repository

Clone the main branch of the miscellaneous repository to the home directory (/home/ubuntu/ or ~/) of the gateway.

git clone https://github.com/FLARE-forecast/miscellaneous ~/.

Copy Configuration File

Copy the default configuration file for the gateway:

cp ~/miscellaneous/gateways/config-files/<gateway_name>/config.yml \
   ~/miscellaneous/gateways/config-files/config.yml

Example Gateway Configuration

Example default config file for Henrietta at FCRE Catwalk location: Henrietta Config File

# General configurations and module-specific configurations for gateways

general: # General configuration for the gateway
  log_file: general.log                    # Log file for general gateway operations
  gateway_name: henrietta                  # Unique name of the gateway
  gateway_location: fcre-catwalk           # Physical location of the gateway
  gateway_power_mode: ac                   # Power source ("ac" or "battery")
  data_dir: /data                          # Directory for storing data
  apps_dir: /home/ubuntu/miscellaneous     # Directory for miscellaneous applications
  datalogger_data_dir: datalogger-data     # Directory for datalogger output
  git_repo: [email protected]:FLARE-forecast/FCRE-data.git # Git repository for data
  git_data_branch: fcre-catwalk-data       # Branch for data storage
  git_logs_branch: henrietta-logs          # Branch for log storage
  module_toggler_log_file: module-toggler.log # Log file for module toggling events

shutdown_scheduler: # Schedules when to shut down the gateway
  is_enabled: true                         # Enable or disable the shutdown scheduler
  log_file: shutdown-scheduler.log         # Log file for shutdown events
  post_reboot_delay_minutes: 30            # Delay after reboot before scheduling a shutdown (when on battery)
  shutdown_time: 00:10                     # Time to shut down when on AC power (HH:MM)

startup_notifier: # Notifies the startup of the gateway by pushing to a Git repo
  is_enabled: true                         # Enable or disable startup notifications
  log_file: startup-notifier.log           # Log file for startup notifier
  local_repo_dir: startup-notifier         # Local directory for the notifier repo
  git_repo: [email protected]:FLARE-forecast/FCRE-data.git # Git repository for push
  git_branch: main                         # Branch to push the startup notification

status_monitor: # Logs gateway status (e.g., connectivity, disk usage)
  is_enabled: true                         # Enable or disable status monitoring
  log_file: status-monitor.log             # Log file for status reports

git_push: # Pushes local data and logs to remote Git repositories
  is_enabled: true                         # Enable or disable Git pushes
  log_file: git-push.log                   # Log file for Git push actions
  directories:                             # List of directories to push
    - /data/fcre-catwalk-data
    - /data/henrietta-logs
    - /data/fcre-eddyflux-data

git_garbage_collector: # Cleans Git repositories to save disk space
  is_enabled: true                         # Enable or disable Git garbage collection
  log_file: git-garbage-collector.log      # Log file for garbage collection actions
  directories:                             # List of directories to clean
    - /data/fcre-catwalk-data
    - /data/henrietta-logs
    - /data/fcre-eddyflux-data

health_checks_io: # Sends periodic "alive" pings to HealthChecks.io
  is_enabled: true                         # Enable or disable HealthChecks monitoring
  log_file: health-checks-io.log           # Log file for HealthChecks
  ping_url: https://hc-ping.com/d3e6533f-a382-459b-addf-ca88aa668c8a # Unique HealthChecks URL to ping
  max_time: 60                             # Max time between pings (seconds)
  retry: 5                                 # Number of retry attempts on failure

reverse_ssh: # Maintains an SSH tunnel from the JS2 front VM to the gateway
  is_enabled: true                         # Enable or disable the reverse SSH tunnel
  log_file: reverse-ssh.log                # Log file for reverse SSH
  autossh_log_file: autossh.log            # Log file for AutoSSH
  local_port: 60011                        # Local port on the gateway to forward
  base_port: 61000                         # Base port for dynamic allocation
  remote_port: 22                          # Remote SSH port on the gateway
  localhost: localhost                     # Local hostname binding
  server: 149.165.159.29                   # Remote server (JS2 front VM) IP
  user: ubuntu                             # SSH user on the remote server
  ServerAliveInterval: 30                  # Keep-alive message interval (seconds)
  ServerAliveCountMax: 3                   # Max missed keep-alives before disconnect

datalogger_mock_data_generator: # Simulates sensor data for testing
  is_enabled: false                        # Enable or disable mock data generation
  log_file: datalogger-mock-data-generator.log # Log file for mock data generation
  data_file: datalogger-mock-data.csv      # File to store mock data
  interval: 10                             # Frequency of data generation (minutes)

network_interface_monitor: # Monitors network interface traffic
  is_enabled: false                        # Enable or disable network monitoring
  log_file: network-interface-monitor.log  # Log file for network activity
  log_rotation_interval: 86400             # Interval to rotate logs (seconds)
  interfaces:                              # List of network interfaces to monitor
    - name: enp2s0
      log_file_directory: enp2s0           # Directory for interface logs (pcap files)
    - name: nebula1
      log_file_directory: nebula1          # Directory for interface logs (pcap files)

led_monitor: # Monitors LED status (currently not supported on newer kernels)
  is_enabled: false                        # Enable or disable LED monitoring
  log_file: led-monitor.log                # Log file for LED monitor

lora_radio: # Configures LoRa radio communication settings
  is_enabled: true                         # Enable or disable LoRa radio
  log_file: lora-radio.log                 # Log file for LoRa activity
  mode: noevio                             # LoRa mode ("evio", "noevio", "noevio-nat", "pendant")
  serial_interface: ttyUSB0                # Serial device for LoRa radio
  lora_interface: tnc0                     # Network interface for LoRa
  evio_interface: appCIBR6                 # Interface for "evio" mode
  uplink_interface: enp1s0                 # Internet-facing interface in "noevio-nat" mode
  node_ip: 10.10.101.3                     # IP address for this node
  node_netmask: /24                        # Netmask for the LoRa network
  lora_gateway_ip: 10.10.101.1             # Gateway IP for "pendant" mode
  baud_rate: 115200                        # Baud rate for serial communication
  mtu: 400                                 # Maximum Transmission Unit (bytes)
  rate: 20                                 # Data rate (kbit)
  burst: 32                                # Burst data rate (kbit)
  latency: 400                             # Expected latency (ms)
  ingress_policing_rate: 10                # Ingress rate limit (kbit) for "evio"
  ingress_policing_burst: 10               # Burst rate limit (kbit) for "evio"

nebula_overlay_network: # Manages Nebula VPN overlay network
  is_enabled: true                         # Enable or disable Nebula
  log_file: nebula-overlay-network.log     # Log file for Nebula

eddyflux_get_files: # Retrieves EddyFlux files via SSH
  is_enabled: true                         # Enable or disable file retrieval
  log_file: eddyflux-get-files.log         # Log file for retrieval actions
  ssh_user: licor                          # SSH username to connect to the source
  ssh_host: 10.10.1.4                      # Host IP of the source device
  source_path: ~/data/summaries/*          # Source directory to pull files from
  destination_path: /data/fcre-eddyflux-data/ # Destination directory to store files

Run Initializer

Run the initializer to initialize the gateway based on the config file for the first time (e.g., changing hostname, setting up the reverse SSH connection, creating the Git local repos, creating necessary symbolic links, etc.). After that, there is no need to run this script:

~/miscellaneous/gateways/base/initializer.sh

Set Up Crontab Jobs

crontab -e

Copy the content from ~/miscellaneous/gateways/cron-jobs/non-root and save.

Default Crontab Entries

# Schedules the system shutdown
@reboot /home/ubuntu/miscellaneous/gateways/system-setup/shutdown-scheduler.sh

# Facilitates SSH connection from JS2
@reboot sleep 90 && /home/ubuntu/miscellaneous/gateways/remote-access/reverse-ssh.sh

# Notifies the system is booted up
@reboot sleep 90 && /home/ubuntu/miscellaneous/gateways/system-monitors/startup-notifier.sh

# Captures and logs the status of the system
@reboot sleep 90 && /home/ubuntu/miscellaneous/gateways/system-monitors/status-monitor.sh
25 00,08,14,20 * * * /home/ubuntu/miscellaneous/gateways/system-monitors/status-monitor.sh

# Pushes the new additions to the remote repo and runs Git garbage collection afterwards
@reboot sleep 120 && /home/ubuntu/miscellaneous/gateways/git-maintenance/git-push.sh; /home/ubuntu/miscellaneous/gateways/git-maintenance/git-garbage-collector.sh; /home/ubuntu/miscellaneous/gateways/git-maintenance/git-push.sh
30 00,08,14,20 * * * /home/ubuntu/miscellaneous/gateways/git-maintenance/git-push.sh; /home/ubuntu/miscellaneous/gateways/git-maintenance/git-garbage-collector.sh; /home/ubuntu/miscellaneous/gateways/git-maintenance/git-push.sh

# Sends awake ping signals to healthchecks.io
* * * * * /home/ubuntu/miscellaneous/gateways/system-monitors/health-checks-io.sh

# Generates datalogger mock data
*/10 * * * * /home/ubuntu/miscellaneous/gateways/data-tools/datalogger-mock-data-generator.sh

# Runs tcpdump on network interface(s)
@reboot /home/ubuntu/miscellaneous/gateways/system-monitors/network-interface-monitor.sh

# Runs Nebula Overlay Network
@reboot sleep 60 && /home/ubuntu/miscellaneous/gateways/remote-access/nebula-overlay-network.sh
00 * * * * /home/ubuntu/miscellaneous/gateways/remote-access/nebula-overlay-network.sh

# Runs LoRa Radio
@reboot sleep 60 && /home/ubuntu/miscellaneous/gateways/remote-access/lora-radio.sh
00 * * * *  /home/ubuntu/miscellaneous/gateways/remote-access/lora-radio.sh

# Gets EddyFlux Files
45 23 * * * /home/ubuntu/miscellaneous/gateways/eddyflux/eddyflux-get-files.sh

FLARE-specific Code Directory Tree

miscellaneous
├── gateways Scripts and config files for gateways
│   ├── base Base scripts
│   │   ├── initializer.sh #Initilizer script to setup the gateway for the first run
│   │   ├── module-toggler.sh #Toggler script to enable/disable modules
│   │   └── utils.sh #Util functions used in other scripts
│   ├── config-files #Default config files for each gateway, may need minor adjustments
│   │   ├── annie
│   │   │   └── config.yml
│   │   ├── bita
│   │   │   └── config.yml
│   │   ├── bjorn
│   │   │   └── config.yml
│   │   ├── carina
│   │   │   └── config.yml
│   │   ├── diana
│   │   │   └── config.yml
│   │   ├── henrietta
│   │   │   └── config.yml
│   │   ├── norvel
│   │   │   └── config.yml
│   │   └── config.yml #Placeholder for the actual config file
│   ├── cron-jobs #Cron jobs for scheduling tasks
│   │   └── non-root #Cron jobs for non-root (ubuntu) user
│   ├── data-tools
│   │   └── datalogger-mock-data-generator.sh #Datalogger Mock Data Genarator
│   ├── eddyflux
│   │   └── eddyflux-get-files.sh #EddyFlux Script for FCR Catwalk Gateway
│   ├── git-maintenance
│   │   ├── git-garbage-collector.sh
│   │   └── git-push.sh
│   ├── remote-access
│   │   ├── lora-radio.sh
│   │   ├── nebula-overlay-network.sh
│   │   └── reverse-ssh.sh #Access gateways via Reverse-SSH from JS2 VM
│   ├── system-monitors
│   │   ├── health-checks-io.sh
│   │   ├── led-monitor.sh #Legacy script to use gateway LEDs
│   │   ├── network-interface-monitor.sh
│   │   ├── startup-notifier.sh #Pushes to GitHub as soon as the gateway is up
│   │   └── status-monitor.sh #Logs the status of the system
│   └── system-setup
│       ├── led-off.sh #Legacy script to turn off all gateway LEDs
│       └── shutdown-scheduler.sh #Shutdown for battery- or AC-powered gateways
├── lora
│   ├── restart_lora_at_evio_switch.sh
│   ├── restart_lora_at_noevio_gateway.sh
│   └── restart_lora_at_pendant.sh
├── nebula
│   ├── config-lighthouse.yaml
│   ├── config.yaml
│   └── restart_nebula.sh
└── README.md


Maintenance and Troubleshooting

Installing Updates on Ubuntu Machines

For the stability of production machines, automatic updates (unattended-upgrade) is disabled. But it is highly recommended to update the operating system periodically (e.g., once a year).

Full OS Updates

sudo apt update -y && sudo apt upgrade -y && sudo apt autoremove -y && sudo apt dist-upgrade -y && sudo snap refresh

Security Updates

List all available security upgrades:

apt list --upgradable 2>/dev/null | grep -i security

Install all available security upgrades:

sudo apt install --only-upgrade $(apt list --upgradable 2>/dev/null | grep -i security | cut -d/ -f1)

Disk Full Issue on Gateways

Gateways use separate disks for the operating system and sensor data and logs. Sensor data and logs are located at /data.

Check Disk Usage

df -h

Output Sample:

Filesystem                     	Size  Used Avail Use% Mounted on
tmpfs                          	374M  1.4M  373M   1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv   57G   17G   38G  31% /
tmpfs                          	1.9G 	0  1.9G   0% /dev/shm
tmpfs                          	5.0M 	0  5.0M   0% /run/lock
/dev/sda1                       	59G  9.3G   47G  17% /data
/dev/mmcblk0p2                 	2.0G  379M  1.5G  21% /boot
/dev/mmcblk0p1                 	1.1G  6.1M  1.1G   1% /boot/efi
tmpfs                          	374M  4.0K  374M   1% /run/user/1000

Git Garbage Collector

git-garbage-collector.sh script runs from crontab after each Git push is responsible for cleaning up the space from unnecessary Git residue.

/home/ubuntu/miscellaneous/gateways/git-maintenance/git-garbage-collector.sh

Free space

You can immediately free up space used by unreachable loose objects without requiring extra disk space. Run the following commands inside your repository:

git reflog expire --expire=now --expire-unreachable=now --all
git prune --progress