Autoscaling Algorithm Implementation using OpenStack and Terraform in Addition with HTM - caprivm/thesis_msc GitHub Wiki

This section is intended to show how to implement an algorithm for autosaling resources in OpenStack using Terraform and HTM. Consider all the steps outlined. The test was done on an OpenStack based on MicroStack Canonical distribution, Terraform version is 0.14.10 and HTM is in version 2.1.15. The deployment was performed in a machine named Deployment Machine with the following hardware requirements:

Feature Value
CPU 2
RAM 4 GiB
Disk 50 GB
OS Used Ubuntu 20.04 LTS

The basic connection architecture between components is presented in the next figure:

Terraform and OpenStack and HTM connection

The contents of this page are:

Prerequisites

Run the following commands to download the PyEnv repository and install Python 3.7.9:

# Use pyenv for the installation.
cd && sudo apt update && sudo apt upgrade -y
sudo apt install -y git fakeroot
cd && git clone https://github.com/pyenv/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n  eval "$(pyenv init -)"\nfi' >> ~/.bashrc
exec "$SHELL"
sudo apt-get update
sudo apt-get install -y --no-install-recommends make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev
# Confirm the available versions of Python 3.6.x - 3.8.x. 
pyenv install --list | grep " 3\.[678]"
# Install python 3.7.9
pyenv install 3.7.9
pyenv global 3.7.9

If you want to validate the installation, run:

python -V
# Python 3.7.9

Introduction

There is a very important aspect to consider in this autoscaling mechanism: it integrates the predictions made by the HTM algorithm in the demand for resources, as well as their current state. Clearly, the algorithm scales based on threshold violation rules. For this reason, there are two escalation scenarios:

  1. Where the current state of resources is considered and escalation is carried out (reactive).
  2. Where the predictions and the current state of the resources are considered to carry out the escalation (proactive).

Before implementing the algorithm, consider the following considerations:

  1. Scaling decisions are made based on the behavior of the CPU, RAM, and usage of a VM's network resources such as traffic on its network interfaces.
  2. Horizontal scaling is prioritized over vertical.
  3. As a starting point, a single instance is implemented with an HTTP server based on the apache2 library. The traffic is demanded from a VM that stresses the server.
  4. As the server is stressed, using Apache Bench (ab hereinafter), the response time to changes in service demand is measured.
  5. CPU is stressed using ab. However, ab has been shown to have little significant impact on RAM, so an additional package called stress-ng is used to stress RAM. On the other hand, the traffic on the network interfaces also varies when using ab.
  6. When any of the autoscaling policies are met, a new instance is provisioned with Terraform. An external load balancer is not implemented.
  7. Also with Terraform, instances are eliminated when the demand for resources decreases to levels below defined thresholds.
  8. A persistent storage service is not implemented to maintain the configuration data of the created or deleted instances. This means that any type of configuration must be carried out during the creation of the machine, using for example a cloud init file, or later through a script.
  9. Continuous delivery of the service is guaranteed. In other words, regardless of what the consultation time is, the service will always be available.
  10. Both proactive and reactive escalation policies are implemented. It is assumed that scaling decisions cannot be overlapped because they are analyzed at different instants of time.
  11. A scenario is implemented where the demand for the service has a periodic trend, trying to imitate the normal behavior of traffic.

Predefined thresholds

The following table defines the variables and thresholds that determine the application of the autoscaling policies. The variables abstract the behavior of the infrastructure into quantitative values that, when exceeding certain thresholds, indicate that some policy should be applied.

Id. Variable Description Unit Value
1. cpu_usage[t] Average CPU usage % -
2. ram_usage[t] Average RAM usage % -
3. thrgpt_usage[t] Average throughput usage Kbps -
4. pd_cpu_usage Prediction of average CPU usage in an instance. % -
5. pd_ram_usage Prediction of average RAM usage in an instance. % -
6. pd_thrgpt_usage Prediction of average throughput usage in the NIC of an instance. Kbps -
7. threshold_cpu_max Upper threshold for efficient CPU usage in an instance. % 90%
8. threshold_cpu_min Lower threshold for efficient CPU usage in an instance. % 10%
9. threshold_ram_max Upper threshold for the efficient use of RAM in an instance. % 90%
10. threshold_ram_min Lower threshold for the efficient use of RAM in an instance. % 10%
11. threshold_thrgpt_max Upper threshold to avoid congestion on the NIC of an instance. Kbps pcl(95)
12. n_instances Number of instances currently deployed. - -

Where pcl() represents a percentile measured to the throughput data set and t the current instant of time over which the measurement is made. In this same table, the maximum and minimum thresholds that determine the scaling decisions are established. On the other hand, it is assumed that the variables have a predictable behavior, that is, they have a stationary component or a linear trend in their behavior. This is important in cases where an abnormality occurs. On the other hand, HTM can learn anomalies even when they are not part of normal behavior. However, anomalies should not infer the application of an autoscaling policy when they are ephemeral. This consideration is taken into account during the application of the autoscaling policies.

Procedure to Replicate Experimentation

The variables defined in the table are considered in the implementation of the autoscaling policies, through an algorithm defined in this section. The starting point is a single instance with an HTTP server based on the apache2 library. The deployment of this instance is automated with Terraform. For this, clone this repository and follow the following steps:

cd && sudo apt update && sudo apt upgrade -y
git clone https://github.com/caprivm/thesis_msc.git autoscaling
cd autoscaling
ls -1
# HTM
# OpenStack
# README.md
# autoreduce.sh
# autoscale.sh
# docs
# prometheus.yml        -> Modify OpenStack credentials if needed

Remember to have the variables for the OpenStack Identity service set in your /etc/environment and ~/.bashrc files. Also, remember that the address of the instances that will be created with Terraform is in the domain 10.20.20.0/24 for floating IPs. For example, creating the starting point instance, you need the floating IP 10.20.20.165. Follow the recommendations below for the instance.tf and user_data.tf files in the ~/autoscaling/OpenStack/OpenStack_LAB/terraform/openstack/debian/ folder if you need to make addressing modifications:

cd ~/autoscaling/OpenStack/OpenStack_LAB/terraform/openstack/debian/
ls -1
# instance.tf           -> Modify if needed
# terraform.tfstate.d
# user_data.tf          -> Modify if needed
# versions.tf

On the other hand, the cloud init file for Debian images can be found at:

cd ~/autoscaling/OpenStack/OpenStack_LAB/terraform/files
ls -1
# debian.tpl    -> Modify if needed
# ubuntu.tpl

Then, when the starting point is established, the instance that will serve to scale the service is also created from a Terraform configuration file. Therefore, you may need to make changes to your network addressing. Consider the following folder and files:

cd ~/autoscaling/OpenStack/OpenStack_LAB/terraform/scaling/debian 
ls -1
# instance.tf           -> Modify if needed
# terraform.tfstate.d
# user_data.tf          -> Modify if needed
# versions.tf

The cloud init file is in the same folder as above. After making the necessary modifications, you should be able to run the autoscaling algorithm.

NOTE: It is very important that you have deployed the other applications (Prometheus, Grafana, HTM, etc.) and have all the architecture exposed at the beginning of this section ready for experimentation. Also consider possible addressing modifications required by any of your applications.

Autoscaling Algorithm Implemented with Python

All the code is implemented in Python, it can be consulted at this link.

NOTE: The building_htm() function in the Python code defines the configuration parameters used to implement the HTM.

Consider the following procedure to implement the autoscaling algorithm. Please make sure you have previously installed Python version 3.7.9 as explained in Prerequisites.

cd ~/autoscaling/HTM/
python autoscaling.py

Expected Results

By combining the resource demand caused by apache-bench and stress-ng periodically, it is possible to simulate a scenario that approximates a real demand. The periodic form is obtained by following a periodic trend that tries to imitate a real behavior in the demand for resources. When the simulation is running, you can consult the Grafana dashboard to see that the script fluctuates the behavior of the CPU and RAM. The following figure shows an example of what happens in this dashboard.

Example in Grafana

The graph shows only the behavior when the VM is stressed. However, the algorithm performs its autoscaling tasks when required. Please consider the following graph to understand the expected results.

NOTE: Please consider that the language of the figure is Spanish.

Autoscaling

The first stage consists of generating the dynamic demand for resources. Then, once HTM is building, the reactive/proactive behavior of the demand, activate the different decisions made within the autoscaling algorithm. In this way, the second stage shows the moment in which the demand is evaluated, the third the moment in which the thresholds are exceeded to carry out the scaling (fourth stage) and stabilize the service (fifth stage).

⚠️ **GitHub.com Fallback** ⚠️