Terraform SRE build - unix1998/technical_notes GitHub Wiki
To set up a virtual server farm in AWS using Terraform while incorporating Site Reliability Engineering (SRE) principles like auto-scaling and auto-healing, you need to define various AWS resources and configurations in your Terraform configuration files. Here's a step-by-step guide to help you get started:
1. Set Up Your Terraform Configuration File
First, create a main configuration file, typically named main.tf
. This file will define your AWS provider, VPC, subnets, security groups, EC2 instances, and auto-scaling configurations.
2. Define the AWS Provider
provider "aws" {
region = "us-west-2" # Replace with your preferred region
}
3. Create a VPC
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
}
resource "aws_subnet" "subnet1" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-west-2a"
}
resource "aws_subnet" "subnet2" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.2.0/24"
availability_zone = "us-west-2b"
}
4. Define Security Groups
resource "aws_security_group" "web" {
vpc_id = aws_vpc.main.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
5. Launch Configuration and Auto Scaling Group
resource "aws_launch_configuration" "example" {
name = "example-launch-configuration"
image_id = "ami-0c55b159cbfafe1f0" # Replace with your preferred AMI ID
instance_type = "t2.micro"
security_groups = [aws_security_group.web.id]
lifecycle {
create_before_destroy = true
}
}
resource "aws_autoscaling_group" "example" {
launch_configuration = aws_launch_configuration.example.id
min_size = 1
max_size = 5
desired_capacity = 2
vpc_zone_identifier = [aws_subnet.subnet1.id, aws_subnet.subnet2.id]
tag {
key = "Name"
value = "example-instance"
propagate_at_launch = true
}
lifecycle {
create_before_destroy = true
}
health_check_type = "EC2"
health_check_grace_period = 300
}
6. Auto Healing with Auto Scaling Policies
Define auto-scaling policies to handle scaling events based on CloudWatch alarms.
resource "aws_autoscaling_policy" "scale_up" {
name = "scale_up"
scaling_adjustment = 1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.example.name
}
resource "aws_autoscaling_policy" "scale_down" {
name = "scale_down"
scaling_adjustment = -1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.example.name
}
resource "aws_cloudwatch_metric_alarm" "high_cpu" {
alarm_name = "high_cpu"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "120"
statistic = "Average"
threshold = "70"
alarm_actions = [aws_autoscaling_policy.scale_up.arn]
dimensions = {
AutoScalingGroupName = aws_autoscaling_group.example.name
}
}
resource "aws_cloudwatch_metric_alarm" "low_cpu" {
alarm_name = "low_cpu"
comparison_operator = "LessThanOrEqualToThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "120"
statistic = "Average"
threshold = "30"
alarm_actions = [aws_autoscaling_policy.scale_down.arn]
dimensions = {
AutoScalingGroupName = aws_autoscaling_group.example.name
}
}
7. Initialize and Apply Terraform Configuration
-
Initialize your Terraform workspace.
terraform init
-
Validate the configuration.
terraform validate
-
Apply the configuration.
terraform apply
This configuration sets up a basic virtual server farm with auto-scaling and auto-healing capabilities using Terraform on AWS. Adjust the parameters as needed for your specific use case.