AutoScaling - devian-al/AWS-Solutions-Architect-Prep GitHub Wiki

Auto Scaling Simplified

AWS Auto Scaling lets you build scaling plans that automate how groups of different resources respond to changes in demand. You can optimize availability, costs, or a balance of both. AWS Auto Scaling automatically creates all of the scaling policies and sets targets for you based on your preference.

Auto Scaling Key Details

  • Auto Scaling is a major benefit from the cloud's economies of scale so if you ever have a requirement for scaling, automatically think of using the Auto Scaling service.
  • When you use Auto Scaling, your applications gain the following benefits
    • Better fault tolerance Auto Scaling can detect when an instance is unhealthy, terminate it, and launch an instance to replace it. You can also configure Auto Scaling to use multiple Availability Zones. If one Availability Zone becomes unavailable, Auto Scaling can launch instances in another one to compensate.
    • Better availability Auto Scaling can help you ensure that your application always has the right amount of capacity to handle the current traffic demands.
  • When it comes to actually scale your instance groups, the Auto Scaling service is flexible and can be done in various ways
  • In maintaining the current running instance, Auto Scaling will perform occasional health checks on the running instances to ensure that they are all healthy. When the service detects that an instance is unhealthy, it will terminate that instance and then bring up a new one online.
  • When designing HA for your Auto Scaling, use multiple AZs and multiple regions wherever you can.
  • Auto Scaling allows you to suspend and then resume one or more of the Auto Scaling processes in your Auto Scaling Group. This can be very useful when you want to investigate a problem in your application without triggering the Auto Scaling process when making changes.
  • You can specify your launch configuration with multiple Auto Scaling groups. However, you can only specify one launch configuration for an Auto Scaling group at a time.
  • You cannot modify a launch configuration after you've created it. If you want to change the launch configuration for an Auto Scaling group, you must create a new launch configuration and update your Auto Scaling group to inherit this new launch configuration.

Amazon EC2 Auto Scaling

  • Ensuring you have the correct number of EC2 instances available to handle your application load using Auto Scaling Groups.

  • An Auto Scaling group contains a collection of EC2 instances that share similar characteristics and are treated as a logical grouping for the purposes of instance scaling and management.

  • You specify the minimum, maximum and desired number of instances in each Auto Scaling group. Key Components

    • Groups - Your EC2 instances are organized into groups so that they are treated as a logical unit for scaling and management. When you create a group, you can specify its minimum, maximum, and desired number of EC2 instances.

    • Configuration templates - Your group uses a launch template or launch configuration (fewer features) as a template for its EC2 instances. When you create a launch template, you can specify information such as the AMI ID, instance type, key pair, security groups, and block device mapping for your instances.

    • Scaling options - How to scale your Auto Scaling groups.

  • You can add a lifecycle hook to your Auto Scaling group to perform custom actions when instances launch or terminate.

    • Applies to instances launched or terminated
    • Maximum instance lifetime
    • Instance refresh
    • Capacity rebalancing
    • Warm pools
  • Scaling Options

    • Scale to maintain current instance levels at all times
    • Manual Scaling
    • Scale based on a schedule
    • Scale based on a demand
    • Use predictive scaling
  • Scaling Policy Types

    • Target tracking scaling—Increase or decrease the current capacity of the group based on a target value for a specific metric.
      • You can have multiple target tracking scaling policies for a scalable target, provided that each of them uses a different metric.
      • You can also optionally disable the scale-in portion of a target tracking scaling policy.
    • Step scaling—Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
    • Simple scaling—Increase or decrease the current capacity of the group based on a single scaling adjustment.
  • The size of your Auto Scaling group is restricted by capacity limits, which can be resized between the minimum and maximum size limits.

  • The cooldown period is a configurable setting that helps ensure to not launch or terminate additional instances before previous scaling activities take effect.

    • EC2 Auto Scaling supports cooldown periods when using simple scaling policies, but not when using target tracking policies, step scaling policies, or scheduled scaling.
  • You can use the default instance warmup to improve CloudWatch metrics used for dynamic scaling. This feature lets your EC2 instances finish warming up before they contribute the usage data.

  • Dynamic scaling can better react to the demand curve of your application if you utilize a target tracking scaling policy based on a custom Amazon SQS queue metric.

  • Amazon EC2 Auto Scaling marks an instance as unhealthy if the instance is in a state other than running, the system status is impaired, or Elastic Load Balancing reports that the instance failed the health checks.

Auto Scaling Default Termination Policy

  • The default termination policy for an Auto Scaling Group is to automatically terminate a stopped instance, so unless you've configured it to do otherwise, stopping an instance will result in termination regardless if you wanted that to happen or not. A new instance will be spun up in its place.

  • The default termination policy will spare instances that you tell it in case some servers are running critical systems or applications. These critical servers are protected from "scale in", which is just the deletion process of instances deemed superfluous to requirements.

  • The default termination policy is designed to help ensure that your network architecture spans Availability Zones evenly. With the default termination policy, the behavior of the Auto Scaling group is as follows

    • If there are instances in multiple Availability Zones, it will terminate an instance from the Availability Zone with the most instances. If there is more than one Availability Zone with the same max number of instances, it will choose the Availability Zone where instances use the oldest launch configuration.
    • It will then determine which unprotected instances in the selected Availability Zone use the oldest launch configuration. If there is one such instance, it will terminate it.
    • If there are multiple instances to terminate, it will determine which unprotected instances are closest to the next billing hour. (This helps you maximize the use of your EC2 instances and manage your Amazon EC2 usage costs.) If there are some instances that match this criteria, they will be terminated.
  • This flow chart can provide further clarity on how the default Auto Scaling policy decides which instances to delete

  • Custom Termination Policies

    • OldestInstance – Terminate the oldest instance in the group.
    • NewestInstance – Terminate the newest instance in the group.
    • OldestLaunchConfiguration – Terminate instances that have the oldest launch configuration.
    • ClosestToNextInstanceHour – Terminate instances that are closest to the next billing hour.

    An instance can be temporarily removed from an Auto Scaling group by changing its state from InService into Standby.

Autoscaling ECS

Metrics available for instances

  • CPU Utilization
  • Disk Reads
  • Disk Read Operations
  • Disk Writes
  • Disk Write Operations
  • Network In
  • Network Out
  • Status Check Failed (Any)
  • Status Check Failed (Instance)
  • Status Check Failed (System)

Metrics available for ECS Service*

  • ECSServiceAverageCPUUtilization — Average CPU utilization of the service.
  • ECSServiceAverageMemoryUtilization — Average memory utilization of the service.
  • ALBRequestCountPerTarget — Number of requests completed per target in an Application Load Balancer target group.

Monitoring

  • Health checks – identifies any instances that are unhealthy
    • Amazon EC2 status checks (default)
    • Elastic Load Balancing health checks
    • Custom health checks.
  • Auto scaling does not perform health checks on instances in the standby state. Standby state can be used for performing updates/changes/troubleshooting without health checks being performed or replacement instances being launched.
  • CloudWatch metrics – enables you to retrieve statistics about Auto Scaling-published data points as an ordered set of time-series data, known as metrics. You can use these metrics to verify that your system is performing as expected.
  • CloudWatch Events – Auto Scaling can submit events to CloudWatch Events when your Auto Scaling groups launch or terminate instances, or when a lifecycle action occurs.
  • SNS notifications – Auto Scaling can send Amazon SNS notifications when your Auto Scaling groups launch or terminate instances.
  • CloudTrail logs – enables you to keep track of the calls made to the Auto Scaling API by or on behalf of your AWS account, and stores the information in log files in an S3 bucket that you specify.

Security

  • Use IAM to help secure your resources by controlling who can perform AWS Auto Scaling actions.

    By default, a brand new IAM user has NO permissions to do anything. To grant permissions to call Auto Scaling actions, you attach an IAM policy to the IAM users or groups that require the permissions it grants.

Auto Scaling Cooldown Period

  • The cooldown period is a configurable setting for your Auto Scaling Group that helps to ensure that it doesn't launch or terminate additional instances before the previous scaling activity takes effect.
  • After the Auto Scaling Group scales using a policy, it waits for the cooldown period to complete before resuming further scaling activities if needed.
  • The default waiting period is 300 seconds, but this can be modified.