Elastic Block Store & File Store (EBS | EFS) - devian-al/AWS-Solutions-Architect-Prep GitHub Wiki

Elastic Block Store (EBS)

EBS Simplified

An Amazon EBS volume is a durable, block-level storage device that you can attach to a single EC2 instance. You can think of EBS as a cloud-based virtual hard disk. You can use EBS volumes as primary storage for data that requires frequent updates, such as the system drive for an instance or storage for a database application. You can also use them for throughput-intensive applications that perform continuous disk scans.

EBS Key Details

  • EBS volumes persist independently from the running life of an EC2 instance.
  • Each EBS volume is automatically replicated within its Availability Zone to protect from both component failure and disaster recovery (similar to Standard S3).
  • There are five different types of EBS Storage
    • General Purpose (SSD)
    • Provisioned IOPS (SSD, built for speed)
    • Throughput Optimized Hard Disk Drive (magnetic, built for larger data loads)
    • Cold Hard Disk Drive (magnetic, built for less frequently accessed workloads)
    • Magnetic
  • EBS Volumes offer 99.999% SLA.
  • Wherever your EC2 instance is, your volume for it is going to be in the same availability zone
  • An EBS volume can only be attached to one EC2 instance at a time.
  • After you create a volume, you can attach it to any EC2 instance in the same availability zone.
  • Amazon EBS provides the ability to create snapshots (backups) of any EBS volume and write a copy of the data in the volume to S3, where it is stored redundantly in multiple Availability Zones
  • An EBS snapshot reflects the contents of the volume during a concrete instant in time.
  • An image (AMI) is the same thing, but includes an operating system and a boot loader so it can be used to boot an instance.
  • AMIs can also be thought of as pre-baked, launchable servers. AMIs are always used when launching an instance.
  • When you provision an EC2 instance, an AMI is actually the first thing you are asked to specify. You can choose a pre-made AMI or choose your own made from an EBS snapshot.
  • You can also use the following criteria to help pick your AMI
    • Operating System
    • Architecture (32-bit or 64-bit)
    • Region
    • Launch permissions
    • Root Device Storage (more in the relevant section below)
  • You can copy AMIs into entirely new regions.

    When copying AMIs to new regions, Amazon won’t copy launch permissions, user-defined tags, or Amazon S3 bucket permissions from the source AMI to the new AMI. You must ensure those details are properly set for the instances in the new region.

  • You can change EBS volumes on the fly, including the size and storage type

SSD vs. HDD

  • SSD-backed volumes are built for transactional workloads involving frequent read/write operations, where the dominant performance attribute is IOPS.

    Will your workload be IOPS heavy? Plan for SSD.

  • HDD-backed volumes are built for large streaming workloads where throughput (measured in MiB/s) is a better performance measure than IOPS.

    Will your workload be throughput heavy? Plan for HDD.

SDD

General Purpose SSD Provisioned IOPS SSD
Volume type gp3 gp2 io2 Block Express ‡ io2 io1
Durability 99.8% - 99.9% durability (0.1% - 0.2% annual failure rate) 99.8% - 99.9% durability (0.1% - 0.2% annual failure rate) 99.999% durability (0.001% annual failure rate) 99.999% durability (0.001% annual failure rate) 99.8% - 99.9% durability (0.1% - 0.2% annual failure rate)
Use cases Transactional workloads Virtual desktops Medium-sized, single-instance databases Low-latency interactive applications Boot volumes Development and test environments Workloads that require: Sub-millisecond latency Sustained IOPS performance More than 64,000 IOPS or 1,000 MiB/s of throughput Workloads that require sustained IOPS performance or more than 16,000 IOPS I/O-intensive database workloads
Volume size 1 GiB - 16 TiB 4 GiB - 64 TiB 4 GiB - 16 TiB
Max IOPS per volume(16 KiB I/O) 16,000 256,000 64,000 †
Max throughput per volume 1,000 MiB/s 250 MiB/s * 4,000 MiB/s 1,000 MiB/s †
Amazon EBS Multi-attach Not supported Supported Supported Supported Supported

HDD

Throughput Optimized HDD Cold HDD
Volume type st1 sc1
Durability 99.8% - 99.9% durability (0.1% - 0.2% annual failure rate) 99.8% - 99.9% durability (0.1% - 0.2% annual failure rate)
Use cases Big data Data warehouses Log processing Throughput-oriented storage for data that is infrequently accessed Scenarios where the lowest storage cost is important
Volume size 125 GiB - 16 TiB 125 GiB - 16 TiB
Max IOPS per volume (1 MiB I/O) 500 250
Max throughput per volume 500 MiB/s 250 MiB/s
Amazon EBS Multi-attach Not supported Not supported
Boot volume Not supported Not supported

EBS Snapshots

  • EBS Snapshots are point in time copies of volumes. You can think of Snapshots as photographs of the disk’s current state and the state of everything within it.
  • A snapshot is constrained to the region where it was created.
  • Snapshots only capture the state of change from when the last snapshot was taken. This is what is recorded in each new snapshot, not the entire state of the server.
  • Because of this, it may take some time for your first snapshot to be created. This is because the very first snapshot's change of state is the entire new volume.

Only afterwards will the delta be captured because there will then be something previous to compare against.

  • EBS snapshots occur asynchronously which means that a volume can be used as normal while a snapshot is taking place.

  • When creating a snapshot for a future root device, it is considered best practices to stop the running instance where the original device is before taking the snapshot.

  • The easiest way to move an EC2 instance and a volume to another availability zone is to take a snapshot.

  • When creating an image from a snapshot, if you want to deploy a different volume type for the new image (e.g. General Purpose SSD -> Throughput Optimized HDD) then you must make sure that the virtualization for the new image is hardware-assisted.

  • A short summary for creating copies of EC2 instances

    Old instance -> Snapshot -> Image (AMI) -> New instance

  • You cannot delete a snapshot of an EBS Volume that is used as the root device of a registered AMI. If the original snapshot was deleted, then the AMI would not be able to use it as the basis to create new instances. For this reason, AWS protects you from accidentally deleting the EBS Snapshot, since it could be critical to your systems.

To delete an EBS Snapshot attached to a registered AMI, first remove the AMI, then the snapshot can be deleted.

EBS Root Device Storage

  • All AMI root volumes (where the EC2's OS is installed) are of two types EBS-backed or Instance Store-backed
  • When you delete an EC2 instance that was using an Instance Store-backed root volume, your root volume will also be deleted. Any additional or secondary volumes will persist however.
  • If you use an EBS-backed root volume, the root volume will not be terminated with its EC2 instance when the instance is brought offline. EBS-backed volumes are not temporary storage devices like Instance Store-backed volumes.
  • EBS-backed Volumes are launched from an AWS EBS snapshot, as the name implies
  • Instance Store-backed Volumes are launched from an AWS S3 stored template. They are ephemeral, so be careful when shutting down an instance!
  • Secondary instance stores for an instance-store backed root device must be installed during the original provisioning of the server. You cannot add more after the fact. However, you can add EBS volumes to the same instance after the server's creation.
  • With these drawbacks of Instance Store volumes, why pick one? Because they have a very high IOPS rate. So while an Instance Store can't provide data persistence, it can provide much higher IOPS compared to network attached storage like EBS.
  • Further, Instance stores are ideal for temporary storage of information that changes frequently such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers. When to use one over the other?
    • Use EBS for DB data, critical logs, and application configs.
    • Use instance storage for in-process data, noncritical logs, and transient application state.
    • Use S3 for data shared between systems like input datasets and processed results, or for static data needed by each new system when launched.

EBS Encryption

  • EBS encryption offers a straight-forward encryption solution for EBS resources that doesn't require you to build, maintain, and secure your own key management infrastructure.
  • It uses AWS Key Management Service (AWS KMS) customer master keys (CMK) when creating encrypted volumes and snapshots.
  • You can encrypt both the root device and secondary volumes of an EC2 instance. When you create an encrypted EBS volume and attach it to a supported instance type, the following types of data are encrypted
    • Data at rest inside the volume
    • All data moving between the volume and the instance
    • All snapshots created from the volume
    • All volumes created from those snapshots
  • EBS encrypts your volume with a data key using the AES-256 algorithm.
  • Snapshots of encrypted volumes are naturally encrypted as well.

    Volumes restored from encrypted snapshots are also encrypted. You can only share unencrypted snapshots.

  • The old way of encrypting a root device was to create a snapshot of a provisioned EC2 instance. While making a copy of that snapshot, you then enabled encryption during the copy's creation. Finally, once the copy was encrypted, you then created an AMI from the encrypted copy and used to have an EC2 instance with encryption on the root device. Because of how complex this is, you can now simply encrypt root devices as part of the EC2 provisioning options.

EFS

EFS provides a simple and fully managed elastic NFS file system for use within AWS. EFS automatically and instantly scales your file system storage capacity up or down as you add or remove files without disrupting your application.

EFS Key Details

  • In EFS, storage capacity is elastic (grows and shrinks automatically) and its size changes based on adding or removing files.
  • While EBS mounts one EBS volume to one instance, you can attach one EFS volume across multiple EC2 instances.
  • The EC2 instances communicate to the remote file system using the NFSv4 protocol. This makes it required to open up the NFS port for our security group (EC2 firewall rules) to allow inbound traffic on that port.
  • Within an EFS volume, the mount target state will let you know what instances are available for mounting
  • With EFS, you only pay for the storage that you use so you pay as you go. No pre-provisioning required.
  • EFS can scale up to the petabytes and can support thousands of concurrent NFS connections.
  • Data is stored across multiple AZs in a region and EFS ensures read after write consistency.
  • It is best for file storage that is accessed by a fleet of servers rather than just one server