AWS Misc - keshavbaweja-git/guides GitHub Wiki

Cost optimisation

  • Monitor, measure and analyse spend
    • AWS Billing and Cost Management
    • Cost Explorer
  • Stop unused EC2 instances
    • EBS backed EC2 instances can be stopped
    • Instance store backed EC2 instances can't be stopped, but can be terminated
  • AutoScaling
    • Automate auto scaling
  • Reserved instances
    • Purchase EC2 capacity for 1yr/3yrs
    • Partial/full upfront payment
  • Spot instances
    • Purchase surplus AWS capacity at cheaper rates
    • Can be reclaimed by AWS with two minutes notice
  • Leverage AWS Lambda - function as service
    • Pay by usage
    • Analyse and minimise IO wait time in function execution
    • Multiple external network call should executed in parallel
    • Use Step Functions to build state machine and event based workflow
  • Leverage caching
    • CloudFront
    • Avoid calls to DynamoDB, RDS with cache implementation
  • S3 Storage tiering
    • Standard, IA, IA One Zone, Glacier
  • AWS Savings Plan
    • New, flexible pricing model
    • EC2, Fargate
    • Commit to consistent usage
    • Compute Savings Plans across
      • Regions
      • Instance Family
      • OS
      • EC2/Fargate
      • Tenancy
    • EC2 Savings Plan
      • AZ
      • Instance size
      • OS
      • Tenancy

Security

  • Enable strong identity management
    • Centralize identities across accounts
    • Use identity federation
    • Use STS for temporary credentials
  • Attach identity based policies to users/groups/roles
  • Secure all layers
    • Edge
      • CloudFront
      • WAF & Shield
    • ALB in public subnets
    • DMZ subnet with NAT Gateway
    • Application in private subnet
    • Database in private subnet
  • Route Table
    • Define routing configuration
    • Destination, Port, Protocol
  • NACLs
    • Inbound and outbound rules
    • Allow only
    • Stateless
  • Security Groups
    • Assigned at instance level
    • Inbound and outbound rules
    • Allow and Deny rules with an implicit Deny
    • Stateful firwalls
  • Tracing
    • Cloud Trail
    • Cloud Watch Logs
  • Encryption

Multi region deployment

Why

  • High availability
  • Disaster Recovery/Failover
  • GeoLocation
    • Lower latency
    • Data residency requirements

Design patterns

  • Read local, write global
    • Avoids inter region cluster
    • Higher write latency for one region
    • Supports transactions
  • Read local, write partitioned
    • Users in a region write to services based on partitioning strategy defined by application
  • Read local, write local

Foundational pillars

High availability

- Region
  - AZs
    - Data centres
- Multi AZ by default 
  - File store
    - S3
    - EFS
  - NO SQL
    - DynamoDB
    - QLDB
  - Streams & Queue
    - Kinesis
    - SQS
- Multi AZ configurable
  - Search
    - ElasticSearch
  - RDBMS
    - RDS (Primary and Read Replicas)
    - Aurora (Supports multi master)
  - NO SQL
    - Neptune
    - DocumentDB
    - ElastiCache (Master and read replicas)
  - Streams & Queue
    - Amazon MSK
    - Amazon MQ

Cross Region Replication

- S3 supports CRR
  - Object versioning needs to be enabled
- EBS incremental snapshots (if encrypted Customer provided CMK should be used). KMS CMK is region specific.
- DynamoDB streams 
  - Global Table
  - DynamoDB streams need to be enabled
  - Read local, write local
- RDS
  - Supports cross region replicator
  - Read local, write global

Networking

  • VPN appliance + VPN Gateway
    • Data travels over internet, jitter, unreliable latency
  • VPC Peering across regions
    • Data travels over AWS backbone
    • Spider web with multiple VPCs
  • Transit Gateway in each region
    • Data travels over AWS backbone

Traffic Routing

  • Route 53 routing
    • Latency based routing
    • Geolocation based routing
    • DNS failover
      • Route 53 supports application endpoint monitoring with health checks
      • DNS responses are cached by clients
  • AWS Global Accelerator
    • Same pairs of static IP addresses across all regions
    • Anycast
    • TCP connection needs to be reestablished
    • Global Accelerator directs traffic to nearest functional region
  • CloudFront + Lambda@Edge
    • Path based routing
    • Geolocation based routing
    • Cookies

Management

  • AWS Config Rules
  • AWS Systems Manager
  • AWS Cloud Watch Log, Cloud Watch Metrics
  • AWS CloudFormation Stack Sets