AWS Well Architected Framework and Tool - santhoshsantharam/aws-well-architected GitHub Wiki

AWS Well Architected Framework

Why to apply Well-Architected Framework?

  1. Build and deploy faster
  2. Lower or mitigate risks
  3. Make informed decisions
  4. Learn AWS best practices Basically, it provides a way to learn, measure and improve your architecture.

What does it provide?

  1. 5 Pillars
  2. General and Pillar Specific Design Principles
  3. Questions for each of the pillars

Five Pillars of Well-Architected Framework

Operational Excellence Pillar

The operational excellence pillar focuses on running and monitoring systems to deliver business value, and continually improving processes and procedures. Key topics include automating changes, responding to events, and defining standards to manage daily operations.

Security Pillar

The security pillar focuses on protecting information and systems. Key topics include confidentiality and integrity of data, identifying and managing who can do what with privilege management, protecting systems, and establishing controls to detect security events.

Reliability Pillar

The reliability pillar focuses on ensuring a workload performs its intended function correctly and consistently when it’s expected to. A resilient workload quickly recovers from failures to meet business and customer demand. Key topics include distributed system design, recovery planning, and how to handle change.

Performance Efficiency Pillar

The performance efficiency pillar focuses on using IT and computing resources efficiently. Key topics include selecting the right resource types and sizes based on workload requirements, monitoring performance, and making informed decisions to maintain efficiency as business needs evolve.

Cost Optimization Pillar

The cost optimization pillar focuses on avoiding unnecessary costs. Key topics include understanding and controlling where money is being spent, selecting the most appropriate and right number of resource types, analyzing spend over time, and scaling to meet business needs without overspending.

Design Principles

For example, Automate responses to security events: Monitor and automatically trigger responses to event-driven, or condition-driven alerts.

General Design Principles

  • Stop guessing your capacity needs
  • Test systems at production scale
  • Automate to make architectural experimentation easier
  • Allow for evolutionary architectures
  • Drive architectures using data (eg, using CloudWatch data)
  • Improve through game days

Pillar-Specific Design Principles

Operational Excellence

  • Perform operations as code
  • Annotate documentation (tooling & monitoring systems auto feed data)
  • Make frequent, small, reversible changes
  • Refine operations procedures frequently
  • Anticipate failures
  • Learn from all operational failures

Security

  • Implement a strong identify foundation
  • Enable traceability
  • Apply security at all layers
  • Automate security best practices
  • Protect data in transit and at rest
  • Keep people away from data
  • Prepare for security events

Reliability

  • Test recovery procedures
  • Automatically recover from failures
  • Scale horizontally to increase aggregate system availability
  • Stop guessing capacity
  • Manage change in automation

Performance Efficiency

  • Democratize advanced technologies
  • Go global in minutes
  • Use serverless architectures
  • Experiment more often
  • Mechanical sympathy

Cost Optimization

  • Adopt a consumption model
  • Measure overall efficiency
  • Stop spending money on data center operations
  • Analyze and attribute expenditure
  • Use managed services to reduce cost of ownership

Questions

Below is a sample question

What is available?

AWS Well-Architected Tool

Features

  • Define workloads
  • Perform reviews
    • Helpful resources
    • Assign priorities to pillars
  • Results
    • Improvement plan
    • Generate PCF Reports
    • Dashboard
    • Save milestones

Intent of review

  • Working together to improve (Not an audit)
  • Pragmatic proven advice (Not architecture astronauts)
  • Through lifecycle (Not a one-time check)

Suggestions

  • Start using it from the design phase. Earlier is better
  • Review all the questions to avoid bad decisions
  • Most workloads can be improved with this tool

Customer Use Cases

Question: Learning best practices for the cloud

  • How do I architect for the cloud?
  • Being constrained by on-premises assumptions
  • So many resources, where to start?
  • How do I know if I have done something wrong?

Answer

  • Learn AWS Best Practice
  • Transition to cloud native
  • Sign-post resources/services
  • Identify improvments
  • Inform future architectures

Question: Technology governance

  • Ready to go into production?
  • Are my teams following best practice?
  • Consistent measurement?
  • Burn down risks?

Answer

  • Review Process
  • Consistent

Question: Portfolio management

  • Where is my inventory of workloads?
  • What decisions did I make in each?
  • What risks are in each?
  • How are risks changing over time?
  • Where should I invest?
  • Are there trends I can address holistically?
  • Can I build mechanisms?

Answers

  • Technology Portfolio

Demonstrations

  • Workload: The collection of AWS resources and code that delivers business value.
    • Subsets of resources in one or more accounts

Concepts

  • Result of a review:
    • High Risk
    • Medium Risk
  • An Improvement Plan provides the steps to improve
  • Use milestones to record progress
    • Design Time such as Original Design, Design Review, Pre Go-Live
    • Run Time: such as Version 1, New Feature Release

Benefits of AWS Well-Architected

  • Think Cloud-Natively
  • Consistent Approach to Reviewing Architectures
  • Understand Potential Impact
  • Visibility of Risks

How to get started?