AWS - kamialie/knowledge_corner GitHub Wiki
- Overview
- Architecture
- Migration
- Pricing and support
- Management
- DevOps
- CLI
- Other services
- Learning and certification
Well-Architected Tool
- framework to evaluate 5 pillars. Provides questions
and generates a report based on answers with information on what can be
improved or optimized. Can be used to monitor architecture changes over time,
set goals, etc.
The ability to run and monitor systems to deliver business value, and continuously improve processes and procedures.
Design principals:
- operations as code - infrastructure as code
- small frequent reversible changes - automate deployments, be able to revert at any time
- annotated documentation - automated documentation creation
- refine operations frequently - all team aware of procedures
- anticipate failure - learn from them
The ability to protect information, systems, and assets through risk assessments and mitigation strategies.
Design principals:
- strong identity foundation - centralized privilege management, least privilege principal, avoid long-term credentials
- enable traceability - logs and metrics integration for fast response
- automate best practices
- apply encryption in transit and at rest
- perform data integrity check
- reduce or limit direct access or manual processing of data
The ability to recover from infrastructure or service disruptions, dynamically acquire computational resources to meet demand, and mitigate disruptions such as misconfigurations or network issues.
Design principals:
- recovery planning and testing - simulate or recreate failures
- automate recovery
- dynamic scaling - maintain optimal level to satisfy demand, and distribute work to avoid single point of failure
The ability to use compute resources efficiently to meet requirements as demands change and technologies evolve.
Design principals:
- try out and experiment with new options and advanced technologies
- go global in minutes
- strive to use more serverless
- find best fit for the given workload
- evaluate trade-offs (f.e. caching)
The ability to run systems at the lowest possible cost.
Design principals:
- adopt consumption models - pay only for what is used (strive to use managed services and serverless)
- measure efficiency
- analyze and attribute expenditure - identify system components and associated costs (leverage tags)
Cloud Adoption Framework (AWS CAF)
Organizes guidance in 6 areas (perspectives) to focus on for the migration:
- business
- people
- governance
- platform
- security
- operations
- rehosting - move applications as is
- replatforming - moving with few cloud optimizations (not core)
- refactoring - re-architecturing and developing application with cloud-native features
- repurchasing - moving from traditional licence to software-as-a-service model (changing existing vendor to cloud-based version)
- retaining - keeping critical applications (might require major refactoring or can be postponed)
- retiring - removing applications that are not being used
Fundamental drivers of cost:
- compute - hourly from start to termination
- storage - per GB
- data transfer - pay for outbound; usually no charge for inbound or between AWS
services within the same region (f.e.
EC2
andS3
)
-
always (f.e.
Lambda
allows 1 million free invocations per month) -
first 12 months (f.e.
S3
for 5 GB) -
trial (f.e.
LightSail
1 month of up to 750 hours)
Pricing Calculator estimates the cost per service, service group or total infrastructure.
Organizations
provides Consolidated billing
feature (free) that groups multiple accounts billing info into one (by creating
one payer account that can view and pay combined bills of all linked accounts).
It can also apply bulk discounts and Savings Plans to multiple accounts (f.e.
Dedicated instances or total volume used by S3).
Total Cost of Ownership
calculator (TCO) creates a report on estimated
savings on moving from on-prem to AWS.
Budgets
can be used to create budget plan of service usage, costs and
instance reservations. Can track cost per service, reserved instances, Savings Plan
utilizations and coverage. Also provides fully customizable alert system
(f.e. if budget has reached certain percentage). Updates 3 times a day.
Provides cost analytics (reports, visualizations) over specified period of time, forecasts (up to 12 months), and recommendations (all accessible via API as well). Among many grouping options (f.e. resource, region) can also leverage tags.
Plan | Cost | Support | Trusted Advisor | Other |
---|---|---|---|---|
Basic | free | 24/7 customer service limited to account and billing info, documentation, support forums | 7 core checks |
Personal Health Dashboard - alert and remediation guidance when AWS is experiencing events that may affect you |
Developer | $29/month or 3% of AWS costs | Basic, plus email access to customer service, one person is specified as a primary contact (also can ask technical questions) | 7 core checks | Basic |
Business | $100/month or 3-10% of AWS costs | Developer, plus direct phone, chat access to customer support | full aspect of best practices | Infrastructure event management (extra fee) |
Enterprise | $15000/month or 3-10% of AWS costs | Business, plus 15-minute SLA for business critical workloads | full aspect of best practices | dedicated Technical Account Manager (TAM), concierge support team |
SLA
(Service-level Agreement) specifies response time. Paid support plans
(all except Basic) are on month-to-month basis.
Management of multiple AWS accounts (global service). Provides consolidated billing across all accounts (single payment method).
Main account is master account (can't be changed), while other accounts are member accounts. Member account can only be part of one organization.
Organizational units
(OU
) group multiple accounts with similar business or
security requirements. OU
can include other OU
s.
Policies can be attached to individual members or OU
s.
Service Control Policies,
(SCP
) allow to put restrictions on AWS services, resources, and individual
API calls that users and roles can access. Can be applied to organization root,
individual member account or OU
, thus, affecting all users, groups and roles
within an account (in contrast, IAM
policies can not be applied to root
account), but does not apply to master account. Explicit DENY on higher
level (f.e. OU
) can not be overwritten by ALLOW on lower level (f.e.
account in OU
).
Managed catalog of IT services to be used within organizations. Serves as an organizational catalog for the cloud. Supports lifecycle for service releases.
Allows sharing resources with other AWS accounts, outside or within an Organization.
Resources (not all):
-
VPC
subnets (can not be from default VPC) - accounts can not view, modify or delete resources owned by other accounts on the subnet, but resources can communicate with each other using private IPs and reference security groups Transit Gateway
Route53 Resolver Rules
Licence Manager Configurations
Creates a multi-account environment that follows best practices in operational efficiency, security, and governance.
Centralizes users across all accounts, provides templates that can be used to
create new accounts, integrates Guardrails (specific protections are on, f.e.
CloudTrail
)
Provides operational data and automation across infrastructure (f.e. update system library on predefined set of VMs).
Gives a secure way of accessing servers using AWS credentials. Securely stores commonly used parameters.
Configuration managements service. Provides managed instances of Chef and Puppet.
Managed Cloud Desktop (Virtual Desktop Infrastructure). Payed on demand. Integrated with Microsoft Active Directory. Supports Linux and Windows.
Also called Quotas, limits the usage of cloud resources.
API Rate limits set maximum number of API requests a client can make. Every API
has its own limits, e.g. S3 GetObject, EC2 DescribeInstances. Clients can
either implement exponential backoff stategy (ThrottlingException
intermittent errors usually is a sign for it; AWS SDK already has a mechanism
for it) or request AWS to increase API throttling limit.
Service Quotas set maximum number of instances on the given service, e.g. maximum number of vCPU for on-demand instances. Client can create a ticket requesting an increase, also possible using Service Quotas API.
CodeCommit
- managed source code repository. Utilizes git and controls access
IAM
policies.
CodeBuild
- continuous integration service.
CodeDeploy
- managed deployment service; works with EC2
, Fargate
,
Lambda
and on-prem.
CodePipeline
- works with previously mentioned services to create a pipeline
(continuous delivery).
CodeStar
- bootstrapping; complete continuous delivery toolchain for custom
applications.
CLI is also available in AWS Console as CloudShell
(terminal icon to the
right of search bar). Not available in all regions. Automatically assumes
credentials of the current user - no configuration needed.
Various examples; omitting --profile NAME
parameter implies the use of
default profile, specify profile if another one should be used:
# create an s3 bucket, mb stands for make bucket
$ aws s3 mb s3://test-bucket
# copy files between S3 and EC2:
$ aws s3 cp s3://test-bucket/file.txt file.txt
$ aws s3 cp file.txt s3://test-bucket/file.txt
$ aws s3 cp s3://test-bucket1/file.txt s3://test-bucket/file.txt
# syncing files between S3 and EC2:
$ aws s3 sync s3://test-bucket1 s3://test-bucket2
$ aws s3 sync . s3://test-bucket/file.txt
$ aws s3 sync s3://test-bucket1/file.txt .
# create an ec2 instance
$ aws ec2 run-instances --image-id {take from images page} --instance-type t2.micro
# list instances(shows full info in json format):
$ aws ec2 describe-instances
# query specific info from previous command (second line also utilizes filtering):
$ aws ec2 describe-instances --query 'Reservations[].Instances[].PublicIpAddress'
$ aws ec2 describe-instances --query 'Reservations[].Instances[].PublicIpAddress' --filters "Name=platform,Values=windows"
# stop/terminate an instance:
$ aws ec2 stop-instances --instance-ids {id}
$ aws ec2 terminate-instances --instance-ids {id}
Create roles in the same page as IAM->User
, then attach to running EC2
or
select when creating it; now access tokens are are automatically rotated and
can be accessed as this:
$ curl 169.254.169.254/latest/meta-data/iam/security-credentials/
$ curl 169.254.169.254/latest/meta-data/iam/security-credentials/{output from previous command}/
Configure credentials (will be then asked Access Key ID
and
Secret Acess Key
, which are obtained from IAM->User->Create User
page and
can be downloaded as csv); hard-coded way, not preferred (use roles instead).
$ aws configure [--profile NAME]
$ cat ~/.aws/credentials
CLI looks for credentials in this order:
- Command line options, e.g
--region
,--profile
- Environment variables, e.g.
AWS_ACCESS_KEY_ID
- Credentials file -
~/.aws/credentials
- Configuration file -
~/.aws/config
- Container credentials (ECS tasks)
- Instance profile
By default CLI uses a page size of 1,000 items, which means one API call to retrieve 1,000 items. If a given command implies retrieving 2,500 items, default CLI behavior would be making 3 separate API calls; results, however, are merged together before being displayed.
-
--page-size <number>
- adjust background behavior of API calls to execute the command, and retrieve only # of items per API call. Could help to eliminate time out type of errors due to too many items to retrieve. -
--max-items <number>
- limit number of items to retrieved and displayed; also includes NextToken to be able to retrieve the next set of items. No NextToken attribute in the response indicates there are no more items to retrieve. -
--strating-token
- accepts a token to retrieve next set of items
Marketplace - software from third-party providers.
Enables orchestration of workflows. Supports serverless architecture. Charges occur on state transition and services leveraged. Workflow is defined using Amazon States Language.
In general Step Functions
help to visualize serverless application(s),
automate and track triggers for each each step. Usually output of one step is
an input of the next, all state changes and actions are logged. Mainly used to
orchestrate Lambda
functions, but can also be used with EC2
, ECS
, on-prem
servers, API Gateway
.
Flow is represented as a JSON state machine. Task is a single step or unit of work within state machine.
Step functions
provide different types of workflows that apply to different
types of tasks that need to be automated/orchestrated.
Standard workflow fits well for long-running, durable workflows that could run up to a year. Full execution history is available up to 90 days after execution. By default tasks are not executed more than once, unless retry is explicitly stated. Works for non-idempotent actions.
Express workflow is designed for short-lived (up to 5 mins), high volume and event-driven types of workflows. Tasks are assumed to potentially run more than once or concurrently, therefore, works well for idempotent actions. Synchronous and asynchronous types are available - first option starts the workflow, waits completion and returns the result, while the latter, starts the workflow and doesn't return anything, results can be later found in logs.
Coordinates work done by multiple applications running on EC2
.
Step Functions
is recommended for new applications, but SWS
can be used, if
there is a need to be able to intervene the process or return values from child
processes to parent process.
Stores and syncs data across mobile and web apps. Uses GraphQL.