CloudWatch - xe1gyq/amazon Wiki

Cloud Watch

Agenda

  1. AWS Obersvability Options
  2. CloudWatch Dashboards
  3. CloudWatch Metric Alarms and Anomaly Detection Labs
  4. Break
  5. CloudWatch Compsite Alarms
  6. CloudWatch Logs Insights
  7. CloudWatch contributor Insights
  8. CloudWatch Managed Service for Mrometheus
  9. Amazon Managed Service for Prometheus
  10. Wrap-Up

AWS Obersvability Options

Everything fails, all the time

  • Grow new revenue streams
  • Improve operational and financial efficiency
  • Lower business risk

Monitoring

  • Behaving as expected?
  • Its usage?
  • Its business impact?

Observability

Detect, Investigate, Remediate

Components

  • Logs
  • Metrics
  • Traces

Full-Stack Observability at AWS

  • CloudWatch Metrics
  • CloudWatch Logs
  • AWS X-Ray
  • Cloud
  • EventBridge

Why

AWS Customer Footprint: 6 Quadrillion Metric Observations, 3.5 Exabytes of logs per month, 32 trillion events ingested monthly

  • Applications & Infrastructure
  • Collect Metrics in AWS and on-premises
  • Performance and resource optmizations
  • Operational visibility and insight
  • Actionable insight into actions

Open Source

Solid contributors

  • Amazon Managed Service for Prometheus
  • Amazon OpenSearch Engine
  • Grafana

Cloud Watch

Monitoring service for AWS cloud resources, applications you run on AWS and onPrem.

Benefits

  • Observability applciations and infrastructure
  • Collect metrics
  • Improve performance and utilization
  • Visibility
  • Actionable insight

How it works

CloudWatch Integration

  1. Collect
  2. Monitor
  3. Act
  4. Analyze

Collect

Amazon Cloud Watch Logs service: collect and sotre logs from resources, applications and services in near real-time.

  • Amazon EC2 instances
  • Om-rpmise servers
  • VPC Flow Logs
  • AWS cloudTrail
  • AWS Lambda
  • Other AWS Services

Metrics

  • Common
  • Custom

Monitor

  • Dashboards
    • Amazon CloudWatch Dashboards enable you to create re-usbable graphs and visualize your cloud resources and applications in a unified view.
  • High Resolution Alarms
    • Set a threshold
  • Logs and Metrics Correletion

Act

  • Auto Scaling
    • Automate capacity and resource planning
  • Automation
    • Send message to Slack
    • Disaster Recovery
  • Custom Operations on Metrics
    • Perform calculations across multiple metrics

Analyze

Dashboard

  • Widgets
  • Custom
  • Automated

Labs

First Labs

Then complete the Dashboard Labs 30 minutes

NOTE: When you need to run CLI commands you can use AWS CloudShell instead of the suggested Cloud9 IDE 

Proceed into the Custom Widgets Lab 20 minutes

CloudWatch Composite Alarms

For the question about the bitmap graph custom widget: The stack it deploys is a python based lambda that the widget uses to do server side rendering of CloudWatch graphs instead of client side that can speed up displaying complex or large numbers of graphs. All of the example custom widget can be found in this aws-sample repo with some descriptions: https://github.com/aws-samples/cloudwatch-custom-widgets-samples

Questions

  • Please point to AWS Blueprints example or Architectural samples that can get you started faster for each of the products you are describing.

You can use this landing page for Quickstarts, and filter by "Management & Governance" on the left side of the screen for our Logs and Monitoring QuickStarts. https://aws.amazon.com/quickstart/?solutions-all.sort-by=item.additionalFields.sortDate&solutions-all.sort-order=desc&awsf.filter-content-type=all&awsf.filter-tech-category=tech-category%23mgmt-govern&awsf.filter-industry=all You can also use this Reference Architecture landing page as well to filter via domain or search for a specific service. https://aws.amazon.com/architecture/?cards-all.sort-by=item.additionalFields.sortDate&cards-all.sort-order=desc&awsf.content-type=all&awsf.methodology=all&awsf.tech-category=all&awsf.industries=all&cards-all.q=CloudWatch%2B&cards-all.q_operator=AND

Open Questions

  1. Collating two separate log lines into a single metric. Example: Lambda invocation for an endpoint and the Lambda execution metrics.
  2. Pass variable into a query on the dashboard to filter a specific set of data.
  3. Run a query on a schedule and based on the results generate an alarm

For #3 for the steps here and then create an alarm: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CreateMetricFilterProcedure.html For #2: It seems to be possible via custom widgets, but I have not tested it myself as yet. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/add_custom_widget_samples.html

For the question about the bitmap graph custom widget: The stack it deploys is a python based lambda that the widget uses to do server side rendering of CloudWatch graphs instead of client side that can speed up displaying complex or large numbers of graphs. All of the example custom widget can be found in this aws-sample repo with some descriptions: https://github.com/aws-samples/cloudwatch-custom-widgets-samples