Project Idea ‐ EKS ECS - Campus-Castolo/m300 GitHub Wiki

🚀 Project Overview

This project aims to build and deploy a scalable and highly available application infrastructure using AWS services. A custom Docker image, built from a GitHub repository, is pushed to Amazon Elastic Container Registry (ECR) and deployed on either an ECS (Elastic Container Service) or EKS (Elastic Kubernetes Service) cluster.

The backend relies on a primary Amazon RDS instance, which is replicated into a secondary read-only RDS instance located in a different Availability Zone for redundancy and performance. Monitoring and logging are managed through Amazon CloudWatch, and a robust backup mechanism ensures that automated snapshots of the primary RDS database are regularly stored in an Amazon S3 bucket.

This infrastructure supports high availability, fault tolerance, continuous monitoring, and disaster recovery.


🛠️ Technologies and Tools

Category Tool/Service Purpose
Version Control GitHub Source code repository and Docker image build trigger
Containerization Docker Containerizing the application
Container Registry Amazon ECR Storing and versioning Docker images
Orchestration Amazon ECS or EKS Deploying and managing containers at scale
Database (Primary) Amazon RDS (MySQL/PostgreSQL) Persistent data storage for the application
Database (Replica) Amazon RDS Read Replica Redundant, read-optimized copy of the primary DB for failover/load
Monitoring Amazon CloudWatch Monitoring performance and logging
Backup Amazon S3 + Lambda Automated snapshot storage for disaster recovery
IaC Terraform Infrastructure automation and reproducibility
Networking Amazon VPC, ALB Secure traffic routing and load balancing
Security IAM, Security Groups, Secrets Manager Access control, secrets management, and secure communication

🎯 Goals and Functionality

  • Automated CI/CD Pipeline

    • Build Docker image on GitHub push
    • Push to Amazon ECR
    • Deploy to ECS/EKS cluster
  • Highly Available Database Layer

    • Deploy primary RDS instance in AZ-1
    • Deploy read replica in AZ-2 for load balancing and failover
  • Centralized Monitoring

    • Monitor cluster and RDS metrics using CloudWatch
    • Log ingestion and alerts for performance anomalies
  • Automated Backups

    • Take regular snapshots of the primary RDS
    • Use Lambda to copy snapshots to S3 with timestamped tags
  • Scalable and Secure Infrastructure

    • Load balancing using ALB
    • Secure access using IAM and environment secrets

⚠️ Probable Challenges

Area Potential Challenge Suggested Mitigation
ECR Authentication Handling access tokens from GitHub to ECR Use OIDC or deploy GitHub Actions runner inside AWS
RDS Replication Lag Delay between primary and replica under high write loads Monitor replica lag and set CloudWatch alarms
Cluster Scaling Handling unexpected traffic spikes Enable auto-scaling for ECS tasks or EKS pods
Snapshot Automation Ensuring consistent, timestamped backups and lifecycle policies for S3 Use AWS Lambda + EventBridge scheduler with tags
Cost Management ECS/EKS, RDS, and S3 can incur significant costs Use budgeting alerts and clean up unused resources regularly
Terraform State Handling Managing state files securely and collaboratively Use remote backends like S3 + DynamoDB for locking
Security Compliance Handling secrets, enforcing least-privilege IAM policies Use AWS Secrets Manager + strict IAM role definitions
Multi-AZ Network Design Ensuring consistent connectivity and failover between zones Carefully design subnets, routing tables, and ALB configuration