Kubernetes - 100-hours-a-week/3-team-ssammu-wiki GitHub Wiki

1. ๊ธฐ์กด ์‹œ์Šคํ…œ ํ•œ๊ณ„

Docker Compose ๊ธฐ๋ฐ˜ ์šด์˜์˜ ๊ตฌ์กฐ์  ํ•œ๊ณ„๋กœ ์ธํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฌธ์ œ๋“ค์ด ๋ฐœ์ƒํ–ˆ๋‹ค:

  • ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜ ๋ถ€์žฌ: ์ปจํ…Œ์ด๋„ˆ ๋‹จ์œ„ ํ—ฌ์Šค์ฒดํฌ, ์ž๋™ ๋ณต๊ตฌ ๊ธฐ๋Šฅ ๋ฏธ์ง€์›
  • ๋„คํŠธ์›Œํฌ ํ†ต์ œ ํ•œ๊ณ„: ALB ํŠธ๋ž˜ํ”ฝ ๋ถ„์‚ฐ์€ ๊ฐ€๋Šฅํ•˜์ง€๋งŒ, ์ปจํ…Œ์ด๋„ˆ ๊ฐ„ ์„ธ๋ถ€ ํ†ต์‹  ์ •์ฑ… ์ ์šฉ ๋ถˆ๊ฐ€
  • ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ ์žฅ์•  ๋Œ€์‘ ๋ถ€์กฑ: Compose ์ž์ฒด ์ƒํƒœ ๊ฐ์‹œ ๊ธฐ๋Šฅ ๋ฏธ๋น„, ์ˆ˜๋™ ์ ๊ฒ€ ์˜์กด
  • ๋ณด์•ˆ ์ •์ฑ… ๋ฏธํก: ๋„คํŠธ์›Œํฌ ๋ถ„๋ฆฌ๋Š” ๊ฐ€๋Šฅํ•˜๋‚˜, ์„œ๋น„์Šค ๊ฐ„ ์ธ์ฆ/์ธ๊ฐ€(mTLS) ๋ฏธ์ง€์›
  • ํ™•์žฅ์„ฑ ๋ถ€์กฑ: ๋Œ€๊ทœ๋ชจ ์ปจํ…Œ์ด๋„ˆ ์ˆ˜ ๊ด€๋ฆฌ ๋ฐ ํด๋Ÿฌ์Šคํ„ฐ๋ง ์ž๋™ํ™” ๊ธฐ๋Šฅ ๋ฏธ๋น„

2. Kubernetes ๋„์ž… ๋ฐฐ๊ฒฝ

Docker Compose ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ณ , ๋‹ค์Œ ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ์กฑํ•˜๊ธฐ ์œ„ํ•ด Kubernetes๋ฅผ ๋„์ž…ํ•œ๋‹ค:

  • ๊ณ ๊ฐ€์šฉ์„ฑ(HA): ๋ฌด์ค‘๋‹จ ์šด์˜ ๋ฐ ์žฅ์•  ์ž๋™ ๋ณต๊ตฌ(Self-healing)
  • ํ™•์žฅ์„ฑ ํ™•๋ณด: ๋ฉ€ํ‹ฐ ๋…ธ๋“œ ์šด์˜ ๋ฐ ์˜คํ† ์Šค์ผ€์ผ๋ง
  • ์„œ๋น„์Šค ๊ฐ„ ํ†ต์‹  ์ •์ฑ… ์ œ์–ด: Kubernetes NetworkPolicy ์ ์šฉ
  • ์šด์˜ ์ž๋™ํ™”: ๋ฐฐํฌ, ๋ณต๊ตฌ, ๋ชจ๋‹ˆํ„ฐ๋ง ์ฒด๊ณ„ ๊ตฌ์ถ•
  • ๋ฉ€ํ‹ฐ ํด๋ผ์šฐ๋“œ ์ง€์›: AWS-GCP ๊ฐ„ ์—ฐ๋™ ๊ตฌ์„ฑ

3. ์•„ํ‚คํ…์ฒ˜ ๋‹ค์ด์–ด๊ทธ๋žจ

SSMU-แ„แ…ฎแ„‡แ…ฅแ„‚แ…ฆแ„แ…ตแ„‰แ…ณ

3.1 ์ „์ฒด ํ๋ฆ„

์™ธ๋ถ€ โ†’ AWS ALB โ†’ Kubernetes NGINX Ingress Controller โ†’ Kubernetes Service โ†’ Pod

3.2 ์ธํ”„๋ผ ๋ฐฐ์น˜

๊ตฌ์„ฑ AWS GCP
Master Node Private Subnet (kubeadm init) Private Subnet (kubeadm init)
Worker Node AZ 2a/2c Private Subnet (kubeadm join) Private Subnet (kubeadm join) โ€” CPU ์ „์šฉ ์›Œ์ปค๋…ธ๋“œ, GPU ์›Œ์ปค๋…ธ๋“œ ๊ตฌ์„ฑ
์Šคํ† ๋ฆฌ์ง€ EBS (PVC ์ž๋™ ํ”„๋กœ๋น„์ €๋‹) Persistent Disk (PD)

4. ์ฃผ์š” ๊ธฐ์ˆ  ์Šคํƒ ๋ฐ ์„ ํƒ ์ด์œ 

์˜์—ญ ์„ ํƒ ๊ธฐ์ˆ  ์ฃผ์š” ์ด์œ 
์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜ kubeadm ํ‘œ์ค€ ์„ค์น˜, ์„ธ๋ฐ€ํ•œ ์ปค์Šคํ„ฐ๋งˆ์ด์ง•
๋„คํŠธ์›Œํฌ(CNI) Calico L3 ๊ธฐ๋ฐ˜, ๋ณด์•ˆ ์ •์ฑ… ์™„์ „ ์ง€์›
์ปจํ…Œ์ด๋„ˆ ๋Ÿฐํƒ€์ž„ containerd Kubernetes CRI ๊ณต์‹ ์ง€์›, ๊ฒฝ๋Ÿ‰
์ธ๊ทธ๋ ˆ์Šค NGINX Ingress Controller ๋‹ค์–‘ํ•œ ๋ผ์šฐํŒ…, ๋ฉ€ํ‹ฐ ํด๋ผ์šฐ๋“œ ๋Œ€์‘
๋ชจ๋‹ˆํ„ฐ๋ง CloudWatch, GCP Monitoring ํด๋ผ์šฐ๋“œ ๊ธฐ๋ณธ ํ†ตํ•ฉ, ๋น„์šฉ ํšจ์œจ
์šด์˜ ์ž๋™ํ™” GitHub Actions + ArgoCD SaaS ๊ธฐ๋ฐ˜ CI/CD + GitOps ๊ตฌ์ถ•
์Šคํ† ๋ฆฌ์ง€ EBS, PD PVC ๋™์  ํ”„๋กœ๋น„์ €๋‹, ๊ณ ์„ฑ๋Šฅ ์ง€์›

5. Kubernetes ํด๋Ÿฌ์Šคํ„ฐ ์„ธ๋ถ€ ๊ตฌ์„ฑ

5.1 ํด๋ผ์šฐ๋“œ ๋ฐ AZ๋ณ„ ์ธํ”„๋ผ ๊ตฌ์„ฑ

AWS (Control Plane + Data Plane)

  • VPC: 192.168.0.0/16
  • Private Subnet
    • 192.168.110.0/24 (AZ 2a/2c)
      • Kubernetes Master Node
      • Kubernetes Worker Node
    • 192.168.210.0/24 (AZ 2a/2c)
      • DB ์„œ๋ฒ„ ์ „์šฉ ์ธ์Šคํ„ด์Šค ๋ฐฐ์น˜
  • Master Nodes (kubeadm init):
    • AZ 2a: t3.medium (vCPU 2, RAM 4GB)
    • AZ 2c: t3.medium (vCPU 2, RAM 4GB)
  • Worker Nodes (kubeadm join):
    • AZ 2a: m5.large (vCPU 2, RAM 8GB)
    • AZ 2c: m5.large (vCPU 2, RAM 8GB)
  • DB Nodes:
    • AZ 2a: r6i.large (vCPU 2, RAM 16GB)
    • AZ 2c: r6i.large (vCPU 2, RAM 16GB)
    • ์Šคํ† ๋ฆฌ์ง€: gp3 EBS, IOPS 3000 ์„ค์ •
  • ์Šคํ† ๋ฆฌ์ง€:
    • Kubernetes์šฉ: EBS PVC ์ž๋™ ํ”„๋กœ๋น„์ €๋‹
    • DB์šฉ: ๋ณ„๋„ gp3 EBS ๊ณ ์„ฑ๋Šฅ ๋””์Šคํฌ ์‚ฌ์šฉ
  • ๋กœ๋“œ๋ฐธ๋Ÿฐ์„œ:
    • AWS ALB ์‚ฌ์šฉ, ์™ธ๋ถ€ ํŠธ๋ž˜ํ”ฝ ์ˆ˜์‹ 

GCP (Control Plane + Data Plane)

  • VPC: 10.0.0.0/16
  • Private Subnet:
    • 10.0.20.0/24 (Master Node, Worker Node ๋ฐฐ์น˜)
  • Master Nodes (kubeadm init):
    • AZ a: e2-medium (vCPU 2, RAM 4GB)
    • AZ b: e2-medium (vCPU 2, RAM 4GB)
  • Worker Nodes (kubeadm join):
    • CPU ์ „์šฉ ๋…ธ๋“œ: e2-medium
    • GPU ์ „์šฉ ๋…ธ๋“œ: g2-standard-8 (NVIDIA L4)
  • ์Šคํ† ๋ฆฌ์ง€:
    • GCP Persistent Disk(PD) ๊ธฐ๋ฐ˜ PVC ์ž๋™ ํ”„๋กœ๋น„์ €๋‹

5.2 Kubernetes ํ•ต์‹ฌ ๊ตฌ์„ฑ

ํ•ญ๋ชฉ ์ƒ์„ธ ๋‚ด์šฉ
Pod ๋„คํŠธ์›Œํฌ Calico ์ ์šฉ (IP-in-IP ๋ชจ๋“œ) ๊ตฌ์„ฑ
์ปจํŠธ๋กค ํ”Œ๋ ˆ์ธ HA AWS์™€ GCP ๋ชจ๋‘ Master Node ๋‹ค์ค‘ AZ ์ด์ค‘ํ™” ๊ตฌ์„ฑ
๋ฐ์ดํ„ฐ ํ”Œ๋ ˆ์ธ AWS AZ 2a/2c ๋ฐ GCP์— Worker Node ๋ถ„์‚ฐ ๊ตฌ์„ฑ
์Šคํ† ๋ฆฌ์ง€ EBS (AWS), PD (GCP) ๊ธฐ๋ฐ˜ PVC ๋™์  ํ”„๋กœ๋น„์ €๋‹ ๊ตฌ์„ฑ
Ingress Controller AWS ALB ์ˆ˜์‹  โ†’ NGINX Ingress๋กœ ๋ผ์šฐํŒ… ๊ตฌ์„ฑ
DNS ๋ผ์šฐํŒ… Route53์„ ํ†ตํ•ด ALB ๋„๋ฉ”์ธ ๊ด€๋ฆฌ ๊ตฌ์„ฑ
TLS ์ธ์ฆ AWS ACM ์ธ์ฆ์„œ ๋ฐœ๊ธ‰ ๋ฐ cert-manager ์ž๋™ํ™” ๊ตฌ์„ฑ
๋„คํŠธ์›Œํฌ ์—ฐ๋™ AWS-GCP VPC Peering ํ†ตํ•œ ์‚ฌ์„ค๋ง ํ†ต์‹  ๊ตฌ์„ฑ ๋ฐ ์„œ๋น„์Šค ๊ฐ„ ํ†ต์‹  ํ†ต์ œ
WAF ์—ฐ๋™ AWS WAF๋ฅผ ํ†ตํ•ด ALB ๋ ˆ๋ฒจ Web Application Firewall ๊ตฌ์„ฑ
ECR ์—ฐ๋™ AWS ECR์„ Docker ์ด๋ฏธ์ง€ ์ €์žฅ์†Œ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐฐํฌ ์ด๋ฏธ์ง€ ๊ด€๋ฆฌ
  • AWS ALB ์•ž๋‹จ์— Route53์„ ํ†ตํ•ด ๋„๋ฉ”์ธ DNS ๊ด€๋ฆฌ ๊ตฌ์„ฑ
  • HTTPS ํ†ต์‹ ์„ ์œ„ํ•ด AWS ACM ์ธ์ฆ์„œ ๋ฐœ๊ธ‰ ๋ฐ NGINX Ingress Controller์™€ ์—ฐ๋™
  • ALB์— AWS WAF๋ฅผ ์ ์šฉํ•˜์—ฌ OWASP Top 10 ๊ธฐ๋ฐ˜ ๊ณต๊ฒฉ ๋ฐฉ์–ด ๊ตฌ์„ฑ
  • AWS-GCP ๊ฐ„ VPC Peering์„ ํ†ตํ•œ ์‚ฌ์„ค๋ง ํ†ต์‹  ๊ตฌ์„ฑ
  • Kubernetes ๋„คํŠธ์›Œํฌ ์ •์ฑ…(NetworkPolicy)์„ ํ†ตํ•œ ์„œ๋น„์Šค ๊ฐ„ ํ†ต์‹  ํ†ต์ œ
  • Frontend ์„œ๋ฒ„์—์„œ DB ์„œ๋ฒ„(192.168.210.0/24)๋กœ์˜ ์ง์ ‘ ์ ‘๊ทผ ์ฐจ๋‹จ
  • Backend API ์„œ๋ฒ„๋ฅผ ํ†ตํ•œ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์š”์ฒญ ์ˆ˜ํ–‰
  • ECR์„ ํ†ตํ•œ ์ด๋ฏธ์ง€ ๊ด€๋ฆฌ ๋ฐ ๋ฐฐํฌ
    • AWS ECR์„ Docker ์ด๋ฏธ์ง€ ์ €์žฅ์†Œ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐฐํฌ ์ด๋ฏธ์ง€๋ฅผ ๊ด€๋ฆฌ
    • GitHub Actions์—์„œ ๋นŒ๋“œ๋œ Docker ์ด๋ฏธ์ง€๋ฅผ ECR๋กœ Push
    • ์ด๋ฏธ์ง€ ๋ฒ„์ „ ๊ด€๋ฆฌ๋Š” ํƒœ๊ทธ(Tag) ๊ธฐ๋ฐ˜์œผ๋กœ ๊ด€๋ฆฌํ•˜๋ฉฐ, ์ปค๋ฐ‹ SHA ๋˜๋Š” ๋ฒ„์ „ ๋„˜๋ฒ„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋กค๋ฐฑ ๋ฐ ๋ฒ„์ „ ์ถ”์ ์„ ์šฉ์ดํ•˜๊ฒŒ ๊ตฌ์„ฑ
    • ๋กค๋ฐฑ ์ „๋žต: ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ ๋ฒ„์ „์˜ ์ด๋ฏธ์ง€๋ฅผ ์ด์ „ ํƒœ๊ทธ๋ฅผ ํ†ตํ•ด ์‰ฝ๊ฒŒ ๋ณต๊ตฌ ๊ฐ€๋Šฅ

5.3 ์„ค๊ณ„ ์˜๋„ ๋ฐ ๊ธฐ๋Œ€ ํšจ๊ณผ

  • Control Plane ๊ณ ๊ฐ€์šฉ์„ฑ ํ™•๋ณด
    • Kubernetes Master Node๋ฅผ AZ 2a, 2c์— ์ด์ค‘ ๋ฐฐ์น˜
    • ๋‹จ์ผ AZ ์žฅ์•  ๋ฐœ์ƒ ์‹œ์—๋„ ํด๋Ÿฌ์Šคํ„ฐ ์ œ์–ด ๋ฌด์ค‘๋‹จ ์œ ์ง€
  • Data Plane ๊ฐ€์šฉ์„ฑ ํ™•๋ณด
    • Kubernetes Worker Node๋ฅผ AZ 2a, 2c์— ๋ถ„์‚ฐ ๋ฐฐ์น˜
    • ํŠน์ • AZ ์žฅ์•  ์‹œ์—๋„ ์„œ๋น„์Šค ํŠธ๋ž˜ํ”ฝ ๋ฌด์ค‘๋‹จ ๋ณต๊ตฌ
  • DB ์„œ๋ฒ„ ๊ณ ๊ฐ€์šฉ์„ฑ ํ™•๋ณด
    • Database ์„œ๋ฒ„๋ฅผ Private Subnet(192.168.210.0/24)์— AZ 2a, 2c๋กœ ์ด์ค‘ ๋ฐฐ์น˜
    • ๋ฐ์ดํ„ฐ ๊ณ„์ธต ๋ฌด์ค‘๋‹จ ์šด์˜ ๋ฐ ์—ฐ์†์„ฑ ํ™•๋ณด
  • ๋ฐ์ดํ„ฐ ํ”Œ๋ ˆ์ธ ํ™•์žฅ์„ฑ ํ™•๋ณด
    • AWS์™€ GCP๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ Kubernetes Worker Node ์ˆ˜ํ‰ ํ™•์žฅ ์ง€์›
    • ๋ฉ€ํ‹ฐ ํด๋ผ์šฐ๋“œ ์›Œํฌ๋กœ๋“œ ๋ถ„์‚ฐ ๊ตฌ์„ฑ
  • ๋ณด์•ˆ ๊ฐ•ํ™”
    • AWS-GCP VPC Peering์„ ํ†ตํ•œ ์‚ฌ์„ค๋ง ํ†ต์‹  ๊ตฌ์„ฑ
    • Kubernetes NetworkPolicy๋ฅผ ํ†ตํ•œ ์„œ๋น„์Šค ๊ฐ„ ํ†ต์‹  ํ†ต์ œ
    • Frontend ์„œ๋ฒ„์˜ DB ์„œ๋ฒ„ ์ง์ ‘ ์ ‘๊ทผ ์ฐจ๋‹จ
    • Backend API ์„œ๋ฒ„๋ฅผ ํ†ตํ•œ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์š”์ฒญ ์ˆ˜ํ–‰
  • ์šด์˜ ์ž๋™ํ™” ๊ธฐ๋ฐ˜ ๋งˆ๋ จ
    • GitOps ๊ธฐ๋ฐ˜ Git ์ƒํƒœ ๊ธฐ์ค€ Kubernetes ๋ฆฌ์†Œ์Šค ์ž๋™ ๋™๊ธฐํ™”
    • ์šด์˜ ์ˆ˜์ž‘์—… ์ตœ์†Œํ™”

6. ์šด์˜ ๋ฐ ์ž๋™ํ™” ์ฒด๊ณ„

6.1 ์šด์˜ ์ž๋™ํ™” ๊ตฌ์„ฑ

ECR์„ ํ†ตํ•œ ์ด๋ฏธ์ง€ ๊ด€๋ฆฌ ๋ฐ ๋กค๋ฐฑ ์ „๋žต

  • *AWS Elastic Container Registry(ECR)**์„ Docker ์ด๋ฏธ์ง€ ์ €์žฅ์†Œ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐฐํฌ ์ด๋ฏธ์ง€๋ฅผ ๊ด€๋ฆฌ
  • GitHub Actions์—์„œ ๋นŒ๋“œ๋œ Docker ์ด๋ฏธ์ง€๋ฅผ ECR๋กœ Push
  • ์ด๋ฏธ์ง€ ๋ฒ„์ „ ๊ด€๋ฆฌ๋Š” ํƒœ๊ทธ(Tag) ๊ธฐ๋ฐ˜์œผ๋กœ ๊ด€๋ฆฌํ•˜๋ฉฐ, ์ปค๋ฐ‹ SHA ๋˜๋Š” ๋ฒ„์ „ ๋„˜๋ฒ„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋กค๋ฐฑ ๋ฐ ๋ฒ„์ „ ์ถ”์ ์„ ์šฉ์ดํ•˜๊ฒŒ ๊ตฌ์„ฑ
  • ๋กค๋ฐฑ ์ „๋žต: ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ ๋ฒ„์ „์˜ ์ด๋ฏธ์ง€๋ฅผ ์ด์ „ ํƒœ๊ทธ๋ฅผ ํ†ตํ•ด ์‰ฝ๊ฒŒ ๋ณต๊ตฌ ๊ฐ€๋Šฅ

ECR ์ด๋ฏธ์ง€ ๋กค๋ฐฑ ํ๋ฆ„

[์žฅ์•  ๋ฐœ์ƒ ๋˜๋Š” ๋ฐฐํฌ ์‹คํŒจ ๊ฐ์ง€]
        โ†“
[์ด์ „ ์ •์ƒ ๋ฒ„์ „ Docker ์ด๋ฏธ์ง€ ์„ ํƒ (์˜ˆ: :v1.2.3)]
        โ†“
[kubectl set image ๋˜๋Š” Deployment Manifest ์ˆ˜์ •]
        โ†“
[ArgoCD ์ž๋™ Sync ๋˜๋Š” ์ˆ˜๋™ Sync ์ˆ˜ํ–‰]
        โ†“
[์ด์ „ ๋ฒ„์ „ ์„œ๋น„์Šค ์ •์ƒ ๋ณต๊ตฌ]

CI/CD ํŒŒ์ดํ”„๋ผ์ธ

  • GitHub Actions๋ฅผ ์‚ฌ์šฉํ•ด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋นŒ๋“œ, ํ…Œ์ŠคํŠธ, ์ด๋ฏธ์ง€ ๋นŒ๋“œ/ํ‘ธ์‹œ, Kubernetes ๋ฐฐํฌ ์ž๋™ํ™”
  • PR(Pull Request) ์ƒ์„ฑ ์‹œ ์ฝ”๋“œ ํ’ˆ์งˆ ๊ฒ€์‚ฌ ๋ฐ ํ…Œ์ŠคํŠธ ์ž๋™ ์‹คํ–‰
  • main ๋ธŒ๋žœ์น˜ ๋จธ์ง€ ์‹œ Docker ์ด๋ฏธ์ง€ Build & ECR Push
  • ArgoCD๊ฐ€ Git ๋ณ€๊ฒฝ์„ ๊ฐ์ง€ํ•˜์—ฌ Kubernetes์— ์ž๋™ ๋ฐฐํฌ

GitOps ์šด์˜

  • ArgoCD๋ฅผ ์‚ฌ์šฉํ•ด Git ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ํด๋Ÿฌ์Šคํ„ฐ ์ƒํƒœ์˜ ๋‹จ์ผ ์†Œ์Šค(Single Source of Truth)๋กœ ๊ตฌ์„ฑ
  • Git ์ƒํƒœ ๊ธฐ์ค€์œผ๋กœ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์ž๋™ ๋™๊ธฐํ™”
  • Drift(ํŽธ์ฐจ) ๋ฐœ์ƒ ์‹œ ์ž๋™ ๋ณต๊ตฌ ์ฒ˜๋ฆฌ
  • Git ์ปค๋ฐ‹๋งŒ์œผ๋กœ ๋ฐฐํฌ ๋ฐ ๋กค๋ฐฑ ์ˆ˜ํ–‰

๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ ์•Œ๋ฆผ ์‹œ์Šคํ…œ

  • ๊ธฐ๋ณธ ๋ชจ๋‹ˆํ„ฐ๋ง ๋„๊ตฌ ๊ตฌ์„ฑ
    • AWS CloudWatch (EC2, EBS, ALB Metrics)
    • GCP Monitoring (Compute Engine, PD Metrics)
  • ์ถ”๊ฐ€ ํ™•์žฅ ๊ฐ€๋Šฅ์„ฑ ๊ณ ๋ ค
    • ํ•„์š” ์‹œ Prometheus + Grafana Stack ์ถ”๊ฐ€ ๊ตฌ์„ฑ
  • ์•Œ๋ฆผ ์ฒด๊ณ„ ๊ตฌ์„ฑ
    • CloudWatch Alarm, GCP Alerting์„ Slack Webhook๊ณผ ์—ฐ๋™
    • ์žฅ์• , ์Šค์ผ€์ผ๋ง ์ด๋ฒคํŠธ, ๋ฐฐํฌ ์„ฑ๊ณต/์‹คํŒจ๋ฅผ Slack์œผ๋กœ ์‹ค์‹œ๊ฐ„ ํ†ต๋ณด

6.2 ์šด์˜ ์ž๋™ํ™” ์•„ํ‚คํ…์ฒ˜ ํ๋ฆ„

[GitHub Repository]
    โ†“ (Push/PR Event)
[GitHub Actions]
    โ†“ (Docker Build/Push, Manifest Update)
[ECR Registry] + [Git Repository ๋ณ€๊ฒฝ]
    โ†“ (Auto Sync)
[ArgoCD]
    โ†“ (Deploy/Sync)
[Kubernetes Cluster]
    โ†“ (Metrics/Logs Export)
[CloudWatch/GCP Monitoring]
    โ†“ (Alarm Trigger)
[Slack Notifications]

6.3 ์ฃผ์š” ์šด์˜ ์ž๋™ํ™” ํŠน์ง•

ํ•ญ๋ชฉ ๋‚ด์šฉ
CI/CD ์ž๋™ํ™” GitHub Actions๋กœ ๋นŒ๋“œ, ํ…Œ์ŠคํŠธ, ๋ฐฐํฌ ์ž๋™ ์ฒ˜๋ฆฌ
GitOps ๊ตฌํ˜„ ArgoCD๋กœ Git ์ƒํƒœ ๊ธฐ๋ฐ˜ ํด๋Ÿฌ์Šคํ„ฐ ์ž๋™ ๋™๊ธฐํ™”
๋ชจ๋‹ˆํ„ฐ๋ง ๊ธฐ๋ณธ ์ฒด๊ณ„ AWS CloudWatch, GCP Monitoring ๊ธฐ๋ณธ ํ™œ์šฉ
ํ™•์žฅ์„ฑ ๊ณ ๋ ค Prometheus Stack ํ™•์žฅ ๊ฐ€๋Šฅ์„ฑ ๊ณ ๋ ค
์‹ค์‹œ๊ฐ„ ์•Œ๋ฆผ ์žฅ์• , ์ด๋ฒคํŠธ ๋ฐœ์ƒ ์‹œ Slack ํ†ต๋ณด ๊ตฌ์„ฑ
์šด์˜ ๋ณต์žก์„ฑ ๊ฐ์†Œ ์ˆ˜๋™ ๋ฐฐํฌ ๋ฐ ์ˆ˜์ž‘์—… ๊ฐœ์ž… ์ตœ์†Œํ™”

7. ์žฅ์•  ๋ณต๊ตฌ ๋ฐ ๋ฐฐํฌ ํ”Œ๋กœ์šฐ

7.1 ์›Œ์ปค๋…ธ๋“œ ์žฅ์•  ๋ณต๊ตฌ ํ”Œ๋กœ์šฐ

์žฅ์•  ๋ณต๊ตฌ ํ๋ฆ„

[Worker Node ์žฅ์•  ๋ฐœ์ƒ]
        โ†“
[Kubernetes Node ์ƒํƒœ "NotReady"๋กœ ๋ณ€๊ฒฝ]
        โ†“
[Kubernetes Scheduler๊ฐ€ ํ•ด๋‹น ๋…ธ๋“œ์—์„œ ์‹คํ–‰ ์ค‘์ด๋˜ Pod๋ฅผ ๋‹ค๋ฅธ ์ •์ƒ ๋…ธ๋“œ๋กœ ์ž๋™ ์žฌ์Šค์ผ€์ค„๋ง]
        โ†“
[Ingress Controller ๋ฐ ALB๊ฐ€ ์ƒˆ๋กœ ์Šค์ผ€์ค„๋œ Pod๋กœ ํŠธ๋ž˜ํ”ฝ ์ž๋™ ์ „๋‹ฌ]
        โ†“
[์„œ๋น„์Šค ๋ฌด์ค‘๋‹จ ์œ ์ง€]

7.1-1 Cluster Autoscaler ๋ฐ˜์‘ ํ๋ฆ„

[Worker Node ์žฅ์• ๋กœ Capacity ๋ถ€์กฑ ๋ฐœ์ƒ]
        โ†“
[Cluster Autoscaler๊ฐ€ ์‹ ๊ทœ ์›Œ์ปค๋…ธ๋“œ ์ž๋™ ์ถ”๊ฐ€]
        โ†“
[Kubernetes Scheduler๊ฐ€ ์‹ ๊ทœ ๋…ธ๋“œ์— Pod ์Šค์ผ€์ค„๋ง]
        โ†“
[์„œ๋น„์Šค ํ™•์žฅ ๋ฐ ๋ณต๊ตฌ ์™„๋ฃŒ]

์ฃผ์š” ๋ณต๊ตฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜ ์„ค๋ช…

  • Kubernetes Node Controller๊ฐ€ node-monitor-grace-period ๋™์•ˆ Heartbeat ์ˆ˜์‹  ์‹คํŒจ ์‹œ ๋…ธ๋“œ๋ฅผ "NotReady"๋กœ ํ‘œ์‹œ
  • ReplicaSet์„ ํ†ตํ•ด ๋ณด์žฅ๋œ Pod ์ˆ˜๋ฅผ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด Scheduler๊ฐ€ ๋‹ค๋ฅธ ์ •์ƒ ๋…ธ๋“œ๋กœ Pod ์ž๋™ ์žฌ๋ฐฐ์น˜
  • AWS ALB + Ingress Controller๋Š” Kubernetes Service ๋ ˆ๋ฒจ ๋ผ์šฐํŒ…์„ ์‚ฌ์šฉํ•ด Pod IP ๋ณ€๊ฒฝ์—๋„ ํŠธ๋ž˜ํ”ฝ ์ž๋™ ์—ฐ๊ฒฐ
  • Cluster Autoscaler๊ฐ€ ๋…ธ๋“œ Capacity ๋ถ€์กฑ ์‹œ ์›Œ์ปค๋…ธ๋“œ๋ฅผ ์ž๋™ ํ™•์žฅํ•˜์—ฌ ๋ถ€ํ•˜๋ฅผ ๋ถ„์‚ฐ

7.2 ๋กค๋ง ์—…๋ฐ์ดํŠธ ํ”Œ๋กœ์šฐ

๋กค๋ง ์—…๋ฐ์ดํŠธ ํ๋ฆ„

[New Docker Image Build & Push (GitHub Actions)]
        โ†“
[Deployment Manifest Update (GitHub Actions)]
        โ†“
[Git Repository ์—…๋ฐ์ดํŠธ ๊ฐ์ง€ (ArgoCD)]
        โ†“
[Kubernetes Deployment Rolling Update ์‹œ์ž‘]
        โ†“
[๊ธฐ์กด Pod๋ฅผ ์ˆœ์ฐจ์ ์œผ๋กœ ์ข…๋ฃŒํ•˜๊ณ  ์ƒˆ ๋ฒ„์ „ Pod๋ฅผ ์ˆœ์ฐจ์ ์œผ๋กœ ๊ธฐ๋™]
        โ†“
[ALB-Ingress-Service๋ฅผ ํ†ตํ•œ ๋ฌด์ค‘๋‹จ ํŠธ๋ž˜ํ”ฝ ์œ ์ง€]

7.2-1 Blue-Green Deployment ๊ณ ๋ ค

๋Œ€๊ทœ๋ชจ ํŠธ๋ž˜ํ”ฝ ๋ณ€๋™์ด๋‚˜ ๊ณ ์œ„ํ—˜ ๋ฐฐํฌ์˜ ๊ฒฝ์šฐ

์‹ ๊ทœ ๋ฒ„์ „๊ณผ ๊ธฐ์กด ๋ฒ„์ „์„ ๋ณ‘๋ ฌ ๋ฐฐํฌ ํ›„ ํŠธ๋ž˜ํ”ฝ ์Šค์œ„์นญ ๋ฐฉ์‹์œผ๋กœ ๋ฌด์ค‘๋‹จ ์ „ํ™˜ํ•˜๋Š” Blue-Green Deployment ์ „๋žต ๊ณ ๋ ค


์ฃผ์š” ๋ฐฐํฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜ ์„ค๋ช…

  • Rolling Update ์ „๋žต์€ ๊ธฐ์กด Pod๋ฅผ ํ•œ ๋ฒˆ์— ๋ชจ๋‘ ๊ต์ฒดํ•˜์ง€ ์•Š๊ณ  ์ˆœ์ฐจ์ ์œผ๋กœ ์ƒˆ ๋ฒ„์ „์œผ๋กœ ๊ต์ฒด
  • maxUnavailable, maxSurge ์„ค์ •์„ ํ†ตํ•ด ๋ฐฐํฌ ์ค‘ ๊ฐ€์šฉ์„ฑ ๋ณด์žฅ
  • ArgoCD๊ฐ€ Git ๋ณ€๊ฒฝ์‚ฌํ•ญ์„ ์‹ค์‹œ๊ฐ„ ๊ฐ์ง€ํ•ด Deployment ์ž๋™ ๋™๊ธฐํ™” ์ˆ˜ํ–‰
  • ํ•„์š” ์‹œ Blue-Green Deployment๋กœ ๋ฐฐํฌ ์•ˆ์ •์„ฑ๊ณผ ๋กค๋ฐฑ ์‹ ์†์„ฑ ํ™•๋ณด

7.3 ์žฅ์•  ๋ณต๊ตฌ ๋ฐ ๋ฐฐํฌ ์‹œ ๊ณ ๋ ค์‚ฌํ•ญ

๊ตฌ๋ถ„ ๊ณ ๋ ค์‚ฌํ•ญ
์›Œ์ปค๋…ธ๋“œ ์žฅ์•  ๋ณต๊ตฌ AZ๋ณ„ ์›Œ์ปค๋…ธ๋“œ ๋ถ„์‚ฐ์œผ๋กœ ๋‹จ์ผ AZ ์žฅ์• ์—๋„ ๋ฌด์ค‘๋‹จ ๋ณต๊ตฌ ๊ตฌ์„ฑ
๋กค๋ง ์—…๋ฐ์ดํŠธ maxUnavailable: 1, maxSurge: 1 ์„ค์ •์œผ๋กœ ํŠธ๋ž˜ํ”ฝ ์•ˆ์ •์„ฑ ํ™•๋ณด
์•Œ๋ฆผ ์—ฐ๋™ ์žฅ์•  ๋ณต๊ตฌ ๋ฐ ๋ฐฐํฌ ์ด๋ฒคํŠธ ๋ฐœ์ƒ ์‹œ Slack์œผ๋กœ ์‹ค์‹œ๊ฐ„ ํ†ต๋ณด ๊ตฌ์„ฑ
๋ฐฑ์—… ๋ฐ ๋ณต์› etcd ์Šค๋ƒ…์ƒท ๋ฐ PVC ๋ณผ๋ฅจ ์Šค๋ƒ…์ƒท์„ ํ•˜๋ฃจ 1ํšŒ S3์— ์ €์žฅ, ์ตœ๊ทผ 7์ผ ๋ณด์กด, ์›” 1ํšŒ ๋ณต๊ตฌ ํ…Œ์ŠคํŠธ ์‹œํ–‰

7.4 Master Node ์žฅ์•  ๋ณต๊ตฌ ํ”Œ๋กœ์šฐ

Control Plane ์žฅ์•  ๋ณต๊ตฌ ํ๋ฆ„

[Master Node ์žฅ์•  ๋ฐœ์ƒ]
        โ†“
[etcd ์Šค๋ƒ…์ƒท ๋ณต์› ์ ˆ์ฐจ ์ˆ˜ํ–‰]
        โ†“
[์ƒˆ Master Node๋กœ kubeadm init --skip-phases=etcd ์‹คํ–‰]
        โ†“
[Kubernetes API Server ๋ฐ Control Plane ๊ธฐ๋Šฅ ๋ณต๊ตฌ]
        โ†“
[ํด๋Ÿฌ์Šคํ„ฐ ์ •์ƒํ™”]

7.5 ๋„คํŠธ์›Œํฌ ๋ณด์•ˆ ๊ฐ•ํ™”

  • Kubernetes NetworkPolicy๋ฅผ ํ†ตํ•ด ์„œ๋น„์Šค ๊ฐ„ ํ†ต์‹  ์ œ์–ด
  • Frontend ์„œ๋ฒ„์—์„œ DB ์„œ๋ฒ„๋กœ์˜ ์ง์ ‘ ์ ‘๊ทผ ์ฐจ๋‹จ
  • Backend API ์„œ๋ฒ„๋ฅผ ํ†ตํ•œ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ ‘๊ทผ๋งŒ ํ—ˆ์šฉ
  • ํ•„์š” ์‹œ Pod ๊ฐ„ ํ†ต์‹  ์•”ํ˜ธํ™”๋ฅผ ์œ„ํ•ด Calico IPsec ๋˜๋Š” WireGuard ์ ์šฉ ๊ณ ๋ ค
  • ์„œ๋น„์Šค ๊ฐ„ ์ธ์ฆ ๊ฐ•ํ™”๋ฅผ ์œ„ํ•ด Istio ๊ธฐ๋ฐ˜ mTLS ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ ๊ฒ€ํ† 

๐Ÿ“Œ ๋ณธ ํŽ˜์ด์ง€๋Š” 2025๋…„ 4์›” 29์ผ์— ๋งˆ์ง€๋ง‰์œผ๋กœ ์—…๋ฐ์ดํŠธ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

โš ๏ธ **GitHub.com Fallback** โš ๏ธ