EKS - 100-hours-a-week/3-team-ssammu-wiki GitHub Wiki

๐Ÿ’ผ ๊ฐœ์š”

๋ณธ ๋‹จ๊ณ„์—์„œ๋Š” ๊ธฐ์กด kubeadm ๊ธฐ๋ฐ˜ ์ˆ˜๋™ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋„˜์–ด, AWS์˜ ๋งค๋‹ˆ์ง€๋“œ Kubernetes ์„œ๋น„์Šค(EKS)๋ฅผ ๋„์ž…ํ•˜์—ฌ ์šด์˜์˜ ์ž๋™ํ™”, ํ™•์žฅ์„ฑ, ์•ˆ์ •์„ฑ์„ ํ™•๋ณดํ•ฉ๋‹ˆ๋‹ค.

๐ŸŽฏ ํ•ต์‹ฌ ๋ชฉํ‘œ

  • Kubernetes์˜ ๊ณ ๋„ํ™”๋œ ๊ธฐ๋Šฅ์„ ํ†ตํ•ด ๋ณต์žกํ•œ ์„œ๋น„์Šค ๊ตฌ์กฐ ์šด์˜
  • ์˜คํ† ์Šค์ผ€์ผ๋ง, ํ—ฌ์Šค์ฒดํฌ, ๋กค๋ง ์—…๋ฐ์ดํŠธ ๋“ฑ ์šด์˜ ์ž๋™ํ™” ๊ธฐ๋Šฅ ๋„์ž…
  • EKS์™€ CI/CD, ๋ณด์•ˆ, ์Šคํ† ๋ฆฌ์ง€, ์„œ๋น„์Šค ๋””์Šค์ปค๋ฒ„๋ฆฌ ์—ฐ๊ณ„
  • ๋‹ค์ค‘ AWS ๊ณ„์ •(Dev/Prod) ๋ถ„๋ฆฌ ๋ฐ ๋ฉ€ํ‹ฐ VPC ๋„คํŠธ์›Œํฌ(Transit Gateway) ๊ตฌ์„ฑ
  • GPU ๋…ธ๋“œ/AI ์›Œํฌ๋กœ๋“œ ๋ณ„๋„ ๋ถ„๋ฆฌ ๋ฐ PVC ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ์ง€์†์„ฑ ํ™•๋ณด

๋„์ž… ๋ฐฐ๊ฒฝ ๋ฐ ํ•„์š”์„ฑ

๐Ÿšง ๊ธฐ์กด Kubeadm ๊ธฐ๋ฐ˜ ๊ตฌ์กฐ์˜ ํ•œ๊ณ„

ํ•ญ๋ชฉ Kubeadm (EC2 ์ˆ˜๋™ ๊ตฌ์„ฑ) EKS (Fully-managed Kubernetes)
ํด๋Ÿฌ์Šคํ„ฐ ๊ด€๋ฆฌ ์ˆ˜๋™ ์—…๊ทธ๋ ˆ์ด๋“œ / ๊ด€๋ฆฌ ๋ถ€๋‹ด ํผ AWS์—์„œ ์ž๋™ ํŒจ์น˜ ๋ฐ ๊ด€๋ฆฌ
๊ณ ๊ฐ€์šฉ์„ฑ Master Node ์žฅ์•  ์‹œ ์ˆ˜๋™ ์กฐ์น˜ ํ•„์š” Control Plane ์ž๋™ ๋‹ค์ค‘ํ™” (HA ๊ตฌ์„ฑ)
Auto Scaling ์ง์ ‘ ๋…ธ๋“œ ๊ทธ๋ฃน ๊ตฌ์„ฑ ํ•„์š” Managed Node Group ๋ฐ HPA ์ง€์›
๋„คํŠธ์›Œํฌ ๊ตฌ์„ฑ VPC ๋‚ด ์ˆ˜๋™ ์„ค์ • ํ•„์š” VPC-CNI ํ”Œ๋Ÿฌ๊ทธ์ธ ๋ฐ ALB ์—ฐ๋™ ์ง€์›
๋ณด์•ˆ ๋ฐ IAM ํ†ตํ•ฉ ์ธ์ฆ ์ฒด๊ณ„ ๋ณ„๋„ ๊ตฌ์„ฑ ํ•„์š” IAM, RBAC, OIDC ์—ฐ๋™ ๋‚ด์žฅ

๐Ÿ’ก ์šฐ๋ฆฌ ์„œ๋น„์Šค์— ํ•„์š”ํ•œ ์ด์œ 

  • AI ๋ฉด์ ‘ ํ”ผ๋“œ๋ฐฑ, CS ๋ฌธ์ œ ๋Œ€ํšŒ ์šด์˜ ๋“ฑ ์‹ค์‹œ๊ฐ„ ๊ฐ€์šฉ์„ฑ์ด ๋งค์šฐ ์ค‘์š”
  • ๋ณต์žกํ•œ ๋ฐฑ์—”๋“œ + AI ์›Œํฌ๋กœ๋“œ๋ฅผ ๋งˆ์ดํฌ๋กœ์„œ๋น„์Šค ๋‹จ์œ„๋กœ ๊ด€๋ฆฌ ํ•„์š”
  • ํ–ฅํ›„ GPU ๋…ธ๋“œ ํ™•์žฅ ๋ฐ ๋น„์šฉ ์ตœ์ ํ™”(Spot, Fargate) ํ•„์š”
  • ๋กœ๊ทธ ์žฅ๊ธฐ ์ €์žฅ ๋ฐ ์žฅ์•  ๋Œ€์‘ ์ฒด๊ณ„ ๊ตฌ์ถ• ํ•„์š”

โ“ ์™œ ECS๋ฅผ ์„ ํƒํ•˜์ง€ ์•Š์•˜๋‚˜?

ECS vs EKS ๋น„๊ต

ํ•ญ๋ชฉ ECS EKS
์ƒํƒœ๊ณ„ ์—ฐ๋™์„ฑ AWS ๋„ค์ดํ‹ฐ๋ธŒ ๋„๊ตฌ ์ค‘์‹ฌ CNCF ํ‘œ์ค€, ArgoCD/Helm/Kustomize ์—ฐ๋™
๋ฉ€ํ‹ฐ ํด๋ผ์šฐ๋“œ ์ด์‹์„ฑ AWS ์ข…์†์  GKE, AKS, ์˜จํ”„๋ ˆ๋ฏธ์Šค K8s ํ˜ธํ™˜์„ฑ
๋ณด์•ˆ ๊ฒฉ๋ฆฌ Task ์ˆ˜์ค€ ๋„ค์ž„์ŠคํŽ˜์ด์Šค + RBAC + NetworkPolicy
๋น„์šฉ ๊ตฌ์กฐ ๋‹จ์ˆœ ๊ณผ๊ธˆ Spot/Fargate ํ˜ผํ•ฉ ์œ ์—ฐ
๋ฐฐํฌ ์ „๋žต ๋‹จ์ˆœ Task ์žฌ์‹œ์ž‘ Rolling, Blue-Green, Canary (Argo Rollouts)

๐Ÿ”ง ์•„ํ‚คํ…์ฒ˜ ์„ค๊ณ„

๐Ÿ–ผ๏ธ EKS ์•„ํ‚คํ…์ฒ˜ ๋‹ค์ด์–ด๊ทธ๋žจ

CareerBee_EKS drawio

๐Ÿ‘ฅ ํด๋Ÿฌ์Šคํ„ฐ ๊ตฌ์„ฑ

  • ํด๋Ÿฌ์Šคํ„ฐ: careerbee-cluster
  • ๋„ค์ž„์ŠคํŽ˜์ด์Šค: sys, frontend, backend, ai, monitoring
  • ๋ฉ€ํ‹ฐ IAM ๊ณ„์ • ์ „๋žต: shared(dev), prod ๋ถ„๋ฆฌ
  • GPU ์ „์šฉ Node Group: AI ์›Œํฌ๋กœ๋“œ์šฉ (FastAPI)
  • ์ผ๋ฐ˜ Node Group(Worker Node): FE(Next.js), BE(SpringBoot)
  • ๋‹ค์ค‘ VPC ๊ตฌ์„ฑ: Dev VPC, Prod VPC ๋ถ„๋ฆฌ ์šด์˜

๐ŸŒ ๋„คํŠธ์›Œํฌ ํŠธ๋ž˜ํ”ฝ ํ๋ฆ„

  • Public Subnet โ†’ ALB (Ingress Controller)
  • Ingress ๋ฆฌ์†Œ์Šค โ†’ Blue/Green Service โ†’ Blue/Green Pods
  • CoreDNS๋ฅผ ํ†ตํ•œ Service Discovery (<svc>.<ns>.svc.cluster.local)
  • ๋‚ด๋ถ€ํ†ต์‹ : FE โ†’ BE โ†’ AI ์ˆœ์ฐจ ํ˜ธ์ถœ
  • ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์—ฐ๊ฒฐ: RDS(Prod/Dev ๊ฐ๊ฐ ์šด์˜)

๐Ÿ“ถ ๋„คํŠธ์›Œํฌ ํ†ต์ œ - NetworkPolicy

NetworkPolicy๋ž€?

  • Kubernetes์—์„œ Pod ๊ฐ„ ๋„คํŠธ์›Œํฌ ํ†ต์‹ ์„ ์ œ์–ดํ•˜๋Š” ์ •์ฑ…
  • ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ชจ๋“  Pod๋Š” ์„œ๋กœ ํ†ต์‹ ์ด ๊ฐ€๋Šฅํ•˜์ง€๋งŒ, NetworkPolicy๋ฅผ ์ ์šฉํ•˜๋ฉด ํŠน์ • ํ†ต์‹ ๋งŒ ํ—ˆ์šฉ
  • ๋ณด์•ˆ ๊ฐ•ํ™”, ์„œ๋น„์Šค ๋ถ„๋ฆฌ, ์ตœ์†Œ ๊ถŒํ•œ ์›์น™ ์ ์šฉ์— ํ•„์ˆ˜์ 

ํ™œ์šฉ ์˜ˆ์‹œ

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-backend-to-db
  namespace: backend
spec:
  podSelector:
    matchLabels:
      role: db
  ingress:
  - from:
    - podSelector:
        matchLabels:
          role: backend
  • ์ด ์˜ˆ์‹œ๋Š” "role=backend"์ธ Pod๋งŒ "role=db"์ธ Pod๋กœ ์ ‘๊ทผ์„ ํ—ˆ์šฉ
  • ํ”„๋ก ํŠธ์—”๋“œ์—์„œ DB์— ์ง์ ‘ ์ ‘๊ทผํ•˜๋Š” ๊ฒƒ์„ ์ฐจ๋‹จํ•˜๊ณ , ๋ฐฑ์—”๋“œ๋งŒ ์ ‘๊ทผํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๋ณด์•ˆ ๊ตฌ์กฐ๋ฅผ ๊ตฌํ˜„ ๊ฐ€๋Šฅ

โšก Kubernetes ์˜คํ† ์Šค์ผ€์ผ๋ง ๊ธฐ๋Šฅ ์š”์•ฝ

์ข…๋ฅ˜ ๊ธฐ๋Šฅ ์„ค๋ช…
HPA (Horizontal Pod Autoscaler) Pod ์ˆ˜ ์ž๋™ ์กฐ์ ˆ CPU, Memory, Custom Metrics ๊ธฐ๋ฐ˜์œผ๋กœ Pod ๋Š˜๋ฆฌ๊ณ  ์ค„์ž„
VPA (Vertical Pod Autoscaler) Pod ๋ฆฌ์†Œ์Šค ์š”์ฒญ/์ œํ•œ ์ž๋™ ์กฐ์ • Pod CPU/Memory ์š”์ฒญ๋Ÿ‰์„ ๋™์ ์œผ๋กœ ์กฐ์ • (๋‹จ, ์žฌ์‹œ์ž‘ ํ•„์š”)
CA (Cluster Autoscaler) ๋…ธ๋“œ ์ˆ˜ ์ž๋™ ์กฐ์ • ์ „์ฒด ํด๋Ÿฌ์Šคํ„ฐ ๋…ธ๋“œ(EC2 ์ธ์Šคํ„ด์Šค) ์ˆ˜๋ฅผ ์ž๋™ ํ™•์žฅ/์ถ•์†Œ
KEDA (Event-driven Autoscaler) ์™ธ๋ถ€ ์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜ ์Šค์ผ€์ผ๋ง Kafka, SQS, Redis ๋“ฑ ์ด๋ฒคํŠธ ํŠธ๋ž˜ํ”ฝ๋Ÿ‰ ๊ธฐ๋ฐ˜ Pod ์ˆ˜ ์กฐ์ • ๊ฐ€๋Šฅ
HPA Custom Metrics ์‚ฌ์šฉ์ž ์ •์˜ ์ง€ํ‘œ ๊ธฐ๋ฐ˜ ์Šค์ผ€์ผ๋ง ์˜ˆ: Redis ํ ๊ธธ์ด, DB ์ปค๋„ฅ์…˜ ์ˆ˜ ๋“ฑ์œผ๋กœ ์Šค์ผ€์ผ๋ง ๊ฐ€๋Šฅ

โœ… EKS์—์„œ๋„ ๋ชจ๋‘ ๊ตฌํ˜„ ๊ฐ€๋Šฅ

โœ… Custom Metrics๋Š” ๋ณ„๋„ ์–ด๋Œ‘ํ„ฐ ์„ค์น˜(Prometheus Adapter ๋“ฑ)๊ฐ€ ํ•„์š” (FluentBit, CloudWatch๋งŒ์œผ๋กœ๋Š” ๋ถ€์กฑ)


๐Ÿ—„๏ธ Storage (Persistent Volume)

  • PVC โ†’ PV โ†’ EBS ๊ตฌ์กฐ
  • AI ๋ฐ์ดํ„ฐ ๋ฐ ๋ชจ๋ธ ์˜์†์„ฑ ๋ณด์žฅ

PVC ์˜ˆ์‹œ:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: chroma-data-pvc
  namespace: ai
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: gp2

๐Ÿ“‹ Kubernetes ๋ฆฌ์†Œ์Šค ๋ช…์„ธ

  • Deployment, HPA, ConfigMap, Secret, PVC๋ฅผ ๊ฐ ๋„ค์ž„์ŠคํŽ˜์ด์Šค๋ณ„๋กœ ๋ฐฐํฌ
  • Monitoring ๋„ค์ž„์ŠคํŽ˜์ด์Šค์— FluentBit DaemonSet + ConfigMap + ServiceAccount ์šด์˜

Deployment ์˜ˆ์‹œ (Backend)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
  namespace: backend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: backend
  template:
    metadata:
      labels:
        app: backend
    spec:
      containers:
      - name: backend
        image: 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/careerbee-be:prod-1.0.0
        ports:
        - containerPort: 8080
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10

HPA ์˜ˆ์‹œ

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: backend-hpa
  namespace: backend
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: backend
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

ConfigMap ์˜ˆ์‹œ

apiVersion: v1
kind: ConfigMap
metadata:
  name: backend-config
  namespace: backend
data:
  SPRING_PROFILES_ACTIVE: prod
  DB_HOST: rds.careerbee.ap-northeast-2.rds.amazonaws.com

๐Ÿš€ ๋ฐฐํฌ ๋ฐ ์šด์˜ ์ „๋žต

๐Ÿ‘ท๐Ÿปโ€โ™‚๏ธ CI/CD ํ๋ฆ„ (GitOps ๊ธฐ๋ฐ˜)

  • GitHub Actions
    • Docker Build โ†’ ECR Push
    • Helm values.yaml ์—…๋ฐ์ดํŠธ โ†’ Git Push
  • ArgoCD
    • GitOps ๊ฐ์ง€ โ†’ EKS ํด๋Ÿฌ์Šคํ„ฐ ์ž๋™ Sync
  • Slack ์•Œ๋ฆผ: ๊ฐœ๋ฐœ์ž์—๊ฒŒ Slack์„ ํ†ตํ•œ ๋นŒ๋“œ/๋ฐฐํฌ ์„ฑ๊ณต ์—ฌ๋ถ€ ์•Œ๋ฆผ ์ „์†ก

๐Ÿ‘€ ๋กœ๊น… ๋ฐ ๋ชจ๋‹ˆํ„ฐ๋ง

  • ๋กœ๊ทธ ์ˆ˜์ง‘: FluentBit DaemonSet์œผ๋กœ ๊ฐ ๋…ธ๋“œ์—์„œ ๋กœ๊ทธ ์ˆ˜์ง‘ โ†’ CloudWatch Logs๋กœ ์ „์†ก
  • ๋กœ๊ทธ ์ŠคํŠธ๋ฆฌ๋ฐ: CloudWatch Logs โ†’ Kinesis Firehose โ†’ S3 ๋ฒ„ํ‚ท์— ์ ์žฌ (์žฅ๊ธฐ ๋ณด๊ด€ ๋ฐ ๋ถ„์„)
  • ๋ฉ”ํŠธ๋ฆญ ์ˆ˜์ง‘: Kubernetes ๋ฉ”ํŠธ๋ฆญ ์„œ๋ฒ„ + Prometheus Operator ์„ค์น˜ โ†’ Pod, Node, HPA ์ƒํƒœ ์ˆ˜์ง‘
  • ์•Œ๋žŒ: CloudWatch Alarm ์—ฐ๋™ โ†’ Slack ์•Œ๋ฆผ

๐ŸŒ“ ๋ธ”๋ฃจ-๊ทธ๋ฆฐ ๋ฐฐํฌ

  • ๋ธ”๋ฃจ์™€ ๊ทธ๋ฆฐ ํ™˜๊ฒฝ์„ ๋ณ‘๋ ฌ๋กœ ์šด์˜ํ•ด ๋ฌด์ค‘๋‹จ์œผ๋กœ ์ƒˆ ๋ฒ„์ „์„ ๋กค์•„์›ƒํ•˜๋Š” ๋ฐฐํฌ ๋ฐฉ์‹

โš–๏ธ Kubernetes ๋ฐฐํฌ ๋ฐฉ์‹ ๋น„๊ต

ํ•ญ๋ชฉ Rolling Update Blue-Green Deployment Canary Deployment
๊ฐœ์š” ๊ธฐ์กด Pod๋ฅผ ์ˆœ์ฐจ์ ์œผ๋กœ ๊ต์ฒด ๊ธฐ์กด(Blue)๊ณผ ์‹ ๊ทœ(Green)์„ ๋ณ‘๋ ฌ ์šด์˜ ํ›„ ์ „ํ™˜ ์‹ ๊ทœ ๋ฒ„์ „์„ ์ผ๋ถ€ ํŠธ๋ž˜ํ”ฝ์—๋งŒ ์ ์šฉ ํ›„ ์ ์ง„ ํ™•๋Œ€
์žฅ์  ๋ฆฌ์†Œ์Šค ์ ˆ์•ฝ, ๋‹ค์šดํƒ€์ž„ ์ตœ์†Œํ™” ์™„๋ฒฝํ•œ ๋กค๋ฐฑ ๊ฐ€๋Šฅ, ์žฅ์•  ๋Œ€์‘ ๋น ๋ฆ„ ์„ธ๋ฐ€ํ•œ ํ’ˆ์งˆ ๊ฒ€์ฆ, ์œ„ํ—˜ ์ตœ์†Œํ™”
๋‹จ์  ๊ต์ฒด ์ค‘ ์˜ค๋ฅ˜ ์‹œ ๋กค๋ฐฑ ๋ณต์žก ๋ฆฌ์†Œ์Šค ๋‘ ๋ฐฐ ํ•„์š” (Blue/Green ์œ ์ง€) ํŠธ๋ž˜ํ”ฝ ๋ถ„๋ฐฐ/๋ชจ๋‹ˆํ„ฐ๋ง ์ฒด๊ณ„ ํ•„์š”

๋ธ”๋ฃจ-๊ทธ๋ฆฐ ๋ฐฐํฌ ์„ ํƒ ์ด์œ 

  • ๋ฌด์ค‘๋‹จ ์‚ฌ์šฉ์ž ๊ฒฝํ—˜ ๋ณด์žฅ: ๊ธฐ์กด ๋ฒ„์ „์„ ๊ทธ๋Œ€๋กœ ์œ ์ง€ํ•˜๋ฉด์„œ ์‹ ๊ทœ ๋ฒ„์ „์„ ๋ฐฐํฌํ•˜๋ฏ€๋กœ ํŠธ๋ž˜ํ”ฝ ์ ˆ๋ฐ˜ ์ด์ƒ์ด ํ•ญ์ƒ ์ •์ƒ ํ™˜๊ฒฝ(Blue ๋˜๋Š” Green)์„ ํ†ตํ•ด ์ฒ˜๋ฆฌ
    • ์ฃผ์š” ์„œ๋น„์Šค ์ค‘๋‹จ ์—†์ด ์‹ ๊ทœ ๊ธฐ๋Šฅ์„ ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์žˆ์–ด SLA(Service Level Agreement, ์„œ๋น„์Šค ์ œ๊ณต์ž๊ฐ€ ๊ณ ๊ฐ์—๊ฒŒ ์•ฝ์†ํ•˜๋Š” ๊ฐ€์šฉ์„ฑยท์„ฑ๋Šฅยท์ง€์› ์ˆ˜์ค€ ๋“ฑ ํ’ˆ์งˆ ๊ธฐ์ค€) ํ™•๋ณด์— ์œ ๋ฆฌํ•˜๋‹ค
  • ์ฆ‰๊ฐ์  ๋กค๋ฐฑ ์ง€์›: ์ƒˆ๋กœ์šด ๋ฐฐํฌ์—์„œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ๊ฒฌ๋  ๊ฒฝ์šฐ, ๋กœ๋“œ ๋ฐธ๋Ÿฐ์„œ ์„ค์ •๋งŒ์œผ๋กœ ํŠธ๋ž˜ํ”ฝ์„ ์ด์ „ ํ™˜๊ฒฝ์œผ๋กœ ๋Œ๋ ค ์ฆ‰์‹œ ๋ณต๊ท€ ๊ฐ€๋Šฅ
    • ๋ฌธ์ œ ํŒŒ์•… ๋ฐ ๋ณต๊ตฌ ์†๋„๋ฅผ ๊ทน๋Œ€ํ™”ํ•ด ์šด์˜ ๋ฆฌ์Šคํฌ๋ฅผ ์ตœ์†Œํ™”
  • ๋ฒ„์ „๋ณ„ ํŠธ๋ž˜ํ”ฝ ๋ถ„๋ฆฌ: Blue/Green ํ™˜๊ฒฝ์— ๊ฐ๊ฐ ๋ณ„๋„์˜ ๋ฒ„์ „์„ ์šด์˜ํ•ด ๋กœ๊ทธยท๋ชจ๋‹ˆํ„ฐ๋ง ์ง€ํ‘œ๋ฅผ ๋…๋ฆฝ ์ˆ˜์ง‘ํ•œ๋‹ค
    • ์‹ ๊ทœ ๋ฒ„์ „ ์•ˆ์ •์„ฑ์„ ๊ฒ€์ฆํ•œ ๋’ค ์ ์ง„์ ์œผ๋กœ ํŠธ๋ž˜ํ”ฝ ์ „ํ™˜ํ•˜๋ฉด์„œ ํผํฌ๋จผ์Šค๋ฅผ ๊ฒ€ํ† 

๐Ÿฆ‘ ArgoCD (GitOps)

  • ์„ ์–ธ์  ์ธํ”„๋ผ ๊ด€๋ฆฌ: Git์— ๋ชจ๋“  ๋งค๋‹ˆํŽ˜์ŠคํŠธ๋ฅผ ์ €์žฅํ•˜๊ณ  ๋ณ€๊ฒฝ ์ด๋ ฅ์„ ๋ฒ„์ „ ๊ด€๋ฆฌ
    • Git ์ƒํƒœ๊ฐ€ ๊ณง ํด๋Ÿฌ์Šคํ„ฐ ์ƒํƒœ๊ฐ€ ๋˜์–ด ์ผ๊ด€์„ฑ์„ ๋ณด์žฅ
  • ์ž๋™ ๋™๊ธฐํ™” ๋ฐ Drift ๋ณต๊ตฌ
    • Drift(๊ตฌ์„ฑ ๋“œ๋ฆฌํ”„ํŠธ)๋ž€ Git์— ์„ ์–ธ๋œ ๋ฆฌ์†Œ์Šค ์ƒํƒœ์™€ ์‹ค์ œ ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด ๋ฆฌ์†Œ์Šค ์ƒํƒœ ๊ฐ„ ๋ถˆ์ผ์น˜๋ฅผ ์˜๋ฏธ
    • ArgoCD๋Š” ์ด ๋ถˆ์ผ์น˜๋ฅผ ์ž๋™ ํƒ์ง€ํ•˜๊ณ  Git ์ƒํƒœ๋กœ ๋ณต๊ตฌํ•˜์—ฌ, ์›ํ•˜๋Š” ์„ ์–ธ์  ์ƒํƒœ๋ฅผ ์ง€์†์ ์œผ๋กœ ์œ ์ง€
  • ์‰ฝ๊ณ  ์•ˆ์ „ํ•œ ๋กค๋ฐฑ: ๋ชจ๋“  ๋ฐฐํฌ ์ด๋ ฅ์ด Git ์ปค๋ฐ‹ ๋‹จ์œ„๋กœ ๋‚จ์•„ ์žˆ์–ด, ํŠน์ • ์ปค๋ฐ‹์œผ๋กœ ์ฆ‰์‹œ ๋กค๋ฐฑ
    • CI ํŒŒ์ดํ”„๋ผ์ธ๊ณผ ์—ฐ๋™ํ•ด PR์ด ๋จธ์ง€๋˜๊ธฐ ์ „ ์‹ค์ œ ์ ์šฉ ๊ฒฐ๊ณผ๋ฅผ ๋ฏธ๋ฆฌ ๊ฒ€์ฆํ•ด ์ถฉ๋Œ, ์˜ค๋ฅ˜, ๋ฆฌ์†Œ์Šค ๋ณ€๊ฒฝ ์˜ํ–ฅ์„ ์‚ฌ์ „์— ํŒŒ์•…ํ•˜๊ณ  ํ•ด๊ฒฐ
  • ๋ฉ€ํ‹ฐ ํด๋Ÿฌ์Šคํ„ฐ ๊ด€๋ฆฌ: ๋‹จ์ผ ArgoCD ์ธ์Šคํ„ด์Šค๋กœ ์—ฌ๋Ÿฌ ํด๋Ÿฌ์Šคํ„ฐ(EKS, GKE ๋“ฑ)๋ฅผ ํ†ตํ•ฉ ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ์–ด, ๋ฉ€ํ‹ฐ ํด๋Ÿฌ์Šคํ„ฐ ์šด์˜ ๋ณต์žก๋„๋ฅผ ๋Œ€ํญ ๊ฐ์†Œ

โ˜ธ๏ธ Helm Chart ์˜ˆ์‹œ

๊ตฌ์กฐ

careerbee-chart/
โ”œโ”€โ”€ Chart.yaml             # ์ฐจํŠธ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ
โ”œโ”€โ”€ values.yaml            # ๊ธฐ๋ณธ ๊ฐ’ ์ •์˜
โ”œโ”€โ”€ templates/
โ”‚   โ”œโ”€โ”€ deployment.yaml    # Deployment ๋ฆฌ์†Œ์Šค ํ…œํ”Œ๋ฆฟ
โ”‚   โ”œโ”€โ”€ service.yaml       # Service ๋ฆฌ์†Œ์Šค ํ…œํ”Œ๋ฆฟ
โ”‚   โ”œโ”€โ”€ hpa.yaml           # HPA ํ…œํ”Œ๋ฆฟ
โ”‚   โ”œโ”€โ”€ ingress.yaml       # Ingress ๋ฆฌ์†Œ์Šค
โ”‚   โ””โ”€โ”€ _helpers.tpl       # ํ…œํ”Œ๋ฆฟ ํ—ฌํผ ํ•จ์ˆ˜

Chart.yaml

apiVersion: v2
name: careerbee
description: CareerBee ์„œ๋น„์Šค Helm ์ฐจํŠธ
version: 0.1.0
appVersion: "1.0.0"

values.yaml

replicaCount: 3
image:
  repository: 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/careerbee-be
  tag: prod-1.0.1
service:
  type: LoadBalancer
  port: 80
resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 250m
    memory: 256Mi
hpa:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
ingress:
  host: careerbee.dev.example.com

templates/deployment.yaml

yaml
๋ณต์‚ฌํŽธ์ง‘
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "careerbee.fullname" . }}
  labels:
    app: {{ include "careerbee.name" . }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      app: {{ include "careerbee.name" . }}
  template:
    metadata:
      labels:
        app: {{ include "careerbee.name" . }}
    spec:
      containers:
        - name: {{ include "careerbee.name" . }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          ports:
            - containerPort: 8080
          resources:
            limits:
              cpu: {{ .Values.resources.limits.cpu }}
              memory: {{ .Values.resources.limits.memory }}
            requests:
              cpu: {{ .Values.resources.requests.cpu }}
              memory: {{ .Values.resources.requests.memory }}
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10

templates/service.yaml

apiVersion: v1
kind: Service
metadata:
  name: {{ include "careerbee.fullname" . }}
spec:
  type: {{ .Values.service.type }}
  ports:
    - port: {{ .Values.service.port }}
      targetPort: 8080
      protocol: TCP
  selector:
    app: {{ include "careerbee.name" . }}

templates/hpa.yaml

{{- if .Values.hpa.enabled }}
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: {{ include "careerbee.fullname" . }}-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{ include "careerbee.fullname" . }}
  minReplicas: {{ .Values.hpa.minReplicas }}
  maxReplicas: {{ .Values.hpa.maxReplicas }}
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: {{ .Values.hpa.targetCPUUtilizationPercentage }}
{{- end }}

templates/ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ include "careerbee.fullname" . }}
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
    - host: {{ .Values.ingress.host }}
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: {{ include "careerbee.fullname" . }}
                port:
                  number: {{ .Values.service.port }}

values.yaml ์•ˆ์— ingress.host ๊ฐ’๋„ ์ถ”๊ฐ€๋ผ ์žˆ์–ด์•ผ ์ •์ƒ ๋™์ž‘

templates/_helpers.tpl

{{/*
Generate a name for resources
*/}}
{{- define "careerbee.name" -}}
careerbee
{{- end }}

{{/* Generate a fullname for resources */}} {{- define "careerbee.fullname" -}} {{ .Release.Name }}-{{ include "careerbee.name" . }} {{- end }}


๐Ÿ’ฐ ๋น„์šฉ ์˜ˆ์ธก

๊ตฌ์„ฑ ์š”์†Œ ์˜ˆ์ƒ ๋น„์šฉ (์›”๊ฐ„) ์„ค๋ช…
EC2 t3.medium (Worker Node, FE/BE) X 4 ์•ฝ $60~65 EKS ํด๋Ÿฌ์Šคํ„ฐ Worker Node๋กœ ์‚ฌ์šฉ (FE/BE ๋งˆ์ดํฌ๋กœ์„œ๋น„์Šค ์šด์˜)
EC2 t3.micro (OpenVPN) ์•ฝ $5 ๊ฐœ๋ฐœ์ž ์ „์šฉ VPN ์„œ๋ฒ„ (Private Subnet ์ ‘๊ทผ์šฉ)
EKS(Cluster) X 2 ์•ฝ $36 Dev/Prod ๋ณ„๋กœ EKS ํด๋Ÿฌ์Šคํ„ฐ 2๊ฐœ ์šด์˜ (Control Plane ๊ด€๋ฆฌ๋น„: $0.10/์‹œ๊ฐ„)
ECR ์Šคํ† ๋ฆฌ์ง€ ์•ฝ $3 Docker ์ด๋ฏธ์ง€ ์ €์žฅ์†Œ (ECR GB๋‹น ๊ณผ๊ธˆ)
S3 (์ •์  ๋ฆฌ์†Œ์Šค ์ €์žฅ) ์•ฝ $1~3 ์ด๋ฏธ์ง€, ๋กœ๊ทธ ๋ฐฑ์—…์šฉ S3 ๋ฒ„ํ‚ท (์ €์žฅ๋Ÿ‰/ํŠธ๋ž˜ํ”ฝ์— ๋”ฐ๋ผ ๊ฐ€๋ณ€)
VPC X 3 + NAT Gateway + Transit Gateway ์•ฝ $90 Dev/Prod/Shared์šฉ VPC 3๊ฐœ + Transit ์—ฐ๊ฒฐ์šฉ ๊ธฐ๋ณธ ๋น„์šฉ
Elastic Load Balancer(ALB) X 2 ์•ฝ $30~40 Public ALB 2๊ฐœ (Dev/Prod์šฉ Ingress Controller ์•ž๋‹จ ๋ฐฐ์น˜)
EC2 GPU ์ธ์Šคํ„ด์Šค(AI) X 2 ์•ฝ $267 g5g.2xlarge, FastAPI inferencing ์„œ๋ฒ„ + AI ๋ชจ๋ธ
NAT Gateway ์•ฝ $40 Private Subnet์˜ ์ธํ„ฐ๋„ท ํ†ต์‹ ์šฉ (์‹œ๊ฐ„๋‹น + ํŠธ๋ž˜ํ”ฝ๋‹น ์š”๊ธˆ)
Transit Gateway ์•ฝ $50 Dev VPC - Prod VPC - ์™ธ๋ถ€๋ง ์—ฐ๋™ ํŠธ๋ž˜ํ”ฝ ๋ผ์šฐํŒ…
RDS(MySQL) X 2 ์•ฝ $200 ์šด์˜์šฉ RDS (Multi-AZ ํ™œ์„ฑํ™”), Dev/Prod ๊ฐ 1๊ฐœ

๐Ÿ’ก ์˜ˆ์ƒ ์ดํ•ฉ (์›”): ์•ฝ $790 ์ˆ˜์ค€(์ฃผ 45์‹œ๊ฐ„ ์šด์˜ ๊ธฐ์ค€)


๐Ÿ“Œ ๋ณธ ํŽ˜์ด์ง€๋Š” 2025๋…„ 4์›” 29์ผ์— ๋งˆ์ง€๋ง‰์œผ๋กœ ์—…๋ฐ์ดํŠธ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

โš ๏ธ **GitHub.com Fallback** โš ๏ธ