[prometheus]

[환경]

Infra: Azure cloud 가상 머신
OS: Linux (ubuntu 20.04)
서버 Hostname: "vm-kuber-master-001: kubernetes control-plane"
크기: Standard D2as v5
vCPU: 2
RAM: 8GiB

[namespace 생성]

$ kubectl create namespace monitoring
namespace/monitoring created

$ kubectl get namespaces
NAME              STATUS   AGE
default           Active   28d
kube-node-lease   Active   28d
kube-public       Active   28d
kube-system       Active   28d
monitoring        Active   4m18s

[prometheus-cluster-role.yaml]

프로메테우스 컨테이너가 쿠버네티스 api에 접근할 수 있는 권한을 부여해주기 위해 ClusterRole, ClusterRoleBinding을 설정, 생성된 ClusterRole은 monitoring namespace의 기본 서비스어카운트와 연동되어 권한을 부여해줌

$ cat prometheus-cluster-role.yaml 
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
  namespace: monitoring
rules:
  - apiGroups: [""]
    resources:
      - nodes
      - nodes/proxy
      - services
      - endpoints
      - pods
    verbs: ["get", "list", "watch"]
  - apiGroups:
      - extensions
    resources:
      - ingresses
    verbs: ["get", "list", "watch"]
  - nonResourceURLs: ["/metrics"]
    verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
  - kind: ServiceAccount
    name: default
    namespace: monitoring

[Configmap]

프로메테우스가 기동되려면 환경 설정 파일이 필요, 환경 설정 파일을 정의해주는 부분
data 밑에 prometheus.rules와 prometheus.yml를 각각 정의하게 되어있습니다.
prometheus.rules : 수집한 지표에 대한 알람 조건을 지정하여 특정 조건이 되면 AlertManager로 알람을 보낼 수 있음
prometheus.yml : 수집할 지표(metric)의 종류와 수집 주기 등을 기입

[prometheus-config-map.yaml]

$ cat prometheus-config-map.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-server-conf
  labels:
    name: prometheus-server-conf
  namespace: monitoring
data:
  prometheus.rules: |-
    groups:
    - name: container memory alert
      rules:
      - alert: container memory usage rate is very high( > 55%)
        expr: sum(container_memory_working_set_bytes{pod!="", name=""})/ sum (kube_node_status_allocatable_memory_bytes) * 100 > 55
        for: 1m
        labels:
          severity: fatal
        annotations:
          summary: High Memory Usage on 
          identifier: ""
          description: " Memory Usage: "
    - name: container CPU alert
      rules:
      - alert: container CPU usage rate is very high( > 10%)
        expr: sum (rate (container_cpu_usage_seconds_total{pod!=""}[1m])) / sum (machine_cpu_cores) * 100 > 10
        for: 1m
        labels:
          severity: fatal
        annotations:
          summary: High Cpu Usage
  prometheus.yml: |-
    global:
      scrape_interval: 5s
      evaluation_interval: 5s
    rule_files:
      - /etc/prometheus/prometheus.rules
    alerting:
      alertmanagers:
      - scheme: http
        static_configs:
        - targets:
          - "alertmanager.monitoring.svc:9093"

    scrape_configs:
      - job_name: 'kubernetes-apiservers'

        kubernetes_sd_configs:
        - role: endpoints
        scheme: https

        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        relabel_configs:
        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
          action: keep
          regex: default;kubernetes;https

      - job_name: 'kubernetes-nodes'

        scheme: https

        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        kubernetes_sd_configs:
        - role: node

        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics


      - job_name: 'kubernetes-pods'

        kubernetes_sd_configs:
        - role: pod

        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name

      - job_name: 'kube-state-metrics'
        static_configs:
          - targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']

      - job_name: 'kubernetes-cadvisor'

        scheme: https

        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        kubernetes_sd_configs:
        - role: node

        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

      - job_name: 'kubernetes-service-endpoints'

        kubernetes_sd_configs:
        - role: endpoints

        relabel_configs:
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
          action: replace
          target_label: __scheme__
          regex: (https?)
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: kubernetes_name

[deployment] prometheus-deployment.yaml

프로메테우스 이미지를 담은 pod을 담은 deployment controller

$ cat prometheus-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-deployment
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus-server
  template:
    metadata:
      labels:
        app: prometheus-server
    spec:
      containers:
        - name: prometheus
          image: prom/prometheus:latest
          args:
            - "--config.file=/etc/prometheus/prometheus.yml"
            - "--storage.tsdb.path=/prometheus/"
          ports:
            - containerPort: 9090
          volumeMounts:
            - name: prometheus-config-volume
              mountPath: /etc/prometheus/
            - name: prometheus-storage-volume
              mountPath: /prometheus/
      volumes:
        - name: prometheus-config-volume
          configMap:
            defaultMode: 420
            name: prometheus-server-conf

        - name: prometheus-storage-volume
          emptyDir: {}

[node export] prometheus-node-exporter.yaml

프로메테우스가 수집하는 메트릭은 쿠버네티스에서 기본으로 제공하는 system metric만 수집하는게 아니라 그 외의 것들도 수집하기 때문에 수집역할을 하는 에이전트를 따로 둬야 함
위 역할을 해주는게 node-exporter이고 각 노드에 하나씩 띄워야 하므로 DaemonSet으로 구성해주도록 합니다.

$ cat prometheus-node-exporter.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitoring
  labels:
    k8s-app: node-exporter
spec:
  selector:
    matchLabels:
      k8s-app: node-exporter
  template:
    metadata:
      labels:
        k8s-app: node-exporter
    spec:
      containers:
      - image: prom/node-exporter
        name: node-exporter
        ports:
        - containerPort: 9100
          protocol: TCP
          name: http
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: node-exporter
  name: node-exporter
  namespace: kube-system
spec:
  ports:
  - name: http
    port: 9100
    nodePort: 31672
    protocol: TCP
  type: NodePort
  selector:
    k8s-app: node-exporter

[Service] prometheus-svc.yaml

프로메테우스 pod을 외부로 노출시키는 서비스를 구성

$ cat prometheus-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: prometheus-service
  namespace: monitoring
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "9090"
spec:
  selector:
    app: prometheus-server
  type: NodePort
  ports:
    - port: 8080
      targetPort: 9090
      nodePort: 30003            ########### On-premise 에서 쿠버니티스 구축할때 VM IP:30003으로 프로메테우스 접근할 수 있었다. <얘는 각각 Node들의 공인 IP가 있어서 30003으로 접근 가능한듯
##type: LoLoadBalancer           ########### AKS로 구성할때 추가한 항목이며 외부 IP를 받기 위해 선언했고 선언 후 IP:8080으로 접근했다. AKS노드가 있지만 노드들의 공인IP가 없으므로 아마 컨테인플래인 IP:8080으로 접근하는 듯

[배포]

$ kubectl apply -f prometheus-cluster-role.yaml
$ kubectl apply -f prometheus-config-map.yaml
$ kubectl apply -f prometheus-deployment.yaml
$ kubectl apply -f prometheus-node-exporter.yaml
$ kubectl apply -f prometheus-svc.yaml

[pod 확인]

$ kubectl get nodes -o wide
NAME                  STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
vm-kube-dev-002       Ready    <none>          27d   v1.27.4   10.1.0.6      <none>        Ubuntu 20.04.6 LTS   5.15.0-1051-azure   containerd://1.6.21
vm-kube-dev-003       Ready    <none>          27d   v1.27.4   10.1.0.5      <none>        Ubuntu 20.04.6 LTS   5.15.0-1051-azure   containerd://1.6.21
vm-kuber-master-001   Ready    control-plane   28d   v1.28.3   10.1.0.8      <none>        Ubuntu 20.04.6 LTS   5.15.0-1051-azure   containerd://1.6.24

#### prometheus pod도 정상적으로 동작, 현재 환경은 노드가 2개이므로 node-exporter도 2개 뜬 것을 확인
$ kubectl get pods -n monitoring
NAME                                     READY   STATUS    RESTARTS   AGE
node-exporter-tfzcr                      1/1     Running   0          14s
node-exporter-wdkp8                      1/1     Running   0          14s
prometheus-deployment-568f7f568f-p4n95   1/1     Running   0          19s

[30003 포트 접근]

kube-state-metrics가 (0/1 up)으로 아직 올라가지 않은 것으로 표시 됨
kube-state-metrics는 쿠버네티스 클러스터 내 오브젝트(예를들면 Pod)에 대한 지표 정보를 생성하는 서비스, 따라서 Pod 상태 정보를 모니터링 하기 위해서는 kube-state-metrics가 떠 있어야한다.

[kube-state-metrics 배포] kube-state-cluster-role.yaml

$ cat kube-state-cluster-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kube-state-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kube-state-metrics
subjects:
  - kind: ServiceAccount
    name: kube-state-metrics
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kube-state-metrics
rules:
  - apiGroups:
      - ""
    resources:
      [
        "configmaps",
        "secrets",
        "nodes",
        "pods",
        "services",
        "resourcequotas",
        "replicationcontrollers",
        "limitranges",
        "persistentvolumeclaims",
        "persistentvolumes",
        "namespaces",
        "endpoints",
      ]
    verbs: ["list", "watch"]
  - apiGroups:
      - extensions
    resources: ["daemonsets", "deployments", "replicasets", "ingresses"]
    verbs: ["list", "watch"]
  - apiGroups:
      - apps
    resources: ["statefulsets", "daemonsets", "deployments", "replicasets"]
    verbs: ["list", "watch"]
  - apiGroups:
      - batch
    resources: ["cronjobs", "jobs"]
    verbs: ["list", "watch"]
  - apiGroups:
      - autoscaling
    resources: ["horizontalpodautoscalers"]
    verbs: ["list", "watch"]
  - apiGroups:
      - authentication.k8s.io
    resources: ["tokenreviews"]
    verbs: ["create"]
  - apiGroups:
      - authorization.k8s.io
    resources: ["subjectaccessreviews"]
    verbs: ["create"]
  - apiGroups:
      - policy
    resources: ["poddisruptionbudgets"]
    verbs: ["list", "watch"]
  - apiGroups:
      - certificates.k8s.io
    resources: ["certificatesigningrequests"]
    verbs: ["list", "watch"]
  - apiGroups:
      - storage.k8s.io
    resources: ["storageclasses", "volumeattachments"]
    verbs: ["list", "watch"]
  - apiGroups:
      - admissionregistration.k8s.io
    resources:
      ["mutatingwebhookconfigurations", "validatingwebhookconfigurations"]
    verbs: ["list", "watch"]
  - apiGroups:
      - networking.k8s.io
    resources: ["networkpolicies"]
    verbs: ["list", "watch"]

[서비스어카운트 생성 kube-state-svcaccount.yaml ] 위의 ClusterRole과 연동

$ cat kube-state-svcaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-state-metrics
  namespace: kube-system

[kube-state-metrics의 deployment 구성]

$ cat kube-state-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: kube-state-metrics
  name: kube-state-metrics
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kube-state-metrics
  template:
    metadata:
      labels:
        app: kube-state-metrics
    spec:
      containers:
        - image: quay.io/coreos/kube-state-metrics:v1.8.0
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 5
            timeoutSeconds: 5
          name: kube-state-metrics
          ports:
            - containerPort: 8080
              name: http-metrics
            - containerPort: 8081
              name: telemetry
          readinessProbe:
            httpGet:
              path: /
              port: 8081
            initialDelaySeconds: 5
            timeoutSeconds: 5
      nodeSelector:
        kubernetes.io/os: linux
      serviceAccountName: kube-state-metrics

[ kube-state-metrics의 서비스 생성]

$ cat kube-state-svc.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: kube-state-metrics
  name: kube-state-metrics
  namespace: kube-system
spec:
  clusterIP: None
  ports:
    - name: http-metrics
      port: 8080
      targetPort: http-metrics
    - name: telemetry
      port: 8081
      targetPort: telemetry
  selector:
    app: kube-state-metrics

[배포]

$ kubectl apply -f kube-state-cluster-role.yaml
$ kubectl apply -f kube-state-deployment.yaml
$ kubectl apply -f kube-state-svcaccount.yaml
$ kubectl apply -f kube-state-svc.yaml


$ kubectl get pod -n kube-system
NAME                                          READY   STATUS    RESTARTS       AGE
------------------ 생략 ------------------ 
kube-state-metrics-7b9cdc6844-jkpww           1/1     Running   0              31s
------------------ ------------------ ------------------

[Grafana 연동]

Grafana는 수집 지표 정보를 분석 및 시각화 시키는 오픈소스 툴, 주로 데이터를 시각화 하기 위한 대시보드로 사용
위에서 모니터링 툴인 프로메테우스를 사용해 쿠버네티스의 metric들을 수집하는 것까지 구현
Grafana를 붙여서 프로메테우스의 데이터들을 시각화

[Grafana pod과 svc를 생성] grafana.yaml

$ cat grafana.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      name: grafana
      labels:
        app: grafana
    spec:
      containers:
        - name: grafana
          image: grafana/grafana:latest
          ports:
            - name: grafana
              containerPort: 3000
          env:
            - name: GF_SERVER_HTTP_PORT
              value: "3000"
            - name: GF_AUTH_BASIC_ENABLED
              value: "false"
            - name: GF_AUTH_ANONYMOUS_ENABLED
              value: "true"
            - name: GF_AUTH_ANONYMOUS_ORG_ROLE
              value: Admin
            - name: GF_SERVER_ROOT_URL
              value: /
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: monitoring
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "3000"
spec:
  selector:
    app: grafana
  type: NodePort
  ports:
    - port: 3000
      targetPort: 3000
      nodePort: 30004

[grafana.yaml 배포]

$ kubectl apply -f grafana.yaml

$ kubectl get pod -n monitoring
NAME                                     READY   STATUS    RESTARTS   AGE
grafana-78476cc5c8-fkm7v                 1/1     Running   0          18s
node-exporter-tfzcr                      1/1     Running   0          21m
node-exporter-wdkp8                      1/1     Running   0          21m
prometheus-deployment-568f7f568f-p4n95   1/1     Running   0          21m

[Grafana 접속 port: 30004]

Add data source로 이동

프로메테우스를 선택

연동할 프로메테우스의 정보를 기입, url은 서비스의 internal ip를 참고

해당 메세지가 뜨면 연동 성공

[dashboard 추가]

Grafana dashboard 참고:https://grafana.com/grafana/dashboards/?plcmt=footer

kubernetes 검색

copy id를 눌러 대시보드의 id를 복사

import Dashboard로 이동해 복사한 번호를 붙여넣기 후 Load

[Rancher]

Web UI 기반으로 k8s를 배포, 관리, 모니터링 할 수 있는 도구
개인적으로 k8s에 특화되어있는 모니터링 솔루션
Rancher 참고 사이트1: https://rancher.com/
Rancher 참고 사이트2: https://rancher.com/docs/rancher/v2.x/en/installation/single-node/

[환경]

Azure kubernetes Service v1.25.11 사용

[helm] Rancher 설치

[helm 설치]


$ curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
$ sudo apt-get install apt-transport-https --yes
$ echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
$ sudo apt-get update
$ sudo apt-get install helm

[rancher 설치]

# helm 명령어로 rancher 설치하기 위해서 repository 설정
$ helm repo add rancher-stable https://releases.rancher.com/server-charts/stable

# namespace 생성
$ kubectl create namespace cattle-system

#helm 명령어로 rancher 설치
$ helm install rancher rancher-stable/rancher \
  --namespace cattle-system \
  --set replicas=1 \
  --set bootstrapPassword=admin \
  --set ingress.tls.source=secret

# 아래 Error은 DNS 주소 명시하라고 함 무시해도 됨.
Error: INSTALLATION FAILED: 1 error occurred:
* Ingress.extensions "rancher" is invalid: spec.tls[0].hosts[0]: Invalid value: "": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9](https://wrtn.ai/%5B-a-z0-9%5D*%5Ba-z0-9%5D)?(.[a-z0-9](https://wrtn.ai/%5B-a-z0-9%5D*%5Ba-z0-9%5D)?)*')

rancher.yaml

apiVersion: v1
kind: Service
metadata:
  name: rancher-lb
  namespace: cattle-system
  labels:
    app: rancher
spec:
  selector:
    app: rancher
  ports:
  - protocol: "TCP"
    port: 443
    targetPort: 443
  type: LoadBalancer

선언 후 적용

rancher.yaml apply 후 외부 IP 확인

rancher dashboard

nginx pod 올리기

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80

---

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80
  type: LoadBalancer

[docker] Rancher 설치


$ sudo docker run -d --restart=unless-stopped -p 80:80 -p 443:443 --privileged rancher/rancher

해당 포트로 Rancher server dashboard 접근을 위해 외부 고정 IP 및 80, 443 포트 오픈(80, 443 모두 오픈해야 접근 가능) 80 포트로 접근 시 443 으로 리다이렉트


$ docker ps
CONTAINER ID   IMAGE                                 COMMAND                  CREATED          STATUS          PORTS                                                                                                                                  NAMES
bcde865ffc29   rancher/rancher                       "entrypoint.sh"          14 minutes ago   Up 12 minutes   0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp  
$ docker logs  bcde865ffc29  2>&1 | grep "Bootstrap Password:"
2023/09/22 07:21:52 [INFO] Bootstrap Password: kv9n6l24jx49kqjr9jwgrsbbg9xmlsjl8v9mt5gl95lrnvd4fjn6ng

로그인 화면

Azure AKS 선택

항목에 맞게 입력 후 Create

관리자 확인

[SSH 공개 키]

Azure Cli를 통해 아래 커멘드로 생성

# Create an SSH key pair using Azure CLI
az sshkey create --name "mySSHKey" --resource-group "myResourceGroup"

# Create an SSH key pair using ssh-keygen
ssh-keygen -t rsa -b 4096

SSH 생성 화면

필수 사항이 아니므로 안적었다.

하단 Create 클릭

Azure SP가 리소스 그룹에 권한이 없어서 발생하는 문제이므로 리소스그룹에 Azure SP 권한 부여

[kubernetes Service]

서비스: pod 집합에서 실행 중인 애플리케이션을 네트워크 서비스로 노출하는 추상화 방법
- pod는 영구적이지 않음. 종종 소멸되거나 복구되며, pod들은 고유한 IP 주소를 가지긴 하지만 이는 동적으로 변경,
- 이런 환경 속에서 다른 서비스가 이 pod 집합들과 통신하려면 일반적으로는 서비스 디스커버리 메커니즘이 필요며, kubernetes Service 통해 팟에게 고유한 IP 주소와 집합에 대한 단일 DNS 명을 부여하고 레플리카들에게 알아서 로드밸런싱을 수행해준다.
예: 클러스터 내 프론트앤드와 백엔드 서버가 있다고 가정하자. 백엔드는 고 가용성을 위해 여러 개의 레플리카들로 관리된다. 만약에 프론트엔드가 백엔드와 통신하려고 하면 “서비스”라는 개념 없이는 실제 IP 주소를 추적해야 한다. 그러나 쿠버네티스에서 제공하는 서비스라는 추상화 개념을 이용하면, 백엔드 파트의 실제 IP 주소를 추적할 필요 없이 등록된 내부 DNS 주소를 통해 통신이 가능하다.(한마디로 이해하기 쉽게 Pod는 올라갔다 내려갔다하면 IP가 바뀌는데 앞 단에 정적 NAT IP를 부여한다.)

[서비스 종류]

Cluster IP

클러스터 안에 있는 다른 Pod들이 접근할 수 있도록 IP를 할당, 내부 IP만을 할당하기 때문에 클러스터 외부에서는 접근 불가
Pod들은 K8s 클러스터 내 존재하고 Cluster IP Service는 해당 Pod들에게 트래픽을 로드밸런싱 한다. Cluster IP는 클러스터 내부에서만 유효하며, 클러스터 외부와의 통신을 위해서는 NodePort나 LoadBalancer 서비스가 필요하다.


apiVersion: v1
kind: Service
metadata:
  name: dong-cluster-test
spec:
  type: ClusterIP
  selector:
    app: dong-service
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

dong-cluster-test라는 이름의 ClusterIP 타입의 서비스가 생성,
클러스터 IP 주소(CLUSTER-IP)는 10.101.8.69, 80번 포트로 트래픽을 받아서 백엔드 Pod에 전달
EXTERNAL-IP: Kubernetes 클러스터 외부에서 Service에 접근할 수 있는 IP 주소
일반적으로 EXTERNAL-IP는 LoadBalancer 타입의 Service를 생성했을 때 사용
LoadBalancer 서비스는 클라우드 제공 업체가 제공하는 로드 밸런서 리소스를 사용하여 외부 트래픽을 내부 서비스로 전달합니다. 그러나, ClusterIP 타입의 Service의 경우에는 EXTERNAL-IP가 으로 표시됩니다. 이는 해당 서비스가 클러스터 내부에서만 접근 가능하고 외부로 노출되지 않음을 의미, 따라서, dong-cluster-test 서비스의 EXTERNAL-IP가 으로 표시되고 있으므로 해당 서비스는 현재 클러스터 외부에서 직접 접근할 수 없음. 그러나 클러스터 내에서 다른 Pod이나 Service에서 dong-cluster-test 서비스에 접근할 수 있습니다. 만약 외부에서 해당 서비스에 접근해야 한다면, LoadBalancer 타입이 아닌 다른 방법(예: NodePort, Ingress 등)을 사용하여 서비스를 노출시켜야 합니다.
또한, endpoints/dong-cluster-test 항목에서는 해당 서비스에 연결된 엔드포인트(즉, 백엔드 Pod)의 IP와 포트가 나열되어 있습니다. 예시에서는 10.244.1.14:8080, 10.244.2.16:8080로 두 개의 엔드포인트가 표시됩니다. 따라서, 클러스터 내부에서 dong-cluster-test 서비스에 접근하려면 해당 ClusterIP(10.101.8.69)와 포트(80)를 사용하여 접속할 수 있습니다.

NodePort

NodePort는 고정 포트로 Pod이 배포된 노드들의 IP에 서비스를 노출시킨다. NodePort 서비스는 ClusterIP 서비스를 자동으로 생성한다. NodeIP:NodePort를 요청하여 클러스터 외부에서 NodePort 서비스에 접근할 수 있다 여기서 port, targetPort, nodePort 세 가지 개념이 나오는데 nodePort는 실제 VM 노드에 접근할 때 사용되는 port, port 서비스의 포트, 마지막으로 targetPort는 Pod에 접근할 때 사용하는 포트이다. 트래픽이 오게되면 흐름상 nodePort -> port -> targetPort로 전달된다.

NodePort는 노드의 개방된 Port로 네트워크 접근을 허용할 수 있지만, 실제 분산 노드 애플리케이션을 구현한다고 가정했을 때 auto scaling의 이유로 노드들의 네트워크 환경이 동적으로 변경된다면, 서비스 디스커버리와 같은 방법으로 클라이언트 단에서 노드들의 네트워크 엔드포인트들을 관리해야한다는 문제가 생긴다.

이를 해결할 수 있는 것이 로드밸런서이다.


apiVersion: v1
kind: Service
metadata:  
  name: dong-nodeport-test
spec:
  selector:    
    app: dong-service
  type: NodePort
  ports:  
  - name: http
    port: 80
    targetPort: 8080
    nodePort: 30036
    protocol: TCP

LoadBalancer

로드 밸런서는 서비스를 외부에 노출시키는 표준 방법이다. 로드밸런서는 따로 물리 장비(컴퓨터)가 필요한데, 클라우드 컴퓨팅 환경에서는 벤더사가 제공하는 로드밸런서를 사용하면 된다.


apiVersion: v1
kind: Service
metadata:
  name: dong-cluster-test-loadbalancer
spec:
  type: LoadBalancer
  selector:
    app: dong-service
  ports:
    - protocol: TCP  
      port: 8088
      targetPort: 80

참고자료

021. kubernetes 모니터링(prometheus, rancher) - kimdonggwan337/dongdong GitHub Wiki

[prometheus]

[환경]

[namespace 생성]

[prometheus-cluster-role.yaml]

[Configmap]

[prometheus-config-map.yaml]

[deployment] prometheus-deployment.yaml

[node export] prometheus-node-exporter.yaml

[Service] prometheus-svc.yaml

[배포]

[pod 확인]

[30003 포트 접근]

[kube-state-metrics 배포] kube-state-cluster-role.yaml

[서비스어카운트 생성 kube-state-svcaccount.yaml ] 위의 ClusterRole과 연동

[kube-state-metrics의 deployment 구성]

[ kube-state-metrics의 서비스 생성]

[배포]

[Grafana 연동]

[Grafana pod과 svc를 생성] grafana.yaml

[grafana.yaml 배포]

[Grafana 접속 port: 30004]

[dashboard 추가]

[Rancher]

[환경]

[helm] Rancher 설치

[helm 설치]

[rancher 설치]

rancher.yaml

nginx pod 올리기

[docker] Rancher 설치

[SSH 공개 키]

[kubernetes Service]

[서비스 종류]

Cluster IP

NodePort

LoadBalancer

⚠️ GitHub.com Fallback ⚠️

021. kubernetes 모니터링(prometheus, rancher) - kimdonggwan337/dongdong GitHub Wiki

[prometheus]

[환경]

[namespace 생성]

[prometheus-cluster-role.yaml]

[Configmap]

[prometheus-config-map.yaml]

[deployment] prometheus-deployment.yaml

[node export] prometheus-node-exporter.yaml

[Service] prometheus-svc.yaml

[배포]

[pod 확인]

[30003 포트 접근]

[kube-state-metrics 배포] kube-state-cluster-role.yaml

[서비스어카운트 생성 kube-state-svcaccount.yaml ] 위의 ClusterRole과 연동

[kube-state-metrics의 deployment 구성]

[ kube-state-metrics의 서비스 생성]

[배포]

[Grafana 연동]

[Grafana pod과 svc를 생성] grafana.yaml

[grafana.yaml 배포]

[Grafana 접속 port: 30004]

[dashboard 추가]

[Rancher]

[환경]

[helm] Rancher 설치

[helm 설치]

[rancher 설치]

rancher.yaml

nginx pod 올리기

[docker] Rancher 설치

[SSH 공개 키]

[kubernetes Service]

[서비스 종류]

Cluster IP

NodePort

LoadBalancer

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️