Prometheus Federation - hyunsun/documentations GitHub Wiki

Install Kubernetes in Edge and Central

Assume that you have 3 nodes, one for central K8S cluster and the others for edge K8S clusters, named one-menlo and one-tucson in this example.

# Run from all nodes
$ git clone https://github.com/kubernetes-incubator/kubespray.git -b release-2.11

# Run from central node
$ sed -i 's/node1/central/g' kubespray/inventory/local/hosts.ini

# Run from onf-menlo node
$ sed -i 's/node1/onf-menlo/g' kubespray/inventory/local/hosts.ini

# Run from onf-tucson node
$ sed -i 's/node1/onf-tucson/g' kubespray/inventory/local/hosts.ini

# Install K8S
$ sudo apt update
$ sudo apt install -y software-properties-common python-pip
$ sudo pip install virtualenv
$ virtualenv ${HOME}/venv/kubespray --no-site-packages
$ source ${HOME}/venv/kubespray/bin/activate
$ cd ~/kubespray
$ pip install -r requirements.txt
$ ansible-playbook -b -i inventory/local/hosts.ini \
    -e "{'override_system_hostname' : True, 'disable_swap' : True}" \
    -e "{'docker_iptables_enabled' : True}" \
    -e "{'kubectl_localhost' : True}" \
    -e "{'kubeconfig_localhost' : True}" \
    -e "{'helm_enabled' : True}"
    cluster.yml
$ cd; mkdir .kube
$ cp kubespray/inventory/local/artifacts/admin.conf .kube/config
$ kubectl get nodes

Install Prometheus Operator in Edges

$ cat >> prometheus-operator-edge.yaml << EOF
prometheus:
  service:
    type: NodePort
    nodePort: 30090
  prometheusSpec:
    externalLabels:
      datacenter: onf-menlo # change the name to onf-tucson for the other Edge cluster

# alert manager values
alertmanager:
  service:
    type: NodePort
    nodePort: 30903

# grafana values
grafana:
# User: admin
# Pass: prom-operator
  service:
    type: NodePort
    nodePort: 30091
EOF

$ helm install stable/prometheus-operator --name=monitoring --namespace=monitoring -f prometheus-operator-edge.yaml

Install Prometheus Operator in Central

$ cat >> prometheus-operator-central.yaml << EOF
prometheus:
  service:
    type: NodePort
    nodePort: 30090
  prometheusSpec:
    externalLabels:
      datacenter: central
    additionalScrapeConfigs:
      - job_name: prometheus-aggregator
        scrape_interval: 15s
        honor_labels: true
        metrics_path: /federate
        params:
          # need to fix this to get all valid monitoring metrics
          match[]:
            - '{job="kube-state-metrics"}'
            - '{__name__=~"job:.*"}'
        static_configs:
          - targets:
            # add real IP address of edges
            - [one-menlo IP]:30090
            - [onf-tucson IP]:30090

# alert manager values
alertmanager:
  service:
    type: NodePort
    nodePort: 30903

# grafana values
grafana:
# User: admin
# Pass: prom-operator
  service:
    type: NodePort
    nodePort: 30091
EOF

$ helm install stable/prometheus-operator --name=monitoring --namespace=monitoring -f prometheus-operator-central.yaml

Check Dashboards

prometheus: http://[central IP]:30090

To check if the central Prometheus scrapes the metrics from edges correctly visit the URL below and find datacenter="onf-menlo" and datacenter="onf-tucson" from the result.

http://[central IP]:30090/graph?g0.range_input=1h&g0.expr=kube_node_status_capacity&g0.tab=1

grafana: http://[central IP]:30091

User: admin
Pass: prom-operator