k8s_networking - henk52/knowledgesharing GitHub Wiki

Kubernetes Networking

Introduction

Purpose

Vocabulary

References

Troubleshooting

Troubleshooting CoreDNS

dig coredns timesout

dig coredns
; <<>> DiG 9.16.27 <<>> metrics-server
;; global options: +cmd
;; connection timed out; no servers could be reached
  • kubectl create deployment nginx --image=nginx
  • kubectl get pods
> kubectl exec -it nginx-bf5d5cf98-qjbdx -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/

kubectl create deployment dnsutils --image=registry.k8s.io/e2e-test-images/agnhost:2.39

k exec -it dnsutils -- bash

even on the longhorn manager the /etc/resolv.conf contains:

search longhorn-system.svc.cluster.local svc.cluster.local cluster.local default.svc.cluster.local
nameserver 10.233.0.3
options ndots:5

longhorn-admission-webhook.longhorn-system.svc

Why does the name in the resolv.conf end in .cluster.local

  • network
    • Service
      • k -n kube-system get svc coredns -o yaml > svc_coredns_longhorn.yaml
      • k -n kube-system get svc coredns -o yaml > svc_coredns_macau.yaml
      • kdiff3 svc_coredns_macau.yaml svc_coredns_longhorn.yaml
    • Endpoints
      • k -n kube-system get svc coredns -o yaml > endpoint_coredns_longhorn.yaml
      • k -n kube-system get endpointslices coredns-4pftc -o yaml > endpoint_coredns_macau.yaml
      • kdiff3 endpoint_coredns_macau.yaml endpoint_coredns_longhorn.yaml
  • workloads
    • Deployment
      • k -n kube-system get deployments coredns -o yaml > deployment_coredns_longhorn.yaml
      • k -n kube-system get deployments coredns -o yaml > deployment_coredns_macau.yaml
      • kdiff3 deployment_coredns_macau.yaml deployment_coredns_longhorn.yaml
    • replicat-set
      • k -n kube-system get replicasets coredns-5d784884df -o yaml > replicaset_coredns_longhorn.yaml
      • k -n kube-system get replicasets coredns-5d784884df -o yaml > replicaset_coredns_macau.yaml
      • kdiff3 replicaset_coredns_macau.yaml replicaset_coredns_longhorn.yaml
  • config
    • configmaps
      • k -n kube-system get configmaps coredns -o yaml > configmaps_coredns_longhorn.yaml
      • k -n kube-system get configmaps coredns -o yaml > configmaps_coredns_macau.yaml
      • kdiff3 configmaps_coredns_macau.yaml configmaps_coredns_longhorn.yaml
  • All above coredns are identical, except for timestamps etc.
  • kube-proxy

On both dnsutils

command macau longhorn
dig kubernetes.default ok ok
dig coredns ok fails, then pass later

On longhorn

bash-5.0# dig metric-server

; <<>> DiG 9.16.27 <<>> metric-server
;; global options: +cmd
;; connection timed out; no servers could be reached

bash-5.0# nc -zv 10.233.0.3 53
nc: connect to 10.233.0.3 port 53 (tcp) failed: Operation timed out
bash-5.0# nc -zv 10.233.0.3 53
Connection to 10.233.0.3 53 port [tcp/domain] succeeded!

  • dig @10.233.71.1 coredns
  • dig @10.233.74.67 coredns

It seems like the coredns pod at 10.233.71.1 always answers and the corednspod at 10.233.74.67 never answers

kubectl logs -n kube-system calico-node-lvmqr | grep -v INFO

dns-autoscaler configmap has the min number coredns pods set to two.