k8s_networking - henk52/knowledgesharing GitHub Wiki
Kubernetes Networking
Introduction
Purpose
Vocabulary
References
Troubleshooting
Troubleshooting CoreDNS
dig coredns timesout
dig coredns
; <<>> DiG 9.16.27 <<>> metrics-server
;; global options: +cmd
;; connection timed out; no servers could be reached
- kubectl create deployment nginx --image=nginx
- kubectl get pods
> kubectl exec -it nginx-bf5d5cf98-qjbdx -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/
kubectl create deployment dnsutils --image=registry.k8s.io/e2e-test-images/agnhost:2.39
k exec -it dnsutils -- bash
even on the longhorn manager the /etc/resolv.conf contains:
search longhorn-system.svc.cluster.local svc.cluster.local cluster.local default.svc.cluster.local
nameserver 10.233.0.3
options ndots:5
longhorn-admission-webhook.longhorn-system.svc
Why does the name in the resolv.conf end in .cluster.local
- network
- Service
- k -n kube-system get svc coredns -o yaml > svc_coredns_longhorn.yaml
- k -n kube-system get svc coredns -o yaml > svc_coredns_macau.yaml
- kdiff3 svc_coredns_macau.yaml svc_coredns_longhorn.yaml
- Endpoints
- k -n kube-system get svc coredns -o yaml > endpoint_coredns_longhorn.yaml
- k -n kube-system get endpointslices coredns-4pftc -o yaml > endpoint_coredns_macau.yaml
- kdiff3 endpoint_coredns_macau.yaml endpoint_coredns_longhorn.yaml
- Service
- workloads
- Deployment
- k -n kube-system get deployments coredns -o yaml > deployment_coredns_longhorn.yaml
- k -n kube-system get deployments coredns -o yaml > deployment_coredns_macau.yaml
- kdiff3 deployment_coredns_macau.yaml deployment_coredns_longhorn.yaml
- replicat-set
- k -n kube-system get replicasets coredns-5d784884df -o yaml > replicaset_coredns_longhorn.yaml
- k -n kube-system get replicasets coredns-5d784884df -o yaml > replicaset_coredns_macau.yaml
- kdiff3 replicaset_coredns_macau.yaml replicaset_coredns_longhorn.yaml
- Deployment
- config
- configmaps
- k -n kube-system get configmaps coredns -o yaml > configmaps_coredns_longhorn.yaml
- k -n kube-system get configmaps coredns -o yaml > configmaps_coredns_macau.yaml
- kdiff3 configmaps_coredns_macau.yaml configmaps_coredns_longhorn.yaml
- configmaps
- All above coredns are identical, except for timestamps etc.
- kube-proxy
On both dnsutils
| command | macau | longhorn |
|---|---|---|
| dig kubernetes.default | ok | ok |
| dig coredns | ok | fails, then pass later |
On longhorn
bash-5.0# dig metric-server
; <<>> DiG 9.16.27 <<>> metric-server
;; global options: +cmd
;; connection timed out; no servers could be reached
bash-5.0# nc -zv 10.233.0.3 53
nc: connect to 10.233.0.3 port 53 (tcp) failed: Operation timed out
bash-5.0# nc -zv 10.233.0.3 53
Connection to 10.233.0.3 53 port [tcp/domain] succeeded!
- dig @10.233.71.1 coredns
- dig @10.233.74.67 coredns
It seems like the coredns pod at 10.233.71.1 always answers and the corednspod at 10.233.74.67 never answers
kubectl logs -n kube-system calico-node-lvmqr | grep -v INFO
dns-autoscaler configmap has the min number coredns pods set to two.