oneke_troubleshoot - OpenNebula/one-apps GitHub Wiki

Troubleshooting

This page provides details on common problems encountered after deploying the OneKE Service.

Broken OneGate Access

[!TIP] For detailed info on OneGate please refer to the OneGate Usage and OneGate Configuration documents.

Because OneKE is a OneFlow service it requires OneFlow and OneGate OpenNebula components to be operational.

If the OneKE service is stuck in the DEPLOYING state and only VMs from the VNF role are visible, it is likely there is some networking or configuration issue regarding the OneGate component. You can try to confirm if OneGate is reachable from VNF nodes by logging in to a VNF node via SSH and executing the following command:

$ ssh [email protected] onegate vm show
VM 227
NAME                : vnf_0_(service_105)

If the OneGate endpoint is not reachable from VNF nodes, you'll see an error/timeout message.

If the OneKE service is stuck in the DEPLOYING state and all VMs from all roles are visible, and you've also confirmed that VMs from the VNF role can access the OneGate component, there still may be a networking issue on the leader VNF node itself. In this case you can try to confirm if OneGate is reachable from Kubernetes nodes via SSH by executing the following command:

ssh -J root@<VNF_public_IP> root@<master_private_IP> onegate vm show

An example with the IPs used throughout this documentation:

$ ssh -J [email protected] [email protected] onegate vm show
VM 228
NAME                : master_0_(service_105)

If you see error/timeout message on a Kubernetes node, but not on a VNF node, you should investigate networking config and logs on the leader VNF VM, specifically the /var/log/messages file.

Broken Access to the Public Internet

If you're constantly getting an ImagePullBackOff error in Kubernetes, log in to a worker node and check:

  • Check if the default gateway points to the private VIP address:
$ ssh -J [email protected] [email protected] ip route show default
default via 172.20.0.86 dev eth0
  • Check if the DNS config points to the nameserver defined in the private VNET:
$ ssh -J [email protected] [email protected] cat /etc/resolv.conf
nameserver 1.1.1.1

If in all the above cases everything looks correct, then you should investigate networking config and logs on the leader VNF VM, specifically the /var/log/messages file.