forked from kubernetes/kubernetes
-
Notifications
You must be signed in to change notification settings - Fork 0
Debugging FAQ
bgrant0607 edited this page Sep 11, 2014
·
20 revisions
Under construction.
Tips that may help you debug why Kubernetes isn't working.
Of course, also take a look at the documentation, especially the getting-started guides.
Depending on the Linux distribution, the logs of system components, including Docker, will be in /var/log or /tmp, or can be accessed using journalctl on systemd-based systems, such as Fedora, RHEL7, or CoreOS.
If you don't see much useful in the logs, you could try turning on verbose logging on the Kubernetes component you suspect has a problem. See https://github.com/golang/glog for more details.
-
dev-build-and-up.sh
waits for ever atWaiting for cluster initialization
- Try
cluster/kube-down.sh
andhack/dev-build-and-up.sh
again- If it still hangs, ctrl-c and try
hack/dev-build-and-push.sh
- Check whether all the VMs exist -- typically one master VM and N minions
- If so, check whether you can ssh into them
- Check serial console output, if available
- If it still hangs, ctrl-c and try
- If it still doesn't work, see provider-specific issues below
- Try
- kubecfg cannot reach apiserver
- Ensure KUBERNETES_MASTER or KUBE_MASTER_IP is set, or use -h
- Ensure apiserver is running
- Check that the process is running on the master
- Check its logs
- kubecfg hangs forever or a pod is in state
Waiting
forever- Ensure all backend components are running: controller, scheduler, etcd, kubelets
- Ensure all k8s components have --etcd_servers set correctly on the command line (if it isn't, you should see error messages in their logs)
- If it's not set, your networking setup may be broken, since it is usually initialized from the IP address of kubernetes-master, such as in cluster/saltbase/salt/apiserver/default
- apiserver reports
Error synchronizing container: Get http://:10250/podInfo?podID=foo: dial tcp :10250: connection refused
- Just means that pod foo has not yet been scheduled (see #1285)
- Check whether the scheduler is running properly
- If the scheduler is running, possibly no minion addresses were passed to the apiserver using
--machines
(seehack/local-cluster-up.sh
for an example)
- Cannot connect to the container
- Try to telnet to the minion at its service port, and/or to the pod's IP and port
- Check whether the container has been created in Docker:
sudo docker ps -a
- If you don't see the container, there could be a problem with the pod configuration, image, Docker, or Kubelet
- If you see containers created every 10 seconds, then container creation is failing or the container's process is failing
rm -rf Godeps/_workspace/pkg output _output
TODO
TODO
- Ensure you can ssh to an instance, which may require enabling billing and/or creating an ssh key. Create an instance if you don't have one, then use
gcutil ssh
to ssh into it. -
gcutil listfirewalls ; gcutil getfirewall default-ssh
- If
default-ssh
doesn't exist, dogcutil addfirewall --description "SSH allowed from anywhere" --allowed=tcp:22 default-ssh
- If
gcutil listnetworks