Pod Containers Keeps on Restarting - kubernetes

Container getting killed at node after pod creation
Issue was raised at github and asked me to move to SO
https://github.com/kubernetes/kubernetes/issues/24241
However i am briefing my issue here.After creating pod it doesnt run since i have to mention the container name in the kubelet args under --pod-infra-container-image as mentioned below.
I have solved the issue of Pods Status Container Creating by adding the container name in "--pod-infra-container-image= then pod creation was successful.
However I want to resolve this issue some other way instead of adding containers name in kubelet args. Kindly let me know how do I get this issue fixed.
Also after the pod creation is done. The containers keep on restarting. However if I check the logs via kubectl logs output shows the container expected output.
But the container restarts often. For restarting of pod what i did i have made the restartPolicy: never in spec file of pod and then it didnt restarted however container doesnt run. Kindly help me.

Your description is very confusing, can you please reply to this answer with:
1. What error you get when you do docker pull gcr.io/google_containers/pause:2.0
2. What cmd you're running in your container
For 2 you need a long running command, like while true; do echo SUCCESS; done, otherwise it'll just exit and get restarted by the kubelet with a RestartPolicy of Always.

Related

My Pods getting SIGTERM and exited gracefully as part of signalhandler but unable to find root cause of why SIGTERM sent from kubelet to my pods?

My Pods getting SIGTERM automatically for unknown reason. Unable to find root cause of why SIGTERM sent from kubelet to my pods is the concern to fix issue.
When I ran kubectl describe podname -n namespace, under events section Only killing event is present. I didn't see any unhealthy status before kill event.
Is there any way to debug further with events of pods or any specific log files where we can find trace of reason for sending SIGTERM?
I tried to do kubectl describe on events(killing)but it seems no such command to drill down events further.
Any other approach to debug this issue is appreciated.Thanks in advance!
kubectl desribe pods snippet
Please can you share the yaml of your deployment so we can try to replicate your problem.
Based on your attached screenshot, it looks like your readiness probe failed to complete repeatedly (it didn't run and fail, it failed to complete entirely), and therefore the cluster killed it.
Without knowing what your docker image is doing makes it hard to debug from here.
As a first point of debugging, you can try doing kubectl logs -f -n {namespace} {pod-name} to see what the pod is doing and seeing if it's erroring there.
The error Client.Timeout exceeded while waiting for headers implies your container is proxying something? So perhaps what you're trying to proxy upstream isn't responding.

Iguazio job is stuck on 'Pending' status

I have a job I am running in Iguazio. It starts and then the status is "Pending" and the icon is blue. It stays like this indefinitely and there is nothing in the logs that describes what is going on. How do I fix this?
A job stuck in this status is usually a Kubernetes issue. The reason there is no logs in the Iguazio dashboard for the job is because the pod never started, which is where the logs come from. You can navigate to the web shell / Jupyter service in Iguazio and use kubectl commands to find out what is going on in Kubernetes. Usually, I see this when there is an issue with the docker image for the pod, it either can’t be found or has bugs.
In a terminal: doing kubectl get pods and find your pod. It usually has ImagePullBackOff, or CrashLoopBackOff or some similar error. Check the docker image which is usually the culprit. You can kill the pod in Kubernetes, which in turn will error the job out. You can also “abort” the job from the menu in the dashboard under that specific job.

Kubernetes' pods' status is CrashLoopBackOff but no logs are showing up

I am a beginner learning about Kubernetes. I tried pulling an unofficial image from a private registry for zookeeper in my yaml file for testing but the pod status was ImagePullBackOff. Somehow I got that error rectified and the image was pulled successfully but the new error being reflected for pod status is CrashLoopBackOff. Upon using the command "kubectl logs -f -p zookeeper-n1-pod-0 -c zookeeper-n1 -n test-1" or using "kubectl logs podname" command in any way or form in putty terminal, there isn't any output, the cursor just moves to the next line. I tried "exit $?" command to see the exit status of my previous command and got the output as 0 which means that the last command was executed successfully yet I see the pod status as CrashLoopBackOff. I am not able to solve this issue as no logs are present. What is the probable cause and solution for this?
Thanks in advance!!
CrashLoopBackOff tells that a pod crashes right after the start. Kubernetes tries to start pod again, but again pod crashes and this goes in loop.
kubectl logs [podname] -p
the -p option will read the logs of the previous (crashed) instance.
Next, you can check "state reason","last state reason" and "Events" Section by describing pod.
kubectl describe pod -n
I would recommend you to check this blog Debugging CrashLoopBackOff
I had missed changing the zoo_sample.cfg configuration file's name to zoo.cfg in my Dockerfile commands which led to failure in Zookeeper server launch and led to ImagePullBackOff error, CrashLoopBackOff error, and no logs showing up. It is a compulsory step that must be followed because zkServer.sh looks for zoo.cfg on startup.

Heapster status stuck in Container Creating or Pending status

I am new to Kubernetes and started working with it from past one month.
When creating the setup of cluster, sometimes I see that Heapster will be stuck in Container Creating or Pending status. After this happens the only way have found here is to re-install everything from the scratch which has solved our problem. Later if I run the Heapster it would run without any problem. But I think this is not the optimal solution every time. So please help out in solving the same issue when it occurs again.
Heapster image is pulled from the github for our use. Right now the cluster is running fine, So could not send the screenshot of the heapster failing with it's status by staying in Container creating or Pending status.
Suggest any alternative for the problem to be solved if it occurs again.
Thanks in advance for your time.
A pod stuck in pending state can mean more than one thing. Next time it happens you should do 'kubectl get pods' and then 'kubectl describe pod '. However, since it works sometimes the most likely cause is that the cluster doesn't have enough resources on any of its nodes to schedule the pod. If the cluster is low on remaining resources you should get an indication of this by 'kubectl top nodes' and by 'kubectl describe nodes'. (Or with gke, if you are on google cloud, you often get a low resource warning in the web UI console.)
(Or if in Azure then be wary of https://github.com/Azure/ACS/issues/29 )

On what basis restart count in kubernetes increase

I have a kubernetes cluster running fine. It has 4 workers and 1 master with the dashboard to view the status. After running it for sometime, I looked at the Restart count of a node and it was 8. I immediately ran the describe command to get any events but there was no events for that pod. However when I checked the logs of the containers, I found out that the node itself was powered down and up 4 times but dont know why it didnt had any events.
In another node, while looking at the restart count, I got event as Sandbox changed which means probably the node was powered down for sometime and thus the master lost connection to it and so incremented the restart count by 2.
I wanted to know how can we get the logs/debug related to this restart count to know why it was restarted.
Whenever a pod is recreated, does it takes up a new name.? If so, how can we get the events of the previous pod.
Does sandbox changed event actually means that master actually lost connection.?
Step by step:
I'd check the kubelet and docker daemon logs, these restarts should appear somewhere in the logs and hopefully more info about what causes them.
Yes, the pod's name is unique thus it change everytime a pod is destroyed and recreated. You can try to find the pod with kubectl get po -a. Other solution is to get all events with kubectl get events and then filter to find your pod's events.
I've seen this error before and in my case it meant problem with the docker daemon networking. But I searched a bit in google and I saw many other reasons. Again, try to analyse the docker daemon and kubelet logs, and also dmesg. If you have doubts please add a link to the logs in your question and I'll try to help.