Kubernetes' pods' status is CrashLoopBackOff but no logs are showing up - kubernetes

I am a beginner learning about Kubernetes. I tried pulling an unofficial image from a private registry for zookeeper in my yaml file for testing but the pod status was ImagePullBackOff. Somehow I got that error rectified and the image was pulled successfully but the new error being reflected for pod status is CrashLoopBackOff. Upon using the command "kubectl logs -f -p zookeeper-n1-pod-0 -c zookeeper-n1 -n test-1" or using "kubectl logs podname" command in any way or form in putty terminal, there isn't any output, the cursor just moves to the next line. I tried "exit $?" command to see the exit status of my previous command and got the output as 0 which means that the last command was executed successfully yet I see the pod status as CrashLoopBackOff. I am not able to solve this issue as no logs are present. What is the probable cause and solution for this?
Thanks in advance!!

CrashLoopBackOff tells that a pod crashes right after the start. Kubernetes tries to start pod again, but again pod crashes and this goes in loop.
kubectl logs [podname] -p
the -p option will read the logs of the previous (crashed) instance.
Next, you can check "state reason","last state reason" and "Events" Section by describing pod.
kubectl describe pod -n
I would recommend you to check this blog Debugging CrashLoopBackOff

I had missed changing the zoo_sample.cfg configuration file's name to zoo.cfg in my Dockerfile commands which led to failure in Zookeeper server launch and led to ImagePullBackOff error, CrashLoopBackOff error, and no logs showing up. It is a compulsory step that must be followed because zkServer.sh looks for zoo.cfg on startup.

Related

My Pods getting SIGTERM and exited gracefully as part of signalhandler but unable to find root cause of why SIGTERM sent from kubelet to my pods?

My Pods getting SIGTERM automatically for unknown reason. Unable to find root cause of why SIGTERM sent from kubelet to my pods is the concern to fix issue.
When I ran kubectl describe podname -n namespace, under events section Only killing event is present. I didn't see any unhealthy status before kill event.
Is there any way to debug further with events of pods or any specific log files where we can find trace of reason for sending SIGTERM?
I tried to do kubectl describe on events(killing)but it seems no such command to drill down events further.
Any other approach to debug this issue is appreciated.Thanks in advance!
kubectl desribe pods snippet
Please can you share the yaml of your deployment so we can try to replicate your problem.
Based on your attached screenshot, it looks like your readiness probe failed to complete repeatedly (it didn't run and fail, it failed to complete entirely), and therefore the cluster killed it.
Without knowing what your docker image is doing makes it hard to debug from here.
As a first point of debugging, you can try doing kubectl logs -f -n {namespace} {pod-name} to see what the pod is doing and seeing if it's erroring there.
The error Client.Timeout exceeded while waiting for headers implies your container is proxying something? So perhaps what you're trying to proxy upstream isn't responding.

Iguazio job is stuck on 'Pending' status

I have a job I am running in Iguazio. It starts and then the status is "Pending" and the icon is blue. It stays like this indefinitely and there is nothing in the logs that describes what is going on. How do I fix this?
A job stuck in this status is usually a Kubernetes issue. The reason there is no logs in the Iguazio dashboard for the job is because the pod never started, which is where the logs come from. You can navigate to the web shell / Jupyter service in Iguazio and use kubectl commands to find out what is going on in Kubernetes. Usually, I see this when there is an issue with the docker image for the pod, it either can’t be found or has bugs.
In a terminal: doing kubectl get pods and find your pod. It usually has ImagePullBackOff, or CrashLoopBackOff or some similar error. Check the docker image which is usually the culprit. You can kill the pod in Kubernetes, which in turn will error the job out. You can also “abort” the job from the menu in the dashboard under that specific job.

Kubernetes: view logs of crashed Airflow worker pod

Pods on our k8s cluster are scheduled with Airflow's KubernetesExecutor, which runs all Tasks in a new pod.
I have a such a Task for which the pod instantly (after 1 or 2 seconds) crashes, and for which of course I want to see the logs.
This seems hard. As soon the pod crashes, it gets deleted, along with the ability to retrieve crash logs. I already tried all of:
kubectl logs -f <pod> -p: cannot be used since these pods are named uniquely
(courtesy of KubernetesExecutor).
kubectl logs -l label_name=label_value: I
struggle to apply the labels to the pod (if this is a known/used way of working, I'm happy to try further)
An shared nfs is mounted on all pods on a fixed log directory. The failing pod however, does not log to this folder.
When I am really quick I run kubectl logs -f -l dag_id=sample_dag --all-containers (dag_idlabel is added byAirflow)
between running and crashing and see Error from server (BadRequest): container "base" in pod "my_pod" is waiting to start: ContainerCreating. This might give me some clue but:
these are only but the last log lines
this is really backwards
I'm basically looking for the canonical way of retrieving logs from transient pods
You need to enable remote logging. Code sample below is for using S3. In airflow.cfg set the following:
remote_logging = True
remote_log_conn_id = my_s3_conn
remote_base_log_folder = s3://airflow/logs
The my_s3_conn can be set in airflow>Admin>Connections. In the Conn Type dropdown, select S3.

Kubernetes is Down

All of a sudden, I get this error when I run kubectl commands
The connection to the server xx.xx.xxx.xx:6443 was refused - did you specify the right host or port?
I can see kubelet running, but none of the other Kubernetes related services are running.
docker ps -a | grep kube-api
returns me nothing.
What I did after searching for the resolution in google:
Turned off swap --> issue persists
Restarted the linux machine -->
Right after restart, I could see the kubectl commands giving me output, but after about 15 seconds, it again went back to the error I mentioned above.
Restarted Kubelet --> For 1 sec, I could see the output for kubectl commands, but again back to square one.
I'm not sure what exactly I'm supposed to do here.
NB: K8 cluster is installed with kubeadm
Also, I can see the pods in EVICTED state for a brief time when I could see the kubectl get pods output

Pod Containers Keeps on Restarting

Container getting killed at node after pod creation
Issue was raised at github and asked me to move to SO
https://github.com/kubernetes/kubernetes/issues/24241
However i am briefing my issue here.After creating pod it doesnt run since i have to mention the container name in the kubelet args under --pod-infra-container-image as mentioned below.
I have solved the issue of Pods Status Container Creating by adding the container name in "--pod-infra-container-image= then pod creation was successful.
However I want to resolve this issue some other way instead of adding containers name in kubelet args. Kindly let me know how do I get this issue fixed.
Also after the pod creation is done. The containers keep on restarting. However if I check the logs via kubectl logs output shows the container expected output.
But the container restarts often. For restarting of pod what i did i have made the restartPolicy: never in spec file of pod and then it didnt restarted however container doesnt run. Kindly help me.
Your description is very confusing, can you please reply to this answer with:
1. What error you get when you do docker pull gcr.io/google_containers/pause:2.0
2. What cmd you're running in your container
For 2 you need a long running command, like while true; do echo SUCCESS; done, otherwise it'll just exit and get restarted by the kubelet with a RestartPolicy of Always.