How to diagnose a "Error syncing pod" pod on Kuberentes? - kubernetes

I've got a deployment that has a pod that is stuck at :
The describe output has some sensitive details in it, but the events has this at the end:
...
Normal Pulled 18m (x3 over 21m) kubelet, ip-10-151-21-127.ec2.internal Successfully pulled image "example/asdf"
Warning FailedSync 7m (x53 over 19m) kubelet, ip-10-151-21-127.ec2.internal Error syncing pod
What is the cause of this error? How can I diagnose this further?
It seems to be repulling the image, however it's odd that it's x10 over 27m I wonder if it's maybe reaching a timeout?
Warning FailedSync 12m (x53 over 23m) kubelet, ip-10-151-21-127.ec2.internal Error syncing pod
Normal Pulling 2m (x10 over 27m) kubelet, ip-10-151-21-127.ec2.internal pulling image "aoeuoeauhtona.epgso"

The kubelet process is responsible for pulling images from a registry.
This is how you can check the kubelet logs:
$ journalctl -u kubelet
More information about images can be found in documentation.

You can check the logs of your pod:
kubectl logs pod-id
More information here: https://kubernetes.io/docs/tasks/debug-application-cluster/debug-pod-replication-controller/

Related

Back-off pulling image metrics in kubernetes

i installed the k8s dashboard follows as github dashboard address.
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml
and output of
kubectl get pods --all-namespaces
kubernetes-dashboard dashboard-metrics-scraper-79c5968bdc-64wl9 0/1 ImagePullBackOff 0 48m
kubernetes-dashboard kubernetes-dashboard-9f9799597-w9cp9 1/1 Running 0 48m
and output of kubectl describe pod is
Normal Scheduled 41m default-scheduler Successfully assigned kubernetes-dashboard/dashboard-metrics-scraper-79c5968bdc-64wl9 to lab-vm
Normal Pulling 30m (x4 over 41m) kubelet Pulling image "kubernetesui/metrics-scraper:v1.0.6"
Warning Failed 27m (x7 over 37m) kubelet Error: ImagePullBackOff
Warning Failed 24m (x5 over 37m) kubelet Failed to pull image "kubernetesui/metrics-scraper:v1.0.6": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kubernetesui/metrics-scraper:v1.0.6": failed to resolve reference "docker.io/kubernetesui/metrics-scraper:v1.0.6": unexpected status code [manifests v1.0.6]: 408 Request Time-out
Warning Failed 10m (x7 over 37m) kubelet Error: ErrImagePull
Normal BackOff 81s (x73 over 37m) kubelet Back-off pulling image "kubernetesui/metrics-scraper:v1.0.6"
Please help me to fix this issue.
thanks a lot.

I can't get into the azure kubernetes pod

I want to check inside the pod the flaw I'm having inside the image but the errors below appear. As I verified in describe, the name of the container is correct. What else can I do to get the connection in the cluster?
Command: kubectl exec -it -c airflow-console -n airflow airflow-console-xxxxxxx-xxxxx bash
error: unable to upgrade connection: container not found ("airflow-console")
Command: kubectl describe pod/airflow-console-xxxxxxx-xxxxx -n airflow
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 37m default-scheduler Successfully assigned airflow/airflow-console-xxxxxxx-xxxxx to aks-test
Normal Pulling 37m kubelet, aks-test Pulling image "test.azurecr.io/airflow:2"
Normal Pulled 37m kubelet, aks-test Successfully pulled image "test.azurecr.io/airflow:2"
Warning BackOff 36m kubelet, aks-test Back-off restarting failed container
Normal Pulled 36m (x3 over 37m) kubelet, aks-test Container image "k8s.gcr.io/git-sync:v3.1.2" already present on machine
Normal Created 36m (x3 over 37m) kubelet, aks-test Created container git-sync
Normal Started 36m (x3 over 37m) kubelet, aks-test Started container git-sync
Normal Created 36m (x3 over 36m) kubelet, aks-test Created container airflow-console
Normal Pulled 36m (x2 over 36m) kubelet, aks-test Container image "test.azurecr.io/airflow:2" already present on machine
Normal Started 36m (x3 over 36m) kubelet, aks-test Started container airflow-console
Warning BackOff 2m15s (x178 over 36m) kubelet, aks-test Back-off restarting failed container
this line
Warning BackOff 2m15s (x178 over 36m) kubelet, aks-test Back-off restarting failed container
shows that your pod/container is in a failed state. This will prevent you from execution commands in the container due to it not being alive.
To learn why your pod/container is in a bad state, you should look at the logs of the failed container
kubectl logs -n airflow airflow-console-xxxxxxx-xxxxx -c airflow-console
or the logs off the previous container that failed. (sometimes it helps)
kubectl logs -n airflow airflow-console-xxxxxxx-xxxxx -c airflow-console -p
This explain the main reason why a user cannot exec into a container.

Troubles while installing GlusterFS on Kubernetes cluster using Heketi

I try to install GlusterFS on my kubernetes cluster using heketi. I start gk-deploy but it shows that pods aren't found:
Using Kubernetes CLI.
Using namespace "default".
Checking for pre-existing resources...
GlusterFS pods ... not found.
deploy-heketi pod ... not found.
heketi pod ... not found.
gluster-s3 pod ... not found.
Creating initial resources ... Error from server (AlreadyExists): error when creating "/heketi/gluster-kubernetes/deploy/kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
Error from server (AlreadyExists): clusterrolebindings.rbac.authorization.k8s.io "heketi-sa-view" already exists
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view not labeled
OK
node/sapdh2wrk1 not labeled
node/sapdh2wrk2 not labeled
node/sapdh2wrk3 not labeled
daemonset.extensions/glusterfs created
Waiting for GlusterFS pods to start ... pods not found.
I've started gk-deploy more than once.
I have 3 nodes in my kubernetes cluster and it seems like pods can't start up on none of them, but I don't understand why.
Pods are created but aren't ready:
kubectl get pods
NAME READY STATUS RESTARTS AGE
glusterfs-65mc7 0/1 Running 0 16m
glusterfs-gnxms 0/1 Running 0 16m
glusterfs-htkmh 0/1 Running 0 16m
heketi-754dfc7cdf-zwpwn 0/1 ContainerCreating 0 74m
Here is a log of one GlusterFS Pod, it ends with a warning:
Events:
Type Reason Age From Message
Normal Scheduled 19m default-scheduler Successfully assigned default/glusterfs-65mc7 to sapdh2wrk1
Normal Pulled 19m kubelet, sapdh2wrk1 Container image "gluster/gluster-centos:latest" already present on machine
Normal Created 19m kubelet, sapdh2wrk1 Created container
Normal Started 19m kubelet, sapdh2wrk1 Started container
Warning Unhealthy 13m (x12 over 18m) kubelet, sapdh2wrk1 Liveness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Warning Unhealthy 3m58s (x35 over 18m) kubelet, sapdh2wrk1 Readiness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Glusterfs-5.8-100.1 is installed and started up on every node including master.
What is the reason why Pods don't start up?

How to fix ImagePullBackOff with Kubernetespod?

I created a pod 5 hours ago.Now I have error:Pull Back Off
These are events from describe pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4h51m default-scheduler Successfully assigned default/nodehelloworld.example.com to minikube
Normal Pulling 4h49m (x4 over 4h51m) kubelet, minikube pulling image "milenkom/docker-demo"
Warning Failed 4h49m (x4 over 4h51m) kubelet, minikube Failed to pull image "milenkom/docker-demo": rpc error: code = Unknown desc = Error response from daemon: manifest for milenkom/docker-demo:latest not found
Warning Failed 4h49m (x4 over 4h51m) kubelet, minikube Error: ErrImagePull
Normal BackOff 4h49m (x6 over 4h51m) kubelet, minikube Back-off pulling image "milenkom/docker-demo"
Warning Failed 4h21m (x132 over 4h51m) kubelet, minikube Error: ImagePullBackOff
Warning FailedMount 5m13s kubelet, minikube MountVolume.SetUp failed for volume "default-token-zpl2j" : couldn't propagate object cache: timed out waiting for the condition
Normal Pulling 3m34s (x4 over 5m9s) kubelet, minikube pulling image "milenkom/docker-demo"
Warning Failed 3m32s (x4 over 5m2s) kubelet, minikube Failed to pull image "milenkom/docker-demo": rpc error: code = Unknown desc = Error response from daemon: manifest for milenkom/docker-demo:latest not found
Warning Failed 3m32s (x4 over 5m2s) kubelet, minikube Error: ErrImagePull
Normal BackOff 3m5s (x6 over 5m1s) kubelet, minikube Back-off pulling image "milenkom/docker-demo"
Warning Failed 3m5s (x6 over 5m1s) kubelet, minikube Error: ImagePullBackOff
Images on my desktop
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
milenkom/docker-demo tagname 08d27ff00255 6 hours ago 659MB
Following advices from Max and Shanica I made a mess when tagging
docker tag 08d27ff00255 docker-demo:latest
Works OK,but when I try
docker push docker-demo:latest
The push refers to repository [docker.io/library/docker-demo]
e892b52719ff: Preparing
915b38bfb374: Preparing
3f1416a1e6b9: Preparing
e1da644611ce: Preparing
d79093d63949: Preparing
87cbe568afdd: Waiting
787c930753b4: Waiting
9f17712cba0b: Waiting
223c0d04a137: Waiting
fe4c16cbf7a4: Waiting
denied: requested access to the resource is denied
although I am logged in
Output docker inspect image 08d27ff00255
[
{
"Id": "sha256:08d27ff0025581727ef548437fce875d670f9e31b373f00c2a2477f8effb9816",
"RepoTags": [
"docker-demo:latest",
"milenkom/docker-demo:tagname"
],
Why does it fail assigning pod now?
manifest for milenkom/docker-demo:latest not found
Looks like there's no latest tag in the image you want to pull: https://hub.docker.com/r/milenkom/docker-demo/tags.
Try some existing image.
UPD (based on question update):
docker push milenkom/docker-demo:tagname
update k8s pod to point to milenkom/docker-demo:tagname

Kubernetes pod deployment error FailedSync| Error syncing pod

Env:
Vbox on a windows 10 desktop machine
Two ubuntu VMs, one VM is master and the other one is k8s(1.7) worker.
I can see two nodes are "ready" when get nodes. But even deploy a very simple nginx pod, I got the error message from pod describe
"norm | SandboxChanged |Pod sandbox changed, it will be killed and re-created." and "warning | FailedSync| Error syncing pod".
But if I run the docker container directly on the worker, the container can be up and running. Anyone has a suggestion what I can check for?
k8s-master#k8smaster-VirtualBox:~$ **kubectl get pods** NAME
READY STATUS RESTARTS AGE
movie-server-1517284798-lbb01 0/1 CrashLoopBackOff 6
16m
k8s-master#k8smaster-VirtualBox:~$ **kubectl describe pod
movie-server-1517284798-lbb01**
--- clip --- kubelet, master-virtualbox spec.containers{movie-server} Warning FailedError: failed to start
container "movie-server": Error response from daemon:
{"message":"cannot join network of a non running container:
3f59947dbd404ecf2f6dd0b65dd9dad8b25bf0c418aceb8cf666ad0761402b53"}
kubelet, master-virtualbox spec.containers{movie-server}
Warning BackOffBack-off restarting failed container
kubelet, master-virtualbox Normal
SandboxChanged Pod sandbox changed, it will be killed and
re-created.
kubelet, master-virtualbox spec.containers{movie-server} Normal
PulledContainer image "nancyfeng/movie-server:0.1.0" already present
on machine
kubelet, master-virtualbox spec.containers{movie-server}
Normal CreatedCreated container
kubelet, master-virtualbox
Warning FailedSync Error syncing pod
kubelet, master-virtualbox spec.containers{movie-server}
Warning FailedError: failed to start container "movie-server": Error
response from daemon: {"message":"cannot join network of a non running
container:
72ba77b25b6a3969e8921214f0ca73ffaab4c82d8a2852e3d1b1f3ac5dde6ce1"}
--- clip ---