K8s pod ImagePullBackoff - kubernetes

created a very simple nginx pod and run into status ImagePullBackoff
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 32m default-scheduler Successfully assigned reloader/nginx to aks-appnodepool1-22779252-vmss000000
Warning Failed 29m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to resolve reference "docker.io/library/nginx:latest": failed to do request: Head "https://registry-1.docker.io/v2/library/nginx/manifests/latest": dial tcp 52.200.78.26:443: i/o timeout
Warning Failed 27m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to resolve reference "docker.io/library/nginx:latest": failed to do request: Head "https://registry-1.docker.io/v2/library/nginx/manifests/latest": dial tcp 52.21.28.242:443: i/o timeout
Warning Failed 23m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to resolve reference "docker.io/library/nginx:latest": failed to do request: Head "https://registry-1.docker.io/v2/library/nginx/manifests/latest": dial tcp 3.223.210.206:443: i/o timeout
Normal Pulling 22m (x4 over 32m) kubelet Pulling image "nginx"
Warning Failed 20m (x4 over 29m) kubelet Error: ErrImagePull
Warning Failed 20m kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to resolve reference "docker.io/library/nginx:latest": failed to do request: Head "https://registry-1.docker.io/v2/library/nginx/manifests/latest": dial tcp 3.228.155.36:443: i/o timeout
Warning Failed 20m (x7 over 29m) kubelet Error: ImagePullBackOff
Warning Failed 6m41s kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to resolve reference "docker.io/library/nginx:latest": failed to do request: Head "https://registry-1.docker.io/v2/library/nginx/manifests/latest": dial tcp 52.5.157.114:443: i/o timeout
Normal BackOff 2m17s (x65 over 29m) kubelet Back-off pulling image "nginx"
Checked network status:
A VM in the same subnet can access "https://registry-1.docker.io/v2/library/nginx/manifests/latest" and telnet 52.5.157.114 443 successful.
docker pull nginx successfully on the VM in the same subnet.
kubectl exec into a running pod in the same cluster can wget https://registry-1.docker.io/v2/library/nginx/manifests/latest successfully.
.
What is the possible problem?

When I wget/curl or anything you want to access
https://registry-1.docker.io/v2/library/nginx/manifests/latest
It says
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"repository","Class":"","Name":"library/nginx","Action":"pull"}]}]}
However this is because you need to be logged in to pull this image from this repository.
2 solutions:
The first is simple, in the image field just replace this url by nginx:latest and it should work
The second: create a regcred

in your pod yaml , change image : docker.io/library/nginx:latest to docker.io/nginx:latest

Turned out to be firewall dropped the package.

Related

coredns connection refused error while setting up kubernetes cluster

I've got a kubernetes cluster set up with kubeadm. I haven't deployed any pods yet, but the coredns pods are stuck in a ContainerCreating status.
[root#master-node ~]# kubectl get -A pods
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-64897985d-f5kjh 0/1 ContainerCreating 0 151m
kube-system coredns-64897985d-xz9nt 0/1 ContainerCreating 0 151m
[...]
When I check it out with kubectl describe I see this:
[root#master-node ~]# kubectl describe -n kube-system pod coredns-64897985d-f5kjh
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 22m (x570 over 145m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4974dadd11fecf1ebfbcccd75701641b752426808889895672f34e6934776207": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/4974dadd11fecf1ebfbcccd75701641b752426808889895672f34e6934776207": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 18m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "bce2558b24468c0d0e83fe1eedf2fa70108420a466d000b74ceaf351e595007d": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/bce2558b24468c0d0e83fe1eedf2fa70108420a466d000b74ceaf351e595007d": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 18m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e53e79bc3642c9a0c2b240dc174931af9f5dddf7d5b7df50382fcb3fea351df9": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/e53e79bc3642c9a0c2b240dc174931af9f5dddf7d5b7df50382fcb3fea351df9": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 18m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b6da6e72057c3b48ac6ced3ba6b81917111e94c20216b65126a2733462139ed1": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/b6da6e72057c3b48ac6ced3ba6b81917111e94c20216b65126a2733462139ed1": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 18m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "09416534b75ef7beea279f9389eb1a732b6a288c3b170a489e04cce01c294fa2": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/09416534b75ef7beea279f9389eb1a732b6a288c3b170a489e04cce01c294fa2": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "411fe06179ab24a3999b1c034bc99452d99249bbb6cb966b496f7a8b467e1806": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/411fe06179ab24a3999b1c034bc99452d99249bbb6cb966b496f7a8b467e1806": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e0fc2a5d4852cd31eca4b473f614cadcb9235a2a325c01b469110bfd6bbf9a3b": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/e0fc2a5d4852cd31eca4b473f614cadcb9235a2a325c01b469110bfd6bbf9a3b": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4528997239e55f7ef546c0af9cc7c12cf5fe4942a370ed2a772ba7fc405773d2": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/4528997239e55f7ef546c0af9cc7c12cf5fe4942a370ed2a772ba7fc405773d2": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b534273b4fe3b893cdeac05555e47429bc7578c1e0c0095481fe155637f0c4ae": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/b534273b4fe3b893cdeac05555e47429bc7578c1e0c0095481fe155637f0c4ae": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 17m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "afc479a4bfa16cef4367ecfee74333dfa9bbf12c59995446792f22c8e39ca16d": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/afc479a4bfa16cef4367ecfee74333dfa9bbf12c59995446792f22c8e39ca16d": dial tcp 127.0.0.1:6784: connect: connection refused
Warning FailedCreatePodSandBox 3m50s (x61 over 16m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a9254528ba611403a9b2293a2201c8758ff4adf75fd4a1d2b9690d15446cc92a": unable to allocate IP address: Post "http://127.0.0.1:6784/ip/a9254528ba611403a9b2293a2201c8758ff4adf75fd4a1d2b9690d15446cc92a": dial tcp 127.0.0.1:6784: connect: connection refused
Any idea what could be causing this?
Turns out this is a firewall issue. I was using Weavenet as my CNI, which requires port 6784 to be open to work. You can see this in the error, where it's trying to access 127.0.0.1:6784 and getting the connection refused (pretty obvious in hindsight). I fixed it by opening port 6784 on my firewall. For firewalld, I did
firewall-cmd --permanent --add-port=6784/tcp
firewall-cmd --reload
This might be a security problem. The weavenet docs said something about how this port should only be accessible to certain processes or something, not sure. For my application security isn't a big concern so I didn't bother looking into it.

AKS cluster to Nexus repo via VPN

i need to be able to connect to a nexus repository from my AKS cluster. The pods get deployed, but ends up with imagepullback error, with an rpc error code. The repo can be accessed only via a VPN and i can manually run a docker pull command to pull the image. But the pods on the AKS cluster are not able to connect. What am i doing wrong? Please help!
Here is the error message I get when I describe the pod. I have even applied a secret with the credentials for the remote nexus repo. Still no luck.
Events:
Type Reason Age From Message
Normal Scheduled 51s default-scheduler Successfully assigned my-namespace/export-example-5c649db546-trhm6 to aks-nodepool1-30782560-vmss00006z
Warning Failed 20s kubelet Failed to pull image "dockerhub.myrepoexample.com/omnius-vnext/export-example:4.0.1": rpc error: code = Unknown desc = failed to pull and unpack image "dockerhub.myrepoexample.com/omnius-vnext/export-example:4.0.1": failed to resolve reference "dockerhub.myrepoexample.com/omnius-vnext/export-example:4.0.1": failed to do request: Head "https://dockerhub.myrepoexample.com/v2/omnius-vnext/export-example/manifests/4.0.1": dial tcp 35.154.211.153:443: i/o timeout
Warning Failed 20s kubelet Error: ErrImagePull
Normal BackOff 20s kubelet Back-off pulling image "dockerhub.myrepoexample.com/omnius-vnext/export-example:4.0.1"
Warning Failed 20s kubelet Error: ImagePullBackOff
Normal Pulling 9s (x2 over 51s) kubelet Pulling image "dockerhub.myrepoexample.com/omnius-vnext/export-example:4.0.1"

What can cause a ErrImagePull in kubernetes? The registry returns 403

Hey I'm trying to get a pipeline to work with kubernetes but I keep getting ErrImagePull
Earlier I was getting something along the lines authentication failed.
I created a secret in the namespace of the pod and referring to it in the deployment file:
imagePullSecrets:
- name: "registry-secret"
I still get ErrImagePull but now for different reasons. When describing the failed pod I get:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m46s default-scheduler Successfully assigned <project> to <server>
Normal Pulling 3m12s (x4 over 4m45s) kubelet Pulling image "<container_url>"
Warning Failed 3m12s (x4 over 4m45s) kubelet Failed to pull image "<container_url>": rpc error: code = Unknown desc = Requesting bear token: invalid status code from registry 403 (Forbidden)
Warning Failed 3m12s (x4 over 4m45s) kubelet Error: ErrImagePull
Warning Failed 3m (x6 over 4m45s) kubelet Error: ImagePullBackOff
Normal BackOff 2m46s (x7 over 4m45s) kubelet Back-off pulling image "<container_url>"
I guess the Registry is returning 403, but why? Does it mean the user in registry-secret is not allowed to pull the image?
OP has posted in the comment that the problem is resolved:
I found the error. So I had a typo and my secret was in fact not created in the correct namespace.

helm3 installation fails with: failed post-install: timed out waiting for the condition

I'm new to Kubernetes and Helm. I have installed k3d and helm:
k3d version v1.7.0
k3s version v1.17.3-k3s1
helm version
version.BuildInfo{Version:"v3.2.4", GitCommit:"0ad800ef43d3b826f31a5ad8dfbb4fe05d143688", GitTreeState:"clean", GoVersion:"go1.13.12"}
I do have a cluster created with 10 worker nodes. When I try to install stackstorm-ha on the cluster I see the following issues:
helm install stackstorm/stackstorm-ha --generate-name --debug
client.go:534: [debug] stackstorm-ha-1592860860-job-st2-apikey-load: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Error: failed post-install: timed out waiting for the condition
helm.go:84: [debug] failed post-install: timed out waiting for the condition
njbbmacl2813:~ gangsh9$ kubectl get pods
Unable to connect to the server: net/http: TLS handshake timeout
kubectl describe pods either shows :
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/stackstorm-ha-1592857897-st2api-7f6c877b9c-dtcp5 to k3d-st2hatest-worker-5
Warning Failed 23m kubelet, k3d-st2hatest-worker-5 Error: context deadline exceeded
Normal Pulling 17m (x5 over 37m) kubelet, k3d-st2hatest-worker-5 Pulling image "stackstorm/st2api:3.3dev"
Normal Pulled 17m (x5 over 28m) kubelet, k3d-st2hatest-worker-5 Successfully pulled image "stackstorm/st2api:3.3dev"
Normal Created 17m (x5 over 28m) kubelet, k3d-st2hatest-worker-5 Created container st2api
Normal Started 17m (x4 over 28m) kubelet, k3d-st2hatest-worker-5 Started container st2api
Warning BackOff 53s (x78 over 20m) kubelet, k3d-st2hatest-worker-5 Back-off restarting failed container
or
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/stackstorm-ha-1592857897-st2timersengine-c847985d6-74h5k to k3d-st2hatest-worker-2
Warning Failed 6m23s kubelet, k3d-st2hatest-worker-2 Failed to pull image "stackstorm/st2timersengine:3.3dev": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/stackstorm/st2timersengine:3.3dev": failed to resolve reference "docker.io/stackstorm/st2timersengine:3.3dev": failed to authorize: failed to fetch anonymous token: Get https://auth.docker.io/token?scope=repository%3Astackstorm%2Fst2timersengine%3Apull&service=registry.docker.io: net/http: TLS handshake timeout
Warning Failed 6m23s kubelet, k3d-st2hatest-worker-2 Error: ErrImagePull
Normal BackOff 6m22s kubelet, k3d-st2hatest-worker-2 Back-off pulling image "stackstorm/st2timersengine:3.3dev"
Warning Failed 6m22s kubelet, k3d-st2hatest-worker-2 Error: ImagePullBackOff
Normal Pulling 6m10s (x2 over 6m37s) kubelet, k3d-st2hatest-worker-2 Pulling image "stackstorm/st2timersengine:3.3dev"
Kind of stuck here.
Any help would be greatly appreciated.
The TLS handshake timeout error is very common when the machine that you are running your deployment on is running out of resources. Alternative issue is caused by slow internet connection or some proxy settings but we ruled out that since you can pull and run docker images locally and deploy small nginx webserver in your cluster.
As you may notice in the stackstorm helm chart it installs a big amount of services/pods inside your cluster which can take up a lot of resources.
It will install 2 replicas for each component of StackStorm
microservices for redundancy, as well as backends like RabbitMQ HA,
MongoDB HA Replicaset and etcd cluster that st2 replies on for MQ, DB
and distributed coordination respectively.
I deployed stackstorm on both k3d and GKE but I had to use fast machines in order to deploy this quickly and successfully.
NAME: stackstorm
LAST DEPLOYED: Mon Jun 29 15:25:52 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
Congratulations! You have just deployed StackStorm HA!

FailedCreatePodSandBox on GCP Kubernetes cluster

Warning FailedCreatePodSandBox 6s (x6 over 74s) kubelet, gke-xxxx-default-pool-71axxxx-dkbr Failed create pod sandbox: rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: dial tcp: lookup k8s.gcr.io on 1xx.2xx.1xx.2xx:53: server misbehaving
This is the status of my pod when I run: kubectl describe pod <podname>
and also the pod says its in the container creating state.