kubernetes issue : Init:ErrImagePull - kubernetes

I have following critical issue on kubernetes.
From couple of days I'm fighting with images from docker issues.
I have one server with docker-ce and on another master-server on kubernetes.
I don't understant this issue because everything was working fine and last time I have issues wirth images on docker.
Images on docker are there, and from kubernetes I see below errors:
Failed to pull image "busybox": rpc error: code = Unknown desc = Error
response from daemon: Get https://registry-1.docker.io/v2/: net/http:
TLS handshake timeout
and also issues with get image from docker: Error: ErrImagePull
Back-off pulling image "busybox" Error: ImagePullBackOff
Can someone help me with that please ?

Related

k8s access denied when pulling local images

I use podman with a local registry. I am able to pull the images from the command line and also see the manifest. When I deploy my k8s it fails to pull the image with error access denied. Any idea's? I Googled for days now but do not get an answer that works.
I run Ubuntu 22.04 on VMWARE if that maybe makes a difference. Thank you.
kubelet Failed to pull image "localhost:5001/datawire/aes:3.1.0": rpc error: code = Unknown desc = failed to pull and unpack image "localhost:5001/datawire/aes:3.1.0": failed to resolve reference "localhost:5001/datawire/aes:3.1.0": failed to do request: Head "http://localhost:5001/v2/datawire/aes/manifests/3.1.0": dial tcp 127.0.0.1:5001: connect: connection refused
When Kubernetes attempts to create a new pod but is unable to pull the required container image, an error message stating "Failed to pull image" will be displayed. When you try to add a new resource to your cluster with a command like "kubectl apply," you should see this right away. When you inspect the pod with "kubectl describe pod/my-pod," the error will appear in the Events.
Pull errors are caused by the nodes in your cluster. Each node's Kubelet worker process is in charge of obtaining the images required to process a pod scheduling request. When a node is unable to download an image, the status is reported to the cluster control plane.
Images may not pull for a variety of reasons. It's possible that your nodes' networking failed, or that the cluster as a whole is experiencing connectivity issues. If you are online, the registry is up, and pull errors continue to occur, your firewall or traffic filtering may be malfunctioning.
Refer this doc1 and doc2 for more information.

Not able to pull registry.k8s.io in Microk8s start

I'm trying to kickstart a MicroK8s cluster but the Calico pod stays on Pending status because of a 403 error against the pull of registry.k8s.io/pause:3.7
This is the error:
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "registry.k8s.io/pause:3.7": failed to pull image "registry.k8s.io/pause:3.7": failed to pull and unpack image "registry.k8s.io/pause:3.7": failed to resolve reference "registry.k8s.io/pause:3.7": pulling from host registry.k8s.io failed with status code [manifests 3.7]: 403 Forbidden
We're talking about a new server which might be missing some configuration.
The insecure registry, according to Microk8s documentation, is enabled at localhost:3200.
I've enabled the dns on the Microk8s but nothing has change.
If I try to pull from docker I get a forbidden error.

Failed to pull image "velero/velero-plugin-for-gcp:v1.1.0" while installing Velero in GKE Cluster

I'm trying to install and configure Velero for kubernetes backup. I have followed the link to configure it in my GKE cluster. The installation went fine, but velero is not working.
I am using google cloud shell for running all my commands (I have installed and configured velero client in my google cloud shell)
On further inspection on velero deployment and velero pods, I found out that it is not able to pull the image from the docker repository.
kubectl get pods -n velero
NAME READY STATUS RESTARTS AGE
velero-5489b955f6-kqb7z 0/1 Init:ErrImagePull 0 20s
Error from velero pod (kubectl describe pod) (output redacted for readability - only relevant info shown below)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 38s default-scheduler Successfully assigned velero/velero-5489b955f6-kqb7z to gke-gke-cluster1-default-pool-a354fba3-8674
Warning Failed 22s kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Failed to pull image "velero/velero-plugin-for-gcp:v1.1.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 22s kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Error: ErrImagePull
Normal BackOff 21s kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Back-off pulling image "velero/velero-plugin-for-gcp:v1.1.0"
Warning Failed 21s kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Error: ImagePullBackOff
Normal Pulling 8s (x2 over 37s) kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Pulling image "velero/velero-plugin-for-gcp:v1.1.0"
Command used to install velero: (some of the values are given as variables)
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.1.0 \
--bucket $storagebucket \
--secret-file ~/velero-backup-storage-sa-key.json
Velero Version
velero version
Client:
Version: v1.4.2
Git commit: 56a08a4d695d893f0863f697c2f926e27d70c0c5
<error getting server version: timed out waiting for server status request to be processed>
GKE version
v1.15.12-gke.2
Isn't this a Private Cluster ? – mario 31 mins ago
#mario this is a private cluster but I can deploy other services without any issues (for eg: I have deployed nginx successfully) –
Sreesan 15 mins ago
Well, this is a know limitation of GKE Private Clusters. As you can read in the documentation:
Can't pull image from public Docker Hub
Symptoms
A Pod running in your cluster displays a warning in kubectl describe such as Failed to pull image: rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Potential causes
Nodes in a private cluster do not have outbound access to the public
internet. They have limited access to Google APIs and services,
including Container Registry.
Resolution
You cannot fetch images directly from Docker Hub. Instead, use images
hosted on Container Registry. Note that while Container Registry's
Docker Hub
mirror
is accessible from a private cluster, it should not be exclusively
relied upon. The mirror is only a cache, so images are periodically
removed, and a private cluster is not able to fall back to Docker Hub.
You can also compare it with this answer.
It can be easily verified on your own by making a simple experiment. Try to run two different nginx deployments. First based on image nginx (which equals to nginx:latest) and the second one based on nginx:1.14.2.
While the first scenario is perfectly feasible because the nginx:latest image can be pulled from Container Registry's Docker Hub mirror which is accessible from a private cluster, any attempt of pulling nginx:1.14.2 will fail which you'll see in Pod events. It happens because the kubelet is not able to find this version of the image in GCR and it tries to pull it from public docker registry (https://registry-1.docker.io/v2/), which in Private Clusters is not possible. "The mirror is only a cache, so images are periodically removed, and a private cluster is not able to fall back to Docker Hub." - as you can read in docs.
If you still have doubts, just ssh into your node and try to run following commands:
curl https://cloud.google.com/container-registry/
curl https://registry-1.docker.io/v2/
While the first one works perfectly, the second one will eventually fail:
curl: (7) Failed to connect to registry-1.docker.io port 443: Connection timed out
Reason ? - "Nodes in a private cluster do not have outbound access to the public internet."
Solution ?
You can search what is currently available in GCR here.
In many cases you should be able to get the required image if you don't specify it's exact version (by default latest tag is used). While it can help with nginx, unfortunatelly no version of velero/velero-plugin-for-gcp is currently available in Google Container Registry's Docker Hub mirror.
Granting private nodes outbound internet access by using Cloud NAT seems the only reasonable solution that can be applied in your case.
I solved this problem by realizing that version of:
velero/velero-plugin-for-gcp
is not following the version of:
velero/velero
For example, now latest versions are:
velero/velero:v1.9.1 and velero/velero-plugin-for-gcp:v1.5.0

One node for a GKE cluster cannot pull image from dockerhub

This is a very wried thing.
I created a private GKE cluster with a node pool of 3 nodes. Then I have a replica set with 3 pods. some of these pods will be scheduled to one node.
So one of these pods always get ImagePullBackOff, I check the error
Failed to pull image "bitnami/mongodb:3.6": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
And the pods scheduled to the remaining two nodes work well.
I ssh to that node, run docker pull and everything is fine. I cannot find another way to troubleshoot this error.
I tried to drain or delete that node and let the cluster to recreate the node. but it is still not working.
Help me, please.
Update:
From GCP documentation, it will fail to pull images from the docker hub.
BUT the weirdest thing is ONLY ONE node is unable to pull the images.
There was a related reported bug in Kubernetes 1.11
Make sure it is not your case

kubernetes | docker | no image found error while rolling update

Have create updated image with new tag for rolling but then while performing update with this command: kubectl set image deployments/hello-node-1 hello-node-1=hello-node:v2
Getting error: kubelet, minikube Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "hello-node-1" with ErrImagePull: "rpc error: code = 2 desc = Error: image library/hello-node not found"
It looks like you didn't set the image correctly. Did you push it to the correct repository? A way to test it could be to create a new deployment that uses your newly created image.
You are referring to the wrong image. The error message shows that the kubelet is attempting to pull hello-node:v2 as an official image from docker hub (library/...).
If you did push your image to docker hub then prefix the image name with your docker hub username.
If this is in some private repository then prefix it with a repository hostname.
If you built the image locally on the node then make sure your imagePullPolicy in your Deployment is set to IfNotPresent and make sure the image is actually present on all nodes this pod might be scheduled to run on.
For minikube check out this post.