kubernetes can't pull certain images from ibm cloud registry - kubernetes

My pod does the following:
Warning Failed 21m (x4 over 23m) kubelet, 10.76.199.35 Failed to pull image "registryname/image:version1.2": rpc error: code = Unknown desc = Error response from daemon: unauthorized: authentication required
but other images will work. The output of
ibmcloud cr images
doesn't show anything different about the images that don't work. What could be going wrong here?

Given this is in kubenetes and you can see the image in ibmcloud cr images it most likely going to be a misconfiguration of your imagePullSecrets.
If you do kubectl get pod <pod-name> -o yaml you will be able to see the what imagePullSecrets are in scope for the pod and check if it looks correct (could be worth comparing it to a pod that is working).
It's worth noting that if your cluster is an instance in the IBM Cloud Kubernetes Service a default imagePullSecret for your account is added to the default namespace and therefore if you are running the pod in a different Kubenetes namespace you will need to do additional steps to make that work. This is a good place to start for information on this topic.
https://console.bluemix.net/docs/containers/cs_images.html#other

Looks like you haven't logged into the IBM Cloud Container registry. If you haven't done this yet, You should login with this command
ibmcloud cr login
Other issues can be
Docker is not installed.
The Docker client is not logged in to IBM Cloud Container Registry.
Your IBM Cloud access token might have expired.
You can find more troubleshooting instructions here

Related

not able to pull image in POD ,getting ImagePullBackOff

I have my kubernetes nodes on different vms . each VM has 1 kubernetes node . in total I have 7 worker nodes
While trying to create POD on 1 node I get ImagepullBackOff error while docker pull on the same node is successful .
rest of the worker nodes are working fine
My docker registry is already set as insecure-regiry in daemon.json
pls help
ImagePullBackOff is almost always a typo in the image name. Make sure you specified the name correctly.
You need to describe the Pod using: kubectl describe pod <name>. It will show a more detailed message why pulling fails.
The kubernetes service account attached to the Pod is probably not able to pull the image. The service account must have the correct ImagePullSecrets.
When no service account is configured, it uses the default service account.
kubectl get sa -o yaml
This will give a list of ImagePullSecrets attached to this service account. See if you have created the correct secret and attached it to the service account.
resolved the issue.
issue was the Container runtime. affected nodes were using containrd as runtime and I setup these nodes to access my insecure registry for containerd . everything was OK after that.

401 Unauthorized error while trying to pull image from Google Container Registry

I am using google container registry (GCR) to push and pull docker images. I have created a deployment in kubernetes with 3 replicas. The deployment will use a docker image pulled from the GCR.
Out of 3 replicas, 2 are pulling the images and running fine.But the third replica is showing the below error and the pod's status remains "ImagePullBackOff" or "ErrImagePull"
"Failed to pull image "gcr.io/xxx:yyy": rpc error: code = Unknown desc
= failed to pull and unpack image "gcr.io/xxx:yyy": failed to resolve reference "gcr.io/xxx:yyy": unexpected status code: 401 Unauthorized"
I am confused like why only one of the replicas is showing the error and the other 2 are running without any issue. Can anyone please clarify this?
Thanks in Advance!
ImagePullBackOff and ErrImagePull indicate that the image used by a container cannot be loaded from the image registry.
401 unauthorized error might occur when you pull an image from a private Container Registry repository. For troubleshooting the error:
Identify the node that runs the pod by kubectl describe pod POD_NAME | grep "Node:"
Verify the node has the storage scope by running the command
gcloud compute instances describe NODE_NAME --zone=COMPUTE_ZONE --format="flattened(serviceAccounts[].scopes)"
The node's access scope should contain at least one of the following:
serviceAccounts[0].scopes[0]: https://www.googleapis.com/auth/devstorage.read_only
serviceAccounts[0].scopes[0]: https://www.googleapis.com/auth/cloud-platform
Recreate the node pool that node belongs to with sufficient scope and you cannot modify existing nodes, you must recreate the node with the correct scope.
Create a new node pool with the gke-default scope by the following command
gcloud container node-pools create NODE_POOL_NAME --cluster=CLUSTER_NAME --zone=COMPUTE_ZONE --scopes="gke-default"
Create a new node pool with only storage scope
gcloud container node-pools create NODE_POOL_NAME --cluster=CLUSTER_NAME --zone=COMPUTE_ZONE --scopes="https://www.googleapis.com/auth/devstorage.read_only"
Refer to the link for more information on the troubleshooting process.
Hi you will setup role for cluster to access GCR images for pulling and pushing you can see https://github.com/GoogleContainerTools/skaffold/issues/336

How to find the pod that led to an error in GKE

If I look at my logs in GCP logs, I see for instance that I got a request that gave 500
log_message: "Method: some_cloud_goo.Endpoint failed: INTERNAL_SERVER_ERROR"
I would like to quickly go to that pod and do a kubectl logs on it. But I did not find a way to do this.
I am fairly new to k8s and GKE, any way to traceback the pod that handled that request?
You could run command "kubectl get pods " on each node to check the status of all pods and could figure out accordingly by running for detail description of an error " kubectl describe pod pod-name"
As mentioned in #Neelam answer, you can can get the pod names with the command kubectl get pods -A and log into all your pods to find the error.
Or, alternatively, you could deploy a custom monitoring system like Elastic GKE Logging available in GCP github Click-to-deploy.
See here to install from MarketPlace with few clicks.
It is a free alternative to have a complete monitoring system and you can filter your logs in Kibana dashboard after deployed.

In kubernetes not able to attach to container in a pod

I am not able to attach to a container in a pod. Receiving below message
Error from server (Forbidden): pods "sleep-76df4f989c-mqvnb" is forbidden: cannot exec into or attach to a privileged container
Could someone please let me what i am missing?
This seems to be a permission (possibly RBAC) issue.
See Kubernetes pod security-policy.
For instance gluster/gluster-kubernetes issue 432 points to Azure PR 1961, which disable the cluster-admin rights (although you can customize/override the admission-controller flags passed to the API server).
So it depends on the nature of your Kubernetes environment.
I have not enabled RBAC at all. What I have done is that i have enabled istio and all the pods are now running with side car.
I am not able to attach or exec to pods which have istio.
I am able to attach or exec which do not have istio proxy side car.
Need help here.

Kubernetes unable to pull images from gcr.io

I am trying to setup Kubernetes for the first time. I am following the Fedora Manual installation guide: http://kubernetes.io/v1.0/docs/getting-started-guides/fedora/fedora_manual_config.html
I am trying to get the kubernetes addons running , specifically the kube-ui. I created the service and replication controller like so:
kubectl create -f cluster/addons/kube-ui/kube-ui-rc.yaml --namespace=kube-system
kubectl create -f cluster/addons/kube-ui/kube-ui-svc.yaml --namespace=kube-system
When i run
kubectl get events --namespace=kube-system
I see errors such as this:
Failed to pull image "gcr.io/google_containers/pause:0.8.0": image pull failed for gcr.io/google_containers/pause:0.8.0, this may be because there are no credentials on this request. details: (Authentication is required.)
How am i supposed to tell kubernetes to authenticate? This isnt covered in the documentation. So how do i fix this?
This happened due to a recent outage to gce storage as a result of which all of us went through this error while pulling images from gcr (which uses gce storage on the backend).
Are you still seeing this error ?
as the message says, you need credentials. Are you using Google Container Engine? Then you need to run
gcloud config set project <your-project>
gcloud config set compute/zone <your-zone, like us-central1-f>
gcloud beta container clusters get-credentials --cluster <your-cluster-name>
then your GCE cluster will have the credentials