Pulling a image from gcr.to fails - kubernetes

I am able to create a kubernetes cluster and I followed the steps in to pull a private image from GCR repository.
https://cloud.google.com/container-registry/docs/advanced-authentication
https://cloud.google.com/container-registry/docs/access-control
I am unable to pull the image from GCR. I have used the below commands
gcloud auth login
I have authendiacted the service accounts.
Connection between the local machine and gcr as well.
Below is the error
$ kubectl describe pod test-service-55cc8f947d-5frkl
Name: test-service-55cc8f947d-5frkl
Namespace: default
Priority: 0
Node: gke-test-gke-clus-test-node-poo-c97a8611-91g2/10.128.0.7
Start Time: Mon, 12 Oct 2020 10:01:55 +0530
Labels: app=test-service
pod-template-hash=55cc8f947d
tier=test-service
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container test-service
Status: Pending
IP: 10.48.0.33
IPs:
IP: 10.48.0.33
Controlled By: ReplicaSet/test-service-55cc8f947d
Containers:
test-service:
Container ID:
Image: gcr.io/test-256004/test-service:v2
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: ErrImagePull
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment:
test_SERVICE_BUCKET: test-pt-prod
COPY_FILES_DOCKER_IMAGE: gcr.io/test-256004/test-gcs-copy:latest
test_GCP_PROJECT: test-256004
PIXALATE_GCS_DATASET: test_pixalate
PIXALATE_BQ_TABLE: pixalate
APP_ADS_TXT_GCS_DATASET: test_appadstxt
APP_ADS_TXT_BQ_TABLE: appadstxt
Mounts:
/test/output from test-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-6g7nl (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
test-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: test-pvc
ReadOnly: false
default-token-6g7nl:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-6g7nl
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 42s default-scheduler Successfully assigned default/test-service-55cc8f947d-5frkl to gke-test-gke-clus-test-node-poo-c97a8611-91g2
Normal SuccessfulAttachVolume 38s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-25025b4c-2e89-4400-8e0e-335298632e74"
Normal SandboxChanged 31s kubelet, gke-test-gke-clus-test-node-poo-c97a8611-91g2 Pod sandbox changed, it will be killed and re-created.
Normal Pulling 15s (x2 over 32s) kubelet, gke-test-gke-clus-test-node-poo-c97a8611-91g2 Pulling image "gcr.io/test-256004/test-service:v2"
Warning Failed 15s (x2 over 32s) kubelet, gke-test-gke-clus-test-node-poo-c97a8611-91g2 Failed to pull image "gcr.io/test-256004/test-service:v2": rpc error: code = Unknown desc = Error response from daemon: pull access denied for gcr.io/test-256004/test-service, repository does not exist or may require 'docker login': denied: Permission denied for "v2" from request "/v2/test-256004/test-service/manifests/v2".
Warning Failed 15s (x2 over 32s) kubelet, gke-test-gke-clus-test-node-poo-c97a8611-91g2 Error: ErrImagePull
Normal BackOff 3s (x4 over 29s) kubelet, gke-test-gke-clus-test-node-poo-c97a8611-91g2 Back-off pulling image "gcr.io/test-256004/test-service:v2"
Warning Failed 3s (x4 over 29s) kubelet, gke-test-gke-clus-test-node-poo-c97a8611-91g2 Error: ImagePullBackOff

If you don't use workload identity, the default service account of your pod is this one of the nodes, and the nodes, by default, use the Compute Engine service account.
Make sure to grant it the correct permission to access to GCR.
If you use another service account, grant it with the Storage Object Reader role (when you pull an image, you read a blob stored in Cloud Storage (at least it's the same permission)).
Note: even if it's the default service account, I don't recommend to use the Compute Engine service account with any change in its roles. Indeed, it is project editor, that is a lot of responsability.

Related

Failed to pull image with "x509: certificate signed by unknown authority" error

I am using k3s kubernetes, and Harbor as a private container registry. I use a self-sign cert in Harbor. And I have a sample image in Harbor, which I want to create a sample pod in Kubernetes using this private Harbor image.
I created a file call testPod.yml with the following content to create the pod:
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- name: test
image: harbor-server/t_project/test:001
imagePullSecrets:
- name: testcred
However, there is an error after I applied this yml file, x509: certificate signed by unknow authority, which is shown below:
Name: test
Namespace: default
Priority: 0
Node: server/10.1.0.11
Start Time: Thu, 07 Jul 2022 15:20:32 +0800
Labels: <none>
Annotations: <none>
Status: Pending
IP: 10.42.2.164
IPs:
IP: 10.42.2.164
Containers:
test:
Container ID:
Image: harbor-server/t_project/test:001
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-47cgb (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-47cgb:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 19s default-scheduler Successfully assigned default/test to server
Normal BackOff 19s kubelet Back-off pulling image "harbor-server/t_project/test:001"
Warning Failed 19s kubelet Error: ImagePullBackOff
Normal Pulling 4s (x2 over 19s) kubelet Pulling image "harbor-server/t_project/test:001"
Warning Failed 4s (x2 over 19s) kubelet Failed to pull image "harbor-server/t_project/test:001": rpc error: code = Unknown desc = failed to pull and unpack image "harbor-server/t_project/test:001": failed to resolve reference "harbor-server/t_project/test:001": failed to do request: Head "https://harbor-server:443/v2/t_project/test/manifests/001?ns=harbor-server": x509: certificate signed by unknown authority
Warning Failed 4s (x2 over 19s) kubelet Error: ErrImagePull
How to solve this x509 error? Is there any step that I have missed?
The CA’s certificate needs to be trusted first.
Put the CA into the host system’s trusted CA's chain. Run the following command.
sudo mkdir -p /usr/local/share/ca-certificates/myregistry
sudo cp registry/myca.pem /usr/local/share/ca-certificates/myregistry/myca.crt
sudo update-ca-certificates
Notice, the cert on the specific directory have to be named with crt extension. restart the K3s service to let the change in effect.

Pod with Debian image is getting created but container is continuously crashing

Below is my Pod manifest:
apiVersion: v1
kind: Pod
metadata:
name: pod-debian-container
spec:
containers:
- name: pi
image: debian
command: ["/bin/echo"]
args: ["Hello, World."]
And below is the output of "describe" command for this Pod:
C:\Users\so.user\Desktop>kubectl describe pod/pod-debian-container
Name: pod-debian-container
Namespace: default
Priority: 0
Node: minikube/192.168.49.2
Start Time: Mon, 15 Feb 2021 21:47:43 +0530
Labels: <none>
Annotations: <none>
Status: Running
IP: 10.244.0.21
IPs:
IP: 10.244.0.21
Containers:
pi:
Container ID: cri-o://f9081af183308f01bf1de6108b2c988e6bcd11ab2daedf983e99e1f4d862981c
Image: debian
Image ID: docker.io/library/debian#sha256:102ab2db1ad671545c0ace25463c4e3c45f9b15e319d3a00a1b2b085293c27fb
Port: <none>
Host Port: <none>
Command:
/bin/echo
Args:
Hello, World.
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 15 Feb 2021 21:56:49 +0530
Finished: Mon, 15 Feb 2021 21:56:49 +0530
Ready: False
Restart Count: 6
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-sxlc9 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-sxlc9:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-sxlc9
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15m default-scheduler Successfully assigned default/pod-debian-container to minikube
Normal Pulled 15m kubelet Successfully pulled image "debian" in 11.1633901s
Normal Pulled 15m kubelet Successfully pulled image "debian" in 11.4271866s
Normal Pulled 14m kubelet Successfully pulled image "debian" in 11.0252907s
Normal Pulled 14m kubelet Successfully pulled image "debian" in 11.1897469s
Normal Started 14m (x4 over 15m) kubelet Started container pi
Normal Pulling 13m (x5 over 15m) kubelet Pulling image "debian"
Normal Created 13m (x5 over 15m) kubelet Created container pi
Normal Pulled 13m kubelet Successfully pulled image "debian" in 9.1170801s
Warning BackOff 5m25s (x31 over 15m) kubelet Back-off restarting failed container
Warning Failed 10s kubelet Error: ErrImagePull
And below is another output:
C:\Users\so.user\Desktop>kubectl get pod,job,deploy,rs
NAME READY STATUS RESTARTS AGE
pod/pod-debian-container 0/1 CrashLoopBackOff 6 15m
Below are my question:
I can see that Pod is running but Container inside it is crashing. I can't understand "why" because I see that Debian image is successfully pulled
As you can see in "kubectl get pod,job,deploy,rs" output, RESTARTS is equal to 6, is it the Pod which has restarted 6 times or is it the container?
Why 6 restart happened, I didn't mention anything in my spec
This looks like a liveness problem related to the CrashLoopBackOff have you cosidered taking a look into this blog it explains very well how to debug the problem blog

How to use private registry provider, Service Account - from Kubernertes deployments

Update I suspect this to be a google issue, I have created a new more clean question here.
Update: yes this is different than the suggested "This question may already have an answer here:", as this is about a "Service Account" - not a "User accounts".
Do you now how to use a private registry like Google Container Registry from DigitalOcean or any other Kubernetes not running on the same provider?
I tried following this, but unfortunately it did not work for me.
Update: I suspect it to be a Google SA issue, I will go and try using Docker Hub and get back if that succeeds. I am still curious to see the solution for this, so please let me know - thanks!
Update: Also tried this
Update: tried to activate Google Service Account
Update: tried to download Google Service Account key
Update: in the linked description is says:
kubectl create secret docker-registry $SECRETNAME \
--docker-server=https://gcr.io \
--docker-username=_json_key \
--docker-email=user#example.com \
--docker-password="$(cat k8s-gcr-auth-ro.json)"
Is the --docker-password="$(cat k8s-gcr-auth-ro.json)" really the password?
If I do cat k8s-gcr-auth-ro.json the format is:
{
"type": "service_account",
"project_id": "<xxx>",
"private_key_id": "<xxx>",
"private_key": "-----BEGIN PRIVATE KEY-----\<xxx>\n-----END PRIVATE KEY-----\n",
"client_email": "k8s-gcr-auth-ro#<xxx>.iam.gserviceaccount.com",
"client_id": "<xxx>",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/k8s-gcr-auth-ro%<xxx>.iam.gserviceaccount.com"
}
kubectl get pods
I get: ...is waiting to start: image can't be pulled
from a deployment with:
image: gcr.io/<project name>/<image name>:v1
deployment.yaml
# K8s - Deployment
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: <image-name>-deployment-v1
spec:
replicas: 1
template:
metadata:
labels:
app: <image-name>-deployment
version: v1
spec:
containers:
- name: <image-name>
image: gcr.io/<project-name>/<image-name>:v1
imagePullPolicy: Always
ports:
- containerPort: 80
imagePullSecrets:
- name: <name-of-secret>
I can see from the following that it logs: repository does not exist or may require 'docker login'
kubectl describe pod :
k describe pod <image-name>-deployment-v1-844568c768-5b2rt
Name: <image-name>-deployment-v1-844568c768-5b2rt
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: my-cluster-digitalocean-1-7781/10.135.153.236
Start Time: Mon, 25 Mar 2019 15:51:37 +0100
Labels: app=<image-name>-deployment
pod-template-hash=844568c768
version=v1
Annotations: <none>
Status: Pending
IP: <ip address>
Controlled By: ReplicaSet/<image-name>-deployment-v1-844568c768
Containers:
chat-server:
Container ID:
Image: gcr.io/<project-name/<image-name>:v1
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dh8dh (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-dh8dh:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-dh8dh
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 50s default-scheduler Successfully assigned default/<image-name>-deployment-v1-844568c768-5b2rt to my-cluster-digitalocean-1-7781
Normal Pulling 37s (x2 over 48s) kubelet, my-cluster-digitalocean-1-7781 pulling image "gcr.io/<project-name><image-name>:v1"
Warning Failed 37s (x2 over 48s) kubelet, my-cluster-digitalocean-1-7781 Failed to pull image "gcr.io/<project-name>/<image-name>:v1": rpc error: code = Unknown desc = Error response from daemon: pull access denied for gcr.io/<project-name>/<image-name>, repository does not exist or may require 'docker login'
Warning Failed 37s (x2 over 48s) kubelet, my-cluster-digitalocean-1-7781 Error: ErrImagePull
Normal SandboxChanged 31s (x7 over 47s) kubelet, my-cluster-digitalocean-1-7781 Pod sandbox changed, it will be killed and re-created.
Normal BackOff 29s (x6 over 45s) kubelet, my-cluster-digitalocean-1-7781 Back-off pulling image "gcr.io/<project-name>/<image-name>:v1"
Warning Failed 29s (x6 over 45s) kubelet, my-cluster-digitalocean-1-7781 Error: ImagePullBackOff
Just a note: docker pull on local machine pulls the image alright

Kafka Pod doesn't start on GKE

I followed this tutorial and when I tried to run it on GKE I was not able to start kafka pod.
It returns CrashLoopBackOff all the time. And I don't know how to show pod error logs.
Here is the result when I hit kubectl describe pod my-pod-xxx:
Name: kafka-broker1-54cb95fb44-hlj5b
Namespace: default
Node: gke-xxx-default-pool-f9e313ed-zgcx/10.146.0.4
Start Time: Thu, 25 Oct 2018 11:40:21 +0900
Labels: app=kafka
id=1
pod-template-hash=1076519600
Annotations: kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container kafka
Status: Running
IP: 10.48.8.10
Controlled By: ReplicaSet/kafka-broker1-54cb95fb44
Containers:
kafka:
Container ID: docker://88ee6a1df4157732fc32b7bd8a81e329dbdxxxx9cbe614689e775d183dbcd61
Image: wurstmeister/kafka
Image ID: docker-pullable://wurstmeister/kafka#sha256:4f600a95fa1288f7b1xxxxxa32ca00b4fb13b83b31533fa6b40499bd9bdf192f
Port: 9092/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Thu, 25 Oct 2018 14:35:32 +0900
Finished: Thu, 25 Oct 2018 14:35:51 +0900
Ready: False
Restart Count: 37
Requests:
cpu: 100m
Environment:
KAFKA_ADVERTISED_PORT: 9092
KAFKA_ADVERTISED_HOST_NAME: 35.194.100.32
KAFKA_ZOOKEEPER_CONNECT: zoo1:2181
KAFKA_BROKER_ID: 1
KAFKA_CREATE_TOPICS: topic1:3:3
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-w6s7n (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-w6s7n:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-w6s7n
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 5m (x716 over 2h) kubelet, gke-xxx-default-pool-f9e313ed-zgcx Back-off restarting failed container
Normal Pulling 36s (x38 over 2h) kubelet, gke-xxxdefault-pool-f9e313ed-zgcx pulling image "wurstmeister/kafka"
I noticed that on the first run it is going well but after that,Node is changing status to NotReady and kafka pod is entering the CrashLoopBackOff
state.
Here is the log before it goes down:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m default-scheduler Successfully assigned kafka-broker1-54cb95fb44-wwf2h to gke-xxx-default-pool-f9e313ed-8mr6
Normal SuccessfulMountVolume 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 MountVolume.SetUp succeeded for volume "default-token-w6s7n"
Normal Pulling 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 pulling image "wurstmeister/kafka"
Normal Pulled 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 Successfully pulled image "wurstmeister/kafka"
Normal Created 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 Created container
Normal Started 5m kubelet, gke-xxx-default-pool-f9e313ed-8mr6 Started container
Normal NodeControllerEviction 38s node-controller Marking for deletion Pod kafka-broker1-54cb95fb44-wwf2h from Node gke-dev-centurion-default-pool-f9e313ed-8mr6
Could anyone tell me what's wrong with my pod and how can I catch the error for pod failure?
I just figured out that my cluster's nodes have not enough resources.
After creating a new cluster with more memory, it works.

Unable to pull from private docker hub registry on kubernetes

I'm running a k8 cluster on google container engine. I'm having trouble getting it to pull images from a private docker repo.
I get the following when trying to boot:
Name: ds-expected-date
Namespace: default
Node: gke-ds-cluster-1-default-pool-8980b100-l64j/10.132.0.3
Start Time: Wed, 24 May 2017 13:24:11 +0100
Labels: <none>
Annotations: kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container ds-expected-date-flask
Status: Pending
IP: 10.40.0.23
Controllers: <none>
Containers:
ds-expected-date-flask:
Container ID:
Image: fluidy/ds-expected-date:latest
Image ID:
Port:
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-h340m (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-h340m:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-h340m
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
21s 21s 1 default-scheduler Normal Scheduled Successfully assigned ds-expected-date to gke-ds-cluster-1-default-pool-8980b100-l64j
18s 18s 1 kubelet, gke-ds-cluster-1-default-pool-8980b100-l64j spec.containers{ds-expected-date-flask} Normal BackOff Back-off pulling image "fluidy/ds-expected-date:latest"
18s 18s 1 kubelet, gke-ds-cluster-1-default-pool-8980b100-l64j Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "ds-expected-date-flask" with ImagePullBackOff: "Back-off pulling image \"fluidy/ds-expected-date:latest\""
20s 6s 2 kubelet, gke-ds-cluster-1-default-pool-8980b100-l64j spec.containers{ds-expected-date-flask} Normal Pulling pulling image "fluidy/ds-expected-date:latest"
19s 5s 2 kubelet, gke-ds-cluster-1-default-pool-8980b100-l64j spec.containers{ds-expected-date-flask} Warning Failed Failed to pull image "fluidy/ds-expected-date:latest": Error response from daemon: unauthorized: authentication required
19s 5s 2 kubelet, gke-ds-cluster-1-default-pool-8980b100-l64j Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "ds-expected-date-flask" with ErrImagePull: "Error response from daemon: unauthorized: authentication required"
I have followed all the instructions on the docs page. I'm confident my registry secret is being read - if I put duff credentials in it, the error changes to 'invalid user name or password'.
You have not configured your cluster to pull private images from Docker Hub with your credentials.
Read and apply this guide: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
Google Container Engine can automatically pull from Google Container Registry (http://gcr.io), consider using that without pulling images from a private registry.