Unable to pull Artifact Registry private images in newly created GKE cluster - kubernetes

I cannot pull artifact registry images to a newly created GKE cluster with Terraform and a user-defined service account.
The terraform used to stand up the cluster is below.
locals {
service = "example"
resource_prefix = format("%s-%s", local.service, var.env)
location = format("%s-b", var.gcp_region)
}
resource "google_service_account" "main" {
account_id = format("%s-sa", local.resource_prefix)
display_name = format("%s-sa", local.resource_prefix)
project = var.gcp_project
}
resource "google_container_cluster" "main" {
name = local.resource_prefix
description = format("Cluster primarily servicing the service %s", local.service)
location = local.location
remove_default_node_pool = true
initial_node_count = 1
}
resource "google_container_node_pool" "main" {
name = format("%s-node-pool", local.resource_prefix)
location = local.location
cluster = google_container_cluster.main.name
node_count = var.gke_cluster_node_count
node_config {
preemptible = true
machine_type = var.gke_node_machine_type
# Google recommends custom service accounts that have cloud-platform scope and permissions granted via IAM Roles.
service_account = google_service_account.main.email
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/servicecontrol",
"https://www.googleapis.com/auth/service.management.readonly",
"https://www.googleapis.com/auth/trace.append"
]
}
autoscaling {
min_node_count = var.gke_cluster_autoscaling_min_node_count
max_node_count = var.gke_cluster_autoscaling_max_node_count
}
}
I run a helm deployment to deploy an application and get the following issue.
default php-5996c7fbfd-d6xf5 0/1 ImagePullBackOff 0 37m
Normal Pulling 36m (x4 over 37m) kubelet Pulling image "europe-docker.pkg.dev/example-999999/eu.gcr.io/example-php-fpm:latest"
Warning Failed 36m (x4 over 37m) kubelet Failed to pull image "europe-docker.pkg.dev/example-999999/eu.gcr.io/example-php-fpm:latest": rpc error: code = Unknown desc = failed to pull and unpack image "europe-docker.pkg.dev/example-999999/eu.gcr.io/example-php-fpm:latest": failed to resolve reference "europe-docker.pkg.dev/example-999999/eu.gcr.io/example-php-fpm:latest": failed to authorize: failed to fetch oauth token: unexpected status: 403 Forbidden
Warning Failed 36m (x4 over 37m) kubelet Error: ErrImagePull
Warning Failed 35m (x6 over 37m) kubelet Error: ImagePullBackOff
Seems to me that I've missed something to do with the service account. Although using cloud ssh I am able to generate an oauth token, but that also does not work using crictl
UPDATE: issue resolved
I have been able to resolve my problem with the following additional terraform code.
resource "google_project_iam_member" "artifact_role" {
role = "roles/artifactregistry.reader"
member = "serviceAccount:${google_service_account.main.email}"
project = var.gcp_project
}

Turning comment to answer as it resolved #David's issue.
Because the user defined service account is being used for the node_pool the appropriate roles need to be bound to this service account.
In this case: roles/artifactregistry.reader
Configuring artifact registry permissions
Best practice is to grant the minimum required roles.

As error says : unexpected status: 403 Forbidden
You might be having an issue with the Deployment imagepull secret.
For GKE you can use the service account JSON
Ref doc : https://cloud.google.com/container-registry/docs/advanced-authentication#json-key
Terraform create secret in GKE which you can use it to deployment
resource "kubernetes_secret" "gcr" {
type = "kubernetes.io/dockerconfigjson"
metadata {
name = "gcr-image-pull"
namespace = "default"
}
data = {
".dockerconfigjson" = jsonencode({
auths = {
"gcr.io" = {
username = "_json_key"
password = base64decode(google_service_account_key.myaccount.private_key)
email = google_service_account.main.email
auth = base64encode("_json_key:${ base64decode(google_service_account_key.myaccount.private_key) }")
}
}
})
}}
Or use the kubectl to create the secret
kubectl create secret docker-registry gcr \
--docker-server=gcr.io \
--docker-username=_json_key \
--docker-password="$(cat google-service-account-key.json)" \
--docker-email=<Email address>
Now if you have the POD or deployment you can create YAML config like
apiVersion: v1
kind: Pod
metadata:
name: uses-private-registry
spec:
containers:
- name: hello-app
image: <image URI>
imagePullSecrets:
- name: secret-that-you-created
Update:
As per Guillaume's suggestion for GKE/GCP you can follow *workload identity* option as best practice with other extern repo it might could not work.
Create the IAM service account in GCP:
gcloud iam service-accounts create gke-workload-indentity \
--project=<project-id>
Create a service account in the K8s cluster :
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
iam.gke.io/gcp-service-account: gke-workload-indentity#PROJECT-ID.iam.gserviceaccount.com
name: gke-sa-workload
namespace: default
Policy binding run below Gcloud command :
gcloud iam service-accounts add-iam-policy-binding gke-workload-indentity#PROJECT_ID.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:PROJECT_ID.svc.id.goog[default/K8s_SANAME]"
Now you can create the deployment POD with image in GCR/astifact repo just update the ServiceAccount
spec:
containers:
- name: container
image: IMAGE
serviceAccountName: gke-sa-workload
Read more at : https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

Related

Unable to fetch Vault Token for Pod Service Account

I am using Vault CSI Driver on Charmed Kubernetes v1.19 where I'm trying to retrieve secrets from Vault for a pod running in a separate namespace (webapp) with its own service account (webapp-sa) following the steps in the blog.
As I have been able to understand so far, the Pod is trying authenticate to the Kubernetes API, so that later it can generate a Vault token to access the secret from Vault.
$ kubectl get po webapp
NAME READY STATUS RESTARTS AGE
webapp 0/1 ContainerCreating 0 22m
It appears to me there's some issue authenticating with the Kubernetes API.
The pod remains stuck in the Container Creating state with the message - failed to create a service account token for requesting pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 35m default-scheduler Successfully assigned webapp/webapp to host-03
Warning FailedMount 4m38s (x23 over 35m) kubelet MountVolume.SetUp failed for volume "secrets-store-inline" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod webapp/webapp, err: rpc error: code = Unknown desc = error making mount request: **failed to create a service account token for requesting pod** {webapp xxxx webapp webapp-sa}: the server could not find the requested resource
I can get the vault token using cli in the pod namespace:
$ vault write auth/kubernetes/login role=database jwt=$SA_JWT_TOKEN
Key Value
--- -----
token <snipped>
I do get the vault token using the API as well:
$ curl --request POST --data #payload.json https://127.0.0.1:8200/v1/auth/kubernetes/login
{
"request_id":"1234",
<snipped>
"auth":{
"client_token":"XyZ",
"accessor":"abc",
"policies":[
"default",
"webapp-policy"
],
"token_policies":[
"default",
"webapp-policy"
],
"metadata":{
"role":"database",
"service_account_name":"webapp-sa",
"service_account_namespace":"webapp",
"service_account_secret_name":"webapp-sa-token-abcd",
"service_account_uid":"123456"
},
<snipped>
}
}
Reference: https://www.vaultproject.io/docs/auth/kubernetes
As per the vault documentation, I've configured Vault with the Token Reviewer SA as follows:
$ cat vault-auth-service-account.yaml
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: role-token-review-binding
namespace: vault
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: vault-auth
namespace: vault
Vault is configured with JWT from the Token Reviewer SA as follows:
$ vault write auth/kubernetes/config \
token_reviewer_jwt="< TOKEN Reviewer service account JWT>" \
kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443"
kubernetes_ca_cert=#ca.crt
I have defined a Vault Role to allow the webapp-sa access to the secret:
$ vault write auth/kubernetes/role/database \
bound_service_account_names=webapp-sa \
bound_service_account_namespaces=webapp \
policies=webapp-policy \
ttl=72h
Success! Data written to: auth/kubernetes/role/database
The webapp-sa is allowed access to the secret as per the Vault Policy defined as follows:
$ vault policy write webapp-policy - <<EOF
> path "secret/data/db-pass" {
> capabilities = ["read"]
> }
> EOF
Success! Uploaded policy: webapp-policy
Pod and it's SA is defined as follows:
$ cat webapp-sa-and-pod.yaml
kind: ServiceAccount
apiVersion: v1
metadata:
name: webapp-sa
---
kind: Pod
apiVersion: v1
metadata:
name: webapp
spec:
serviceAccountName: webapp-sa
containers:
- image: registry/jweissig/app:0.0.1
name: webapp
volumeMounts:
- name: secrets-store-inline
mountPath: "/mnt/secrets-store"
readOnly: true
volumes:
- name: secrets-store-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
providerName: vault
secretProviderClass: "vault-database"
Does anyone have any clue as to why the Pod won't authenticate with
the Kubernetes API?
Do I have to enable flags on the kube-apiserver for Token Review API
to work?
Is it enabled by default on Charmed Kubernetes v1.19?
Would be grateful for any help.
Regards,
Sana

How to pull ACR image from k3s pods

I have customized the coredns image and pushed it to my azure container registry (ACR). Now in default coredns pod that is coming after k3s installation, I want to use my_azure_acr_repo/proj/customize-coredns:latest image instead of rancher/coredns-coredns:1.8.3. So I edited the coredns deployment kubectl edit deploy coredns -n kube-system and replaced my acr image with rancher one. But now coredns pod is not able to pull my acr image and giving error in pod description:
Failed to pull image "my_azure_acr_repo/proj/customize-coredns:latest": rpc error:
code = Unknown desc = failed to pull and unpack image "my_azure_acr_repo/proj/customize-coredns:latest":
failed to resolve reference "my_azure_acr_repo/proj/customize-coredns:latest": failed to
authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized
How can I authenticate acr image, so that pod should pull it ?
That's because your container is not authorized to pull image from your private ACR.
First you've to create secret so that you can access your ACR, then pass that secret in your deployment using imagePullSecrets
you can create secret by this command, make sure to replace your credential variables
kubectl create secret docker-registry <name> --docker-server=DOCKER_REGISTRY_SERVER --docker-username=DOCKER_USER --docker-password=DOCKER_PASSWORD --docker-email=DOCKER_EMAIL
For ACR it will be something like this
kubectl create secret docker-registry regkey --docker-server=https://myregistry.azurecr.io --docker-username=ACR_USERNAME --docker-password=ACR_PASSWORD --docker-email=ANY_EMAIL_ADDRESS
your deployment spec
spec:
containers:
- name: foo
image: janedoe/awesomeapp:v1
imagePullSecrets:
- name: regkey
More info related to this.
https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod

kubernetes external secrets on GKE , Permission error

I install kubernetes external secrets with helm, on GKE.
GKE: 1.16.15-gke.6000 on asia-northeast1
helm app version 6.2.0
using Workload Identity as document written
For workload identity,the service account I bind as below command (my-secrets-sa#$PROJECT.iam.gserviceaccount.com) has SecretManager.admin role, which seems necessary for using google secrets manager
gcloud iam service-accounts add-iam-policy-binding --role roles/iam.workloadIdentityUser --member "serviceAccount:$CLUSTER_PROJECT.svc.id.goog[$SECRETS_NAMESPACE/kubernetes-external-secrets]" my-secrets-sa#$PROJECT.iam.gserviceaccount.com
Workload identity looks set correctly, because checking service account in pod on GKE shows correct serviceaccount
https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#enable_workload_identity_on_a_new_cluster
create a pod in cluster and check auth in it. it shows my-secrets-sa#$PROJECT.iam.gserviceaccount.com
$ kubectl run -it --image google/cloud-sdk:slim --serviceaccount ksa-name --namespace k8s-namespace workload-identity-test
$ gcloud auth list
But even if creating externalsecret, externalsecret shows error
ERROR, 7 PERMISSION_DENIED: Permission 'secretmanager.versions.access' denied for resource 'projects/project-id/secrets/my-gsm-secret-name/versions/latest' (or it may not exist).
secret my-gsm-secret-name itself exist in secretmanager, so it should not "not exist".
Also permission must be correctly set by workload identity.
it's the externalsecret I defined.
apiVersion: kubernetes-client.io/v1
kind: ExternalSecret
metadata:
name: gcp-secrets-manager-example # name of the k8s external secret and the k8s secret
spec:
backendType: gcpSecretsManager
projectId: my-gsm-secret-project
data:
- key: my-gsm-secret-name # name of the GCP secret
name: my-kubernetes-secret-name # key name in the k8s secret
version: latest # version of the GCP secret
property: value # name of the field in the GCP secret
Has anyone had similar problem before ?
Thank you
whole command
create a cluster with workload-pool.
$ gcloud container clusters create cluster --region asia-northeast1 --node-locations asia-northeast1-a --num-nodes 1 --preemptible --workload-pool=my-project.svc.id.goog
create kubernetes service account.
$ kubectl create serviceaccount --namespace default ksa
binding kubernetes service account & service account
$ gcloud iam service-accounts add-iam-policy-binding
--role roles/iam.workloadIdentityUser
--member "serviceAccount:my-project.svc.id.goog[default/ksa]"
my-secrets-sa#my-project.iam.gserviceaccount.com`
add annotation
$ kubectl annotate serviceaccount
--namespace default
ksa
iam.gke.io/gcp-service-account=my-secrets-sa#my-project.iam.gserviceaccount.com
install with helm
$ helm install my-release external-secrets/kubernetes-external-secrets
create external secret
apiVersion: kubernetes-client.io/v1
kind: ExternalSecret
metadata:
name: gcp-secrets-manager-example # name of the k8s external secret and the k8s secret
spec:
backendType: gcpSecretsManager
projectId: my-gsm-secret-project
data:
- key: my-gsm-secret-name # name of the GCP secret
name: my-kubernetes-secret-name # key name in the k8s secret
version: latest # version of the GCP secret
property: value # name of the field in the GCP secret
$ kubectl apply -f excternal-secret.yaml
I noticed that I had used different kubernetes service account.
When installing helm, new kubernetes service account my-release-kubernetes-external-secrets was created, and service/pods must be working on this service account.
So I should bind my-release-kubernetes-external-secrets & google service account.
Now, it works well.
Thank you #matt_j #norbjd

GKE: Service account for Config Connector lacks permissions

I'm attempting to get Config Connector up and running on my GKE project and am following this getting started guide.
So far I have enabled the appropriate APIs:
> gcloud services enable cloudresourcemanager.googleapis.com
Created my service account and added policy binding:
> gcloud iam service-accounts create cnrm-system
> gcloud iam service-accounts add-iam-policy-binding ncnrm-system#test-connector.iam.gserviceaccount.com --member="serviceAccount:test-connector.svc.id.goog[cnrm-system/cnrm-controller-manager]" --role="roles/iam.workloadIdentityUser"
> kubectl wait -n cnrm-system --for=condition=Ready pod --all
Annotated my namespace:
> kubectl annotate namespace default cnrm.cloud.google.com/project-id=test-connector
And then run through trying to apply the Spanner yaml in the example:
~ >>> kubectl describe spannerinstance spannerinstance-sample
Name: spannerinstance-sample
Namespace: default
Labels: label-one=value-one
Annotations: cnrm.cloud.google.com/management-conflict-prevention-policy: resource
cnrm.cloud.google.com/project-id: test-connector
API Version: spanner.cnrm.cloud.google.com/v1beta1
Kind: SpannerInstance
Metadata:
Creation Timestamp: 2020-09-18T18:44:41Z
Generation: 2
Resource Version: 5805305
Self Link: /apis/spanner.cnrm.cloud.google.com/v1beta1/namespaces/default/spannerinstances/spannerinstance-sample
UID:
Spec:
Config: northamerica-northeast1-a
Display Name: Spanner Instance Sample
Num Nodes: 1
Status:
Conditions:
Last Transition Time: 2020-09-18T18:44:41Z
Message: Update call failed: error fetching live state: error reading underlying resource: Error when reading or editing SpannerInstance "test-connector/spannerinstance-sample": googleapi: Error 403: Request had insufficient authentication scopes.
Reason: UpdateFailed
Status: False
Type: Ready
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning UpdateFailed 6m41s spannerinstance-controller Update call failed: error fetching live state: error reading underlying resource: Error when reading or editing SpannerInstance "test-connector/spannerinstance-sample": googleapi: Error 403: Request had insufficient authentication scopes.
I'm not really sure what's going on here, because my cnrm service account has ownership of the project my cluster is in, and I have the APIs listed in the guide enabled.
The CC pods themselves appear to be healthy:
~ >>> kubectl wait -n cnrm-system --for=condition=Ready pod --all
pod/cnrm-controller-manager-0 condition met
pod/cnrm-deletiondefender-0 condition met
pod/cnrm-resource-stats-recorder-58cb6c9fc-lf9nt condition met
pod/cnrm-webhook-manager-7658bbb9-kxp4g condition met
Any insight in to this would be greatly appreciated!
By the error message you have posted, I should supposed that it might be an error in your GKE scopes.
To GKE access others GCP APIs you must allow this access when creating the cluster. You can check the enabled scopes with the command:
gcloud container clusters describe <cluster-name> and find in the result for oauthScopes.
Here you can see the scope's name for Cloud Spanner, you must enable the scope https://www.googleapis.com/auth/cloud-platform as minimum permission.
To verify in the GUI, you can see the permission in: Kubernetes Engine > <Cluster-name> > expand the section permissions and find for Cloud Platform

Pulling Images from GCR into GKE

Today is my first day playing with GCR and GKE. So apologies if my question sounds childish.
So I have created a new registry in GCR. It is private. Using this documentation, I got hold of my Access Token using the command
gcloud auth print-access-token
#<MY-ACCESS_TOKEN>
I know that my username is oauth2accesstoken
On my local laptop when I try
docker login https://eu.gcr.io/v2
Username: oauth2accesstoken
Password: <MY-ACCESS_TOKEN>
I get:
Login Successful
So now its time to create a docker-registry secret in Kubernetes.
I ran the below command:
kubectl create secret docker-registry eu-gcr-io-registry --docker-server='https://eu.gcr.io/v2' --docker-username='oauth2accesstoken' --docker-password='<MY-ACCESS_TOKEN>' --docker-email='<MY_EMAIL>'
And then my Pod definition looks like:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: my-app
image: eu.gcr.io/<my-gcp-project>/<repo>/<my-app>:latest
ports:
- containerPort: 8090
imagePullSecrets:
- name: eu-gcr-io-registry
But when I spin up the pod, I get the ERROR:
Warning Failed 4m (x4 over 6m) kubelet, node-3 Failed to pull image "eu.gcr.io/<my-gcp-project>/<repo>/<my-app>:latest": rpc error: code = Unknown desc = Error response from daemon: unauthorized: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication
I verified my secrets checking the YAML file and doing a base64 --decode on the .dockerconfigjson and it is correct.
So what have I missed here ?
If your GKE cluster & GCR registry are in the same project: You don't need to configure authentication. GKE clusters are authorized to pull from private GCR registries in the same project with no config. (Very likely you're this!)
If your GKE cluster & GCR registry are in different GCP projects: Follow these instructions to give "service account" of your GKE cluster access to read private images in your GCR cluster: https://cloud.google.com/container-registry/docs/access-control#granting_users_and_other_projects_access_to_a_registry
In a nutshell, this can be done by:
gsutil iam ch serviceAccount:[PROJECT_NUMBER]-compute#developer.gserviceaccount.com:objectViewer gs://[BUCKET_NAME]
where [BUCKET_NAME] is the GCS bucket storing your GCR images (like artifacts.[PROJECT-ID].appspot.com) and [PROJECT_NUMBER] is the numeric GCP project ID hosting your GKE cluster.