GKE Workload Identity PermissionDenied - kubernetes

I am trying to use Google's preferred "Workload Identity" method to enable my GKE app to securely access secrets from Google Secrets.
I've completed the setup and even checked all steps in the Troubleshooting section (https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity?hl=sr-ba#troubleshooting) but I'm still getting the following error in my logs:
Unhandled exception. Grpc.Core.RpcException:
Status(StatusCode=PermissionDenied, Detail="Permission
'secretmanager.secrets.list' denied for resource
'projects/my-project' (or it may not exist).")
I figured the problem was due to the node pool not using the correct service account, so I recreated it, this time specifying the correct service account.
The service account has the following roles added:
Cloud Build Service
Account Kubernetes Engine Developer
Container Registry Service Agent
Secret Manager Secret Accessor
Secret Manager Viewer
The relevant source code for the package I am using to authenticate is as follows:
var data = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
var request = new ListSecretsRequest
{
ParentAsProjectName = ProjectName.FromProject(projectName),
};
var secrets = secretManagerServiceClient.ListSecrets(request);
foreach(var secret in secrets)
{
var value = secretManagerServiceClient.AccessSecretVersion($"{secret.Name}/versions/latest");
string secretVal = this.manager.Load(value.Payload);
string configKey = this.manager.GetKey(secret.SecretName);
data.Add(configKey, secretVal);
}
Data = data;
Ref. https://github.com/jsukhabut/googledotnet
Am I missing a step in the process?
Any idea why Google is still saying "Permission 'secretmanager.secrets.list' denied for resource 'projects/my-project' (or it may not exist)?"

Like #sethvargo mentioned in the comments, you need to map the service account to your pod because Workload Identity doesn’t use the underlying node identity and instead maps a Kubernetes service account to a GCP service account. Everything happens at the per-pod level in Workload identity.
Assign a Kubernetes service account to the application and configure it to act as a Google service account.
1.Create a GCP service account with the required permissions.
2.Create a Kubernetes service account.
3.Assign the Kubernetes service account permission to impersonate the GCP
service account.
4.Run your workload as the Kubernetes service account.
Hope you are using project ID instead of project name in the project or secret.
You cannot update the service account of an already created pod.
Refer the link to add service account to the pods.

Related

Why is my GCP image failing to deploy to local kubernetes?

I am getting "can't be pulled" when I use Cloud Code plugin in VS code to build and deploy an image to a local Kubernetes cluster. There are no errors being logged on GCP, but locally I'm getting the following:
- deployment/<redacted> failed. Error: container <redacted> is waiting to start: gcr.io/<redacted>/<redacted>:latest#sha256:<redacted> can't be pulled.
If your GCR registry is a private registry then you need to configure your local Kubernetes cluster with an imagePullSecret to use to authenticate to GCR. The general process is to create a service account in your GCP project, and then configure the corresponding service account key file as the pull secret.
There are a variety of tutorials, and this one looks pretty good.
Can you try gcloud auth list and check if you are using the right account? To switch account use gcloud auth login <account>
Also make sure you have the right permission : gcloud permission to pull GCP image
Once these two things are in place then you should be able to pull the image for GCR.

Create cluster with Shared Network in GKE

I’m trying to create a cluster in GKE project-1 with shared network of project-2.
Roles given to Service account:
project-1: Kubernetes Engine Cluster Admin, Compute Network Admin, Kubernetes Engine Host Service Agent User
project-2: Kubernetes Engine Service Agent, Compute Network User, Kubernetes Engine Host Service Agent User
Service Account is created under project-1.
API & Services are enabled in both Projects.
But I am getting this error persistently.
Error: googleapi: Error 403: Kubernetes Engine Service Agent is missing required permissions on this project. See Troubleshooting | Kubernetes Engine Documentation | Google Cloud for more info: required “container.hostServiceAgent.use” permission(s) for “projects/project-2”., forbidden
data "google_compute_network" "shared_vpc" {
name = "network-name-in-project-2"
project = "project-2"
}
data "google_compute_subnetwork" "shared_subnet" {
name = "subnet-name-in-project-2"
project = "project-2"
region = "us-east1"
}
# cluster creation under project 1
# project 1 specified in Provider
resource "google_container_cluster" "mowx_cluster" {
name = var.cluster_name
location = "us-east1"
initial_node_count = 1
master_auth {
username = ""
password = ""
client_certificate_config {
issue_client_certificate = false
}
}
remove_default_node_pool = true
cluster_autoscaling {
enabled = false
}
# cluster_ipv4_cidr = var.cluster_pod_cidr
ip_allocation_policy {
cluster_secondary_range_name = "pods"
services_secondary_range_name = "svc"
}
network = data.google_compute_network.shared_vpc.id
subnetwork = data.google_compute_subnetwork.shared_subnet.id
}
This is a community wiki answer based on the discussion in the comments and posted for better visibility. Feel free to expand it.
The error you encountered:
Error: googleapi: Error 403: Kubernetes Engine Service Agent is missing required permissions on this project. See Troubleshooting | Kubernetes Engine Documentation | Google Cloud for more info: required “container.hostServiceAgent.use” permission(s) for “projects/project-2”., forbidden
means that the necessary service agent was not created:
roles/container.serviceAgent - Kubernetes Engine Service Agent:
Gives Kubernetes Engine account access to manage cluster resources.
Includes access to service accounts.
The official troubleshooting docs describe a solution for such problems:
To resolve the issue, if you have removed the Kubernetes Engine Service Agent role from your Google Kubernetes Engine service account,
add it back. Otherwise, you must re-enable the Kubernetes Engine API,
which will correctly restore your service accounts and permissions.
You can do this in the gcloud tool or the Cloud Console.
The solution above works as in your use case the account was missing so it had to be (re)created.
For me, even though the gke serice account existed and had the roles Kubernetes Engine Host Service Agent and Kubernetes Engine Service Agent in the both the service and host projects, I still got the 443 error.
The problem was that the service account needed to have roles/compute.networkUser and roles/compute.instanceAdmin applied to the VPC's subnetwork binding of the VPC.
See: resource google_compute_subnetwork_iam_binding
See also module "shared_vpc_access"

gcloud confusion about set/get IAM policy for a service account

There are 2 commands I am confused for some time:
gcloud iam service-accounts get-iam-policy
gcloud iam service-accounts set-iam-policy
from the --help command, these 2 commands treat service account as a resource, most often I use service account as an identity, for example, in a project, set policy by binding role with service account so this service account can operate on something in that project.
Can someone please point out what is the usage to attach the policy to service account? how does service account act as a resource rather than an identity?
As explained in this below part of the official documentation Managing service accounts
:
When thinking of a service account as a resource, you can grant roles to other users to access or manage that service account.
So, use it as a resource has to goal for you to manage who can use and control the service account. To provide some additional details, as in this example here, with the policies attached to them, you can configure the level of access that different users can have within service accounts - as mentioned there, you can configure that some users have viewer access, while others have editor level.
To summarize, the functinality of attaching policies to a service account is for you to set different levels of access and permissions to users who can access the service account.

Kubernetes service connections in azure devops w/ AAD bound AKS cluster

Will kubernetes service connections in azure devops work with an AKS cluster that is bound to AAD via openidconnect? Logging into such clusters goes through an openidconnect flow that involves a device login + browser. How is this possible w/ azure devops k8s service connections?
Will kubernetes service connections in azure devops work with an AKS
cluster that is bound to AAD via openidconnect?
Unfortunately to say, no, this does not support until now.
According to your description, what you want to connect with in Azure Devops Kubernetes service connection is Azure Kubernetes service. This means you would select Azure Subscription in Choose authentication. BUT, this connection method is using Service Principal Authentication (SPA) to authenticate, which does not yet supported for the AKS that is bound with AAD auth.
If you connect your AKS cluster as part of your CI/CD deployment in Azure Devops, and attempt to get the cluster credentials. You would get a warning response which inform you to log in since the service principal cannot handle it:
WARNING: To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code *** to authenticate.
You should familiar with this message, it needs you open a browser to login in to complete the device code authentication manually. But this could not be achieve in Azure Devops.
There has a such feature request raised on our forum which request us expand this feature to Support non-interactive login for AAD-integrated clusters. You can vote and comment there to advance the priority of this suggestion ticket. Then it could be considered into the develop plan by our Product Manager as soon as possible.
Though it could not be achieved directly. But there has 2 work around can for you refer now.
The first work around is change the Azure DevOps authenticate itself from AAD client to the server client.
Use az aks get-credentials command and specify the parameter --admin with it. This can help with bypassing the Azure AD auth since it can let you connect and retrieve the admin credentials which can work without Azure AD.
But, I do not recommend this method because subjectively, this method is ignoring the authentication rules set in AAD for security. If you want a quick method to achieve what you want and not too worry about the security, you can try with this.
The second one is using Kubernetes service accounts
You can follow this doc to create a service account. Then in Azure Devops, we could use this service account to communicate with AKS API. Here you also need to consider about the authorized IP address ranges in AKS.
After the service account created successfully, choose Service account in the service connection of Azure Devops:
Server URL: Get it from the AKS instance(API server address) in Azure portal, then do not forget append the https:// before it while you input it into this service connection.
Secret: Generate it by using command:
kubectl get secret -n <name of secret> -o yaml -n service-accounts
See this doc: Deploy Vault on Azure Kubernetes Service (AKS).
Then you can use this service connection in Azure Devops tasks.

permission error: service account don't have access to cloud-ml platform

I am running Kubeflow pipeline(docker approach) and cluster uses the endpoint to navigate to the dashboard. The Clusters is created followed by the instructions mentioned in this link Deploy Kubeflow. Everything is successfully created and the cluster generated the endpoints and its working perfectly.
Endpoint link would be something like this https://appname.endpoints.projectname.cloud.goog.
Every workload of the pipeline is working fine except the last one. In the last workload, I am trying to submit a job to the cloud-ml engine. But it logs shows that the application has no access to the project. Here is the full image of the log.
ERROR:
(gcloud.ml-engine.versions.create) PERMISSION_DENIED: Request had
insufficient authentication scopes.
ERROR:
(gcloud.ml-engine.jobs.submit.prediction) User
[clustername#project_name.iam.gserviceaccount.com]
does not have permission to access project [project_name]
(or it may not exist): Request had insufficient authentication scopes.
From the logs, it's clear that this service account doesn't have access to the project itself. However, I tried to give access for Cloud ML Service to this service account but still, it's throwing the same error.
Any other ways to give Cloud ML service credentials to this application.
Check two things:
1) GCP IAM: if clustername-user#projectname.iam.gserviceaccount.com has ML Engine Admin permission.
2) Your pipeline DSL: if the cloud-ml engine step calls apply(gcp.use_gcp_secret('user-gcp-sa')), e.g. https://github.com/kubeflow/pipelines/blob/ea07b33b8e7173a05138d9dbbd7e1ce20c959db3/samples/tfx/taxi-cab-classification-pipeline.py#L67