How does the GKE metadata server work in Workload Identity - kubernetes

I've recently been making use of the GKE Workload Identity feature. I'd be interested to know in more detail how the gke-metadata-server component works.
GCP client code (gcloud or other language SDKs) falls through to the GCE metadata method
Request made to http://metadata.google.internal/path
(guess) Setting GKE_METADATA_SERVER on my node pool configures this to resolve to the gke-metadata-server pod on that node.
(guess) the gke-metadata-server pod with --privileged and host networking has a means of determining the source (pod IP?) then looking up the pod and its service account to check for the iam.gke.io/gcp-service-account annotation.
(guess) the proxy calls the metadata server with the pods 'pseudo' identity set (e.g. [PROJECT_ID].svc.id.goog[[K8S_NAMESPACE]/[KSA_NAME]]) to get a token for the service account annotated on its Kubernetes service account.
If this account has token creator / workload ID user rights to the service account presumably the response from GCP is a success and contains a token, which is then packaged and set back to the calling pod for authenticated calls to other Google APIs.
I guess the main puzzle for me right now is the verification of the calling pods identity. Originally I thought this would use the TokenReview API but now I'm not sure how the Google client tools would know to use the service account token mounted into the pod...
Edit follow-up questions:
Q1: In between step 2 and 3, is the request to metadata.google.internal routed to the GKE metadata proxy by the setting GKE_METADATA_SERVER on the node pool?
Q2: Why does the metadata server pod need host networking?
Q3: In the video here: https://youtu.be/s4NYEJDFc0M?t=2243 it's taken as a given that the pod makes a GCP call. How does the GKE metadata server identify the pod making the call to start the process?

Before going into details, please familiarize yourself with these components:
OIDC provider: Runs on Google’s infrastructure, provides cluster specific metadata and signs authorized JWTs.
GKE metadata server: It runs as a DaemonSet meaning one instance on every node, exposes pod specific metadata server (it will provide backwards compatibility with old client libraries), emulates existing node metadata server.
Google IAM: issues access token, validates bindings, validates OIDC signatures.
Google cloud: accepts access tokens, does pretty much anything.
JWT: JSON Web token
mTLS: Mutual Transport Layer Security
The steps below explain how GKE metadata server components work:
Step 1: An authorized user binds the cluster to the namespace.
Step 2: Workload tries to access Google Cloud service using client libraries.
Step 3: GKE metadata server is going to request an OIDC signed JWT from the control plane. That connection is authenticated using mutual TLS (mTLS) connection with node credential.
Step 4: Then the GKE metadata server is going use that OIDC signed JWT to request an access token for the [identity namespace]/[Kubernetes service account] from IAM. IAM is going to validate that the appropriate bindings exist on identity namespace and in the OIDC provider.
Step 5: And then IAM validates that it was signed by the cluster’s correct OIDC provider. It will then return an access token for the [identity namespace]/[kubernetes service account].
Step 6: Then the metadata server sends the access token it just got back to IAM. IAM will then exchange that for a short lived GCP service account token after validating the appropriate bindings.
Step 7: Then GKE metadata server returns the GCP service account token to the workload.
Step 8: The workload can then use that token to make calls to any Google Cloud Service.
I also found a video regarding Workload Identity which you will find useful.
EDIT Follow-up questions' answers:
Below are answers to your follow-up questions:
Q1: In between step 2 and 3, is the request to metadata.google.internal routed to the gke metadata proxy by the setting GKE_METADATA_SERVER on the node pool?
You are right, GKE_METADATA_SERVER is set on the node pool. This exposes a metadata API to the workloads that is compatible with the V1 Compute Metadata APIs. Once workload tries to access Google Cloud service, the GKE metadata server performs a lookup (the metadata server checks to see if a pod exists in the list whose IP matches the incoming IP of the request) before it goes on to request the OIDC token from the control plane.
Keep in mind that GKE_METADATA_SERVER enumeration feature can only be enabled if Workload Identity is enabled at the cluster level.
Q2: Why does the metadata server pod need host networking?
The gke-metadata-server intercepts all GCE metadata server requests from pods, however pods using the host network are not intercepted.
Q3: How does the GKE metadata server identify the pod making the call to start the process?
The pods are identified using iptables rules.

Related

Using HTTP API to call multiple services running on AWS ECS

My goal here is to deploy two spring boot services using AWS ECS Fargate in a private subnet and access them via AWS API Gateway. Basically, I want to use a single HTTP API and then based on the path it should call the appropriate service. I am using VPC Links, and Cloud Map for linking services running in a private subnet, for service discovery. First of all - Is this assumption even correct, i.e. can we use a single HTTP API to call two different services based on a path?
Some considerations of how I created the ECS services.
ECS Service A is deployed in a private subnet, it has no public IP enabled and the service discovery has been enabled. While enabling service discovery I choose the DNS record type to be SRV, giving a port number and TTL as 60 secs.
ECS Service B is also deployed similarly.
Both ECS Service A and B have a separate Service discovery endpoint.
Now in the API Gateway, the steps I followed were
Created a new HTTP API using the defaults, this means the default stage and no routes and integrations configured yet.
Then I created a VPC Link for HTTP API by assigning it a name (service-a-vpclink), assigning a VPC, subnet and appropriate security group (security that was assigned to the ECS service for service A).
Now I created a route where the method is "ANY" and the path is "$default" and assigned an integration to it, I am able to reach all my endpoints of service A running in the private subnet. (So all good here, as this shows that I am able to reach the service running in a private subnet using API Gateway.)
For the integration that I mentioned in point 3, this was of type "Private Resource", target service as "Cloud Map" and then selecting the namespace and appropriate service (serviceA) along with the VPC link that was created in step 2.
But this is what I don't want to do. I want something like the below:
Hitting any endpoint like "https://uzhgtf6t8u.execute-api.eu-west-2.amazonaws.com/serviceA/any-serviceA-endpoints" where /serviceA is a path that is configured in API Gateway and then any-serviceA-endpoints are the actual endpoints configured in the backend service running, navigates to service A endpoints.
Hitting any endpoint like "https://uzhgtf6t8u.execute-api.eu-west-2.amazonaws.com/serviceB/any-serviceB-endpoints" where /serviceB is a path that is configured in API Gateway and then any-serviceB-endpoints are the actual endpoints configured in the backend service running, navigates to service B endpoints.
Here I attach separate integrations to path /serviceA and to path /serviceB, but this does not work. Rather this way the response is 404, not found.
What exactly am I not following?
Many thanks..
Screenshot of route

Kubernetes service account role using OIDC

I am trying out the capability where 2 pods deployed to the same worker node in EKS are associated to different service accounts. Below are the steps
Each service account is associated to a different role one with access to SQS and other without access.
Used eksutil to associate OIDC provider with cluster and also created iamserviceaccount with service account in kubernetes and role with policy for accessing SQS attached (implicit annotation of service account with IAM role provided by eksctl create iamserviceaccount).
But when I try to start the pod which has service account tied to role with SQS access, I am getting access denied for SQS, however if I add SQS permissions to worker node instance role, its working fine.
Am I missing any steps and is my understanding correct?
So, there are a few things required to get IRSA to work:
There has to be an OIDC provider associated with the cluster, following the directions here.
The IAM role has to have a trust relationship with the OIDC provider, as defined in the AWS CLI example here.
The service account must be annotated with a matching eks.amazonaws.com/role-arn.
The pod must have the appropriate service account specified with a serviceAccountName in its spec, as per the API docs.
The SDK for the app needs to support the AssumeRoleWithWebIdentity API call. Weirdly, the aws-sdk-go-v2 SDK doesn't currently support it at all (the "old" aws-sdk-go does).
It's working with the node role because one of the requirements above isn't met, meaning the credential chain "falls through" to the underlying node role.

Can I use the Ambassador to authenticate service-to-service communication inside a Kubernetes cluster?

I have a Kubernetes cluster with services and I use Ambassador as an API gateway between outside world and my services.
With Ambassador I know that I can use a service, which I have, to check authentication and authorization for incoming requests but does this only apply for requests coming outside the cluster?
I want to intercept service-to-service calls as well.
I would be surprised if you cannot.
This answer needs some terminology, to avoid getting lost in word-soup.
App-A is a consumer of an in-cluster Service, and the one which will be authenticating to Ambassador
App-Z is the provider of an in-cluster Service (the selector would target its Pods)
The k8s Service for app-Z we'll call z-service in the z namespace, for a FQDN of z-service.z.svc.cluster.local
It seems like you can use its v-host support and teach it to honor the in-cluster virtual host (the aforementioned FQDN), then update the z-service selector to target the Ambassador Pods rather than the underlying app-Z Pods.
From app-A's point of view, the only thing that would change is that it now must provide authentication for contacting z-service.z.svc.cluster.local.
Without studying Ambassador's setup more, it's hard to know if Ambassador would Just Work™ at that point, or whether you would then need a "implementation" Service -- such as z-for-real.z.svc.cluster.local -- so that Ambassador knows how to find the actual app-Z Pods.
I have the same problem at the moment. Ambassador routes every request to an auth service (if provided), the auth service can be anything. So you can setup http basic auth, oauth, jwt auth and so on.
The next important thing to mention is that your services may use header based routing (https://www.getambassador.io/reference/headers). Only if a bearer (or something similiar) is present the request will hit your service, otherwise will fail. In your service you can check for permissions and so on. So all in all ambassador can help you, but you have still to program something by yourself.
If you want something ready from start or more advanced you can try
https://github.com/ory/oathkeeper or https://istio.io.
If you already found a solution, it would be interesting to know.

custom load balancing within kubernetes

I am trying to deploy an application with load-balancing within kubernetes
below are my intended deployment diagram
ideally, the application is deployed by a set of pods using k8s deployment with type of "backend"
normally, user instances are stored in the archive. and are restored into one of the pods dynamically upon request, stay there for a TTL time (say 30 minutes), and deleted and backuped into the archive.
ideally, the load balance is deployed by a set of pods using k8s deployment with type of "frontend".
ideally, the frontend is configured as layer7 session sticky with "sticky = host". the host equals the UID of a backend pod
an user requests the service by a SOAP message, which contains parameters "host" and "user" in its body.
when a SOAP message reaches the frontend, the "host" value is extracted from the message body.
if the "host" value is valid, the SOAP message is forwarded to the corresponding backend pod (whose UID equals the host value). otherwise, a random backend pod is assigned.
(processing here upon is application specific)
In a backend pod, the application checks the availability of the user instance by the value of "user".
if already existed, just use it; otherwise, try to restore from the archive; if restoring failed(new user), create a new user instance.
I searched around, and did not find any similar examples.
especially layer7 session sticky configuration, and the implementation of custom acquiring of sticky value from the incoming message body.
This sounds like a use-case where you are doing authentication through the front-end loadbalancer. Have you looked at Istio and Ambassador. Seems like Istio and Envoy could provide the service mesh to route the requests to the pods. Then you would have to write a custom plugin module into Ambassador to create this specific routing and authentication mechanism that you are seeking.
Example of Ambassador custom authentication service: https://www.getambassador.io/user-guide/auth-tutorial
https://www.getambassador.io/user-guide/with-istio
This custom sticky session routing can also be done using other API gateways but still using Istio for routing to the different pods. However it would be best if the pods are defined as separate services in order to have easier segmentation by the API gateway (Ambassador, Kong, Nginx) based on the parameters of the message body.

REST API for Kubernetes apiserver proxy?

I have a K8s Cluster made from kubeadm v1.8.4 present on virtual machine. Now i want to access this K8s cluster using rest API from my laptop/workstation.
One way through which i am able to do this is by running this command "kubectl proxy --address= --accept-hosts '.*' ".But i have to manually run this command in order to access my cluster from laptop and i don't want that.
While going through the docs i found that there is another proxy available which is apiserver proxy.I am trying to use this way by following this link(https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster/#manually-constructing-apiserver-proxy-urls) but getting error "could not get any response" in postman.
So i was wondering whether i am using this apiserver proxy properly or not.Or is there any other way through which i can hit REST request from my laptop to my cluster on VM's without manually hitting "kubectl proxy" command?
What kube proxy does for you is essentialy two things.
First, obviously, it proxies your traffic from a localhost port to kubernetes api.
Second, it also authenticates you against the cluster so that all the proxied calls do not neet authentication information.
To hit API directly, it's as simple as pointing your client to the right IP:PORT of your VM, but... you need to either ignore (not advised) tls issues or trust kube CA cert. Also, you still need to authenticate to it, so you need to use appropriate client credentials (ie. bearer token).
Manually constructing apiserver proxy refers to a different kind of beast, which allows you to proxy traffic to services deployed in your kubernetes cluster by means of accessing a particular path in kube API server. So to use that you need to have the access to the API already.