How to authenticate to a GKE cluster without using the gcloud CLI - kubernetes

I've got a container inside a GKE cluster and I want it to be able to talk to the Kubernetes API of another GKE cluster to list some resources there.
This works well if run the following command in a separate container to proxy the connection for me:
gcloud container clusters get-credentials MY_CLUSTER --region MY_REGION --project MY_PROJECT; kubectl --context MY_CONTEXT proxy --port=8001 --v=10
But this requires me to run a separate container that, due to the size of the gcloud cli is more than 1GB big.
Ideally I would like to talk directly from my primary container to the other GKE cluster. But I can't figure out how to figure out the IP address and set-up the authentication required for the connection.
I've seen a few questions:
How to Authenticate GKE Cluster on Kubernetes API Server using its Java client library
Is there a golang sdk equivalent of "gcloud container clusters get-credentials"
But it's still not really clear to me if/how this would work with the Java libraries, if at all possible.
Ideally I would write something like this.
var info = gkeClient.GetClusterInformation(...);
var auth = gkeClient.getAuthentication(info);
...
// using the io.fabric8.kubernetes.client.ConfigBuilder / DefaultKubernetesClient
var config = new ConfigBuilder().withMasterUrl(inf.url())
.withNamespace(null)
// certificate or other autentication mechanishm
.build();
return new DefaultKubernetesClient(config);
Does that make sense, is something like that possible?

There are multiple ways to connect to your cluster without using the gcloud cli, since you are trying to access the cluster from another cluster within the cloud you can use the workload identity authentication mechanism. Workload Identity is the recommended way for your workloads running on Google Kubernetes Engine (GKE) to access Google Cloud services in a secure and manageable way. For more information refer to this official document. Here they have detailed a step by step procedure for configuring workload identity and provided reference links for code libraries.
This is drafted based on information provided in google official documentation.

Related

Can we configure AWS Secrets Manager to integrate with an on-premises k8s cluster

I setup a EKS cluster and integrated AWS Secrets Manager in it following the steps mentioned in https://github.com/aws/secrets-store-csi-driver-provider-aws and it worked as expected.
Now we have a requirement to integrate the AWS Secrets Manager on an on-premises k8s cluster and I am unable to follow the same steps as they seem to be explicitly for AWS EKS based clusters.
I googled around a bit and found you can call the Secrets Manager programmatically using one of the ways in https://docs.aws.amazon.com/secretsmanager/latest/userguide/asm_access.html, but this approach wont work for us.
Is there a k8s way to directly connect to AWS secrets Manager without setting up AWS-CLI and the OIDC cluster ID on the on-premises cluster?
Any help would be highly appreciated.
You can setup external OIDC providers with AWS and also setup K8s to with OIDC, but that is a lot of work.
AWS recently announced IAM Roles Anywhere which will let you use host based certificates to authenticate, but you will still have to call the Secrets Manager APIs.
If you are willing to retrieve secrets through etcd (which may store the secrets base64 encoded on the cluster) you can look at using the opensource External Secrets solution.

CloudSQL Proxy on GKE : Service vs Sidecar

Does anyone know the pros and cons for installing the CloudSQL-Proxy (that allows us to connect securely to CloudSQL) on a Kubernetes cluster as a service as opposed to making it a sidecar against the application container?
I know that it is mostly used as a sidecar. I have used it as both (in non-production environments), but I never understood why sidecar is more preferable to service. Can someone enlighten me please?
The sidecar pattern is preferred because it is the easiest and more secure option. Traffic to the Cloud SQL Auth proxy is not encrypted or authenticated, and relies on the user to restrict access to the proxy (typically be running local host).
When you run the Cloud SQL proxy, you are essentially saying "I am user X and I'm authorized to connect to the database". When you run it as a service, anyone that connects to that database is connecting authorized as "user X".
You can see this warning in the Cloud SQL proxy example running as a service in k8s, or watch this video on Connecting to Cloud SQL from Kubernetes which explains the reason as well.
The Cloud SQL Auth proxy is the recommended way to connect to Cloud SQL, even when using private IP. This is because the Cloud SQL Auth proxy provides strong encryption and authentication using IAM, which can help keep your database secure.
When you connect using the Cloud SQL Auth proxy, the Cloud SQL Auth proxy is added to your pod using the sidecar container pattern. The Cloud SQL Auth proxy container is in the same pod as your application, which enables the application to connect to the Cloud SQL Auth proxy using localhost, increasing security and performance.
As sidecar is a container that runs on the same Pod as the application container, because it shares the same volume and network as the main container, it can “help” or enhance how the application operates. In Kubernetes, a pod is a group of one or more containers with shared storage and network. A sidecar is a utility container in a pod that’s loosely coupled to the main application container.
Sidecar Pros: Scales indefinitely as you increase the number of pods. Can be injected automatically. Already used by serviceMeshes.
Sidecar Cons: A bit difficult to adopt, as developers can't just deploy their app, but deploy a whole stack in a deployment. It consumes much more resources and it is harder to secure because every Pod must deploy the log aggregator to push the logs to the database or queue.
Refer to the documentation for more information.

Anthos showing wrong status of Deployment on on-premise external cluster

I wanted to give a try to GCP's Anthos On-Premise GKE offering.
For sake of my demo I setup a Kubernetes cluster in GCP itself using Google Compute Engine following instructions from (https://kubernetes.io/docs/setup/production-environment/turnkey/gce/)
After this I followed Anthos documentation to register my cluster to Anthos. I was able to register the cluster and Login into it using both Token based and Basic authentication based mechanisms.
Now when I try to deploy anything from GCP console, I get following error
But the deployment succeeds, I can see deployment and associated pods in Running state on my cluster.
Also when I try to deploy using Marketplace I get following error.
I wish to know if it is a bug in Anthos or my cluster has some missing configurations ?
You're not running Anthos GKE On-Prem, you're running open-source Kubernetes on Google Cloud. Things designed for Anthos - the marketplace and connecting clusters to Cloud Console - are not supposed to work in your setup. The fact that they mostly work despite that is an accident (and a testament to the portability and compatibility of Kubernetes).
To get Cloud Console integration and use the marketplace, you need to use either Anthos GKE On-Prem that runs on VMWare or regular GKE.

Google cloud access mongo deployed on compute engine from app deployed on kubernetes engine

I have three instances for kubernetes cluster and three instances for mongo cluster as shown here:
I can access my mongo cluster from app console and other compute instances using uri like this:
mongo mongodb:root:passwd#mongodb-1-servers-vm-0:27017,mongodb-1-servers-vm-1:27017/devdb?replicaSet=rs0
I also tried replacing instance names with internal and external ip addresses, but that didn't help it either.
But the same command does not work from instances inside the kubernetes cluster. I assume that I have to configure some kind of permissions for my cubernetes cluster to access compute instances? Can someone help?
Ok, I managed to find a solution, not sure if the best one.
First we add firewall rules to allow mongodb traffic
gcloud compute firewall-rules create allow-mongodb --allow tcp:27017
Then we use external ip's to connect to mongodb from kubernetes instances
mongodb:root:passwd#<ip1>:27017,<ip2>:27017/devdb?replicaSet=rs0

Connect to a DB hosted within a Kubernetes engine cluster from a PySpark Dataproc job

I am a new Dataproc user and I am trying to run a PySpark job that is supposed to use the MongoDB connector to retrieve data from a MongoDB replicaset hosted within a Googke Kubernetes Engine cluster.
Is it there a way to achieve this as my replicaset is not supposed to be accessible from the outside without using a port-forward or something?
In this case I assume by saying "outside" you're pointing to the internet or other networks than your GKE cluster's. If you deploy your Dataproc cluster on the same network as your GKE cluster, and expose the MongoDB service to the internal network, you should be able to connect to the databases from your Dataproc job without needing to expose it to outside of the network.
You can find more information in this link to know how to create a Cloud Dataproc cluster with internal IP addresses.
Just expose your Mogodb service in GKE and your should be able to reach it from within the same VPC network.
Take a look at this post for reference.
You should also be able to automate the service exposure through an init script