How do I use gcloud to find which regions my dataproc clusters are in? - gcloud

If I issue gcloud dataproc clusters list 0 clusters are listed:
$ gcloud dataproc clusters list
Listed 0 items.
However if I specify the region gcloud dataproc clusters list --region europe-west1 I get back a list of clusters:
$ gcloud dataproc clusters list --region europe-west1
NAME WORKER_COUNT STATUS ZONE
mydataproccluster1 2 RUNNING europe-west1-d
mydataproccluster2 2 RUNNING europe-west1-d
I'm guessing that the inability to get a list of clusters without specifying --region is a consequence of a decision made by my org's administrators however I'm hoping there is a way around it. I can visit https://console.cloud.google.com/ and see a list of all the clusters in the project, can I get the same using gcloud? Having to visit https://console.cloud.google.com/ just so I can issue gcloud dataproc clusters list --region europe-west1 seems a bit of a limitation.

The underlying regional services are by-design isolated from each other such that there's no single URL that returns the combined list (because that would be a global dependency and failure mode), and unfortunately, at the moment the layout of the gcloud libraries is such that there's no option for specifying a list of regions or shorthand for "all regions" when listing dataproc clusters or jobs.
However, you can work around this by obtaining the list of possible regional stacks from the Compute API:
gcloud compute regions list --format="value(name)" | \
xargs -n 1 gcloud dataproc clusters list --region
The only dataproc region that doesn't match up to one of the Compute regions is the special "global" Dataproc region, which is a separate Dataproc service that spans all compute regions.
For convenience you can also just add global to a for-loop:
for REGION in global $(gcloud compute regions list --format="value(name)"); do gcloud dataproc clusters list --region ${REGION}; done

Having to specify --region is how Dataproc command group in gcloud works. Developers Console issues lists requests against all regions (you could request for gcloud to do the same).
Alternatively, you can use the global mutiregion (which is the gcloud default). This will interact well with your organization policies. If your Organization has region-restricted VM locations you will be able to create VMs in europe but will get an error when doing so elsewhere).

Related

CLI command ordering for toggling between two GKE/kubectl accounts w/ two emails

Several weeks ago I asked this question and received a very helpful answer. The gist of that question was: "how do I switch back and forth between two different K8s/GCP accounts on the same machine?" I have 2 different K8s projects with 2 different emails (gmails) that live on 2 different GKE clusters in 2 different GCP accounts. And I wanted to know how to switch back and forth between them so that when I run kubectl and gcloud commands, I don't inadvertently apply them to the wrong project/account.
The answer was to basically leverage kubectl config set-context along with a script.
This question (today) is an extenuation of that question, a "Part 2" so to speak.
I am confused about the order in which I:
Set the K8s context (again via kubectl config set-context ...); and
Run gcloud init; and
Run gcloud auth; and
Can safely run kubectl and gcloud commands and be sure that I am hitting the right GKE cluster
My understanding is that gcloud init only has to be ran once to initialize the gcloud console on your system. Which I have already done.
So my thinking here is that I could be able to do the following:
# 1. switch K8s context to Project 1
kubectl config set-context <context for GKE project 1>
# 2. authenticate w/ GCP so that now gcloud commands will only hit the GCP
# resources associated with Project 1 (and GCP Account 1)
gcloud auth
# 3. run a bunch of kubectl and gcloud commands for Project/GCP Account 1
# 4. switch K8s context to Project 2
kubectl config set-context <context for GKE project 2>
# 5. authenticate w/ GCP so that now gcloud commands will only hit the GCP
# resources associated with Project 2 (and GCP Account 2)
gcloud auth
# 6. run a bunch of kubectl and gcloud commands for Project/GCP Account 2
Is my understanding here correct or is it more involved/complicated than this (and if so, why)?
I'll assume familiarity with the earlier answer
gcloud
gcloud init need only be run once per machine and only again if you really want to re-init'ialize the CLI (gcloud).
gcloud auth login ${ACCOUNT} authenticates a (Google) (user or service) account and persists (on Linux by default in ${HOME}/.config/gcloud) and renews the credentials.
gcloud auth list lists the accounts that have been gcloud auth login. The results show which account is being used by default (ACTIVE with *).
Somewhat inconveniently, one way to switch between the currently ACTIVE account is to change gcloud global (every instance on the machine) configuration using gcloud config set account ${ACCOUNT}.
kubectl
To facilitate using previously authenticated (i.e. gcloud auth login ${ACCOUNT}) credentials with Kubernetes Engine, Google provides the command gcloud container clusters get-credentials. This uses the currently ACTIVE gcloud account to create a kubectl context that joins a Kubernetes Cluster with a User and possibly with a Kubernetes Namespace too. gcloud container clusters get-credentials makes changes to kubectl config (on Linux by default in ${HOME}/.kube/config).
What is a User? See Users in Kubernetes. Kubernetes Engine (via kubectl) wants (OpenID Connect) Tokens. And, conveniently, gcloud can provide these tokens for us.
How? Per previous answer
user:
auth-provider:
config:
access-token: [[redacted]]
cmd-args: config config-helper --format=json
cmd-path: path/to/google-cloud-sdk/bin/gcloud
expiry: "2022-02-22T22:22:22Z"
expiry-key: '{.credential.token_expiry}'
token-key: '{.credential.access_token}'
name: gcp
kubectl uses the configuration file to invoke gcloud config config-helper --format=json and extracts the access_token and token_expiry from the result. GKE can then use the access_token to authenticate the user. And, if necessary can renew the token using Google's token endpoint after expiry (token_expiry).
Scenario
So, how do you combine all of the above.
Authenticate gcloud with all your Google accounts
ACCOUNT="client1#gmail.com"
gcloud auth login ${ACCOUNT}
ACCOUNT="client2#gmail.com"
gcloud auth login ${ACCOUNT} # Last will be the `ACTIVE` account
Enumerate these
gcloud auth list
Yields:
ACTIVE ACCOUNT
client1#gmail.com
* client2#gmail.com # This is ACTIVE
To set the active account, run:
$ gcloud config set account `ACCOUNT`
Switch between users for gcloud commands
NOTE This doesn't affect kubectl
Either
gcloud config set account client1#gmail.com
gcloud auth list
Yields:
ACTIVE ACCOUNT
* client1#gmail.com # This is ACTIVE
client2#gmail.com
Or you can explicitly add --account=${ACCOUNT} to any gcloud command, e.g.:
# Explicitly unset your account
gcloud config unset account
# This will work and show projects accessible to client1
gcloud projects list --account=client1#gmail.com
# This will work and show projects accessible to client2
gcloud projects list --account=client2#gmail.com
Create kubectl contexts for any|all your Google accounts (via gcloud)
Either
ACCOUNT="client1#gmail.com"
PROJECT="..." # Project accessible to ${ACCOUNT}
gcloud container clusters get-credentials ${CLUSTER} \
--ACCOUNT=${ACCOUNT} \
--PROJECT=${PROJECT} \
...
Or equivalently using kubectl config set-context directly:
kubectl config set-context ${CONTEXT} \
--cluster=${CLUSTER} \
--user=${USER} \
But it avoids having to gcloud config get-clusters, gcloud config get-users etc.
NOTE gcloud containers clusters get-credentials uses derived names for contexts and GKE uses derived names for clusters. If you're confident you can edit kubectl config directly (or using kubectl config commands) to rename these cluster, context and user references to suit your needs.
List kubectl contexts
kubectl config get-context
Yields:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* client1 a-cluster client1
client2 b-cluster client2
Switch between kubectl contexts (clusters*users)
NOTE This doesn't affect gcloud
Either
kubectl config use-context ${CONTEXT}
Or* you can explicitly add --context flag to any kubectl commands
# Explicitly unset default|current context
kubectl config unset current-context
# This will work and list deployments accessible to ${CONTEXT}
kubectl get deployments --context=${CONTEXT}

How to clean up after a GKE cluster created with gcloud container clusters create?

I'm creating Kubernetes clusters programmatically for end-to-end tests in GitLab CI/CD. I'm using gcloud container clusters create. I'm doing this for half a year and created and deleted a few hundred clusters. The cost went up and down. Now, I got an unusually high bill from Google and I checked the cost breakdown. I noticed that the cost is >95% for "Storage PD Capacity". I found out that gcloud container clusters delete never deleted the Google Compute Disks created for Persistent Volume Claims in the Kubernetes cluster.
How can I delete those programmatically? What else could be left running after deleting the Kubernetes cluster and the disks?
Suggestions:
To answer your immediate question: you can programatically delete your disk resource(s) with the Method: disks.delete API.
To determine what other resources might have been allocated, look here: Listing all Resources in your Hierarchy.
Finally, this link might also help: GKE: Understanding cluster resource usage
Because this part of the answer is lengthy:
gcloud compute disks create disk-a \
--size=10gb \
--zone=us-west1-a \
--labels=something=monday \
--project=${PROJECT}
gcloud compute disks create disk-b \
--size=10gb \
--zone=us-west1-b \
--labels=something=else \
--project=${PROJECT}
Then:
ID=$(gcloud compute disks list \
--filter="name~disk zone~us-west1 labels.something=else" \
--format="value(id)" \
--project=${PROJECT}) && echo ${ID}
NB
the filter AND is implicit and omitted
you may remove terms as needed
you should make the filter as specific as possible
And -- when you're certain as deletion is irrecoverable:
gcloud compute disks delete ${ID} --project=${PROJECT} --region=${REGION}
If there are multiple matches, you can iterate:
IDS=$(gcloud compute disks list ...)
for ID in ${IDS}
do
gcloud compute disks delete ${ID}
done
If you prefer -- the awesome jq, you'll have a general-purpose way (not gcloud-specific):
gcloud compute disks list \
--project=${PROJECT} \
--format=json \
| jq --raw-output '.[] | select(.name | contains("disk")) | select(.zone | contains("us-west1")) | select(.labels.something=="else")'
...

Identify redundant GCP resources created by Kubernetes

When creating various Kubernetes objects in GKE, associated GCP resources are automatically created. I'm specifically referring to:
forwarding-rules
target-http-proxies
url-maps
backend-services
health-checks
These have names such as k8s-fw-service-name-tls-ingress--8473ea5ff858586b.
After deleting a cluster, these resources remain. How can I identify which of these are still in use (by other Kubernetes objects, or another cluster) and which are not?
There is no easy way to identify which added GCP resources (LB, backend, etc.) are linked to which cluster. You need to manually go into these resources to see what they are linked to.
If you delete a cluster with additional resources attached, you have to also manually delete these resources as well. At this time, I would suggest taking note of which added GCP resources are related to which cluster, so that you will know which resources to delete when the time comes to deleting the GKE cluster.
I would also suggest to create a feature request here to request for either a more defined naming convention for additional GCP resources being created linked to a specific cluster and/or having the ability to automatically delete all additonal resources linked to a cluster when deleting said cluster.
I would recommend you to look at https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/docs/14-cleanup.md
You can easily delete all the objects by using the google cloud sdk in the following manner :
gcloud -q compute firewall-rules delete \
kubernetes-the-hard-way-allow-nginx-service \
kubernetes-the-hard-way-allow-internal \
kubernetes-the-hard-way-allow-external \
kubernetes-the-hard-way-allow-health-check
{
gcloud -q compute routes delete \
kubernetes-route-10-200-0-0-24 \
kubernetes-route-10-200-1-0-24 \
kubernetes-route-10-200-2-0-24
gcloud -q compute networks subnets delete kubernetes
gcloud -q compute networks delete kubernetes-the-hard-way
gcloud -q compute forwarding-rules delete kubernetes-forwarding-rule \
--region $(gcloud config get-value compute/region)
gcloud -q compute target-pools delete kubernetes-target-pool
gcloud -q compute http-health-checks delete kubernetes
gcloud -q compute addresses delete kubernetes-the-hard-way
}
This assumes you named your resources 'kubernetes-the-hard-way', if you do not know the names, you can also use various filter mechanisms to filter resources by namespaces etc to remove these.

Unable to create Dataproc cluster using custom image

I am able to create a google dataproc cluster from the command line using a custom image:
gcloud beta dataproc clusters create cluster-name --image=custom-image-name
as specified in https://cloud.google.com/dataproc/docs/guides/dataproc-images, but I am unable to find information about how to do the same using the v1beta2 REST api in order to create a cluster from within airflow. Any help would be greatly appreciated.
Since custom images can theoretically reside in a different project if you grant read/use access of that custom image to whatever project service account you use for the Dataproc cluster, images currently always need a full URI, not just a short name.
When you use gcloud, there's syntactic sugar where gcloud will resolve the full URI automatically; you can see this in action if you use --log-http with your gcloud command:
gcloud beta dataproc clusters create foo --image=custom-image-name --log-http
If you created one with gcloud you can also gcloud dataproc clusters describe your cluster to see the fully-resolved custom image URI.

Access Kubernetes GKE cluster outside of GKE cluster with client-go?

I have multiple kubernetes clusters running on GKE (let's say clusterA and clusterB)
I want to access both of those clusters from client-go in an app that is running in one of those clusters (e.g. access clusterB from an app that is running on clusterA)
I general for authenticating with kubernetes clusters from client-go I see that I have two options:
InCluster config
or from kube config file
So it is easy to access clusterA from clusterA but not clusterB from clusterA.
What are my options here? It seems that I just cannot pass GOOGLE_APPLICATION_CREDENTIALS and hope that client-go will take care of itself.
So my thinking:
create a dedicated IAM service account
create kube config with tokens for both clusters by doing gcloud container clusters get-credentials clusterA and gcloud container clusters get-credentials clusterB
use that kube config file in client-go via BuildConfigFromFlags on clusterA
Is this the correct approach, or is there a simpler way? I see that tokens have an expiration date?
Update:
It seems I can also use CLOUDSDK_CONTAINER_USE_CLIENT_CERTIFICATE=True gcloud beta container clusters get-credentials clusterB --zone. Which would add certificates to kube conf which I could use. But AFAIK those certificates cannot be revoked
client-go needs to know about:
cluster master’s IP address
cluster’s CA certificate
(If you're using GKE, you can see these info in $HOME/.kube/config, populated by gcloud container clusters get-credentials command).
I recommend you to either:
Have a kubeconfig file that contains these info for clusters A & B
Use GKE API to retrieve these info for clusters A & B (example here) (You'll need a service account to do this, explained below.)
Once you can create a *rest.Config object in client-go, client-go will use the auth plugin that's specified in the kubeconfig file (or its in-memory equivalent you constructed). In gcp auth plugin, it knows how to retrieve a token.
Then, Create a Cloud IAM Service Account and give it "Container Developer" role. Download its key.
Now, you have two options:
Option 1: your program uses gcloud
gcloud auth activate-service-account --key-file=key.json
KUBECONFIG=a.yaml gcloud container clusters get-credentials clusterA
KUBECONFIG=b.yaml gcloud container clusters get-credentials clusterB
Then create 2 different *rest.Client objects, one created from a.yaml, another from b.yaml in your program.
Now your program will rely on gcloud binary to retrieve token every time your token expires (every 1 hour).
Option 2: use GOOGLE_APPLICATION_CREDENTIALS
Don't install gcloud to your program’s environment.
Set your key.json to GOOGLE_APPLICATION_CREDENTIALS environment
variable for your program.
Figure out a way to get cluster IP/CA (explained above) so you can
construct two different *rest.Config objects for cluster A & B.
Now your program will use the specified key file to get an access_token
to Google API every time it expires (every 1h).
Hope this helps.
P.S. do not forget to import _ "k8s.io/client-go/plugin/pkg/client/auth/gcp" in your Go program. This loads the gcp auth plugin!