One of our Google Kubernetes Engine clusters has lost access to Google Cloud Platform via it's main service account. It was not using the service account 'default', but a custom one, but it's now gone. Is there a way to restore or change the service account for a GKE cluster after it has been created? Or are we just out of luck and do we have to re-create the cluster?
Good news! We found a way to solve the issue without having to re-create the entire cluster.
Create a new node-pool and make sure it has the default permissions to Google Cloud Platform (this is the case if you create the pool via the Console UI).
'Force' all workloads on the new node pool (e.g. by using node labels).
Re-deploy the workloads.
Remove the old (broken ) node pool.
Hope this helps anyone with the same issue in the future.
Looks like you are out of luck. According to the documentation, gcloud container clusters update command does not let you update service account.
It's not possible to do it, either restore a service account or update the cluster for a new one, you can edit Compute Engine instances but since the cluster is managed as a group, you can't edit them, even if you could, if you had the autoscaler or the auto repair node feature, new nodes wouldn't have the new service account.
So, it seems you're out of luck, you will have to recreate the cluster.
Related
Sorry to bother you, but i am having a serious issue with my online DevOps learning.
In fact, i am taking a Devops course and we are using the google cloud platform as a cloud. When i create my cluster with gcloud container clusters create xxx and then do the describe command like gcloud container clusters describe xxx, it works but i have no information regarding the login and password to Kubernetes;
That is one of the problem.
After creating the cluster, i got not Kubernetes dashboard link with the command kubectl cluster-info. Normally i should have a Kubernetes dashboard to manage my app. In place of having the Kubernetes dashboard, there is something called Kubernetes system metric.
Can somebody help me fix this problem probably someone who is used to practice on GCP.
Best regards
Can you please go through this Google Cloud Kubernetes dashboards docs[1]?
Because, I'm able to see Kubernetes dashboard in my console. But, I don't know why you are not able to see that, and I also checked there is now any service outage on Kubernetes from Google Cloud Status Dashboard[2]. But, It's working fine. So, kindly go through that Kubernetes docs, from that you will get some better understanding of working with Kubernetes in GCP.
If you're still facing any issue or abnormal behavior, please go to public issue tracker[3] or support from GCP console and raise a ticket.
[1]. https://cloud.google.com/kubernetes-engine/docs/concepts/dashboards
[2]. https://status.cloud.google.com/
[3]. https://cloud.google.com/support/docs/issue-trackers#trackers-list
When you visit the GCP dashboard docs, you should see red warning on top of the website, saying:
Warning: The open source Kubernetes Dashboard addon is deprecated for clusters on GKE and will be removed as an option in version 1.15. As an alternative, use the Cloud Console dashboards described in this guide.
Below you read:
Starting with GKE v1.15, you will no longer be able to enable the Kubernetes Dashboard by using the add-on API. You will still be able to install Kubernetes Dashboard manually by following the instructions in the project's repository. For clusters in which you have already deployed the add-on, it will continue to function but you will need to manually apply any updates and security patches that are released.
To deploy it, follow the instructions on k8s dashboard github repo
We have been running a cluster on GKE for around three years. As such, legacy authorization is enabled.
The control plane has been getting updated automatically, and our node pools are running a mixture of 1.12 and 1.14.
We have an increasing number of services, and are planning on incrementally adopting istio.
We want to enable a minimal RBAC setup without causing errors and downtime of our services.
I haven't been able to find any guides for how to accomplish this. Some people say just to enable RBAC authorization on the GKE cluster, but I assume that would take down all of our services.
It has also been implied that k8s can run in a hybrid ABAC/RBAC mode, but we can't tell if it is or not!
Is there a good guide for migrating to RBAC for GKE?
If you cluster is Regional you won't have downtime in your application when upgrade, but if your cluster is single-zonal or multi-zonal the best approach here is:
Add a new node pool
Cordon the old node pool to migrate the applications to the new node pool
Delete the old node pool after all pods are migrated.
It is the safesty way to update your node pool (zonal) without downtimes. Please read the references below to understand in details every step.
References:
https://kubernetes.io/docs/concepts/architecture/nodes/#reliability
https://kubernetes.io/docs/reference/kubectl/cheatsheet/#interacting-with-nodes-and-cluster
I am trying to use KubeVirt with GKE cluster.
I found I am able to create a nested virtualization enabled GCP VM, but I didn't find a way to achieve the same thing for GKE cluster node.
If I cannot enable nested virtualization for GKE cluster node, I can only use the kubevirt with debug.useEmulation which is not what I want.
Thanks
Yes you can -- it isn't even hard to do, it just isn't very intuitive.
Start a GKE cluster with ubuntu/containerd, n1-standard nodes and minimum cpu of Haswell. I think you also need to enable "Basic Authorization" to get virtctl working (sorry).
Find the template used for your new cluster, then to determine the proper source image:
gcloud compute instance-templates describe --format=json | jq ".properties.disks[0].initializeParams.sourceImage"
Create a copy of the source disk with nested virtualization enabled:
gcloud compute images --project $PROJECT create $NEW_IMAGE_NAME --source-image $SOURCE_IMAGE --source-image-project=$SOURCE_PROJECT --licenses "https://www.googleapis.com/compute/v1/projects/vm-options/global/licenses/enable-vmx"
Use "Create Similar" on the template for your GKE cluster. Change the boot disk to $NEW_IMAGE_NAME. You will also need to drill down to networking/alias and change the default subnet to your pod network.
Trigger a rolling update on the group for your GKE nodes to move them to the new template.
You can now install kubevirt (I had to use 0.38.1 instead of the current)
Caveats: I don't know how to use google disk images for kubevirt which would be an obvious match. I haven't even figured out how to get private GCR working with CDI. Oh, and console doesn't work due to websocket problems. But... you can shell to a gke node and see /dev/kvm, you can also kubevirt a VM then ssh into it, so yes, it does work.
Anyone know how to make any of this better?
Currently nested virtualization is available only on GCE as per this docs.
There is already question regarding supporting Nested Virtualization on GKE and it can be found here. I'd say it's not introduced yet, thats why you cannot find proper documentation about GKE and nested virtualization.
Also please consider that GCP and GKE are quite different.
Google Compute Engine VM instance is unmanaged by google. So besides ready base image, you can do whatever you need, like it would be normal VM.
However, Google Kubernetes Engine was created especially for containers. Thoses VMs are managed by google. GKE already creates Cluster for you and all VMs are automatically part of the cluster. In GKE you are unable to run Minikube or Kubeadm.
Here you have some characteristics of GKE
Can anyone explain an example of using kiam on kubernetes to manage service-level access control to aws resources?
According to the docs:
The server is the only process that needs to call sts:AssumeRole and
can be placed on an isolated set of EC2 instances that don't run other
user workloads.
I would like to know to run the server part of it away from nodes that host your services.
Answer: KIAM architecture is well explained here:
https://www.bluematador.com/blog/iam-access-in-kubernetes-kube2iam-vs-kiam
Basically you want to use Master Nodes in your cluster with IAM::STS permissions on them to install the Server portion of kiam and then let your worker nodes connect to master nodes to retrieve credentials.
DISCLAIMER: I did some digging on k2iam and kiam without going all the way through to taking them to a test bench and wasn't happy with what I found out. It turns out we don't need them anymore starting with K8s 1.13 in EKS, that is as of september 4th as native support from AWS has been added for PODS to access IAM STS.
https://docs.aws.amazon.com/en_pv/eks/latest/userguide/iam-roles-for-service-accounts.html
I cannot find a way to remove GPU (accelerator resource) from Google Kubernetes Engine (GKE) cluster. There is no official documentation on how to make change to it. Can you suggest a proper way to do so? The UI is gray out and it cannot allow me to make change from the console.
Here is the screenshot when I click to edit cluster.
Thank you
You cannot edit settings of a Node Pool once it is created.
You should create a new node pool with the settings you want (GPU, machine type etc) and delete the old node pool.
There's a tutorial on how to migrate to a new node pool smoothly here: https://cloud.google.com/kubernetes-engine/docs/tutorials/migrating-node-pool If you don't care about pods terminating gracefully, you can create a new pool and just delete the old one.
You can find more content about this at https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-upgrading-your-clusters-with-zero-downtime.