We have been running a cluster on GKE for around three years. As such, legacy authorization is enabled.
The control plane has been getting updated automatically, and our node pools are running a mixture of 1.12 and 1.14.
We have an increasing number of services, and are planning on incrementally adopting istio.
We want to enable a minimal RBAC setup without causing errors and downtime of our services.
I haven't been able to find any guides for how to accomplish this. Some people say just to enable RBAC authorization on the GKE cluster, but I assume that would take down all of our services.
It has also been implied that k8s can run in a hybrid ABAC/RBAC mode, but we can't tell if it is or not!
Is there a good guide for migrating to RBAC for GKE?
If you cluster is Regional you won't have downtime in your application when upgrade, but if your cluster is single-zonal or multi-zonal the best approach here is:
Add a new node pool
Cordon the old node pool to migrate the applications to the new node pool
Delete the old node pool after all pods are migrated.
It is the safesty way to update your node pool (zonal) without downtimes. Please read the references below to understand in details every step.
References:
https://kubernetes.io/docs/concepts/architecture/nodes/#reliability
https://kubernetes.io/docs/reference/kubectl/cheatsheet/#interacting-with-nodes-and-cluster
Related
I'm experiencing downtimes whenever the GKE cluster gets upgraded during the maintenance window. My services (APIs) become unreachable for like ~5min.
The cluster Location type is set to "Zonal", and all my pods have 2 replicas. The only affected pods seem to be the ones using nginx ingress controller.
Is there anything I can do to prevent this? I read that using Regional clusters should prevent downtimes in the control plane, but I'm not sure if it's related to my case. Any hints would be appreciated!
You mention "downtime" but is this downtime for you using the control plane (i.e. kubectl stop working) or is it downtime in that the end user who is using the services stops seeing the service working.
A GKE upgrade upgrades two parts of the cluster: the control plane or master nodes, and the worker nodes. These are two separate upgrades although they can happen at the same time depending on your configuration of the cluster.
Regional clusters can help with that, but they will cost more as you are having more nodes, but the upside is that the cluster is more resilient.
Going back to the earlier point about the control plane vs node upgrades. The control plane upgrade does NOT affect the end-user/customer perspective. The services will remaining running.
The node upgrade WILL affect the customer so you should consider various techniques to ensure high availability and resiliency on your services.
A common technique is to increase replicas and also to include pod antiaffinity. This will ensure the pods are scheduled on different nodes, so when the node upgrade comes around, it doesn't take the entire service out because the cluster scheduled all the replicas on the same node.
You mention the nginx ingress controller in your question. If you are using Helm to install that into your cluster, then out of the box, it is not setup to use anti-affinity, so it is liable to be taken out of service if all of its replicas get scheduled onto the same node, and then that node gets marked for upgrade or similar.
Seems like every week the GKE cluster gets restarted. Is there anything I could do to prevent that from happening? It does migrate pods to other node while it does maintenance on one of the node. But I'm not sure if there is downtime during migration and also sometimes the pods gets stuck in crash crashloopbackoff or errimagepull state.
How does the migration happen while maintenance? Does it create a new pod and then route the traffic and then delete the old pod when the total number of replica is just one? Just wanted to know if there is downtime. Its a new cluster and monitoring hasn't been setup so don't know if players are experiencing downtime during maintenance.
Is there a way to prevent GCP from doing maintenance? I used terraform to create the cluster so if I could prevent it I need to do it via terraform since GKE nodes can't be edited using GCP console.
You can configure your maintenance windows and enable/disable automatic node upgrades.
Here's an example of the configuration options in the GCP console:
You can also decide on which release channel you want to be (rapid, regular and stable).
Your Kubernetes control plane will have downtime if you have a zonal cluster. Only regional clusters replicate the control plane.
In terms of your own applications they should have zero downtime and GKE will automatically create new nodes and divert traffic when pods are ready to receive traffic.
I have a GKE cluster I'm trying to switch the default node machine type on.
I have already tried:
Creating a new node pool with the machine type I want
Deleting the default-pool. GKE will process for a bit, then not remove the default-pool. I assume this is some undocumented behavior where you cannot delete the default-pool.
I'd prefer to not re-create the cluster and re-apply all of my deployments/secrets/configs/etc.
k8s version: 1.14.10-gke.24 (Stable channel)
Cluster Type: Regional
The best approach to change/increase/decrease your node pool specification would be with:
Migration
To migrate your workloads without incurring downtime, you need to:
Create a new node pool.
Mark the existing node pool as unschedulable.
Drain the workloads running on the existing node pool.
Check if the workload is running correctly on a new node pool.
Delete the existing node pool.
Your workload will be scheduled automatically onto a new node pool.
Kubernetes, which is the cluster orchestration system of GKE clusters, automatically reschedules the evicted Pods to the new node pool as it drains the existing node pool.
There is official documentation about migrating your workload:
This tutorial demonstrates how to migrate workloads running on a GKE cluster to a new set of nodes within the same cluster without incurring downtime for your application. Such a migration can be useful if you want to migrate your workloads to nodes with a different machine type.
-- GKE: Migrating workloads to different machine types
Please take a look at above guide and let me know if you have any questions in that topic.
Disable the default-pool's autoscaler and set the pool size to 0 nodes.
Wish there was a way I could just switch the machine type on the default-pool...
We are working on provisioning our service using Kubernetes and the service needs to register/unregister some data for scaling purposes. Let's say the service handles long-held transactions so when it starts/scales out, it needs to store the starting and ending transaction ids somewhere. When it scales out further, it will need to find the next transaction id and save it with the ending transaction id that is covered. When it scales in, it needs to delete the transaction ids, etc. ETCD seems to make the cut as it is used (by Kubernetes) to store deployment data and not only that it is close to Kubernetes, it is actually inside and maintained by Kubernetes; thus we'd like to find out if that is open for our use. I'd like to ask the question for both EKS, AKS, and self-installed. Any advice welcome. Thanks.
Do not use the kubernetes etcd directly for an application.
Access to read/write data to the kubernetes etcd store is root access to every node in your cluster. Even if you are well versed in etcd v3's role based security model avoid sharing that specific etcd instance so you don't increase your clusters attack surface.
For EKS and GKE, the etcd cluster is hidden in the provided cluster service so you can't break things. I would assume AKS takes a similar approach unless they expose the instances to you that run the management nodes.
If the data is small and not heavily updated, you might be able to reuse the kubernetes etcd store via the kubernetes API. Create a ConfigMap or a custom resource definition for your data and edit it via the easily securable and namespaced functionality in the kubernetes API.
For most application uses run your own etcd cluster (or whatever service) to keep Kubernetes free to do it's workload scheduling. The coreos etcd operator will let you define and create new etcd clusters easily.
I'm currently learning about Kubernetes and still trying to figure it out. I get the general use of it but I think that there still plenty of things I'm missing, here's one of them. If I want to run Kubernetes on my public cloud, like GCE or AWS, will Kubernetes spin up new VMs by itself in order to make more compute for new pods that might be needed? Or will it only use a certain amount of VMs that were pre-configured as the compute pool. I heard Brendan say, in his talk in CoreOS fest, that Kubernetes sees the VMs as a "sea of compute" and the user doesn't have to worry about which VM is running which pod - I'm interested to know where that pool of compute comes from, is it configured when setting up Kubernetes? Or will it scale by itself and create new machines as needed?
I hope I managed to be coherent.
Thanks!
Kubernetes supports scaling, but not auto-scaling. The addition and removal of new pods (VMs) in a Kubernetes cluster is performed by replication controllers. The size of a replication controller can be changed by updating the replicas field. This can be performed in a couple ways:
Using kubectl, you can use the scale command.
Using the Kubernetes API, you can update your config with a new value in the replicas field.
Kubernetes has been designed for auto-scaling to be handled by an external auto-scaler. This is discussed in responsibilities of the replication controller in the Kubernetes docs.