Can I configure a turn on/off on a GKE node? - kubernetes

I have a GKE cluster with 1 node-pool and 2 nodes in it, one with node affinity to only accept pods of production and the other to development and testing pods. For financial purposes, I want to configure like a cronJob or something similar on the dev/test node so I can spend less money but I don't know if that's possible.

Yes, you can add another node pool named test so you can have two node pools; one to develop and one to test. You can also turn on the auto [scaling]1 in your development pool, this feature of GKE will allow you to save money because it automatically resizes your GKE Cluster node pool based on the demand workload when your production pool is not demanding and you can put a maximum of pods to limit the numbers of pods deployed; in case of your workload increase the demand.
Once you have configured the production pool, you can create the new test node pool with a fixed size of one node.
Then, you will use a node selector in your pods to make sure that they can run in the production node pool. And you could use an anti-affinity rule to ensure that two of your training pods cannot be scheduled on the same node.

Yes, it is possible. But you should separate production and development nodes into different pools. Then you can fire up a cronjob, target that to come up in the production pool, and use that to perform the following actions:
Turn off autoscaling on the development pool (if you use autoscaling, that is)
Set the node pool size of the development pool to 0
Remember to fire up another cronjob to reverse the steps:
Scale up development pool to 1
Enable autoscaling on the development pool

Related

In Kubernetes / AKS how do you direct services to specific node pools' individual nodes. Is there a way for the deployments to choose or is it auto?

I am new to Kubernetes. One thing I am not sure of is when creating all of these deployments I am starting to max out my cpu/memory usage on my current node pool.
In this article it states that the SAME configuration of nodes will be created as a "node pool"
In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools.
System node pools serve the primary purpose of hosting critical system pods such as CoreDNS and tunnelfront. User node pools serve the primary purpose of hosting your application pods.
Also, you can create "taints (lol what)", tags and labels which only seem to "label" per se the node pool. Not an individual node.
When creating a node pool, you can add taints, labels, or tags to that node pool. When you add a taint, label, or tag, all nodes within that node pool also get that taint, label, or tag.
So with all of that said, it doesn't seem like the control is inside of a node pool's node. So how does it work for nodes in a node pool when deploying workloads and services?
Do I need to worry about managing that or is that automatically managed and pods are created across the plane of nodes in a node pool. I'm not really seeing the documentation for this.
One more thing, "Vertical Pod Autoscaling"
This seems like a good option but doesn't really explain in the documentation what is going on in terms of the nodes in a node pool. Except for this one statement at the end.
This article showed you how to automatically scale resource utilization, such as CPU and memory, of cluster nodes to match application requirements.
The question about the vertical auto scaler (which is really vertical + horizontal (IMHO)) but I understand the reference/verbiage, is what happens if you aren't using this? Do you have to manage each deployment on your own? How do deployments distribute over the individual node pool plane?

Can I force Kubernetes not to run more than X replicas of a pod in the same node?

I have a tiny Kubernetes cluster consisting of just two nodes running on t3a.micro AWS EC2 instances (to save money).
I have a small web app that I am trying to run in this cluster. I have a single Deployment for this app. This deployment has spec.replicas set to 4.
When I run this Deployment, I noticed that Kubernetes scheduled 3 of its pods in one node and 1 pod in the other node.
Is it possible to force Kubernetes to schedule at most 2 pods of this Deployment per node? Having 3 instances in the same pod puts me dangerously close to running out of memory in these tiny EC2 instances.
Thanks!
The correct solution for this would be to set memory requests and limits correctly matching your steady state and burst RAM consumption levels on every pod, then the scheduler will do all this math for you.
But for the future and for others, there is a new feature which kind of allows this https://kubernetes.io/blog/2020/05/introducing-podtopologyspread/. It's not an exact match, you can't put a global cap, rather you can require pods be evenly spaced over the cluster subject to maximum skew caps.

Change node machine type on GKE cluster

I have a GKE cluster I'm trying to switch the default node machine type on.
I have already tried:
Creating a new node pool with the machine type I want
Deleting the default-pool. GKE will process for a bit, then not remove the default-pool. I assume this is some undocumented behavior where you cannot delete the default-pool.
I'd prefer to not re-create the cluster and re-apply all of my deployments/secrets/configs/etc.
k8s version: 1.14.10-gke.24 (Stable channel)
Cluster Type: Regional
The best approach to change/increase/decrease your node pool specification would be with:
Migration
To migrate your workloads without incurring downtime, you need to:
Create a new node pool.
Mark the existing node pool as unschedulable.
Drain the workloads running on the existing node pool.
Check if the workload is running correctly on a new node pool.
Delete the existing node pool.
Your workload will be scheduled automatically onto a new node pool.
Kubernetes, which is the cluster orchestration system of GKE clusters, automatically reschedules the evicted Pods to the new node pool as it drains the existing node pool.
There is official documentation about migrating your workload:
This tutorial demonstrates how to migrate workloads running on a GKE cluster to a new set of nodes within the same cluster without incurring downtime for your application. Such a migration can be useful if you want to migrate your workloads to nodes with a different machine type.
-- GKE: Migrating workloads to different machine types
Please take a look at above guide and let me know if you have any questions in that topic.
Disable the default-pool's autoscaler and set the pool size to 0 nodes.
Wish there was a way I could just switch the machine type on the default-pool...

Why kubernetes taints the master node with "NoSchedule" by default?

A few days ago, I looked up why none of pods are being scheduled to the master node, and found this question: Allow scheduling of pods on Kubernetes master?
It tells that it is because the master node is tainted with "NoSchedule" effect, and gives the command to remove that taint.
But before I execute that command on my cluster, I want to understand why it was there in the first place.
Is there a reason why the master node should not run pods? Any best-practices it relates to?
The purpose of kubernetes is to deploy application easily and scale them based on the demand. The pod is a basic entity which runs the application and can be increased and decreased based on high and low demands respectively (Horizontal Pod Autoscalar).
These worker pods needs to be run on worker nodes specially if you’re looking at big application where your cluster might scale upto 100’s of nodes based on demand (Cluster Autoscalar). These increasing pods can put up pressure on your nodes and once they do you can always increase the worker node in cluster using cluster autoscalar.
Suppose, you made your master schedulable then the high memory and CPU pressure put your master at risk of crashing the master. Mind you can’t autoscale the master using autoscalar. This way you’re putting your whole cluster at risk. If you have single master then your will not be able to schedule anything if master crashed. If you have 3 master and one of them crashed, then the other two master has to take the extra load of scheduling and managing worker nodes and increasing the load on themselves and hence the increased risk of failure
Also, In case of larger cluster, you already need the master nodes with high resources just to manage your worker nodes. You can’t put additional load on master nodes to run the workload as well in that case. Please have a look at the setting up large cluster in kubernetes here
If you have manageable workload and you know it doesn’t increase beyond a certain level. You can make master schedulable. However for production cluster it is not recommended at all.
Primary role of master is cluster management. Already many components of k8 are running on master.Suppose If pods scheduled on master without limit of resources and pods are consuming all the resources( cpu or memory), then master and in turn whole cluster will be at risk.
So while designing Highly Available production cluster minimum 3 master, 3 etcd, 3 infra node are created and application pods are not scheduled on these nodes. Separate worker nodes added to assign workload.
Master is intended for cluster management tasks and should not be used to run workloads. In development and test environments it is ok to schedule pods on master servers but in production better to keep it only for cluster level management activities. Use workers or nodes to schedule workloads

GKE cluster upgrade by switching to a new pool: will inter-cluster service communication fail?

From this article(https://cloudplatform.googleblog.com/2018/06/Kubernetes-best-practices-upgrading-your-clusters-with-zero-downtime.html) I learnt that it is possible to create a new node pool, and cordon and drain old nodes one by one, so that workloads get re-scheduled to new nodes in the new pool.
To me, a new node pool seems to indicate a new cluster. The reason: we have two node pools in GKE, and they're listed as two separate clusters.
My question is: after the pods under a service get moved to a new node, if that service is being called from other pods in the old node, will this inter-cluster service call fail?
You don't create a new cluster per se. You upgrade the master(s) and then you create a new node pool with nodes that have a newer version. Make sure the new node pool shares the same network as the original node pool.
If you have a service with one replica (one pod) if that pod is living in one of the nodes you are upgrading you need to allow time for Kubernetes to create a new replica on a different node that is not being upgraded. For that time, your service will be unavailable.
If you have a service with multiple replicas chances are that you won't see any downtime unless for some odd reason all your replicas are scheduled on the same node.
Recommendation: scale your resources which serve your services (Deployments, DaemonSets, StatefulSets, etc) by one or two replicas before doing node upgrades.
StatefulSet tip: You will have some write downtime if you are running something like mysql in a master-slave config when you reschedule your mysql master.
Note that creating a new node Pool does not create a new cluster. You can have multiple node pools within the same cluster. Workloads within the different node pools will still interact with each other since they are in the same cluster.
gcloud container node-pools create (the command to create node pools) requires that you specify the --cluster flag so that the new node pool is created within an existing cluster.
So to answer the question directly, following the steps from that Google link will not cause any service interruption nor will there be any issues with pods from the same cluster communicating with each other during your migration.