What if i delete a Node in GKS - kubernetes

I have setup GKS in free trail access.
here is screenshot of cluster
I have already setup vm instance in gce. So my kubernets cluster is having less resource for testing i have setup it but i want to know if i delete 1 node out of 3 what will happened
my pods are running in all 3 nodes(disturbed)
So i delete one node will it create a new node with deploy my running pods into another 2 nodes it will become heavy
how do i know its HA using and Scale Up and Scale Down
Please clear my questions

So i delete one node will it create a new node with deploy my running
pods into another 2 nodes it will become heavy
GKE will manage the Nodes using Node pool config.
if inside your GKE you have set 3 nodes and manually remove 1 instance it will auto create new Node in cluster.
You pod might get moved to another node if space is left there or else it will go to pending state and wait for new node to join the GKE cluster.
If you want to redice nodes in GKE you have to redice minimum count in GKE node pool.
If you want to test scale up and down, you can enable auto scaling on Node pool and increase the POD count on cluster, GKE will auto add nodes. Make sure you have set correctly min and max nodes into node pool section for autoscaling.

When you delete a node, its pods are also deleted. Depending on your deployment, i.e. you have Pod scale of 3, one node will hold 2 pods and the other 1. If your app will suffer or not it depends on the actual traffic.

Related

Pods are not rescheduling to failure node ,when node come alive

My situation is ,I have 5 nodes having K8 cluster .Initially pods are distributed across the 5 node. Sometime we need to restart particular node server. Then that node goes down and and pod will created on another node. But once failed/down node comes up ,no pods are creating in it automatically, as replica number already reached .We need all node have minimum 1 pods to run .Could please help on this
We need all node have minimum 1 pods to run
Instead of running the Deployment, you can run the Daemon set if you want to run a minimum of one 1 pod to run on the node.
So if anytime new comes to the cluster it will have one replica running of your POD.
Read more daemon set : https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
You can spread also replicas if running as deployment however it would be tricky to manage with Affinity, POD topology spread

Does kubernetes restore the worker node if worker node dies?

I am creating kubernetes cluster which include: 1 master node (M1), 2 worker nodes (W1 and W2)
Using deployment to create pods with replica count 5.
If pod dies kubernetes is re creating the pods. Count remains 5.
Lets suppose if W2 worker node dies due to any reason.
In this case does kubernetes will create a new node or just run all the replicas on the same node W1.
If i want to restore the died node automatically how can i do that?
This mostly depends on how you deployed things. Most cloud-integrated installers and hosted providers (GKE, EKS, AKS, Kops) use a node group of some kind so a fully failed node (machine terminated) would be replaced at that level. If the node is up but jammed, that would generally be solved by cluster-autoscaler starting a new node for you. Some installers that don't make per-cloud assumptions (Kubespray, etc) leave this up to you to handle yourself.

Can I make sure that a "cordoned" node is the one deleted when I downsize the cluster in Azure AKS?

I need to downsize a cluster from 3 to 2 nodes.
I have critical pods running on some nodes (0 and 1). As I found that the last node (2) in the cluster is the one that has the non critical pods, I have "cordoned" it so it won't get any new ones.
I wonder is if I can make sure that that last node (2) is the one that will be removed when I go to Azure portal and downsize my cluster to 2 nodes (it is the last node and it is cordoned).
I have read that if I manually delete the node, the system will still consider there are 3 nodes running so it's important to use the cluster management to downsize it.
You cannot control which node will be removed when scaling down the AKS cluster.
However, there are some workarounds for that:
Delete the cordoned node manually via portal and than launch upgrade. It would try to add the node but with no success because the subnet has no space left.
Another option is to:
Set up cluster autoscaler with two nodes
Scale up the number of nodes in the UI
Drain the node you want to delete and wait for autoscaler do it's job
Here are some sources and useful info:
Scale the node count in an Azure Kubernetes Service (AKS) cluster
Support selection of nodes to remove when scaling down
az aks scale
Please let me know if that helped.

AWS EKS Cluster Autoscaler - Scale-In Policy

I've a CA (Cluster Autoscaler) deployed on EKS followed this post. What I'm wondering is CA automatically scales down the cluster whenever at least a single pod is deployed on that node i.e. if there are 3 nodes with the capacity of 8 pods, if 9th pod comes up, CA would provision 4th nodes to run that 9th pod. What I see is CA is continuously terminating & creating a node randomly chosen from within the cluster disturbing other pods & nodes.
How can I tell EKS (without defining minimal nodes or disabling scale-in policy in ASG) to not to kill the node having at least 1 pod running on it. Any suggestion would be appreciated.
You cannot use pod as unit. CA work with resources cpu and memory unit.
If the cluster does not have enough cpu or memory it add one new.
You have to play with your requests resources (in the pod definition) or redefine your node to take an instance type with more or less resources depending how many pod you want on each.
Or you can play with the param scale-down-utilization-threshold
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca

Kubernetes Horizontal Pod Autoscaler not utilising node resources

I am currently running Kubernetes 1.9.7 and successfully using the Cluster Autoscaler and multiple Horizontal Pod Autoscalers.
However, I recently started noticing the HPA would favour newer pods when scaling down replicas.
For example, I have 1 replica of service A running on a node alongside several other services. This node has plenty of available resource. During load, the target CPU utilisation for service A rose above the configured threshold, therefore the HPA decided to scale it to 2 replicas. As there were no other nodes available, the CAS span up a new node on which the new replica was successfully scheduled - so far so good!
The problem is, when the target CPU utilisation drops back below the configured threshold, the HPA decides to scale down to 1 replica. I would expect to see the new replica on the new node removed, therefore enabling the CAS to turn off that new node. However, the HPA removed the existing service A replica that was running on the node with plenty of available resources. This means I now have service A running on a new node, by itself, that can't be removed by the CAS even though there is plenty of room for service A to be scheduled on the existing node.
Is this a problem with the HPA or the Kubernetes scheduler? Service A has now been running on the new node for 48 hours and still hasn't been rescheduled despite there being more than enough resources on the existing node.
After scouring through my cluster configuration, I managed to come to a conclusion as to why this was happening.
Service A was configured to run on a public subnet and the new node created by the CA was public. The existing node running the original replica of Service A was private, therefore leading the HPA to remove this replica.
I'm not sure how Service A was scheduled onto this node in the first place, but that is a different issue.