Doesn't Kubernetes honor HPA configuration when we execute "kubectl scale deploy"? - kubernetes

The Scenario:
I have deployed a service using helm chart, I can see my service, hpa, deployment, pods etc.
In my hpa setting: the min pod count is set to 1.
I can see my Pod is running and able to handle service request.
After a while ---
I have executed -- "kubectl scale deploy --replicas=0"
Once I run the above above command I can see my pod got deleted (although the hpa min pod setting was set to 1), I was expecting after a while hpa will scale up to the min pod count i.e. 1.
However I don't see that happened, I have waited more than an hour and no new pod created by hpa.
I have also tried sending a request to my Kubernetes service and I was thinking now hpa will scale up the pod, since there is no pod to serve the request, however the hps doesn't seem to do that, and I got a response that my Service is not available.
Here is what I can see in kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE**
test Deployment/xxxx /1000% 1 4 0 1h
Interestingly I found that hpa scale down quickly: when I execute "kubectl scale deploy --replicas=2" (please note that the in hpa count is 1), I can see 2 pods gets created quickly however within 5 mins, 1 pod gets removed by hpa.
Is this is expected behavior of Kubernetes (particularly hpa) ?
as in, if we delete all pods by executing --"kubectl scale deploy --replicas=0",
a) the hpa won't block to reduce the replica count less than pod count configured (in hpa config) and
b) the hpa won't scale up (based on the hpa spinning cycle) to the min number of pods as configured.
and essentially c) until we redeploy or execute another round of "kubectl scale deploy" to update the replica count there will be no pods for this service.
Is this expected behavior or a (possible) bug in the Kubernetes codebase ?
I am using Kubernetes 1.8 version.

That was great observation. I was going through documentation of HPA and come across mathematical formula used by HPA to scale pods .and it looks like
TargetNumOfPods = ceil(sum(CurrentPodsCPUUtilization) / Target)
In your case, current pod utilization is zero as your pods count is zero . So mathematically this equation result into zero. So this this is a reason HPA is not working if pod count is zero.
a: HPA should not block manual scaling of pods as it get trigger only from resources (cpu, memory etc). Once you do scaling using "kubectl scale" or by any other means then HPA will come into picture depending on min, max replica and avg utilization value.
b: HPA scales up to min number of replicas if current count is non zero. I tried it and its working perfectly fine.
c: Yes unless you bring replica count to non-zero value, HPA will not work. So you have to scale up to some non zero value.
Hope this answers your doubts about HPA.

Related

GKE node pool with Autoscaling does not scale down

I have a GKE cluster with two nodepools. I turned on autoscaling on one of my nodepools but it does not seem to automatically scale down.
I have enabled HPA and that works fine. It scales the pods down to 1 when I don't see traffic.
The API is currently not getting any traffic so I would expect the nodes to scale down as well.
But it still runs the maximum 5 nodes despite some nodes using less than 50% of allocatable memory/CPU.
What did I miss here? I am planning to move these pods to bigger machines but to do that I need the node autoscaling to work to control the monthly cost.
There are many reasons that can cause CA to not be downscaling successfully. If we resume how this should work normally it will be something like this:
Cluster autoscaler will periodically check (every 10 seconds) utilization of the nodes.
If the utilization factor is less than 0.5 the node will be considered as under utilization.
Then the nodes will be marked for removal and will be monitored for next 10 mins to make sure the utilization factor stays less than 0.5.
If even after 10 mins it stays under utilized then the node would be removed by cluster autoscaler.
If above is not being accomplished, then something else is preventing your nodes to be downscaling. In my experience PDBs needs to be applied to kube-system pods and I would say that could be the reason why; however, there are many reasons why this can be happening, here are reasons that can cause downscaling issues:
1. PDB is not applied to your kube-system pods. Kube-system pods prevent Cluster Autoscaler from removing nodes on which they are running. You can manually add Pod Disruption Budget(PDBs) for the kube-system pods that can be safely rescheduled elsewhere, this can be added with next command:
`kubectl create poddisruptionbudget PDB-NAME --namespace=kube-system --selector app=APP-NAME --max-unavailable 1`
2. Containers using local storage (volumes), even empty volumes. Kubernetes prevents scale down events on nodes with pods using local storage. Look for this kind of configuration that prevents Cluster Autoscaler to scale down nodes.
3. Pods annotated with cluster-autoscaler.kubernetes.io/safe-to-evict: true. Look for pods with this annotation that can be preventing Nodes scaledown
4. Nodes annotated with cluster-autoscaler.kubernetes.io/scale-down-disabled: true. Look for Nodes with this annotation that can be preventing cluster Autoscale. These configurations are the ones I will suggest you check on, in order to make your cluster to be scaling down nodes that are under utilized. -----
Also you can see this page where explains the configuration to prevent the downscales, which can be what is happening to you.

HPA could not get CPU metric during GKE node auto-scaling

Cluster information:
Kubernetes version: 1.12.8-gke.10
Cloud being used: GKE
Installation method: gcloud
Host OS: (machine type) n1-standard-1
CNI and version: default
CRI and version: default
During node scaling, HPA couldn't get CPU metric.
At the same time, kubectl top pod and kubectl top node output is:
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
For more details, I'll show you the flow of my problem occurs:
Suddenly many requests arrive at the GKE server. (Using testing tool)
HPA detects current CPU usage above target CPU usage(50%), thus try pod scale up
incrementally.
Insufficient CPU warning occurs when creating pods, thus GKE try node scalie up
incrementally.
Soon the HPA fails to get the metric, and kubectl top node or kubectl top pod
doesn’t get a response.
- At this time one or more OutOfcpu pods are found, and several pods are in
ContainerCreating (from Pending state).
After node scale-up is complete and some time has elapsed (about a few minutes),
HPA starts to fetch the CPU metric successfully and try to scale up/down based on
metric.
Same situation happens when node scale down.
This causes pod scaling to stop and raises some failures on responding to client’s requests. Is this normal?
I think HPA should get CPU metric(or other metrics) on running pods even during node scaling, to keep track of the optimal pod size at the moment. So when node scaling done, HPA create the necessary pods at once (rather than incrementally).
Can I make my cluster work like this?
Maybe your node runs out of one resource either memory or cpu, there are config maps that describe how addons are scaled depending on the cluster size. You need to edit metrics-server-config config map in kube-system namespace:
kubectl edit cm/metrics-server-config -n kube-system
you should add
baseCPU
cpuPerNode
baseMemory
memoryPerNode
to NannyConfiguration, here you can find extensive manual:
Also heapster suffers from the same OOM issue: too many pods to handle all metrics within assigned resources please modify heapster's config map in accordingly:
kubectl edit cm/heapster-config -n kube-system

How to know if autoscaling has taken place in Kubernetes

Is there any way to know that the number of pods have scaled up or down as a result of Horizontal Pod Autoscaling apart from kubectl get hpa command?
I want to trigger a particular file on every scale up or scale down of pods
You can use status field of HPA to know when was last time HPA was executed.
Details about this can be found with below command:
kubectl explain hpa.status
from this status , you can use lastScaleTime filed for your problem.
lastScaleTime <string>
last time the HorizontalPodAutoscaler scaled the number of pods; used by
the autoscaler to control how often the number of pods is changed.

Kuberenetes hpa patch command not working

I have Kuberenetes cluster hosted in Google Cloud.
I deployed my deployment and added an hpa rule for scaling.
kubectl autoscale deployment MY_DEP --max 10 --min 6 --cpu-percent 60
waiting a minute and run kubectl get hpa command to verify my scale rule - As expected, I have 6 pods running (according to min parameter).
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
MY_DEP Deployment/MY_DEP <unknown>/60% 6 10 6 1m
Now, I want to change the min parameter:
kubectl patch hpa MY_DEP -p '{"spec":{"minReplicas": 1}}'
Wait for 30 minutes and run the command:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
MY_DEP Deployment/MY_DEP <unknown>/60% 1 10 6 30m
expected replicas: 1, actual replicas: 6
More information:
You can assume that the system has no computing anything (0% CPU
utilization).
I waited for more than an hour. Nothing changed.
The same behavior is seen when i deleted the scaling rule and deployed it
again. The replicas parameter has not changed.
Question:
If I changed the MINPODS parameter to "1" - why I still have 6 pods? How to make Kubernetes to actually change the min pods in my deployment?
If I changed the MINPODS parameter to "1" - why I still have 6 pods?
I believe the answer is because of the <unknown>/60% present in the output. The fine manual states:
Please note that if some of the pod's containers do not have the relevant resource request set, CPU utilization for the pod will not be defined and the autoscaler will not take any action for that metric
and one can see an example of 0% / 50% in the walkthrough page. Thus, I would believe that since kubernetes cannot prove what percentage of CPU is being consumed -- neither above nor below the target -- it takes no action for fear of making whatever the situation is worse.
As for why there is a <unknown>, I would hazard a guess it's the dreaded heapster-to-metrics-server cutover that might be obfuscating that information from the kubernetes API. Regrettably, I don't have first-hand experience testing that theory, in order to offer you concrete steps beyond "see if your cluster is collecting metrics in a place that kubernetes can see them."

Autoscaling in Google Container Engine

I understand the Container Engine is currently on alpha and not yet complete.
From the docs I assume there is no auto-scaling of pods (e.g. depending on CPU load) yet, correct? I'd love to be able to configure a replication controller to automatically add pods (and VM instances) when the average CPU load reaches a defined threshold.
Is this somewhere on the near future roadmap?
Or is it possible to use the Compute Engine Autoscaler for this? (if so, how?)
As we work towards a Beta release, we're definitely looking at integrating the Google Compute Engine AutoScaler.
There are actually two different kinds of scaling:
Scaling up/down the number of worker nodes in the cluster depending on # of containers in the cluster
Scaling pods up and down.
Since Kubernetes is an OSS project as well, we'd also like to add a Kubernetes native autoscaler that can scale replication controllers. It's definitely something that's on the roadmap. I expect we will actually have multiple autoscaler implementations, since it can be very application specific...
Kubernetes autoscaling: http://kubernetes.io/docs/user-guide/horizontal-pod-autoscaling/
kubectl command: http://kubernetes.io/docs/user-guide/kubectl/kubectl_autoscale/
Example:
kubectl autoscale deployment foo --min=2 --max=5 --cpu-percent=80
You can autoscale your deployment by using kubectl autoscale.
Autoscaling is actually when you want to modify the number of pods automatically as the requirement may arise.
kubectl autoscale deployment task2deploy1 –cpu-percent=50 –min=1 –max=10
kubectl get deployment task2deploy1
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
task2deploy1 1 1 1 1 49s
As the resource consumption increases the number of pods will increase and will be more than the number of pods you specified in your deployment.yaml file but always less than the maximum number of pods specified in the kubectl autoscale command.
kubectl get deployment task2deploy1
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
task2deploy1 7 7 7 3 4m
Similarly, as the resource consumption decreases, the number of pods will go down but never less than the number of minimum pods specified in the kubectl autoscale command.