How to know if autoscaling has taken place in Kubernetes - kubernetes

Is there any way to know that the number of pods have scaled up or down as a result of Horizontal Pod Autoscaling apart from kubectl get hpa command?
I want to trigger a particular file on every scale up or scale down of pods

You can use status field of HPA to know when was last time HPA was executed.
Details about this can be found with below command:
kubectl explain hpa.status
from this status , you can use lastScaleTime filed for your problem.
lastScaleTime <string>
last time the HorizontalPodAutoscaler scaled the number of pods; used by
the autoscaler to control how often the number of pods is changed.

Related

How do I reset a deployments scale to use HPA after scaling down manually?

I have to turn off my service in production and turn it on again after a small period (doing a DB migration).
I know I can use kubectl scale deployment mydeployment --replicas=0. This services uses a HorizontalPodAutoscaler (HPA) so how would I go about reseting it to scale according to the HPA?
Thanks in advance :)
As suggested by the # Gari Singh ,HPA will not scale from 0, so once you are ready to reactivate your deployment, just run kubectl scale deployment mydeployment --replicas=1 and HPA will then takeover again.
In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand.
Horizontal scaling means that the response to increased load is to deploy more Pods. This is different from vertical scaling, which for Kubernetes would mean assigning more resources (for example: memory or CPU) to the Pods that are already running for the workload.
If the load decreases, and the number of Pods is above the configured minimum, the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet, or other similar resource) to scale back down.
Refer to this link on Horizontal Pod Autoscaling for detailed more information

ReplicationController wait for pods to terminate

I'm currently learning Kubernetes and I'm facing a problem with trying to realize a concept using Kubernetes.
I'm looking for something that works like a ReplicationController where I can tell K8s to start 50 replicas. But when I reduce the amount of replicas I need K8s to wait for the pods to terminate by themselves.
I know that there are Jobs but from what I've read it doesn't seem to be the fitting solution, since jobs are kind of a one-time thing. I need to keep the amount of desired pods until I decrease the amount of desired pods.
Basically a behavior like this:
You can use the kind Deployment in background it uses the ReplicationController and ReplicaSets.
ReplicationController is old version while the ReplicaSets is an updated approach to use. In background Kind : Deployment uses.
You can run the number for desired replicas by setting the numbers into the YAML file.
when you scale the deployment it will spin up the number of replicas and at the time of termination, you can again pass the desired replicas.
For example :
kubectl scale deployment test-deployment --replicas=50
Now running replicas are 50 and you want to scale down
kubectl scale deployment test-deployment --replicas=40
You can also check out the HPA
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Doesn't Kubernetes honor HPA configuration when we execute "kubectl scale deploy"?

The Scenario:
I have deployed a service using helm chart, I can see my service, hpa, deployment, pods etc.
In my hpa setting: the min pod count is set to 1.
I can see my Pod is running and able to handle service request.
After a while ---
I have executed -- "kubectl scale deploy --replicas=0"
Once I run the above above command I can see my pod got deleted (although the hpa min pod setting was set to 1), I was expecting after a while hpa will scale up to the min pod count i.e. 1.
However I don't see that happened, I have waited more than an hour and no new pod created by hpa.
I have also tried sending a request to my Kubernetes service and I was thinking now hpa will scale up the pod, since there is no pod to serve the request, however the hps doesn't seem to do that, and I got a response that my Service is not available.
Here is what I can see in kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE**
test Deployment/xxxx /1000% 1 4 0 1h
Interestingly I found that hpa scale down quickly: when I execute "kubectl scale deploy --replicas=2" (please note that the in hpa count is 1), I can see 2 pods gets created quickly however within 5 mins, 1 pod gets removed by hpa.
Is this is expected behavior of Kubernetes (particularly hpa) ?
as in, if we delete all pods by executing --"kubectl scale deploy --replicas=0",
a) the hpa won't block to reduce the replica count less than pod count configured (in hpa config) and
b) the hpa won't scale up (based on the hpa spinning cycle) to the min number of pods as configured.
and essentially c) until we redeploy or execute another round of "kubectl scale deploy" to update the replica count there will be no pods for this service.
Is this expected behavior or a (possible) bug in the Kubernetes codebase ?
I am using Kubernetes 1.8 version.
That was great observation. I was going through documentation of HPA and come across mathematical formula used by HPA to scale pods .and it looks like
TargetNumOfPods = ceil(sum(CurrentPodsCPUUtilization) / Target)
In your case, current pod utilization is zero as your pods count is zero . So mathematically this equation result into zero. So this this is a reason HPA is not working if pod count is zero.
a: HPA should not block manual scaling of pods as it get trigger only from resources (cpu, memory etc). Once you do scaling using "kubectl scale" or by any other means then HPA will come into picture depending on min, max replica and avg utilization value.
b: HPA scales up to min number of replicas if current count is non zero. I tried it and its working perfectly fine.
c: Yes unless you bring replica count to non-zero value, HPA will not work. So you have to scale up to some non zero value.
Hope this answers your doubts about HPA.

how to update max replicas in running pod

I'm looking to update manually with the command kubectl autoscale my maximum number of replicas for auto scaling.
however each time I run the command it creates a new hpa that fails to launch the pod why I don't know at all:(
Do you have an idea how i can update manually with kubectl my HPA ?
https://gist.github.com/zyriuse75/e75a75dc447eeef9e8530f974b19c28a
I think you are mixing two topics here, one is manually scale a pod (you can do it through a deployment applying kubectl scale deploy {mydeploy} --replicas={#repl}). In the other hand you have HPA (Horizontal Pod AutoScaler), in order to do this (HPA) you should have configured any app metrics provider system
e.g:
metrics server
https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/metrics-server
heapster (deprecated) https://github.com/kubernetes-retired/heapster
then you can create a HPA to handle your autoscaling, you can get more info on this link https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
Once created you can patch your HPA or deleted it and create it again
kubectl delete hpa hpa-pod -n ns-svc-cas
kubectl autoscale hpa-pod --min={#number} --max={#number} -n ns-svc-cas
easiest way

Autoscaling in Google Container Engine

I understand the Container Engine is currently on alpha and not yet complete.
From the docs I assume there is no auto-scaling of pods (e.g. depending on CPU load) yet, correct? I'd love to be able to configure a replication controller to automatically add pods (and VM instances) when the average CPU load reaches a defined threshold.
Is this somewhere on the near future roadmap?
Or is it possible to use the Compute Engine Autoscaler for this? (if so, how?)
As we work towards a Beta release, we're definitely looking at integrating the Google Compute Engine AutoScaler.
There are actually two different kinds of scaling:
Scaling up/down the number of worker nodes in the cluster depending on # of containers in the cluster
Scaling pods up and down.
Since Kubernetes is an OSS project as well, we'd also like to add a Kubernetes native autoscaler that can scale replication controllers. It's definitely something that's on the roadmap. I expect we will actually have multiple autoscaler implementations, since it can be very application specific...
Kubernetes autoscaling: http://kubernetes.io/docs/user-guide/horizontal-pod-autoscaling/
kubectl command: http://kubernetes.io/docs/user-guide/kubectl/kubectl_autoscale/
Example:
kubectl autoscale deployment foo --min=2 --max=5 --cpu-percent=80
You can autoscale your deployment by using kubectl autoscale.
Autoscaling is actually when you want to modify the number of pods automatically as the requirement may arise.
kubectl autoscale deployment task2deploy1 –cpu-percent=50 –min=1 –max=10
kubectl get deployment task2deploy1
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
task2deploy1 1 1 1 1 49s
As the resource consumption increases the number of pods will increase and will be more than the number of pods you specified in your deployment.yaml file but always less than the maximum number of pods specified in the kubectl autoscale command.
kubectl get deployment task2deploy1
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
task2deploy1 7 7 7 3 4m
Similarly, as the resource consumption decreases, the number of pods will go down but never less than the number of minimum pods specified in the kubectl autoscale command.