How do I reset a deployments scale to use HPA after scaling down manually? - kubernetes

I have to turn off my service in production and turn it on again after a small period (doing a DB migration).
I know I can use kubectl scale deployment mydeployment --replicas=0. This services uses a HorizontalPodAutoscaler (HPA) so how would I go about reseting it to scale according to the HPA?
Thanks in advance :)

As suggested by the # Gari Singh ,HPA will not scale from 0, so once you are ready to reactivate your deployment, just run kubectl scale deployment mydeployment --replicas=1 and HPA will then takeover again.
In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand.
Horizontal scaling means that the response to increased load is to deploy more Pods. This is different from vertical scaling, which for Kubernetes would mean assigning more resources (for example: memory or CPU) to the Pods that are already running for the workload.
If the load decreases, and the number of Pods is above the configured minimum, the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet, or other similar resource) to scale back down.
Refer to this link on Horizontal Pod Autoscaling for detailed more information

Related

does GKE autopilot auto scale both pods and nodes?

when I change the replicas: x in my .yaml file I can see GKE autopilot boots pods up/down depending on the value, but what will happen if the load on my deployment gets too big. Will it then autoscale the number of pods and nodes to handle the traffic and then reduce back to the value specified in replicas when the request load is reduced again?
I'm basically asking how does autopilot horizontal autoscaling works?
and how do I get a minimum of 2 pod replicas that can horizontally autoscale in autopilot?
GKE autopilot by default will not scale the replicas count beyond what you specified. This is the default behavior of Kubernetes in general.
If you want automatic autoscaling you have to use Horizental Pod Autoscaler (HPA) which is supported in Autopilot
If you deploy HPA to scale up and down your workload, Autopilot will scale up and down the nodes automatically and that's transparent for you as the nodes are managed by Google.
GKE autoscale only Nodes by default, while you have to take care of your HPA deployment scaling.
Autopilot: GKE provisions and manages the cluster's underlying
infrastructure, including nodes and node pools, giving you an
optimized cluster with a hands-off experience.
We need to configure both scaling options for deployment VPA and HPA.
Pre-configured: Autopilot handles all the scaling and configuring of
your nodes.
Default: You configure Horizontal pod autoscaling (HPA) You configure
Vertical Pod autoscaling (VPA)
GKE will manage the scaling up/down of your nodes in node pools, without worrying about the infrastructure you just have to start deploying the application with HPA & VPA auto-scaling.
You can read more about the options here : https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#comparison

ReplicationController wait for pods to terminate

I'm currently learning Kubernetes and I'm facing a problem with trying to realize a concept using Kubernetes.
I'm looking for something that works like a ReplicationController where I can tell K8s to start 50 replicas. But when I reduce the amount of replicas I need K8s to wait for the pods to terminate by themselves.
I know that there are Jobs but from what I've read it doesn't seem to be the fitting solution, since jobs are kind of a one-time thing. I need to keep the amount of desired pods until I decrease the amount of desired pods.
Basically a behavior like this:
You can use the kind Deployment in background it uses the ReplicationController and ReplicaSets.
ReplicationController is old version while the ReplicaSets is an updated approach to use. In background Kind : Deployment uses.
You can run the number for desired replicas by setting the numbers into the YAML file.
when you scale the deployment it will spin up the number of replicas and at the time of termination, you can again pass the desired replicas.
For example :
kubectl scale deployment test-deployment --replicas=50
Now running replicas are 50 and you want to scale down
kubectl scale deployment test-deployment --replicas=40
You can also check out the HPA
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Azure Kubernetes Service - can the Cluster Autoscaler get triggered even if I don't set autoscaling explicitly?

I am deploying a service to Azure Kubernetes Service.
The Horizontal Pod Autoscaler scales the number of pods, whereas the Cluster Autoscaler scales the number of nodes based on the number of pending pods. If my understanding is correct, if I don't set up autoscaling in my deployment file, the HPA won't get triggered, and only one pod will run; therefore, the CA won't get triggered either.
My question is - is there a scenario in AKS where the CA would get triggered, even without setting autoscaling in my deployment file?
My question is - is there a scenario in AKS where the CA would get triggered, even without setting autoscaling in my deployment file?
Cluster autoscaler is typically used together with the horizontal pod autoscaler. The Horizontal Pod Autoscaler increases or decreases the number of pods based on application demand, and the cluster autoscaler adjusts the number of nodes as needed to run those additional pods accordingly.
If your deployment does not have the capacity to automatically scale up or down via the HPA, NOR you don't manually increase number of pods to the level where no additional pods can run due to insufficient resource in your nodes then the CA would not be triggered therefore the answer is NO.
You might find this document from official azure docs helpful also.

What is the relationship between the HPA and ReplicaSet in Kubernetes?

I can't seem to find an answer to this but what is the relationship between an HPA and ReplicaSet? From what I know we define a Deployment object which defines replicas which creates the RS and the RS is responsible for supervising our pods and scale up and down.
Where does the HPA fit into this picture? Does it wrap over the Deployment object? I'm a bit confused as you define the number of replicas in the manifest for the Deployment object.
Thank you!
When we create a deployment it create a replica set and number of pods (that we gave in replicas). Deployment control the RS, and RS controls pods. Now, HPA is another abstraction which give the instructions to deployment and through RS make sure the pods fullfil the respective scaling.
As far the k8s doc: The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics). Note that Horizontal Pod Autoscaling does not apply to objects that can't be scaled, for example, DaemonSets.
A brief high level overview is: Basically it's all about controller. Every k8s object has a controller, when a deployment object is created then respective controller creates the rs and associated pods, rs controls the pods, deployment controls rs. On the other hand, when hpa controllers sees that at any moment number of pods gets higher/lower than expected then it talks to deployment.
Read more from k8s doc

Kubernetes scale down particular pod

I have a Kubernetes deployment which can have multiple replica pods. I wish to horizontally increase and decrease the pods based on some logic in my python application (not custom metrics in hpa).
I have two ways to this:
Using Horizontal Pod Autoscalar and changing minReplicas, maxReplicas though my application by using kubernetes APIs
Directly updating the "/spec/replicas" field in my deployment using the APIs
Both the above things are working for upscale and downscale.
But, when I scale down, I want to remove a particular Pod, and not any other pod.
If I update the minReplicas maxReplicas in HPA, then it randomly deletes a pod.
Same when I update the /spec/replicas field in the deployment.
How can I delete a particular pod while scaling down?
I am not aware of any way to ensure that a particular pod in a ReplicaSet will be deleted during a scale down. You could achieve this behavior with a StatefulSet which will always delete the last pod on scale down.
For example, if we had a StatefulSet foo that was scaled to 3 we would have pods:
foo-0
foo-1
foo-2
And if we scaled the StatefulSet to 2, the controller would delete foo-2. But note that there are other limitations to be aware of with StatefulSet.