How to set min and max replicas for deployment level? - kubernetes

Can we set min and max limit for deployments at deployment level, not at cluster or replica set level in kubernetes ?

On deployment level it is not possible, but there is an option to do this indirectly. You should use a HorizontalPodAutoscaler (HPA for short):
HPA automatically updates a workload resource (such as a Deployment or
StatefulSet), with the aim of automatically scaling the workload to
match demand.
Example code for HPA:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
More information can be found in Kubernetes documentation.

At deployment level only replicas attribute is there. When you define hpa there is an option for min and max

Related

Kubernetes min replica count best practices

Will minimal number of replicas higher than 1 set in one of the below code snippets in production environments increase resiliency and availability (let's say starting with 2)?
What would be the best practices when setting this variable ?
I'm new to kubernetes but I was not able to find answer to this question.
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
namespace: test
labels:
app: test
spec:
replicas: 1
...
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: test-hpa
namespace: test
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test
minReplicas: 1
The short answer is 'yes', but there are many cases where the answer would be 'no'. For instance of all instances of the pod are scheduled on the same node, and the node dies, there will be some downtime before the pods start up again on an available node.
You can configure topologySpreadConstraints based on topologyKey: "node" as documented here https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/ to raise your recilience level.

Kubernetes - when to use HorizontalPodAutoscaler resource type?

As mentioned in this answer: allow for easy updating of a Replica Set as well as the ability to roll back to a previous deployment.
So, kind: Deployment scales replicasets, which scales Pods, supports zero-downtime updates by creating and destroying replicasets
What is the purpose of HorizontalPodAutoscaler resource type?
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: xyz
spec:
maxReplicas: 4
minReplicas: 2
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: xyz
targetCPUUtilizationPercentage: 70
As you write, with a Deployment it is easy to manually scale an app horizontally, by changing the numer of replicas.
By using a HorizontalPodAutoscaler, you can automate the horizontal scaling by e.g. configuring some metric thresholds, therefore the name autoscaler.

kubernetes HPA for deployment A and VPA for deployment B

The documentation of VPA states that HPA and VPA should not be used together. It can only be used to gethere when you want scaling on custom metrics.
I have scaling enabled on CPU.
My question is can I have HPA enabled for some deployment (lets say A) and VPA enabled for some deployment (lets say B). Or will this also leed to errors.
Using them both at the same time is not suggested because if they both detect that the memory is need they might want to try to resolve the same problems at the same time which will lead to wrongly allocated resources.
This is not something that can be specified at the application deployment level but you can specify which deployment should HPA and VPA scale using targetRef
So for the deployment with app1 you can specify VPA:
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
name: app1-vpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app1
And for the app2 you can specify to use HPA:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: app2-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app2
If need to use HPA and VPA together on the same deployments, you just have to make sure that the will based they behavior on different metrics. This way you prevent them by being scaled by the same event. To summarize VPA and HPA can be used together if he HPA config won`t use CPU or Memory to determine its targets as stated in the documentation:
"Vertical Pod Autoscaler should not be used with the Horizontal
Pod Autoscaler (HPA) on CPU or memory at this moment"

How to autoscale Kubernetes Pods based on average memory usage in EKS?

I am running an EKS cluster and I have a HorizontalPodAutoscaler created for autoscaling number of pods based on average CPU utilisation.
How to do the same for Average memory utilization?
Suppose all of the pods running in an EKS clusters, have used average of 70% of memory they are allocated (using resources), then the deployment should be autoscaled.
How to do this? Is creating a custom metric in CloudWatch the only way?
Even if cloudWatch is the only way, how to do that? Is there a specific documentation or tutorial or blog that does this?
Please try the below HPA configuration object.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: memory
targetAverageUtilization: 70
and apply the object using kubectl apply

YAML File to Achieve HPA and Autoscaler

YAML File for Horizontal Pod Autoscaler & Cluster Autoscaler
I have cluster ss1 which broken up into 2 agentpools: Pool1 and Pool2 , need to have HPA to run for the Pool2-Worker PODs,which runs on Pool2 with the cluster autoscaler to run on pool2, need to achieve via YAML File, anyways to do both HPA and Cluster Autoscaler in single YAML file,any help files to achieve this
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: test-app
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 50
for more you can visit official kubernetes document also.