The documentation of VPA states that HPA and VPA should not be used together. It can only be used to gethere when you want scaling on custom metrics.
I have scaling enabled on CPU.
My question is can I have HPA enabled for some deployment (lets say A) and VPA enabled for some deployment (lets say B). Or will this also leed to errors.
Using them both at the same time is not suggested because if they both detect that the memory is need they might want to try to resolve the same problems at the same time which will lead to wrongly allocated resources.
This is not something that can be specified at the application deployment level but you can specify which deployment should HPA and VPA scale using targetRef
So for the deployment with app1 you can specify VPA:
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
name: app1-vpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app1
And for the app2 you can specify to use HPA:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: app2-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app2
If need to use HPA and VPA together on the same deployments, you just have to make sure that the will based they behavior on different metrics. This way you prevent them by being scaled by the same event. To summarize VPA and HPA can be used together if he HPA config won`t use CPU or Memory to determine its targets as stated in the documentation:
"Vertical Pod Autoscaler should not be used with the Horizontal
Pod Autoscaler (HPA) on CPU or memory at this moment"
Related
Can we set min and max limit for deployments at deployment level, not at cluster or replica set level in kubernetes ?
On deployment level it is not possible, but there is an option to do this indirectly. You should use a HorizontalPodAutoscaler (HPA for short):
HPA automatically updates a workload resource (such as a Deployment or
StatefulSet), with the aim of automatically scaling the workload to
match demand.
Example code for HPA:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
More information can be found in Kubernetes documentation.
At deployment level only replicas attribute is there. When you define hpa there is an option for min and max
As mentioned in this answer: allow for easy updating of a Replica Set as well as the ability to roll back to a previous deployment.
So, kind: Deployment scales replicasets, which scales Pods, supports zero-downtime updates by creating and destroying replicasets
What is the purpose of HorizontalPodAutoscaler resource type?
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: xyz
spec:
maxReplicas: 4
minReplicas: 2
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: xyz
targetCPUUtilizationPercentage: 70
As you write, with a Deployment it is easy to manually scale an app horizontally, by changing the numer of replicas.
By using a HorizontalPodAutoscaler, you can automate the horizontal scaling by e.g. configuring some metric thresholds, therefore the name autoscaler.
I am running an EKS cluster and I have a HorizontalPodAutoscaler created for autoscaling number of pods based on average CPU utilisation.
How to do the same for Average memory utilization?
Suppose all of the pods running in an EKS clusters, have used average of 70% of memory they are allocated (using resources), then the deployment should be autoscaled.
How to do this? Is creating a custom metric in CloudWatch the only way?
Even if cloudWatch is the only way, how to do that? Is there a specific documentation or tutorial or blog that does this?
Please try the below HPA configuration object.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: memory
targetAverageUtilization: 70
and apply the object using kubectl apply
I am using Kubernetes in GCP. I am scaling my pods using metrics of queue size uploaded to Cloud Monitoring.
The problem:
Kubernetes scale up pods in very short intervals. About 12-15 seconds between each scale up. My machines take about 30 seconds to boot up. I would like the scale up intervals to be something close to 30.
Adding
spec:
minReadySeconds: 30
to the deployment yaml did not worked.
Example hpa:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: <NAME>
namespace: <NAMESPACE>
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: <DEPLOYMENT>
minReplicas: <MIN_REPLICAS>
maxReplicas: <MAX_REPLICAS>
metrics:
- type: External
external:
metricName: "custom.googleapis.com|rabbit_mq|<QUEUE>|messages_count"
metricSelector:
matchLabels:
metric.labels.name: <NAMESPACE>
targetValue: <TARGETVALUE>
Is there a way to control this scale-up interval?
The delays between scale-ups are determined internally by the HPA algorithm.
From the documentation:
Starting from v1.12, a new algorithmic update removes the need for the upscale delay.
It seems it was a configurable parameter before, but now the algorithm tries to be smart about it and decide on its own how quickly to scale up your app.
To be really sure how the HPA does it and how you could influence it, you can inspect the code.
I am fairly new to the kubernetes engine and I have a use case that I can't seem to make working. I want to have each pod run in only one dedicated node and then autoscale the cluster.
For now I have tried using a DaemonSet to run each pod and I have created an HorizontalPodAutoscaler targeting the nodepool.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: test
spec:
selector:
matchLabels:
app: test
template:
metadata:
labels:
app: test
spec:
containers:
- name: actions
image: image_link
nodeSelector:
cloud.google.com/gke-nodepool: test
updateStrategy:
type: RollingUpdate
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: test
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: DaemonSet
name: test
minReplicas: 1
maxReplicas: 2
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 80
I then use the stress utility to test the autoscaling process but the number of nodes stays constant. Is there something I am missing here ? Is there another component I can use for my use case ?
HorizontalPodAutoscaler is used to scale the pods depending on the metrics limit. It is not applicable to daemonset.
Daemonset deploys one pod on each node in the cluster. If you want to scale daemonset you need to scale your nodepool.
HorizontalPodAutoscaler is best used to auto scale deployment objects. In your case, change the daemonset object to deployment object or scale out the nodepool. Auto scaling of nodes is supported on Google cloud platform. Not sure about other cloud providers. You need to check your cloud provider documentation
Daemonset is a controller which deploys a POD for each node having the selector matched expression, you can't have more than on POD running on each node. You should look at another controller, I could not see what kind of app you want to deploy, I would suggest:
Deployment: If you want to use a stateless based application which can handle scaling up and down without consistency between the replicas
StatefulSet: If you want to use a stateful based application which needs some care to scaling and also data consistency
One important thing to notice about the HPA is that you must have metrics enabled, otherwise the reconciliation loop would not be able to watch the scale action needed.