I'm using a k8s HPA template for CPU and memory like below:
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: {{.Chart.Name}}-cpu
labels:
app: {{.Chart.Name}}
chart: {{.Chart.Name}}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{.Chart.Name}}
minReplicas: {{.Values.hpa.min}}
maxReplicas: {{.Values.hpa.max}}
targetCPUUtilizationPercentage: {{.Values.hpa.cpu}}
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: {{.Chart.Name}}-mem
labels:
app: {{.Chart.Name}}
chart: {{.Chart.Name}}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{.Chart.Name}}
minReplicas: {{.Values.hpa.min}}
maxReplicas: {{.Values.hpa.max}}
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageValue: {{.Values.hpa.mem}}
Having two different HPA is causing any new pods spun up for triggering memory HPA limit to be immediately terminated by CPU HPA as the pods' CPU usage is below the scale down trigger for CPU.
It always terminates the newest pod spun up, which keeps the older pods around and triggers the memory HPA again, causing an infinite loop.
Is there a way to instruct CPU HPA to terminate pods with higher usage rather than nascent pods every time?
As per the suggestion in comments, using a single HPA solved my issue. I just had to move CPU HPA to same apiVersion as memory HPA.
Autoscaling based on multiple metrics/Custom metrics:-
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 100Mi
When created, the Horizontal Pod Autoscaler monitors the nginx Deployment for average CPU utilization, average memory utilization, and (if you uncommented it) the custom packets_per_second metric. The Horizontal Pod Autoscaler autoscales the Deployment based on the metric whose value would create the larger autoscale event.
https://cloud.google.com/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling#kubectl-apply
Related
I'm trying to understand how the Kubernetes HorizontalPodAutoscaler works.
Until now, I have used the following configuration:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: my-deployment
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50
This uses the targetCPUUtilizationPercentage parameter but I would like to use a metric for the memory percentage used, but I was not able to find any example.
Any hint?
I found also that there is this type of configuration to support multiple metrics, but the apiVersion is autoscaling/v2alpha1. Can this be used in a production environment?
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2alpha1
metadata:
name: WebFrontend
spec:
scaleTargetRef:
kind: ReplicationController
name: WebFrontend
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 80
- type: Object
object:
target:
kind: Service
name: Frontend
metricName: hits-per-second
targetValue: 1k
Here is a manifest example for what you need, that includes Memory Metrics:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: web-servers
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-servers
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 20
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 30Mi
An important thing to notice is that, as you can see, it uses the autoscaling/v2beta2 API version, so you need to follow all the previous instructions listed here.
Regarding the possibility to use the autoscaling/v2alpha1, yes, you can use it, as it includes support for scaling on memory and custom metrics as this URL specifies, but keep in mind that alpha versions are released for testing, as they are not final versions.
For more autoscaling/v2beta2 YAML’s examples and a deeper look into memory metrics, you can take a look at this thread.
This is my hpa yaml file:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: mysql-hpa
spec:
maxReplicas: 2
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: mysql
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
The problem is that while i send requests to my app with jmeter, hpa creates a 2nd pod but doesn't share the traffic to both pods, except a few times!
You can see it to the photos below..
Ιf i create a pod with 2 replicas (by yaml file) without hpa, traffic is devided normally!
Any idea?
i have another pod with 12 containers and the hpa works fine.
I am currently trying to set up a GKE cluster and to configure an HorizontalPodAutoscaler based on a custom metric (GPU consumption).
I have two node-pools and I want to horizontally scale them based on the average GPU consumption of each node_pool. I have configured two identical HPA like this:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: ner
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ner
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metric:
name: kubernetes.io|container|accelerator|duty_cycle
target:
type: AverageValue
averageValue: 60
where I only replace the scaleTargetRef but it turns out that this metric seems to be aggregated at a cluster level. I have double checked that the scaleTargetRef are properly defined.
Is there a way to filter the metrics by container_name or node_pool? Any other suggestion would be awesome !
So I think you are looking for metrics for your k8 cluster especially by container_name or node_pool.
You have five types of metrics you can use in an HPA object(autoscaling/v2beta2)
k explain HorizontalPodAutoscaler.spec.metrics.type --api-version=autoscaling/v2beta2
Edit update
ContainerResource
External # Use this if the metrics not related to Kubernetes objects.
Object
Pods
Resource
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: ner
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ner
minReplicas: 1
maxReplicas: 10
metrics:
- type: ContainerResource
containerResource:
name: gpu
container: your-application-container
target:
type: Utilization
averageUtilization: 60
Edit Update
For GKP Autoscaling Deployments with Cloud Monitoring metrics
I am deploying my microservice application I built using node.
Issue
The pods won't autoscale when I put load using Jmeter. The CPU utitilization goes to 50m, which doesn't invoke HPA to start autoscaling. I want it to start replicating as soon as it reaches 80% of the CPU request(which is 10m).
HPA config :
# apiVersion: autoscaling/v1
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: client-hpa
namespace: default
spec:
minReplicas: 1
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
scaleTargetRef:
# apiVersion: apps/v1beta1
apiVersion: apps/v1
kind: Deployment
name: client-depl
Deployment config :
apiVersion: apps/v1
kind: Deployment
metadata:
name: client-depl
spec:
replicas: 1
selector:
matchLabels:
app: client
template:
metadata:
labels:
app: client
spec:
containers:
- name: client
image: <docker-id>/<image-name>
resources:
requests:
memory: 350Mi
cpu: 10m ### I want it to autoscale when it reaches 8m ###
Also, kubectl get hpa shows the following output :
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
client-hpa Deployment/client-depl <unknown>/80% 1 4 1 8m32s
HPA is based on the following equtation
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
As you can notice, equation is based on current cpu utilization and not on CPU requests. (I want it to autoscale when it reaches 8m).
That said, perhaps the following maybe of interest to you:
Vertical Pod Autoscaling
Is it possible to keep 'cpu' and 'memory' metrics together as shown below ? This seems to be not working. I tried below script as HPA. But instently pods has grown upto 5.
That's not what i was expecting.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: myservice-metrics
namespace: myschema
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myservice
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: memory
targetAverageValue: 500Mi
- type: Resource
resource:
name: cpu
targetAverageUtilization: 70
If i keep it individually, it is not complaining. Is it the best practice to set both the metrics for a service ? is there any other way to set both the metrics.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: myservice-metrics-memory
namespace: myschema
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myservice
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: memory
targetAverageValue: 500Mi
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: myservice-metrics-cpu
namespace: myschema
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myservice
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 70
Starting from Kubernetes v1.6 support for scaling based on multiple metrics has been added.
I would suggest to try and switch to the autoscaling/v2beta2 API.
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-multiple-metrics
https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/horizontal-pod-autoscaler-v2beta2/
metrics is of type []MetricSpec and
the maximum replica count across all metrics will be used
Single file is possible.