Kubernetes - HPA won't autoscale the pods - kubernetes

I am deploying my microservice application I built using node.
Issue
The pods won't autoscale when I put load using Jmeter. The CPU utitilization goes to 50m, which doesn't invoke HPA to start autoscaling. I want it to start replicating as soon as it reaches 80% of the CPU request(which is 10m).
HPA config :
# apiVersion: autoscaling/v1
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: client-hpa
namespace: default
spec:
minReplicas: 1
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
scaleTargetRef:
# apiVersion: apps/v1beta1
apiVersion: apps/v1
kind: Deployment
name: client-depl
Deployment config :
apiVersion: apps/v1
kind: Deployment
metadata:
name: client-depl
spec:
replicas: 1
selector:
matchLabels:
app: client
template:
metadata:
labels:
app: client
spec:
containers:
- name: client
image: <docker-id>/<image-name>
resources:
requests:
memory: 350Mi
cpu: 10m ### I want it to autoscale when it reaches 8m ###
Also, kubectl get hpa shows the following output :
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
client-hpa Deployment/client-depl <unknown>/80% 1 4 1 8m32s

HPA is based on the following equtation
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
As you can notice, equation is based on current cpu utilization and not on CPU requests. (I want it to autoscale when it reaches 8m).
That said, perhaps the following maybe of interest to you:
Vertical Pod Autoscaling

Related

hpa doesn't share cpu traffic to replicas

This is my hpa yaml file:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: mysql-hpa
spec:
maxReplicas: 2
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: mysql
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
The problem is that while i send requests to my app with jmeter, hpa creates a 2nd pod but doesn't share the traffic to both pods, except a few times!
You can see it to the photos below..
Ιf i create a pod with 2 replicas (by yaml file) without hpa, traffic is devided normally!
Any idea?
i have another pod with 12 containers and the hpa works fine.

Unable to fetch metrics from custom metrics API

I am using prometheus kubernetes adapter to scale up application.
I have service nginx running at port 8080 displaying metric requests.
I use hpa.yaml as follows:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: requests
targetAverageValue: 10
This is working... But I want to scale up different deployment using requests as metric so I change my hpa to
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx2
minReplicas: 1
maxReplicas: 10
metrics:
- type: Object
object:
target:
kind: Service
name: nginx
metricName: requests
targetValue: 10
But it is giving error as " invalid metrics (1 invalid out of 1), first error is: failed to get pods metric value: unable to get metric requests: no metrics returned from custom metrics API"
"the HPA was unable to compute the replica count: unable to get metric requests: no metrics returned from custom metrics API"
So I can not scale up other application base on nginx. I want to use nginx service to scale up other deployments.
Any Suggestions... Where I am going wrong?

How to make k8s cpu and memory HPA work together?

I'm using a k8s HPA template for CPU and memory like below:
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: {{.Chart.Name}}-cpu
labels:
app: {{.Chart.Name}}
chart: {{.Chart.Name}}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{.Chart.Name}}
minReplicas: {{.Values.hpa.min}}
maxReplicas: {{.Values.hpa.max}}
targetCPUUtilizationPercentage: {{.Values.hpa.cpu}}
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: {{.Chart.Name}}-mem
labels:
app: {{.Chart.Name}}
chart: {{.Chart.Name}}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{.Chart.Name}}
minReplicas: {{.Values.hpa.min}}
maxReplicas: {{.Values.hpa.max}}
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageValue: {{.Values.hpa.mem}}
Having two different HPA is causing any new pods spun up for triggering memory HPA limit to be immediately terminated by CPU HPA as the pods' CPU usage is below the scale down trigger for CPU.
It always terminates the newest pod spun up, which keeps the older pods around and triggers the memory HPA again, causing an infinite loop.
Is there a way to instruct CPU HPA to terminate pods with higher usage rather than nascent pods every time?
As per the suggestion in comments, using a single HPA solved my issue. I just had to move CPU HPA to same apiVersion as memory HPA.
Autoscaling based on multiple metrics/Custom metrics:-
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 100Mi
When created, the Horizontal Pod Autoscaler monitors the nginx Deployment for average CPU utilization, average memory utilization, and (if you uncommented it) the custom packets_per_second metric. The Horizontal Pod Autoscaler autoscales the Deployment based on the metric whose value would create the larger autoscale event.
https://cloud.google.com/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling#kubectl-apply

Kubernetes HPA wrong metrics?

I've created a GKE test cluster on Google Cloud. It has 3 nodes with 2 vCPUs / 8 GB RAM. I've deployed two java apps on it
Here's the yaml file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapi
spec:
selector:
matchLabels:
app: myapi
strategy:
type: Recreate
template:
metadata:
labels:
app: myapi
spec:
containers:
- image: eu.gcr.io/myproject/my-api:latest
name: myapi
imagePullPolicy: Always
ports:
- containerPort: 8080
name: myapi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myfrontend
spec:
selector:
matchLabels:
app: myfrontend
strategy:
type: Recreate
template:
metadata:
labels:
app: myfrontend
spec:
containers:
- image: eu.gcr.io/myproject/my-frontend:latest
name: myfrontend
imagePullPolicy: Always
ports:
- containerPort: 8080
name: myfrontend
---
Then I wanted to add a HPA with the following details:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: myfrontend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myfrontend
minReplicas: 2
maxReplicas: 5
targetCPUUtilizationPercentage: 50
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: myapi
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapi
minReplicas: 2
maxReplicas: 4
targetCPUUtilizationPercentage: 80
---
If I check kubectl top pods it shows some really weird metrics:
NAME CPU(cores) MEMORY(bytes)
myapi-6fcdb94fd9-m5sh7 194m 1074Mi
myapi-6fcdb94fd9-sptbb 193m 1066Mi
myapi-6fcdb94fd9-x6kmf 200m 1108Mi
myapi-6fcdb94fd9-zzwmq 203m 1074Mi
myfrontend-788d48f456-7hxvd 0m 111Mi
myfrontend-788d48f456-hlfrn 0m 113Mi
HPA info:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
myapi Deployment/myapi 196%/80% 2 4 4 32m
myfrontend Deployment/myfrontend 0%/50% 2 5 2 32m
But If I check uptime on one of the nodes it shows a less lower value:
[myapi#myapi-6fcdb94fd9-sptbb /opt/]$ uptime
09:49:58 up 47 min, 0 users, load average: 0.48, 0.64, 1.23
Any idea why it shows a completely different thing. Why hpa shows 200% of current CPU utilization? And because of this it uses the maximum replicas in idle, too. Any idea?
The targetCPUUtilizationPercentage of the HPA is a percentage of the CPU requests of the containers of the target Pods. If you don't specify any CPU requests in your Pod specifications, the HPA can't do its calculations.
In your case it seems that the HPA assumes 100m as the CPU requests (or perhaps you have a LimitRange that sets the default CPU request to 100m). The current usage of your Pods is about 200m and that's why the HPA displays a utilisation of about 200%.
To set up the HPA correctly, you need to specify CPU requests for your Pods. Something like:
containers:
- image: eu.gcr.io/myproject/my-api:latest
name: myapi
imagePullPolicy: Always
ports:
- containerPort: 8080
name: myapi
resources:
requests:
cpu: 500m
Or whatever value your Pods require. If you set the targetCPUUtilizationPercentage to 80, the HPA will trigger an upscale operation at 400m usage, because 80% of 500m is 400m.
Besides that, you use an outdated version of HorizontalPodAutoscaler:
Your version: v1
Newest version: v2beta2
With the v2beta2 version, the specification looks a bit different. Something like:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myapi
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapi
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
See examples.
However, the CPU utilisation mechanism described above still applies.

Kubernetes - HPA metrics - memory & cpu together

Is it possible to keep 'cpu' and 'memory' metrics together as shown below ? This seems to be not working. I tried below script as HPA. But instently pods has grown upto 5.
That's not what i was expecting.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: myservice-metrics
namespace: myschema
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myservice
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: memory
targetAverageValue: 500Mi
- type: Resource
resource:
name: cpu
targetAverageUtilization: 70
If i keep it individually, it is not complaining. Is it the best practice to set both the metrics for a service ? is there any other way to set both the metrics.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: myservice-metrics-memory
namespace: myschema
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myservice
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: memory
targetAverageValue: 500Mi
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: myservice-metrics-cpu
namespace: myschema
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myservice
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 70
Starting from Kubernetes v1.6 support for scaling based on multiple metrics has been added.
I would suggest to try and switch to the autoscaling/v2beta2 API.
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-multiple-metrics
https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/horizontal-pod-autoscaler-v2beta2/
metrics is of type []MetricSpec and
the maximum replica count across all metrics will be used
Single file is possible.