Kubernetes HPA pod custom metrics shows as <unknown> - kubernetes

I have managed to install Prometheus and it's adapter and I want to use one of the pod metrics for autoscaling
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq . |grep "pods/http_request".
"name": "pods/http_request_duration_milliseconds_sum",
"name": "pods/http_request",
"name": "pods/http_request_duration_milliseconds",
"name": "pods/http_request_duration_milliseconds_count",
"name": "pods/http_request_in_flight",
Checking api I want to use pods/http_request and added it to my HPA configuration
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: app
namespace: app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app
minReplicas: 4
maxReplicas: 8
metrics:
- type: Pods
pods:
metric:
name: http_request
target:
type: AverageValue
averageValue: 200
After applying the yaml and check the hpa status it shows up as <unkown>
$ k apply -f app-hpa.yaml
$ k get hpa
NAME REFERENCE TARGETS
app Deployment/app 306214400/2000Mi, <unknown>/200 + 1 more...
But when using other pod metrics such as pods/memory_usage_bytes the value is properly detected
Is there a way to check the proper values for this metric? and how do I properly add it for my hpa configuration
Reference https://www.ibm.com/support/knowledgecenter/SSBS6K_3.2.0/manage_cluster/hpa.html

1st deploy metrics server, it should be up and running.
$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Then in a few sec. metrics server deployed. check HPA it should resolved.
$ kubectl get deployment -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
.
.
kube-system metrics-server 1/1 1 1 34s
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ha-xxxx-deployment Deployment/xxxx-deployment 1%/5% 1 10 1 6h46m

Related

FailedGetPodsMetric: for HPA autoscaling

I am trying to autoscale using custom metrics, with metric type "http_request". My following command is showing correct output:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/" | jq
Below is my hpa.yaml file:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: podinfo
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
minReplicas: 2
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: http_requests
targetAverageValue: 1
but my scaling is failing due to
the HPA was unable to compute the replica count:
unable to get metric http_requests: unable to fetch metrics from custom metrics API: an error on the server`
("Internal Server Error: \"/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%!A(MISSING)/http_requests?labelSelector=app%!D(MISSING)podinfo\": the server could not find the requested resource")
has prevented the request from succeeding (get pods.custom.metrics.k8s.io *)
Please help me out in this :)
Seems like you are missing pods in your cluster that match the provided deployment specification. Can you check if your podinfo deployment is running? And that it has healthy pods in it?
The command works because you're only checking the availability of the metrics endpoint. This simply implies that the endpoint is live to start providing metrics, doesn't guarantee that you will receive metrics (without any resources).

Kubernetes resource quota, have non schedulable pod staying in pending state

So I wish to limit resources used by pod running for each of my namespace, and therefor want to use resource quota.
I am following this tutorial.
It works well, but I wish something a little different.
When trying to schedule a pod which will go over the limit of my quota, I am getting a 403 error.
What I wish is the request to be scheduled, but waiting in a pending state until one of the other pod end and free some resources.
Any advice?
Instead of using straight pod definitions (kind: Pod) use deployment.
Why?
Pods in Kubernetes are designed as relatively ephemeral, disposable entities:
You'll rarely create individual Pods directly in Kubernetes—even singleton Pods. This is because Pods are designed as relatively ephemeral, disposable entities. When a Pod gets created (directly by you, or indirectly by a controller), the new Pod is scheduled to run on a Node in your cluster. The Pod remains on that node until the Pod finishes execution, the Pod object is deleted, the Pod is evicted for lack of resources, or the node fails.
Kubernetes assumes that for managing pods you should a workload resources instead of creating pods directly:
Pods are generally not created directly and are created using workload resources. See Working with Pods for more information on how Pods are used with workload resources.
Here are some examples of workload resources that manage one or more Pods:
Deployment
StatefulSet
DaemonSet
By using deployment you will get very similar behaviour to the one you want.
Example below:
Let's suppose that I created pod quota for a custom namespace, set to "2" as in this example and I have two pods running in this namespace:
kubectl get pods -n quota-demo
NAME READY STATUS RESTARTS AGE
quota-demo-1 1/1 Running 0 75s
quota-demo-2 1/1 Running 0 6s
Third pod definition:
apiVersion: v1
kind: Pod
metadata:
name: quota-demo-3
spec:
containers:
- name: quota-demo-3
image: nginx
ports:
- containerPort: 80
Now I will try to apply this third pod in this namespace:
kubectl apply -f pod.yaml -n quota-demo
Error from server (Forbidden): error when creating "pod.yaml": pods "quota-demo-3" is forbidden: exceeded quota: pod-demo, requested: pods=1, used: pods=2, limited: pods=2
Not working as expected.
Now I will change pod definition into deployment definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: quota-demo-3-deployment
labels:
app: quota-demo-3
spec:
selector:
matchLabels:
app: quota-demo-3
template:
metadata:
labels:
app: quota-demo-3
spec:
containers:
- name: quota-demo-3
image: nginx
ports:
- containerPort: 80
I will apply this deployment:
kubectl apply -f deployment-v3.yaml -n quota-demo
deployment.apps/quota-demo-3-deployment created
Deployment is created successfully, but there is no new pod, Let's check this deployment:
kubectl get deploy -n quota-demo
NAME READY UP-TO-DATE AVAILABLE AGE
quota-demo-3-deployment 0/1 0 0 12s
We can see that a pod quota is working, deployment is monitoring resources and waiting for the possibility to create a new pod.
Let's now delete one of the pod and check deployment again:
kubectl delete pod quota-demo-2 -n quota-demo
pod "quota-demo-2" deleted
kubectl get deploy -n quota-demo
NAME READY UP-TO-DATE AVAILABLE AGE
quota-demo-3-deployment 1/1 1 1 2m50s
The pod from the deployment is created automatically after deletion of the pod:
kubectl get pods -n quota-demo
NAME READY STATUS RESTARTS AGE
quota-demo-1 1/1 Running 0 5m51s
quota-demo-3-deployment-7fd6ddcb69-nfmdj 1/1 Running 0 29s
It works the same way for memory and CPU quotas for namespace - when the resources are free, deployment will automatically create new pods.

devspace: how to auto-scale deployments?

My deployment never auto-scale on DigitalOcean. I have on my devspace.yaml
deployments:
- name: app
namespace: "mynamespace"
helm:
componentChart: true
values:
replicas: 1
autoScaling:
horizontal:
maxReplicas: 3
averageCPU: 5m
# averageRelativeCPU: 1
containers:
- name: app
image: pablorsk/app
Always has 1 replica. I try with small values on averageCPU like 5m o averageRelativeCPU like 1, but never upgrade replicas on cluster.
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
app Deployment/app <unknown>/5m 1 3 1 13d
This is my node configuration on DigitalOcean:
HPA installation is required for auto-scale deployments.
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Then, you can see values for TARGETS
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
app Deployment/app 216/5m 1 3 1 1d
More information on metrics-server repository.

Create an HPA for k8s with metrics and pod in different namespaces

I have three namespaces: namespace monitoring with prometheus-operator, namespace rabbitmq with a RabbitMq queue manager and prometheus-adapter, namespace worker with an app that just create inputs for the RabbitMq pod. I want to use an Horizontal Pod Autoscaler (HPA) to scale the worker pod (on worker namespace) with metrics from queue "task_queue" from RabbitMq pod (on rabbitmq namespace). All those metrics are collect by prometheus operator (on monitoring namespace) and they are shown in prometheus front-end:
Query "rabbitmq_queue_messages" at prometheus-url:8080/graph:
rabbitmq_queue_messages{durable="true",endpoint="metrics",instance="x.x.x.x:9419",job="rabbitmq-server",namespace="rabbitmq",pod="rabbitmq-server-0",queue="task_queue",service="rabbitmq-server",vhost="/"}
RabbitMQ, Prometheus-operator and Prometheus-adapter were installed from helm charts
RabbitMQ (values.yaml have password and enable metrics at 9419 for scraping):
helm install --namespace rabbitmq rabbitmq-server stable/rabbitmq \
--set extraPlugins=rabbitmq_prometheus \
-f charts/default/rabbitmq/values.yaml
Prometheus-adaptor:
helm upgrade --install --namespace rabbitmq prometheus-adapter stable/prometheus-adapter \
--set prometheus.url="http://pmt-server-prometheus-oper-prometheus.monitoring.svc" \
--set prometheus.port="9090"
Prometheus-operator:
helm upgrade --install --namespace monitoring pmt-server stable/prometheus-operator \
--set prometheusOperator.createCustomResource=false \
-f charts/default/values.yaml
Prometheus values.yaml:
prometheus:
additionalServiceMonitors:
- name: rabbitmq-svc-monitor
selector:
matchLabels:
app: rabbitmq
namespaceSelector:
matchNames:
- rabbitmq
endpoints:
- port: metrics
interval: 10s
path: /metrics
The custom metrics are ok:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/rabbitmq/services/rabbitmq-server/rabbitmq_queue_messages" | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/rabbitmq/services/rabbitmq-server/rabbitmq_queue_messages"
},
"items": [
{
"describedObject": {
"kind": "Service",
"namespace": "rabbitmq",
"name": "rabbitmq-server",
"apiVersion": "/v1"
},
"metricName": "rabbitmq_queue_messages",
"timestamp": "2020-08-20T12:15:39Z",
"value": "0",
"selector": null
}
]
}
And here is my hpa.yaml:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: rabbitmq-queue-worker-hpa
namespace: worker
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: worker
minReplicas: 1
maxReplicas: 50
metrics:
- type: Object
object:
metric:
name: rabbitmq_queue_messages
describedObject:
apiVersion: "/v1"
kind: Service
name: rabbitmq-server.rabbitmq.svc.cluster.local
target:
type: Value
value: 100
But the hpa don't work as kubectl describe shows:
kubectl describe hpa/rabbitmq-queue-worker-hpa -n worker
Name: rabbitmq-queue-worker-hpa
Namespace: worker
Labels: app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: rabbitmq-scaling-demo-app
meta.helm.sh/release-namespace: worker
CreationTimestamp: Thu, 20 Aug 2020 08:42:32 -0300
Reference: Deployment/worker
Metrics: ( current / target )
"rabbitmq_queue_messages" on Service/rabbitmq-server.rabbitmq.svc.cluster.local (target value): <unknown> / 100
Min replicas: 1
Max replicas: 50
Deployment pods: 1 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetObjectMetric the HPA was unable to compute the replica count: unable to get metric rabbitmq_queue_messages: Service on worker rabbitmq-server.rabbitmq.svc.cluster.local/unable to fetch metrics from custom metrics API: the server could not find the metric rabbitmq_queue_messages for services rabbitmq-server.rabbitmq.svc.cluster.local
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 60m (x12 over 63m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get object metric value: unable to get metric rabbitmq_queue_messages: Service on worker rabbitmq-server.rabbitmq.svc.cluster.local/unable to fetch metrics from custom metrics API: no custom metrics API (custom.metrics.k8s.io) registered
Warning FailedGetObjectMetric 60m (x13 over 63m) horizontal-pod-autoscaler unable to get metric rabbitmq_queue_messages: Service on worker rabbitmq-server.rabbitmq.svc.cluster.local/unable to fetch metrics from custom metrics API: no custom metrics API (custom.metrics.k8s.io) registered
Warning FailedGetObjectMetric 58m (x3 over 58m) horizontal-pod-autoscaler unable to get metric rabbitmq_queue_messages: Service on worker rabbitmq-server.rabbitmq.svc.cluster.local/unable to fetch metrics from custom metrics API: the server is currently unable to handle the request (get services.custom.metrics.k8s.io rabbitmq-server.rabbitmq.svc.cluster.local)
Warning FailedGetObjectMetric 2m59s (x218 over 57m) horizontal-pod-autoscaler unable to get metric rabbitmq_queue_messages: Service on worker rabbitmq-server.rabbitmq.svc.cluster.local/unable to fetch metrics from custom metrics API: the server could not find the metric rabbitmq_queue_messages for services rabbitmq-server.rabbitmq.svc.cluster.local
I belive that HPA is trying to find the RabbitMq service on worker namespace,
Warning FailedGetObjectMetric 60m (x13 over 63m) horizontal-pod-autoscaler unable to get metric rabbitmq_queue_messages: Service on worker rabbitmq-server.rabbitmq.svc.cluster.local/unable to fetch metrics from custom metrics API: no custom metrics API (custom.metrics.k8s.io) registered
but the service is on rabbitmq namespace. I tried with the rabbit's service FQDN (rabbitmq-server.rabbitmq.svc.cluster.local) and just the service's name (rabbitmq-server). What am I missing? Is there a way to make it work? The point here is that I have another project with 10+ namespaces and all of them uses the same rabbit server (on rabbitmq namespace), so let them all together in the same namespace will be a nightmare. Thanks.
Edit 1: My custom metrics config.yaml
prometheus:
url: http://pmt-server-prometheus-oper-prometheus.monitoring.svc
port: 9090
rbac:
create: true
serviceAccount:
create: true
service:
port: 443
logLevel: 6
rules:
custom:
- seriesQuery: 'rabbitmq_queue_messages{namespace!="",service!=""}'
resources:
overrides:
namespace: {resource: "namespace"}
service: {resource: "service"}
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,queue="task_queue"}) by (<<.GroupBy>>)
And the adapter helm install with this file:
helm upgrade --install --namespace rabbitmq prometheus-adapter stable/prometheus-adapter -f config.yaml
And this is the HPA describe if the HPA is created in rabbitmq namespace:
Name: rabbitmq-queue-worker-hpa
Namespace: rabbitmq
Labels: app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: rabbitmq-scaling-demo-app
meta.helm.sh/release-namespace: worker
CreationTimestamp: Fri, 21 Aug 2020 08:45:25 -0300
Reference: Deployment/worker
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 80%
Min replicas: 1
Max replicas: 50
Deployment pods: 0 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: deployments/scale.apps "worker" not found
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 9s (x17 over 4m11s) horizontal-pod-autoscaler deployments/scale.apps "worker" not found

Replicaset doesnot update pods in when pod image is modified

I have created a replicaset with wrong container image with below configuration.
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: rs-d33393
namespace: default
spec:
replicas: 4
selector:
matchLabels:
name: busybox-pod
template:
metadata:
labels:
name: busybox-pod
spec:
containers:
- command:
- sh
- -c
- echo Hello Kubernetes! && sleep 3600
image: busyboxXXXXXXX
name: busybox-container
Pods Information:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
rs-d33393-5hnfx 0/1 InvalidImageName 0 11m
rs-d33393-5rt5m 0/1 InvalidImageName 0 11m
rs-d33393-ngw78 0/1 InvalidImageName 0 11m
rs-d33393-vnpdh 0/1 InvalidImageName 0 11m
After this, i try to edit the image inside replicaset using kubectl edit replicasets.extensions rs-d33393 and update image as busybox.
Now, i am expecting pods to be recreated with proper image as part of replicaset.
This has not been the exact result.
Can someone please explain, why it is so?
Thanks :)
With ReplicaSets directly you have to kill the old pod, so the new ones will be created with the right image.
If you would be using a Deployment, and you should, changing the image would force the pod to be re-created.
Replicaset does not support updates. As long as required number of pods exist matching the selector labels, replicaset's jobs is done. You should use Deployment instead.
https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
From the docs:
To update Pods to a new spec in a controlled way, use a Deployment, as
ReplicaSets do not support a rolling update directly.
Deployment is a higher-level concept that manages ReplicaSets and provides declarative updates to Pods. Therefore, it is recommend to use Deployments instead of directly using ReplicaSets unless you don’t require updates at all. ( i.e. one may never need to manipulate ReplicaSet objects when using a Deployment)
Its easy to perform rolling updates and rollbacks when deployed using deployments.
$ kubectl create deployment busybox --image=busyboxxxxxxx --dry-run -o yaml > busybox.yaml
$ cat busybox.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: busybox
name: busybox
spec:
replicas: 1
selector:
matchLabels:
app: busybox
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: busybox
spec:
containers:
- image: busyboxxxxxxx
name: busyboxxxxxxx
ubuntu#dlv-k8s-cluster-master:~$ kubectl create -f busybox.yaml --record=true
deployment.apps/busybox created
Check rollout history
ubuntu#dlv-k8s-cluster-master:~$ kubectl rollout history deployment busybox
deployment.apps/busybox
REVISION CHANGE-CAUSE
1 kubectl create --filename=busybox.yaml --record=true
Update image on deployment
ubuntu#dlv-k8s-cluster-master:~$ kubectl set image deployment.app/busybox *=busybox --record
deployment.apps/busybox image updated
ubuntu#dlv-k8s-cluster-master:~$ kubectl rollout history deployment busybox
deployment.apps/busybox
REVISION CHANGE-CAUSE
1 kubectl create --filename=busybox.yaml --record=true
2 kubectl set image deployment.app/busybox *=busybox --record=true
Rollback Deployment
ubuntu#dlv-k8s-cluster-master:~$ kubectl rollout undo deployment busybox
deployment.apps/busybox rolled back
ubuntu#dlv-k8s-cluster-master:~$ kubectl rollout history deployment busybox
deployment.apps/busybox
REVISION CHANGE-CAUSE
2 kubectl set image deployment.app/busybox *=busybox --record=true
3 kubectl create --filename=busybox.yaml --record=true
You could use
k scale rs new-replica-set --replicas=0
and then
k scale rs new-replica-set --replicas=<Your number of replicas>
Edit the replicaset(assuming its called replicaset.yaml) file with command:
kubectl edit rs replicaset
edit the image name in the editor
save the file
exit the editor
Now , you will need to either delete the replica sets or delete the existing pods:
kubectl delete rs new-replica-set
kubectl delete pod pod_1 pod_2 pod_2 pod_4
replicaset should spin up new pods with new image.