How to display the number of kubernetes pods restarted during a time period? - kubernetes

I have kubernetes clusters with prometheus and grafana for monitoring and I am trying to build a dashboard panel that would display the number of pods that have been restarted in the period I am looking at.
Atm I have this query that fills a vector with 1 if the pod's creation time is in the range (meaning it has been restarted during this period) and -1 otherwise.
-sgn((time() - kube_pod_created{cluster="$cluster"}) - $__range_s)
what this looks like
Is there a way to count the number of positive values in this vector and display it? Like in this example just have a box with red 1 inside.
Or maybe there is a better way to accomplish what I am trying.

To display the Pod Restarts we have different Prometheus metrics
kube_pod_container_status_restarts_total. This is the counter metrics and this will record the container restarts.
To calculate the restarts:
If you want to see all pods then,
sum(increase(kube_pod_container_status_restarts_total{namespace="My-Namespace"}[5m])) by(pod)
or If you want Particular Pod then use,
sum(increase(kube_pod_container_status_restarts_total{namespace="My-Namespace", pod="My-Pod"}[5m]))
or to show by container wise use
sum(increase(kube_pod_container_status_restarts_total{namespace="My-Namespace", pod="My-Pod"}[5m])) by(container)

Related

kubernetes / prometheus custom metric for horizontal autoscaling

I'm wondering about an approach one has to take for our server setup. We have pods that are short lived. They are started up with 3 pods at a minimum and each server is waiting on a single request that it handles - then the pod is destroyed. I'm not sure of the mechanism that this pod is destroyed, but my question is not about this part anyway.
There is an "active session count" metric that I am envisioning. Each of these pod resources could make a rest call to some "metrics" pod that we would create for our cluster. The metrics pod would expose a sessionStarted and sessionEnded endpoint - which would increment/decrement the kubernetes activeSessions metric. That metric would be what is used for horizontal autoscaling of the number of pods needed.
Since having a pod as "up" counts as zero active sessions, the custom event that increments the session count would update the metric server session count with a rest call and then decrement again on session end (the pod being up does not indicate whether or not it has an active session).
Is it correct to think that I need this metric server (and write it myself)? Or is there something that Prometheus exposes where this type of metric is supported already - rest clients and all (for various languages), that could modify this metric?
Looking for guidance and confirmation that I'm on the right track. Thanks!
It's impossible to give only one way to solve this and your question is more "opinion-based". However there is an useful similar question on StackOverFlow, please check the comments that can give you some tips. If nothing works, probably you should write the script. There is no exact solution from Kubernetes's side.
Please also take into the consideration of Apache Flink. It has Reactive Mode in combination of Kubernetes:
Reactive Mode allows to run Flink in a mode, where the Application Cluster is always adjusting the job parallelism to the available resources. In combination with Kubernetes, the replica count of the TaskManager deployment determines the available resources. Increasing the replica count will scale up the job, reducing it will trigger a scale down. This can also be done automatically by using a Horizontal Pod Autoscaler.

How to increase or decrease number of pods in Kubernetes deployment

I have one requirement based on some input value I need to decide the number of active pods I have, let's say in beginning the number would be 1, so there we need to start one pod after some time if number goes to 3, I need to start 2 more pods.
Next day it could happen number goes back to 1, so I need to accordingly remove 2 pods and keep 1 active. How can this be achieved in Kubernetes?
There are few ways to achieve this. The most obvious and manual one would be to use kubectl scale [--resource-version=version] [--current-replicas=count] --replicas=COUNT (-f FILENAME | TYPE NAME) as per this document or this tutorial. You can also consider taking advantage of Kubernetes autoscaling (Horizontal Pod Autoscaler and Cluster Autoscaler) described in this article and this document.

Kubernetes HPA - How to avoid scaling-up for CPU utilisation spike

HPA - How to avoid scaling-up for CPU utilization spike (not on startup)
When the business configuration is loaded for different country CPU load increases for 1min, but we want to avoid scaling-up for that 1min.
below pic, CurrentMetricValue is just current value from a matrix or an average value from the last poll to current poll duration --horizontal.-pod-autoscaler-sync-period
The default HPA check interval is 30 seconds. This can be configured through the as you mentioned by changing value of flag --horizontal-pod-autoscaler-sync-period of the controller manager.
The Horizontal Pod Autoscaler is implemented as a control loop, with a period controlled by the controller manager’s --horizontal-pod-autoscaler-sync-period flag.
During each period, the controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition. The controller manager obtains the metrics from either the resource metrics API (for per-pod resource metrics), or the custom metrics API (for all other metrics).
In order to change/add flags in kube-controller-manager - you should have access to your /etc/kubernetes/manifests/ directory on master node and be able to modify parameters in /etc/kubernetes/manifests/kube-controller-manager.yaml.
Note: you are not able do this on GKE, EKS and other managed clusters.
What is more I recommend increasing --horizontal-pod-autoscaler-downscale-stabilization (the replacement for --horizontal-pod-autoscaler-upscale-delay).
If you're worried about long outages I would recommend setting up a custom metric (1 if network was down in last ${duration}, 0 otherwise) and setting the target value of the metric to 1 (in addition to CPU-based autoscaling). This way:
If network was down in last ${duration} recommendation based on the custom metric will be equal to the current size of your deployment. Max of this recommendation and very low CPU recommendation will be equal to the current size of the deployment. There will be no scale downs until the connectivity is restored (+ a few minutes after that because of the scale down stabilization window).
If network is available recommendation based on the metric will be 0. Maxed with CPU recommendation it will be equal to the CPU recommendation and autoscaler will operate normally.
I think this solves your issue better than limiting size of autoscaling step. Limiting size of autoscaling step will only slow down rate at which number of pods decreases so longer network outage will still result in your deployment shrinking to minimum allowed size.
You can also use memory based scaling
Since it is not possible to create memory-based hpa in Kubernetes, it has been written a script to achieve the same. You can find our script here by clicking on this link:
https://github.com/powerupcloud/kubernetes-1/blob/master/memory-based-autoscaling.sh
Clone the repository :
https://github.com/powerupcloud/kubernetes-1.git
and then go to the Kubernetes directory. Execute the help command to get the instructions:
./memory-based-autoscaling.sh --help
Read more here: memory-based-autoscaling.

how to perform HorizontalPodAutoscaling in Kubernetes based on response time (custom metric) using Prometheus adapter?

Hi everyone,
I have a cluster based on kubeadm having 1 master and 2 workers. I have already implemented built-in horizontalPodAutoscaling (based on cpu_utilization and memory) and now i want to perform autoscaling on the basis of custom metrics (response time in my case).
I am using Prometheus Adapter for custom metrics.And, I could not find any metrics with the name of response_time in prometheus.
Is there any metric available in prometheus which scales the application based on response time and what is its name?
Whether i will need to edit the default horizontal autoscaling algorithm or i will have to make an algorithm for autoscaling from scratch which could scale my application on the basis of response time?
Prometheus has only 4 metric types: Counter, Gauge, Histogram and Summary.
I guess Histogram is that what you need
A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.
A histogram with a base metric name of <basename> exposes multiple time series during a scrape:
cumulative counters for the observation buckets, exposed as <basename>_bucket{le="<upper inclusive bound>"}
the total sum of all observed values, exposed as <basename>_sum
the count of events that have been observed, exposed as <basename>_count (identical to <basename>_bucket{le="+Inf"} above)
1.
There is a stackoverflow question, where you can get a query for latency (response time), so I think this might be useful for you.
2.
I dont know if I understand you correctly, but if you want to edit HPA, you can edit the yaml file, delete previous HPA and create new one instead.
kubectl delete hpa <name.yaml>
kubectl apply -f <name.yaml>
There is good article about Autoscaling on custom metrics with custom Prometheus Metrics.

Why autoscale Kubernates if not doing a rolling update?

In reference to this doc I think I understand the value of temporarily horizontally scaling a pod during an update. For example, you go from 1 pod to 2 pods - update pod 1 and then delete pod 2.
Is there any value to horizontally scaling Kubernates if you're not doing an update? Won't replicating pods simply decrease the performance of each one?
For example, doubling the number of pods while keeping the amount of RAM fixed just means that each pod will have half as much RAM.
... doubling the number of pods while keeping the amount of RAM fixed ...
I think you are mis-understanding what happens when you horizontally scale pods. Each pod is allocated a certain amount of memory and when you create new pods, each existing pod keeps using that much memory and new pods get that same amount of memory. So the overall memory usage as your horizontally scale pods is increasing linearly with the number of pods that are running.
The only constraint is when you hit the total limit of memory available in your cluster, at which point you won't be able to schedule new pods. At this point, you would need to (manually or automatically) scale the number of nodes in the cluster to add additional resources.
Is there any value to horizontally scaling Kubernates if you're not doing an update?
Yes. If you are serving traffic and the current number of pods can't handle the load, adding new pods can help. For example, during the One million requests per second demo, the author determines how many requests each nginx pod can serve and then scales the number of pods to handle the expected load. If this was a web site, the horizontal pod autoscaler could allow you to dynamically scale the number of nginx pods based on the current number of requests.