Kubernetes resource type's history - kubernetes

Is there a way to check history of different kubernetes resource type? It can be an additional plugin.
Use case:
For example currently we have a statefulset on a 5 node cluster:
name: X
replica:3
resources:
memory:
limit: 2Gi
request: 1Gi
currently replica 1 is on node_1, replica 2 is on node_2, replica 3 is on node_3.
I am curious about the state of the resources for any given time.
Let's say I want to check that one month ago what were the resource limits. How many replica we had and on which node were those allocated.

To directly answer your question - you can't do that using oob functionality.
You need any existing monitoring solution for Kubernetes that is capable of exposing metrics you need. What comes to my mind:
kube-state-metrics server + promethues
kube-state-metrics server + metricbeat
For instance kube-state-metrics server exposes such metric like Pod's container resource limit (kube_pod_container_resource_limits), and metricbeat digest that metrics and helps to visualize it.
Collecting Kubernetes state metrics and events
A single instance is deployed to collect Kubernetes metrics. It is
integrated with the kube-state-metrics API to monitor state changes of
objects managed by Kubernetes. This is the section of the config that
defines state_metrics collection.
$HOME/k8s-o11y-workshop/Metricbeat/Metricbeat.yml:
kubernetes.yml: |-
- module: kubernetes
metricsets:
- state_node
- state_deployment
- state_replicaset
- state_pod
- state_container
# Uncomment this to get k8s events:
- event
period: 10s
host: ${NODE_NAME}
hosts: ["kube-state-metrics:8080"]

Related

Prometheus for k8s multi clusters

I have 3 kubernetes clusters (prod, test, monitoring). Iam new to prometheus so i have tested it by installing it in my test environment with the helm chart:
# https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
helm install [RELEASE_NAME] prometheus-community/kube-prometheus-stack
But if i want to have metrics from the prod and test clusters, i have to repeat the same installation of the helm and each "kube-prometheus-stack" would be standalone in its own cluster. It is not ideal at all. Iam trying to find a way to have a single prometheus/grafana which would federate/agregate the metrics from each cluster's prometheus server.
I found this link, saying about prometheus federation:
https://prometheus.io/docs/prometheus/latest/federation/
If install the helm chart "kube-prometheus-stack" and get rid of grafana on the 2 other cluster, how can i make the 3rd "kube-prometheus-stack", on the 3rd cluster, scrapes metrics from the 2 other ones?
thanks
You have to modify configuration for prometheus federate so it can scrape metrics from other clusters as described in documentation:
scrape_configs:
- job_name: 'federate'
scrape_interval: 15s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
static_configs:
- targets:
- 'source-prometheus-1:9090'
- 'source-prometheus-2:9090'
- 'source-prometheus-3:9090'
params field checks for jobs to scrape metrics from. In this particular example
It will scrape any series with the label job="prometheus" or a metric name starting
with job: from the Prometheus servers at
source-prometheus-{1,2,3}:9090
You can check following articles to give you more insight of prometheus federation:
Monitoring Kubernetes with Prometheus - outside the cluster!
Prometheus federation in Kubernetes
Monitoring multiple federated clusters with Prometheus - the secure way
Monitoring a Multi-Cluster Environment Using Prometheus Federation and Grafana
You have few options here:
Option 1:
You can achieve this buy having vmagent or grafana-agent in prod and test clusters and configure remote write on them to your monitoring cluster.
But in this case you will need to install kube-state-metrics and node-exporter separately into prod and test cluster.
Also it's important to add extra label for a cluster name (or any unique identifier) before sending metrics to remote write, to make sure that recording rules from "kube-prometheus-stack" are working correctly
diagram
Option 2:
You can install victoria-metrics-k8s-stack chart. It has similar functionality as kube-prometheus-stack - also installs bunch of components recording rules and dashboards.
With this case you install victoria-metrics-k8s-stack in every cluster, but with different values.
For monitoring cluster you can use default values, with
grafana:
sidecar:
dashboards:
multicluster: true
and proper configured ingress for vmsingle
For prod and test cluster you need to disable bunch of components
defaultRules:
create: false
vmsingle:
enabled: false
alertmanager:
enabled: false
vmalert:
enabled: false
vmagent:
spec:
remoteWrite:
- url: "<vmsingle-ingress>/api/v1/write"
externalLabels:
cluster: <cluster-name>
grafana:
enabled: false
defaultDashboardsEnabled: false
in this case chart will deploy vmagent, kube-state-metrics, node-exporter and scrape configurations for vmagent.
diagram
You could try looking at Wavefront. It's a commercial tool now but you can get a 30 day trial free - also, it understands promQL. So essentially, you could use the same prometheus rules and config across all clusters, and then use wavefront to just connect to all of those prom instances.
Another option may be Thanos, but I've never used it personally.

Dynamically update prometheus scrape config based on pod labels

I'm trying to enhance my monitoring and want to expand the amount of metrics pulled into Prometheus from our Kube estate. We already have a stand alone Prom implementation which has a hard coded config file monitoring some bare metal servers, and hooks into cadvisor for generic Pod metrics.
What i would like to do is configure Kube to monitor the apache_exporter metrics from a webserver deployed in the cluster, but also dynamically add a 2nd, 3rd etc webserver as the instances are scaled up.
I've looked at the kube-prometheus project, but this seems to be more geared to instances where there is no established Prometheus deployed. Is there a simple way to get prometheus to scrape the Kube API or etcd to pull in the current list of pods which match a certain criteria (ie, a tag like deploymentType=webserver) and scrape the apache_exporter metrics for these pods, and scrape the mysqld_exporter metrics where deploymentType=mysql
There's a project called kube-prometheus-stack (formerly prometheus-operator): https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
It has concepts called ServiceMonitor and PodMonitor:
https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/design.md#servicemonitor
https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/design.md#podmonitor
Basically, this is a selector that points your Prometheus instance to scrape targets. In the case of service selector, it discovers all the pods behind the service. In the case of a pod selector, it discovers pods directly. Prometheus scrape config is updated and reloaded automatically in both cases.
Example PodMonitor:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: example
namespace: monitoring
spec:
podMetricsEndpoints:
- interval: 30s
path: /metrics
port: http
namespaceSelector:
matchNames:
- app
selector:
matchLabels:
app.kubernetes.io/name: my-app
Note that this PodMonitor object itself must be discovered by the controller. To achieve this you write a PodMonitorSelector(link). This additional explicit linkage is done intentionally - in this way, if you have 2 Prometheus instances on your cluster (say Infra and Product) you can separate which Prometheus will get which Pods to its scraping config.
The same applies to a ServiceMonitor.

How can I use autoscale from K8s in a Redis Cluster if I'm in a Spring boot application with Spring Data (Jedis) connect in a Redis Cluster?

I need to list all nodes from my Redis Cluster on attribute spring.redis.sentinel.nodes? Is it right?
I wanna run a Redis Cluster on K8s to use the autoscaling provided from K8s, How can I use autoscale is it necessary to inform all nodes on spring.redis.sentinel.nodes?
Good question 💯.
The short answer is that you typically don't do autoscaling with stateful apps like Redis since you have to be careful about not corrupting your data. Most of the time you migrate and shard your data, i.e multiple clusters, with different segments of your data, etc.
Having said that, there is no silver bullet redis autoscaling solution but it's doable with a lot of monitoring and testing 🦄. A challenge here is that sentinels change the master in case of failover so your solution needs to be able to determine or monitor who the master is at a certain interval, this is very critical during downscales. Redis has written a pretty good guide on how to create clients which you will probably have to do/understand if you want a reliable autoscaling solution.
So the idea here 💡 is that you start with a set of sentinel/redis nodes managed by a Kubernetes Operator. With some config like this:
apiVersion: databases.spotahome.com/v1
kind: RedisFailover
metadata:
name: redisfailover
spec:
sentinel:
replicas: 3
resources:
requests:
cpu: 100m
limits:
memory: 100Mi
redis:
replicas: 3
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 400m
memory: 500Mi
Then maybe modify the controller of this operator to do autoscale based on certain metrics (CPUs, Memory, Storage, etc).
The moment there is an autoscale operation you will have to do a configuration change in your Spring boot application to account for this change (say the ConfigMap of your application). For example, automatically change the value of this:
spring:
cache:
type: redis
redis:
port: 6666
password: 123pwd
sentinel:
master: masterredis
nodes:
- 10.0.0.16
- 10.0.0.17
- 10.0.0.18
lettuce:
shutdown-timeout: 200ms
Now after the config change, you need to do a rolling restart to prevent any downtime. The best thing, in my opinion, to do this in Kubernetes is just to have another Operator (or extend the Redis operator) for your application, that has a controller, to automatically detect when there is scaling operation, does the ConfigMap change, and finally does the rolling restart of your app. Your scaling operations need to allow enough time for balancing of data and also the rolling restart to prevent any thrashing/starvation and possible downtime/data corruption.
✌️☮️

Horizontal pod Autoscaling without custom metrics

We want to scale our pods horizontally based on the amount of messages in our Kafka Topic. The standard solution is to publish the metrics to the custom metrics API of Kubernetes. However, due to company guidelines we are not allowed to use the custom metrics API of Kubernetes. We are only allowed to use non-admin functionality. Is there a solution for this with kubernetes-nativ features or do we need to implement a customized solution?
I'm not exactly sure if this would fit your needs but you could use Autoscaling on metrics not related to Kubernetes objects.
Applications running on Kubernetes may need to autoscale based on metrics that don’t have an obvious relationship to any object in the Kubernetes cluster, such as metrics describing a hosted service with no direct correlation to Kubernetes namespaces. In Kubernetes 1.10 and later, you can address this use case with external metrics.
Using external metrics requires knowledge of your monitoring system; the setup is similar to that required when using custom metrics. External metrics allow you to autoscale your cluster based on any metric available in your monitoring system. Just provide a metric block with a name and selector, as above, and use the External metric type instead of Object. If multiple time series are matched by the metricSelector, the sum of their values is used by the HorizontalPodAutoscaler. External metrics support both the Value and AverageValue target types, which function exactly the same as when you use the Object type.
For example if your application processes tasks from a hosted queue service, you could add the following section to your HorizontalPodAutoscaler manifest to specify that you need one worker per 30 outstanding tasks.
- type: External
external:
metric:
name: queue_messages_ready
selector: "queue=worker_tasks"
target:
type: AverageValue
averageValue: 30
When possible, it’s preferable to use the custom metric target types instead of external metrics, since it’s easier for cluster administrators to secure the custom metrics API. The external metrics API potentially allows access to any metric, so cluster administrators should take care when exposing it.
You may also have a look at zalando-incubator/kube-metrics-adapter and use Prometheus collector external metrics.
This is an example of an HPA configured to get metrics based on a Prometheus query. The query is defined in the annotation metric-config.external.prometheus-query.prometheus/processed-events-per-second where processed-events-per-second is the query name which will be associated with the result of the query. A matching query-name label must be defined in the matchLabels of the metric definition. This allows having multiple prometheus queries associated with a single HPA.
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
annotations:
# This annotation is optional.
# If specified, then this prometheus server is used,
# instead of the prometheus server specified as the CLI argument `--prometheus-server`.
metric-config.external.prometheus-query.prometheus/prometheus-server: http://prometheus.my->namespace.svc
# metric-config.<metricType>.<metricName>.<collectorName>/<configKey>
# <configKey> == query-name
metric-config.external.prometheus-query.prometheus/processed-events-per-second: |
scalar(sum(rate(event-service_events_count{application="event-service",processed="true"}[1m])))
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: custom-metrics-consumer
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metric:
name: prometheus-query
selector:
matchLabels:
query-name: processed-events-per-second
target:
type: AverageValue
averageValue: "10"

Can a container in Kubernetes determine its own resource utilization and limits?

How can my containerized app determine its own current resource utilization, as well as the maximum limit allocated by Kubernetes? Is there an API to get this info from cAdvisor and/or Kubelet?
For example, my container is allowed to use maximum 1 core, and it's currently consuming 800 millicores. In this situation, I want to drop/reject all incoming requests that are marked as "low priority".
-How can I see my resource utilization & limit from within my container?
Note that this assumes auto-scaling is not available, e.g. when cluster resources are exhausted, or our app is not allowed to auto-scale (further).
You can use the Kubernetes Downward Api to fetch the limits and requests. The syntax is:
volumes:
- name: podinfo
downwardAPI:
items:
- path: "cpu_limit"
resourceFieldRef:
containerName: client-container
resource: limits.cpu
divisor: 1m