Manage prometheus configuration using Prometheus operator - kubernetes-helm

I am looking for some recommendations as to how Prometheus configuration (like providing our custom alert rules files and reloading the config dynamically if the alert rules files sees some modifications ) using Prometheus operator (helm charts ) ??
Installing Prometheus using Prometheus operator generates default configuration for it so trying to custom feed it using values.yaml file of Prometheus operator.
Any suggestions ?
Thanks.

Prometheus-Operator exposes CRD/s called PromethesRule/s. If you intend to provide your own custom rules, the best thing would be to start out with some simple rules just to get used to the vector centered language PromQL. Also you could expose Prometheus Web UI just to get used to the interface, have a look at metrics, play with the console, evaluate your targets ...
You do not have to worry about rules being picked up, it happens automatically if everything is fine with your Targets and also the TCP/IP needed ports are opened.
Also in case your face some challenge their github page is quite responsive and they have a 'support' label for opened features.

Related

Adding prometheus alerting rules to grafana

I'm desperate with activating alerts in grafana (running in k8s).
Are there standard alerts (like a default file or so), which I can import into grafana?
For instance for getting notified when there is something with the k8s cluster?
I've downloaded a file from awesome-prometheus-alerts, which provides a starting set of rules. According to the grafana doc, it can also be used in grafana.
Unfortunately, I'm unable to get it running. Any ideas, how this can be accomplished?
Our setting:
We are using helm for deploying
Use of loki-stack
Configurations are being made in a values.yml file, overwriting values in grafana.ini
Grafana Loki supports prometheus queries and you can update the queries in values.yaml file and apply it for getting the changes applied. Follow this document for more information. If you want the list of promQl queries follow this document.

How to set MaxRevisionTimeoutSeconds in Knative?

I have deployed a service using Cloud run on gke which uses Knative as an abstraction over k8s. The default MaxRevisionTimeoutSeconds is set to 600s in the knative default config but according to this PR this is customizable.
I couldn't find anything in the official Knative documentation, can anybody help me out here?
UPDATE:
After digging a bit more in knative source code and documentation. It looks like that the MaxRevisionTimeoutSeconds is defined in resource=ConfigMap/config-defaults. So have to update it with custom value.
From this it looks like we can use something called as operator to modify the ConfigMap resource but it did not work probably because gcp's does not use operator to install Knative components. Anyways I went on to install the operator and then used resource=knativeserving to overwrite the config-defaults. But this also did not work when I tried re-deploying service.
The next solution is to directly edit the config-defaults using kubectl edit. I even tried doing this but encountered weird behavior. After editing the YAML file when I used kubectl describe to check the changed value, it sometimes shows the modified value, sometimes shows the old value, and sometimes doesn't even show that particular key-value pair in the YAML. Also, it doesn't work when trying to re-deploy the service after doing this edit.
If anyone can help me with this, it would be really great.
MaxRevisionTimeoutSeconds is a cluster-global setting which enforces the max value for TimeoutSeconds on each Revision. This value exists so that cluster administrators can set upper bounds on the amount of time a single HTTP request can be in the system. Knowing an upper bound can be useful when configuring graceful shutdown settings on the HTTP routing components to prevent dropped requests during upgrades.
It's possible that Cloud Run on GKE has overridden these configurations so that they can upgrade the underlying Istio and Knative components on a predictable schedule. (If you have a 10% upgrade budget and it takes 10m to drain a component, your minimum upgrade time is probably around 110m, taking into account additional scheduling / image fetch / startup time.)

How to connect applications with services created through Kubernetes operators?

I am looking for best practices to connect applications with services. I have a database operator which creates a service, and there is an application pod which needs to connect to it. Is the following approach going to work?
The operator injects the access details into the pods as Secret and ConfigMap.
The operator identifies application pods through label selectors (e.g., connects-to: mysql).
The application pod receives service access details through environment variables.
The operator can document the environment variables and the label selectors.
If the above flow is going to work, how can I inject values into pods?
I can see a few mechanisms. Which one would be better?
PodPreset (alpha since 2017)
Initializer
MutatingAdmissionWebhook
This is the expected interaction between controllers and actors (PodPreset can be substituted with other choices):
your question is not completely clear.
if I understand your question , you can use helm with multiple values file to it.
helm install ./repo/ --name repo --values ./values.file
you can also --set command to the help to add values that are not in the values file for some reason
Looking into the question it seems to general.
For example you can find more information about "CONTAINERS & KUBERNETES - Best practices for building Kubernetes Operators and stateful apps" here.
As per documentation the most important are:
Operators exercise one of the most valuable capabilities of Kubernetes—its extensibility—through the use of CRDs and custom controllers. Operators extend the Kubernetes API to support the modernization of different categories of workloads and provide improved lifecycle management, scheduling.
Please refer to "Application management made easier with Kubernetes Operators on GCP Marketplace" As you can see - in GCP was published a set of Kubernetes operators to encapsulate best practices and end-to-end solutions for specific applications.
In stack overflow you can find discussion about CRD
As an example please see postgress opeartor with comparison between two methods to set the Postgres Operator configuration here.
In this case "The CRD-based configuration is more powerful than the one based on ConfigMaps and should be used unless there is a compatibility requirement to use an already existing configuration"
Here you can also find information "When to use configMap or a custom resource?"
Hope this help

Customizing kubernetes dashboard with company name and environment

Problem statement:
Currently we are running k8s in multiple environments e.g. dev, uat,staging.
It becomes very difficult to identify for us just by looking at k8s dashboard UI.
Do we have any facility to customize k8s dashboard indicating somewhere in header or footer cluster or environment we are using?
Since K8S is open source, you should have the ability to do whatever you want. You will ofcourse need to play with the code and build you own custom dashboard image.
You can start off from here
https://github.com/kubernetes/dashboard/tree/master/src/app/frontend
This feature was released back in 2017, with the introduction of the settings ConfigMap. You just need to set the values of the kubernetes-dashboard-settings ConfigMap in kubernetes-dashboard namespace. You don't even need to restart the dashboard service/deployment.

What is the difference between the core os projects kube-prometheus and prometheus operator?

The github repo of Prometheus Operator https://github.com/coreos/prometheus-operator/ project says that
The Prometheus Operator makes the Prometheus configuration Kubernetes native and manages and operates Prometheus and Alertmanager clusters. It is a piece of the puzzle regarding full end-to-end monitoring.
kube-prometheus combines the Prometheus Operator with a collection of manifests to help getting started with monitoring Kubernetes itself and applications running on top of it.
Can someone elaborate this?
I've always had this exact same question/repeatedly bumped into both, but tbh reading the above answer didn't clarify it for me/I needed a short explanation. I found this github issue that just made it crystal clear to me.
https://github.com/coreos/prometheus-operator/issues/2619
Quoting nicgirault of GitHub:
At last I realized that prometheus-operator chart was packaging
kube-prometheus stack but it took me around 10 hours playing around to
realize this.
**Here's my summarized explanation:
"kube-prometheus" and "Prometheus Operator Helm Chart" both do the same thing:
Basically the Ingress/Ingress Controller Concept, applied to Metrics/Prometheus Operator.
Both are a means of easily configuring, installing, and managing a huge distributed application (Kubernetes Prometheus Stack) on Kubernetes:**
What is the Entire Kube Prometheus Stack you ask? Prometheus, Grafana, AlertManager, CRDs (Custom Resource Definitions), Prometheus Operator(software bot app), IaC Alert Rules, IaC Grafana Dashboards, IaC ServiceMonitor CRDs (which auto-generate Prometheus Metric Collection Configuration and auto hot import it into Prometheus Server)
(Also when I say easily configuring I mean 1,000-10,000++ lines of easy for humans to understand config that generates and auto manage 10,000-100,000 lines of machine config + stuff with sensible defaults + monitoring configuration self-service, distributed configuration sharding with an operator/controller to combine config + generate verbose boilerplate machine-readable config from nice human-readable config.
If they achieve the same end goal, you might ask what's the difference between them?
https://github.com/coreos/kube-prometheus
https://github.com/helm/charts/tree/master/stable/prometheus-operator
Basically, CoreOS's kube-prometheus deploys the Prometheus Stack using Ksonnet.
Prometheus Operator Helm Chart wraps kube-prometheus / achieves the same end result but with Helm.
So which one to use?
Doesn't matter + they achieve the same end result + shouldn't be crazy difficult to start with 1 and switch to the other.
Helm tends to be faster to learn/develop basic mastery of.
Ksonnet is harder to learn/develop basic mastery of, but:
it's more idempotent (better for CICD automation) (but it's only a difference of 99% idempotent vs 99.99% idempotent.)
has built-in templating which means that if you have multiple clusters you need to manage / that you want to always keep consistent with each other. Then you can leverage ksonnet's templating to manage multiple instances of the Kube Prometheus Stack (for multiple envs) using a DRY code base with lots of code reuse. (If you only have a few envs and Prometheus doesn't need to change often it's not completely unreasonable to keep 4 helm values files in sync by hand. I've also seen Jinja2 templating used to template out helm values files, but if you're going to bother with that you may as well just consider ksonnet.)
Kubernetes operator are kubernetes specific application(pods) that configure, manage and optimize other Kubernetes deployments automatically. They are implemented as a custom controller.
According to official coreOS website:
Operators were introduced by CoreOS as a class of software that operates other software, putting operational knowledge collected by humans into software.
The prometheus operator provides the easy way to deploy configure and monitor your prometheus instances on kubernetes cluster. To do so, prometheus operator introduces three types of custom resource definition(CRD) in kubernetes.
Prometheus
Alertmanager
ServiceMonitor
Now, with the help of above CRD's, you can directly create a prometheus instance by providing kind: Prometheus and the prometheus instance is ready to serve, likewise you can do for AlertManager. Without this you would have to setup the deployment for prometheus with its image, configuration and many more things.
The Prometheus Operator serves to make running Prometheus on top of Kubernetes as easy as possible, while preserving Kubernetes-native configuration options.
Now, kube-prometheus implemented the prometheus operator and provides you minimum yaml files to create your basic setup of prometheus, alertmanager and grafana by running a single command.
git clone https://github.com/coreos/prometheus-operator.git
kubectl apply -f prometheus-operator/contrib/kube-prometheus/manifests/
By running above command in kube-prometheus directory, you will get a monitoring namespace which will have an instance of alertmanager, prometheus and grafana for UI. This is enough setup for most of the basic implementation and if you need any more specifics according to your application, you can add more yamls of exporter you need.
Kube-prometheus is more of a contribution to prometheus-operator project, which implements the prometheus operator functionality very well and provide you a complete monitoring setup for your kubernetes cluster. You can start with kube-prometheus and extend the functionality of your monitoring setup according to your application from there.
You can learn more about prometheus-operator here
As of today, 28-09-2020, this is the way to install Prometheus in a Kubernetes cluster
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#kube-prometheus-stack
According to official documentation, kube-prometheus-stack is a rename of prometheus-operator.
As I understood, kube-prometheus-stack also has preinstalled grafana dashboards and prometheus rules.
Note: This chart was formerly named prometheus-operator chart, now
renamed to more clearly reflect that it installs the kube-prometheus
project stack, within which Prometheus Operator is only one component.
Taken from https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
Architecturally the container runs docker
The default container logs are managed by Docker, and the default log driver uses JSON-file
log-driver": "json-file
https://docs.docker.com/config/containers/logging/configure/
If the default jSON-file is used to manage container logs, log rotation is not performed by default. Therefore, the default JSON-file log driver the log files stored by the log driver can result in a large amount of disk space for containers that generate a large amount of output, which can cause disk space to run out.
In this case, save the log to ES, store it separately, and periodically delete the index using curator kubernetes
And run a scheduled task in K8S to delete the index periodically
Another solution for disk space is to periodically delete old logs from jSON-files
Typically we set the size and number of logs
This will set up a maximum of 10 log files, each with a maximum size of 20 Mb. Therefore, the container has a maximum of 200 Mb of logs
"log-driver": "json-file", "log-opts": { "max-size": "20m", "max-file": "10" },
Note: In general, the default Docker log is placed
/var/lib/docker/containers/
But in the same case kubernetes also saves logs and creates a directory structure to help you find pods-based logs, so you can find container logs for each Pod running on a node
/var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/
When removing pod, / var/lib/container under the docker/containers/log and k8s created under/var/log/pods/pod log will be deleted
For example, if the POD is restarted during production, the pod log will be deleted whether it is on the original node or jumped to another node
Therefore, this log needs to be saved in ES for centralized management. Many R&D projects will check the log for troubleshooting in most cases