Prometheus Alertmanager - What does cluster status "disabled" mean? - kubernetes-helm

I am using kube-prometheus-stack, deployed via Helm to my K8s cluster. This incudes the Prometheus Alertmanager among others. I'm trying to configure it to send alert notifications to Slack and am running into some problems.
When I go to the Alertmanager page and select the Status tab, it shows that it's up, but that the cluster status is disabled.
What does that mean? Does it need to enabled for it to be able to send notifications? If so, how do I enable it? Keep in mind I am using Helm, so I affect configuration via the chart's values.yaml file.

Related

Adding prometheus alerting rules to grafana

I'm desperate with activating alerts in grafana (running in k8s).
Are there standard alerts (like a default file or so), which I can import into grafana?
For instance for getting notified when there is something with the k8s cluster?
I've downloaded a file from awesome-prometheus-alerts, which provides a starting set of rules. According to the grafana doc, it can also be used in grafana.
Unfortunately, I'm unable to get it running. Any ideas, how this can be accomplished?
Our setting:
We are using helm for deploying
Use of loki-stack
Configurations are being made in a values.yml file, overwriting values in grafana.ini
Grafana Loki supports prometheus queries and you can update the queries in values.yaml file and apply it for getting the changes applied. Follow this document for more information. If you want the list of promQl queries follow this document.

How can I log CRD reads in a GKE cluster?

According to GKE, I can enable cluster audit logs from the k8s.io API, which will forward cluster events to Cloud Logging. However, I'm unable to find RBAC logs for read requests on custom resources.
Specifically, if I have a CR foo, I seem to only be able to view create and delete events on foo. get and list are separate permissions as well (in both IAM and cluster RBAC), but those calls don't seem to be audited.
Is there a way to see those requests, and their responses, or is that not possible?
It's weird because the cluster's own kube-apiserver.log seems to log those requests:
... httplog.go:109] "HTTP" verb="GET" URI="/apis/foo.io/v1/namespaces/foo-ns/custom-resource/foo" latency="26.286746ms" userAgent="kubectl/v1.xx.x (linux/amd64)" audit-ID="baz" srcIP="1.2.3.4:55555" resp=200

How set up Prometheus Operator with Grafana to enable basic Kubernetes monitoring

I followed a bunch of tutorials on how to monitor Kubernetes with prometheus and Grafana
All referring to a deprecated helm operator
According to the tutorials Grafana comes out of the box complete with cluster monitoring.
In practice Grafana is not installed with the chart
helm install prometheus-operator stable/prometheus -n monitor
nor is it installed with the newer community repo
helm install prometheus-operator prometheus-community/prometheus -n monitor
I installed the Grafana chart independently
helm install grafana-operator grafana/grafana -n monitor
And through the UI tried to connect using inner cluster URLs
prometheus-operator-server.monitor.svc.cluster.local:80
prometheus-operator-alertmanager.monitor.svc.cluster.local:80
the UI test indicates success but produces no metrics.
Is there a ready made Helm operator with out of the box Grafana?
How can Grafana interact with Prometeus?
You've used the wrong charts. Currently the project is named kube-prometheus-stack:
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
If you look at values.yaml you'll notice switches for everything, including prometheus, all the exporters, grafana, all the standard dashboards, alerts for kubernetes and so on. It's all installed by one chart. And it's all linked together out of the box.
They only additional thing you might need is an Ingress/ELB for grafana, prometheus, and alertmanager to be able to open them without port-forwarding (don't forget to add ouath2-proxy or smth else cause it's all opened with no password by default).
I wouldn't bother, look at PaaS like Datadog, NewRelic etc. What you are describing becomes a costly nightmare at scale. It's just not worth the hassle for what you get ihmo.

Customizing kubernetes dashboard with company name and environment

Problem statement:
Currently we are running k8s in multiple environments e.g. dev, uat,staging.
It becomes very difficult to identify for us just by looking at k8s dashboard UI.
Do we have any facility to customize k8s dashboard indicating somewhere in header or footer cluster or environment we are using?
Since K8S is open source, you should have the ability to do whatever you want. You will ofcourse need to play with the code and build you own custom dashboard image.
You can start off from here
https://github.com/kubernetes/dashboard/tree/master/src/app/frontend
This feature was released back in 2017, with the introduction of the settings ConfigMap. You just need to set the values of the kubernetes-dashboard-settings ConfigMap in kubernetes-dashboard namespace. You don't even need to restart the dashboard service/deployment.

How to change fluentd config for GKE-managed logging agent?

I have a container cluster in Google Container Engine with Stackdriver logging agent enabled. It is correctly pulling stdout logs from my containers. Now I would like to change the fluentd config to specify a log parser so that the logs shown in the GCP Logging view will have the correct severity and component.
Following this Stackdriver logging guide from kubernetes.io, I have attempted to:
Get the fluentd ConfigMap as a yml file
Added a new <filter> according to my log4js log format
Created a new ConfigMap named fluentd-cm-2 in kube-system namespace
Edited the DaemonSet for fluentd and set its ConfigMap to fluentd-cm-2. I did this using kubectl edit ds instead of kubectl replace -f because the latter failed with an error message: "the object has been modified", even after getting a fresh copy of the DaemonSet yaml.
Unexpected result: The DaemonSet is restarted, but its configuration is reverted back to the original ConfigMap, so my changes did not take effect.
I have also tried editing the ConfigMap directly (kubectl edit cm fluentd-gcp-config-v1.1 --namespace kube-system) and saved it, but it was also reverted.
I noticed that the DaemonSet and ConfigMap for fluentd are tagged with addonmanager.kubernetes.io/mode: Reconcile. I would conclude that GKE has overwritten my settings because of this "reconcile" mode.
So, my question is: how can I change the fluentd configuration in a Google Container Engine cluster, when the logging agent was installed by GKE on cluster provisioning?
Please take a look at the Prerequisites section on the documentation page you mentioned. It's mentioned there, that on GKE you cannot change the default Stackdriver Logging integration. The reason is that GKE maintains this configuration: updates the agent, watches its health and so on. It's not possible to provide the same level of support for all possible configurations.
However, you can always disable the default integration and deploy your own, patched version of DaemonSet. You can find out how to disable the default integration in the GKE documentation:
gcloud beta container clusters update [CLUSTER-NAME] \
--logging-service=none
Note, that after you disabled the default integration, you have to maintain the new deployment yourself: update the agent, set the resources, watch its health.
Here is a solution for using your own fluentd daemonset that is very much like the one included with GKE.
https://cloud.google.com/solutions/customizing-stackdriver-logs-fluentd