Dashboards and Visualisations gets lost when Grafana restarts on DC/OS - grafana

I have used this documentation in order to deploy Prometheus with Grafana on the cluster.
The problem arises whenever we restart our Prometheus and Grafana with some changed configuration all our dashboards and visualizations are gone.
Is there a workaround where we can persist the dashboards and visualizations?

You need to define volumes, which will be used in the Grafana/Prometheus containers to store data persistently.
Doc: https://docs.mesosphere.com/1.7/administration/storage/mount-disk-resources/

Related

Prometheus Grafana Pod metrics

We are using AWS EKS for deployment of our application in kubernetes cluster, using the AWS managed service for prometheus we have set up a grafana application.
However most of the metrics related to the application are not able to get any data,
The namespace is visible on some of the dashboards but it seems that none of the applications are captured by prometheus.
These are Java applications built on top of spring boot.
The above image shows up the namespace but no data is captured i tried many other dashboards but no luck.
Does this needs some changes in deployment.yaml of the application?
Any help on the same is highly appreciated.
V

access to loki in grafana cloud

I just decided to try out grafana cloud and especially loki for this. Therefore I just set up to push my nextcloud logs for first experiments.
The push seems to be working, at least I see an ingest rate at the account dashboard.
Now I would like to explore the logs but don't have any clue where. Maybe I'm missing some links - how can I access the logs now? Is there any URL to access the ingested logs?
Same question will probably arise to access the provided prometheus instance.
Generally, Grafana is the "UI" for Loki, so you need some Grafana instance.
You very likely has also your Grafana instance in Grafana Cloud, where is your Loki datasource preconfigured, so you can explore logs there (with explore Grafana feature for example).
Or your Loki overview in the Grafana Cloud has Grafana Data Source settings, which you can use to configure Loki datasource in any other Grafana instance (e.g. you can start own Grafana instance locally).

Grafana dashboard doesn't display data in Rancher UI

I enabled monitoring for my project in Rancher UI, and it installed succesfully. But when I click "Go to grafana" at my workload (such as nginx), it moves to Grafana dashboard, but Grafana show nothing: 0 CPU, 0 memory, 0 networking ,...
Why doesn't it have data ?
And how I can know consumed quota of my resource (workload, service, pod)?
Please see my screenshots:
Many thanks
Prometheus is necessary component for using Grafana.
You can check this blogpost:
Kubernetes Monitoring with Prometheus, Grafana and Rancher
Prometheus is an open-source application for monitoring systems and generating alerts. ... Prometheus will scrape targets at designated intervals and store the information in a time-series database.
Grafana is also open source and runs as a web application. ... Grafana makes it easy to create graphs and assemble those graphs into dashboards.
Check, if check if Prometheus is turned on and whether it is configured correctly:
Configuring Project Monitoring
From the Global view, navigate to the project that you want to configure project monitoring.
Select Tools > Monitoring in the navigation bar.
Select Enable to show the Prometheus configuration options. Enter in your desired configuration options.
Click Save.
Also check these settings:
Prometheus Configuration
Enable Persistent Storage for Prometheus Whether or not to configure storage for Prometheus so that metrics can be retained even if the Prometheus pod fails.
Enable Persistent Storage for Grafana Whether or not to configure storage for Grafana so that the Grafana dashboards and configuration can be retained even if the Grafana pod fails.

Best practices when trying to implement custom Kubernetes monitoring system

I have two Kubernetes clusters representing dev and staging environments.
Separately, I am also deploying a custom DevOps dashboard which will be used to monitor these two clusters. On this dashboard I will need to show information such as:
RAM/HD Space/CPU usage of each deployed Pod in each environment
Pod health (as in if it has too many container restarts etc)
Pod uptime
All these stats have to be at a cluster level and also per namespace, preferably. As in, if I query a for a particular namespace, I have to get all the resource usages of that namespace.
So the webservice layer of my dashboard will send a service request to the master node of my respective cluster in order to fetch this information.
Another thing I need is to implement real time notifications in my DevOps dashboard. Every time a container fails, I need to catch that event and notify relevant personnel.
I have been reading around and two things that pop up a lot are Prometheus and Metric Server. Do I need both or will one do? I set up Prometheus on a local cluster but I can't find any endpoints it exposes which could be called by my dashboard service. I'm also trying to set up Prometheus AlertManager but so far it hasn't worked as expected. Trying to fix it now. Just wanted to check if these technologies have the capabilities to meet my requirements.
Thanks!
I don't know why you are considering your own custom monitoring system. Prometheus operator provides all the functionality that you mentioned.
You will end up only with your own grafana dashboard with all required information.
If you need custom notification you can set it up in Alertmanager creating correct prometheusrules.monitoring.coreos.com, you can find a lot of preconfigured prometheusrules in kubernetes-mixin
.
Using labels and namespaces in Alertmanager you can setup a correct route to notify person responsible for a given deployment.
Do I need both or will one do?, yes, you need both - Prometheus collects and aggregates metric when Metrick server exposes metrics from your cluster node for your Prometheus to scrape it.
If you have problems with Prometheus, Alertmanger and so on consider using helm chart as entrypoint.
Prometheus + Grafana are a pretty standard setup.
Installing kube-prometheus or prometheus-operator via helm will give you
Grafana, Alertmanager, node-exporter and kube-state-metrics by default and all be setup for kubernetes metrics.
Configure alertmanager to do something with the alerts. SMTP is usually the first thing setup but I would recommend some sort of event manager if this is a service people need to rely on.
Although a dashboard isn't part of your requirements, this will inform how you can connect into prometheus as a data source. There is docco on adding prometheus data source for grafana.
There are a number of prebuilt charts available to add to Grafana. There are some charts to visualise alertmanager too.
Your external service won't be querying the metrics directly with prometheus, in will be querying the collected data in prometheus stored inside your cluster. To access the API externally you will need to setup an external path to the prometheus service. This can be configured via an ingress controller in the helm deployment:
prometheus.ingress.enabled: true
You can do the same for the alertmanager API and grafana if needed.
alertmanager.ingress.enabled: true
grafana.ingress.enabled: true
You could use Grafana outside the cluster as your dashboard via the same prometheus ingress if it proves useful.

How grafana import old data when I restart prometheus?

I use grafana to show metrics from prometheus.
But when I restart prometheus server, grafana will not draw data that scraped before.
How make grafana draw all data that scraped from prometheus?
Don't think Grafana know or care about Prometheus restarts. Are you running Prometheus in a docker? Do you have the Prometheus storage set to a persistent storage. Grafana will just graph the data it gets from the respective data store.
The correct answer should be addressed here: How to persist data in Prometheus running in a Docker container?
In a nutshell, you need to launch prometheus docker by mounting its volume data (/prometheus/ by default) in a persistent way, so you don't lose data upon restart. I smashed my head over it a week, and finally I got it to work :)