How grafana import old data when I restart prometheus? - grafana

I use grafana to show metrics from prometheus.
But when I restart prometheus server, grafana will not draw data that scraped before.
How make grafana draw all data that scraped from prometheus?

Don't think Grafana know or care about Prometheus restarts. Are you running Prometheus in a docker? Do you have the Prometheus storage set to a persistent storage. Grafana will just graph the data it gets from the respective data store.

The correct answer should be addressed here: How to persist data in Prometheus running in a Docker container?
In a nutshell, you need to launch prometheus docker by mounting its volume data (/prometheus/ by default) in a persistent way, so you don't lose data upon restart. I smashed my head over it a week, and finally I got it to work :)

Related

access to loki in grafana cloud

I just decided to try out grafana cloud and especially loki for this. Therefore I just set up to push my nextcloud logs for first experiments.
The push seems to be working, at least I see an ingest rate at the account dashboard.
Now I would like to explore the logs but don't have any clue where. Maybe I'm missing some links - how can I access the logs now? Is there any URL to access the ingested logs?
Same question will probably arise to access the provided prometheus instance.
Generally, Grafana is the "UI" for Loki, so you need some Grafana instance.
You very likely has also your Grafana instance in Grafana Cloud, where is your Loki datasource preconfigured, so you can explore logs there (with explore Grafana feature for example).
Or your Loki overview in the Grafana Cloud has Grafana Data Source settings, which you can use to configure Loki datasource in any other Grafana instance (e.g. you can start own Grafana instance locally).

Moving Logs into a Kubernetes Cluster

I have Grafana running inside a Kubernetes Cluster and i want to push logs from outside of Kubernetes (apps not running in K8s/DB's etc) into kubernetes so i can view them inside the Grafana cluster. What's the best way of doing this?
So Grafana is a GUI for reporting on data stored in other databases. It sounds like you are capturing metrics from the cluster and this data is stored in another database. If you are running Prometheus this is the database for Grafana's time-series data. You also may end up running long-term storage systems like Thanos in the future for that data to keep it over time depending on the volume of data.
Back to logging... Similarly to use Grafana for logs you'll need to implement some kind of logging database. The most popular is the formerly open-source ELK (ElasticSearch, Logstash, Kibana) stack. You can now use OpenSearch which is an open-source version of ElasticSearch and Kibana. Most K8S distributions come with Fluentd which replaces logstash for sending data. You can also install Fluentd or Fluentbit on any host to send data to this stack. You'll find that Grafana is not the best for log analysis, so most people use Kibana (OpenSearch Dashboards). However you can use Grafana as well, it's just painful IMO.
Another option if you don't want to run ELK is using Grafana Loki, which is another open-source database for logging. It's a lot more simple, but also more limited as to how you can query the logs due to the way it indexes. It works nicely with Grafana, but once again this is not a full-text indexing technology so it will be a bit limited.
Hope this is helpful, let me know if you have questions!

Best practices when trying to implement custom Kubernetes monitoring system

I have two Kubernetes clusters representing dev and staging environments.
Separately, I am also deploying a custom DevOps dashboard which will be used to monitor these two clusters. On this dashboard I will need to show information such as:
RAM/HD Space/CPU usage of each deployed Pod in each environment
Pod health (as in if it has too many container restarts etc)
Pod uptime
All these stats have to be at a cluster level and also per namespace, preferably. As in, if I query a for a particular namespace, I have to get all the resource usages of that namespace.
So the webservice layer of my dashboard will send a service request to the master node of my respective cluster in order to fetch this information.
Another thing I need is to implement real time notifications in my DevOps dashboard. Every time a container fails, I need to catch that event and notify relevant personnel.
I have been reading around and two things that pop up a lot are Prometheus and Metric Server. Do I need both or will one do? I set up Prometheus on a local cluster but I can't find any endpoints it exposes which could be called by my dashboard service. I'm also trying to set up Prometheus AlertManager but so far it hasn't worked as expected. Trying to fix it now. Just wanted to check if these technologies have the capabilities to meet my requirements.
Thanks!
I don't know why you are considering your own custom monitoring system. Prometheus operator provides all the functionality that you mentioned.
You will end up only with your own grafana dashboard with all required information.
If you need custom notification you can set it up in Alertmanager creating correct prometheusrules.monitoring.coreos.com, you can find a lot of preconfigured prometheusrules in kubernetes-mixin
.
Using labels and namespaces in Alertmanager you can setup a correct route to notify person responsible for a given deployment.
Do I need both or will one do?, yes, you need both - Prometheus collects and aggregates metric when Metrick server exposes metrics from your cluster node for your Prometheus to scrape it.
If you have problems with Prometheus, Alertmanger and so on consider using helm chart as entrypoint.
Prometheus + Grafana are a pretty standard setup.
Installing kube-prometheus or prometheus-operator via helm will give you
Grafana, Alertmanager, node-exporter and kube-state-metrics by default and all be setup for kubernetes metrics.
Configure alertmanager to do something with the alerts. SMTP is usually the first thing setup but I would recommend some sort of event manager if this is a service people need to rely on.
Although a dashboard isn't part of your requirements, this will inform how you can connect into prometheus as a data source. There is docco on adding prometheus data source for grafana.
There are a number of prebuilt charts available to add to Grafana. There are some charts to visualise alertmanager too.
Your external service won't be querying the metrics directly with prometheus, in will be querying the collected data in prometheus stored inside your cluster. To access the API externally you will need to setup an external path to the prometheus service. This can be configured via an ingress controller in the helm deployment:
prometheus.ingress.enabled: true
You can do the same for the alertmanager API and grafana if needed.
alertmanager.ingress.enabled: true
grafana.ingress.enabled: true
You could use Grafana outside the cluster as your dashboard via the same prometheus ingress if it proves useful.

Dashboards and Visualisations gets lost when Grafana restarts on DC/OS

I have used this documentation in order to deploy Prometheus with Grafana on the cluster.
The problem arises whenever we restart our Prometheus and Grafana with some changed configuration all our dashboards and visualizations are gone.
Is there a workaround where we can persist the dashboards and visualizations?
You need to define volumes, which will be used in the Grafana/Prometheus containers to store data persistently.
Doc: https://docs.mesosphere.com/1.7/administration/storage/mount-disk-resources/

Export PM2 Cluster Stats to Prometheus

I am trying to add monitoring to a Node.js PM2 cluster where I am looking for aggregated stats in prometheus which I will then import in Grafana.
I have been able to configure prom-client and get metrics for a single process to prometheus and grafana but not a pm2 cluster.
I referred https://github.com/siimon/prom-client/issues/165 and https://github.com/siimon/prom-client/issues/80 and both says its not possible.
Is there any other way to do it? I also referred https://github.com/redar9/pm2-cluster-prometheus but can't get it working as well.
I referred https://github.com/Unitech/pm2/issues/2035 and I was able to use it in my script and find which is the master and which is the slave. But not sure how I go ahead from there.
Any help is appreciated.
I've came up with this solution.
It correctly collects metrics across all instances of PM2 cluster.
Instead of cluster module there is no direct access to the master process in pm2. To return metrics for the whole cluster you can do IPC calls from the active instance to the rest of them and wait while all their locally collected metrics will be sent. Finally you have to aggregate all received metrics.
Node cluster is incompatible with the "pull model" of Prometheus, so a solution is to make node push data to an "collector", from which Prometheus pulls data. For example, statsd should work.
The idea in sketch:
node_instance -> statsd_exporter <- Prometheus