Export PM2 Cluster Stats to Prometheus - grafana

I am trying to add monitoring to a Node.js PM2 cluster where I am looking for aggregated stats in prometheus which I will then import in Grafana.
I have been able to configure prom-client and get metrics for a single process to prometheus and grafana but not a pm2 cluster.
I referred https://github.com/siimon/prom-client/issues/165 and https://github.com/siimon/prom-client/issues/80 and both says its not possible.
Is there any other way to do it? I also referred https://github.com/redar9/pm2-cluster-prometheus but can't get it working as well.
I referred https://github.com/Unitech/pm2/issues/2035 and I was able to use it in my script and find which is the master and which is the slave. But not sure how I go ahead from there.
Any help is appreciated.

I've came up with this solution.
It correctly collects metrics across all instances of PM2 cluster.
Instead of cluster module there is no direct access to the master process in pm2. To return metrics for the whole cluster you can do IPC calls from the active instance to the rest of them and wait while all their locally collected metrics will be sent. Finally you have to aggregate all received metrics.

Node cluster is incompatible with the "pull model" of Prometheus, so a solution is to make node push data to an "collector", from which Prometheus pulls data. For example, statsd should work.
The idea in sketch:
node_instance -> statsd_exporter <- Prometheus

Related

Export logs of Kubernetes cronjob to a path after each run

I currently have a Cronjob that has a job that schedule at some period of time and run in a pattern. I want to export the logs of each pod runs to a file in the path as temp/logs/FILENAME
with the FILENAME to be the timestamp of the run being created. How am I going to do that? Hopefully to provide a solution. If you would need to add a script, then please use python or shell command. Thank you.
According to Kubernetes Logging Architecture:
In a cluster, logs should have a separate storage and lifecycle
independent of nodes, pods, or containers. This concept is called
cluster-level logging.
Cluster-level logging architectures require a separate backend to
store, analyze, and query logs. Kubernetes does not provide a native
storage solution for log data. Instead, there are many logging
solutions that integrate with Kubernetes.
Which brings us to Cluster-level logging architectures:
While Kubernetes does not provide a native solution for cluster-level
logging, there are several common approaches you can consider. Here
are some options:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application.
Kubernetes does not provide log aggregation of its own. Therefore, you need a local agent to gather the data and send it to the central log management. See some options below:
Fluentd
ELK Stack
You can find all logs that PODs are generating at /var/log/containers/*.log
on each Kubernetes node. You could work with them manually if you prefer, using simple scripts, but you will have to keep in mind that PODs can run on any node (if not restricted), and nodes may come and go.
Consider sending your logs to an external system like ElasticSearch or Grafana Loki and manage them there.

Best practices when trying to implement custom Kubernetes monitoring system

I have two Kubernetes clusters representing dev and staging environments.
Separately, I am also deploying a custom DevOps dashboard which will be used to monitor these two clusters. On this dashboard I will need to show information such as:
RAM/HD Space/CPU usage of each deployed Pod in each environment
Pod health (as in if it has too many container restarts etc)
Pod uptime
All these stats have to be at a cluster level and also per namespace, preferably. As in, if I query a for a particular namespace, I have to get all the resource usages of that namespace.
So the webservice layer of my dashboard will send a service request to the master node of my respective cluster in order to fetch this information.
Another thing I need is to implement real time notifications in my DevOps dashboard. Every time a container fails, I need to catch that event and notify relevant personnel.
I have been reading around and two things that pop up a lot are Prometheus and Metric Server. Do I need both or will one do? I set up Prometheus on a local cluster but I can't find any endpoints it exposes which could be called by my dashboard service. I'm also trying to set up Prometheus AlertManager but so far it hasn't worked as expected. Trying to fix it now. Just wanted to check if these technologies have the capabilities to meet my requirements.
Thanks!
I don't know why you are considering your own custom monitoring system. Prometheus operator provides all the functionality that you mentioned.
You will end up only with your own grafana dashboard with all required information.
If you need custom notification you can set it up in Alertmanager creating correct prometheusrules.monitoring.coreos.com, you can find a lot of preconfigured prometheusrules in kubernetes-mixin
.
Using labels and namespaces in Alertmanager you can setup a correct route to notify person responsible for a given deployment.
Do I need both or will one do?, yes, you need both - Prometheus collects and aggregates metric when Metrick server exposes metrics from your cluster node for your Prometheus to scrape it.
If you have problems with Prometheus, Alertmanger and so on consider using helm chart as entrypoint.
Prometheus + Grafana are a pretty standard setup.
Installing kube-prometheus or prometheus-operator via helm will give you
Grafana, Alertmanager, node-exporter and kube-state-metrics by default and all be setup for kubernetes metrics.
Configure alertmanager to do something with the alerts. SMTP is usually the first thing setup but I would recommend some sort of event manager if this is a service people need to rely on.
Although a dashboard isn't part of your requirements, this will inform how you can connect into prometheus as a data source. There is docco on adding prometheus data source for grafana.
There are a number of prebuilt charts available to add to Grafana. There are some charts to visualise alertmanager too.
Your external service won't be querying the metrics directly with prometheus, in will be querying the collected data in prometheus stored inside your cluster. To access the API externally you will need to setup an external path to the prometheus service. This can be configured via an ingress controller in the helm deployment:
prometheus.ingress.enabled: true
You can do the same for the alertmanager API and grafana if needed.
alertmanager.ingress.enabled: true
grafana.ingress.enabled: true
You could use Grafana outside the cluster as your dashboard via the same prometheus ingress if it proves useful.

Live monitoring of container, nodes and cluster

we are using k8s cluster for one of our application, cluster is owned by other team and we dont have full control over thereā€¦ We are trying to find out metrics around resource utilization (CPU and memory), detail about running containers/pods/nodes etc. Need to find out how many parallel containers are running. Problem is they have exposed monitoring of cluster via Prometheus but with Prometheus we are not getting live data, it does not have info about running containers.
My query is , what is that API which is by default available in k8s cluster and can give all what we need. We dont want to read data form another client like Prometheus or anything else, we want to read metrics directly from cluster so that data is not stale. Any suggestions?
As you mentioned you will need metrics-server (or heapster) to get those information.
You can confirm if your metrics server is running kubectl top nodes/pods or just by checking if there is a heapster or metrics-server pod present in kube-system namespace.
Also the provided command would be able to show you the information you are looking for. I wont go into details as here you can find a lot of clues and ways of looking at cluster resource usage. You should probably take a look at cadvisor too which should be already present in the cluster. It exposes a web UI which exports live information about all the containers on the machine.
Other than that there are probably commercial ways of acheiving what you are looking for, for example SignalFx and other similar projects - but this will probably require the cluster administrator involvement.

Service Level metrics Prometheus in k8

would like to see k8 Service level metrics in Grafana from underlying prometheus server.
For instance:
1) If i have 3 application pods exposed through a service i would like to see service level metrics for CPU,memory & network I/O pressure ,Total # of requests,# of requests failed
2)Also if i have group of pods(replicas) related to an application which doesn"t have Service on top of them would like to see the aggregated metrics of the pods related to that application in a single view on grafana
What would be the prometheus queries to achieve the same
Service level metrics for CPU, memory & network I/O pressure
If you have Prometheus installed on your Kubernetes cluster, all those statistics are being already collected by Prometheus. There are many good articles about how to install and how to use Kubernetes+Prometheus, try to check that one, as an example.
Here is an example of a request to fetch container memory usage:
container_memory_usage_bytes{image="CONTAINER:VERSION"}
Total # of requests,# of requests failed
Those are service-level metrics, and for collecting them, you need to use Prometheus Exporter created especially for your service. Check the list with exporters, find one which you need for your service and follow its instruction.
If you cannot find an Exporter for your application, you can write it yourself, here is an official documentation about it.
application which doesn"t have Service on top of them would like to see the aggregated metrics of the pods related to that application in a single view on grafana
It is possible to combine any graphics in a single view in Grafana using Dashboards and Panels. Check an official documentation, all that topics pretty detailed and easy to understand.
Aggregation can be done by Prometheus itself by aggregation operations.
All metrics from Kubernetes has labels, so you can group by them:
sum(http_requests_total) by (application, group), where application and group is labels.
Also, here is an official Prometheus instruction about how to add Prometheus to Grafana as a Datasourse.

Spring boot and prometheus

I am trying to figure out how to best collect metrics from a set of spring boot based services running within a Kubernetes cluster. Looking at the various docs, it seems that the choice for internal monitoring is between Actuator or Spectator with metrics being pushed to an external collection store such as Redis or StatsD or pulled, in the case of Prometheus.
Since the number of instances of a given service is going to vary, I dont see how Prometheus can be configured to poll those running services since it will lack knowledge of them. I am also building around a Eureka service registry so not sure if that is polled first in this configuration.
Any real world insight into this kind of approach would be welcome.
You should use the Prometheus java client (https://www.robustperception.io/instrumenting-java-with-prometheus/) for instrumenting. Approaches like redis and statsd are to be avoided, as they mean hitting the network on every single event - greatly limiting what you can monitor.
Use file_sd service discovery in Prometheus to provide it with a list of targets from Eureka (https://www.robustperception.io/using-json-file-service-discovery-with-prometheus/), though if you're using Kubernetes like your tag hints Prometheus has a direct integration there.