Prometheus zookeeper monitoring - apache-zookeeper

i want to move zookeeper to being monitored by prometheus .
i deployed jmx-exporter (sscaling/jmx-prometheus-exporter:0.1.0)
and got most of the metrics but some are missing , for example zookeeper.approximate_data_size and parnew metrics of the GarbageCollector
for example:
i get this par new metrics from the logstash with the same jmx exporter:
java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_used{name="ParNew",key="Par Survivor Space",}
but in the zookeeper i get only copy metrics:
java_lang_GarbageCollector_LastGcInfo_memoryUsageAfterGc_used{name="Copy",key="Metaspace",} 1.4809288E7

The most likely reason you are getting different GC metrics is that you are running the different JVMs with different memory settings/garbage collectors, and thus the metrics are different.
If Zookeeper is exposing a number via JMX, the JMX exporter should be returning it.

To collect zookeeper metrics from Prometheus
Create conf/java.env file (my zookeeper configuration directory: /etc/zookeeper/conf/java.env) and below mentioned contents
export SERVER_JVMFLAGS="-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=10701 -javaagent:/opt/jmx_prometheus/jmx_prometheus_javaagent-0.3.0.jar=10801:/opt/jmx_prometheus/config.yml $SERVER_JVMFLAGS"
Restart zookeeper service, zookeeper JMX metrics can be collected from port 10801(as I configured this port above to get metrics)

Related

file_descriptor_count,avg_latency and outstanding_request metrics is missing from zookeeper instance after we enabled prometheus exporter

After I enabled prometheus metrics exporter in zookeeper 3.7.1, one instance of zookeeper, running in a cluster of 3 , has stopped sending metrics for file_descriptor_count, avg_latency and outstanding_request. When I deploy it on a single instance it seems to be working fine sending all the metric data.
This is the content of zoo.cfg file -
zoo.cfg
metrics missing from one instance of zookeeper
Missing Metrics
I am currently stuck at this,It would be great if someone can help me... Thanks in advance!!

Is There any Java agent to monitor the metrices of all JVM running processes in my POD?

I am looking for a java agent to monitor the metrics of all JVM running processes in my POD. I want to collect the metrics for each process like CPU, Memory and forward them to stdout or Splunk. How Can I achieve that?
There are a bunch of ways to do this, but I'll highlight 3 of them.
Jolokia - Jolokia is remote JMX with JSON over HTTP. You essentially enable this in your java arguments and it will spin up a server that allows you to request json. You can also install hawtio online for a gui in kubernetes. If you go to Red Hat, the JBoss image has Jolokia installed by defaults. That's how it does it's health checks.
Prometheus - The Prometheus agent can also be installed. You can also ask for metrics on an http/https port just like jolokia.
OpenTelemetry - Opentelemetry is newer and I honestly haven't played around with it yet.
If you run Red Hat JBoss images , both Prometheus and Jolokia extensions are added by default. Hope this helps, Jay

Failed to send the transaction successfully to the order status: SERVICE UNAVAILABLE

I am using kakfa orderer service for hyperledger fabric 1.4. while updating chaincode or making any puutState call i am getting error message stated as Failed to send the transaction successfully to the order status: SERVICE UNAVAILABLE. while checking zoopeker and kafka node it seems like kafka nodes are not able to talk to each other.
kakfa & zookeeper logs
Could you provide more info about the topology of the zookeeper-kafka cluster?
Did you use docker to deploy the zk cluster? If so, you can refer to this file.
https://github.com/whchengaa/hyperledger-fabric-technical-tutorial/blob/cross-machine-hosts-file/balance-transfer/artifacts/docker-compose-kafka.yaml
Remember to specify the IP address of the other zookeeper nodes in the hosts file which is mounted to /etc/hosts of that zookeeper node.
Make sure the port number of zookeeper nodes listed in ZOO_SERVERS environment variables are correct.

mirrormaker2 prometheus metrics

Based on this link, we can setup a prometheus rule to monitor mm2:
https://github.com/apache/kafka/tree/trunk/connect/mirror#monitoring-an-mm2-process
Tried to apply this rule to a configmap on kubernetes using the confluent connect distribution, but no success neither any error.
Could be deleted or not available those jmx rules on the confluent connect distribution? (we are using the 5.5.0)
Confluent has nothing to do with the issue since MM2 is Apache Licensed.
You'll need to configure your prometheus jmx-exporter with the correct MBeans
I suggest using VisualVM or JMXTerm to view what info is available

Geting custom metrics from Kubernetes pods

I was looking into Kubernetes Heapster and Metrics-server for getting metrics from the running pods. But the issue is, I need some custom metrics which might vary from pod to pod, and apparently Heapster only provides cpu and memory related metrics. Is there any tool already out there, which would provide me the functionality I want, or do I need to build one from scratch?
What you're looking for is application & infrastructure specific metrics. For this, the TICK stack could be helpful! Specifically Telegraf can be set up to gather detailed infrastructure metrics like Memory- and CPU pressure or even the resources used by individual docker containers, network and IO metrics etc... But it can also scrape Prometheus metrics from pods. These metrics are then shipped to influxdb and visualized using either chronograph or grafana.
Not sure if this is still open.
I would classify metrics into 3 types.
Events or Logs - System and Applications events which are sent to logs. These are non-deterministic.
Metrics - CPU and Memory utilization on the node the app is hosted. This is deterministic and are collected periodically.
APM - Applicaton Performance Monitoring metrics - these are application level metrics like requests received vs failed vs responded etc.
Not all the platforms do everything. ELK for instance does both the Metrics and Log Monitoring and does not do APM. Some of these tools have plugins into collect daemons which collect perfmon metrics of the node.
APM is a completely different area as it requires developer tool to provider metrics as Springboot does Actuator, Nodejs does AppMetrics etc. This carries the request level data. Statsd is an open source library which application can consume to provide APM metrics too Statsd agents installed in the node.
AWS offers CloudWatch agents for log shipping and sink and Xray for distributed tracing which can be used for APM.