How do I view metrics for a confluentinc/cp-kafka container? - apache-kafka

Hi I have a Kafka container built using the image 'confluentinc/cp-kafka:6.1.0'.
How do I view the metrics from the container?

You can add an environment variable for JMX_PORT then attach a tool like jconsole or visualvm to that.
This is mentioned in the docs, but I think it might be incorrect (at least, trying to use /jmx on Zookeeper, and the variable is only JMX_PORT and shouldn't be different in the container)
If you want to use Prometheus/Grafana, then you'll need to extend the container to add the JMX exporter

I set up Kafka using https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#ce-docker-quickstart. This launches Kafka with JMX installed.
This installation provides Confluent Control Center so you can view metrics there.
However I wanted the raw metrics exposed by JMX so I proceeded to the next steps.
I installed VisualVM from here https://visualvm.github.io/download.html.
(You can also use jconsole available in the JAVA/jdk/bin folder installed in your local m/c but I had connectivity issues running jconsole against the container JMX.)
Install the VisualVM-MBeans plugin in VisualVM.
Add a JMX connection using the KAFKA_JMX_HOSTNAME:KAFKA_JMX_PORT values from your docker-compose.yml in Step 1.
Bingo you can see the metrics from Confluent Kafka running on the container!

Related

Confluent Control Center no clusters found

I am following this tutorial-
https://docs.confluent.io/platform/current/platform-quickstart.html
At step 3 when I click on "Connect" I see no option to add connector.
How do I add a connector?
For reference I am using M1 Mac book Air and Docker v4.12.0
You'll only be able to add a connector if you are running Kafka Connect server, and have properly configured Control Center to use it.
On Mac: Docker memory is allocated minimally at 6 GB (Mac). When using Docker Desktop for Mac, the default Docker memory allocation is 2 GB. Change the default allocation to 6 GB in the Docker Desktop app by navigating to Preferences > Resources > Advanced.
Assuming you already did that, then you need to look at the outputs from docker-compose ps and docker-compose logs connect to determine if the Connect containers are healthy and running.
Personally, I don't use Control Center since I prefer to manage connectors as config files, not copy/paste or click through UI fields. In other words, if Connect container is healthy, try using its HTTP endpoints directly with curl/postman, etc
I had exactly the same issue with there being no way to add a Connector.
Updating the container version from my old version 6.2.1 to 7.3.0 solved it.

How to send logs from Google Stackdriver to Kafka

I see many docs and posts about how to send logs to Stackdriver but almost no information about how to do the opposite - send logs from the Stackdriver to Kafka.
In my case, our Ops want to collect the logs from our web servers using Google's stackdriver agents and pushing them to stackdriver ... However, for my stream processing needs I want to get the logs into Kafka to use it's unparalleled abilities to retain and reprocess data by any number of consumers, something that I cannot do with PubSub.
So, what are the options for doing this? I only saw a couple of possible avenues - neither sounds too good:
based on this post: (https://powerspace.tech/how-to-stream-data-from-google-pubsub-to-kafka-with-kafka-connect-dbef1c340a76) push data into PubSub first, and then read from it using either Kafka connector or write my own Kafka consumer. I hate the thought of adding yet another hop (serialize/deserialize/ack/etc.) between the source of data and Kafka ....
I noticed a brief mentioning in passing on adding a plugin to Google's version of Fluentd (which is what stackdriver log collection agent is based on) here: https://powerspace.tech/how-to-stream-data-from-google-pubsub-to-kafka-with-kafka-connect-dbef1c340a76 . Not many details - so hard to tell how involved this approach is ...
Any other options?
Thank you!
Enter in to the Kafka console and add certain elements in the console. Once you have added the elements in the Kafka console you need to check if these elements are reflected successfully in the cloud shell. For this you will run the command > $ gcloud pubsub subscriptions pull from-kafka — auto-ack — limit=10 < . Once you run this command it will take some time to sync with the Kafka console. You will get the results after running this command a couple of times.
You will run the commands in the Cloud Shell and see the output in the Kafka VM SSH.
***Image1
Now you will be verifying the exact opposite procedure where in you will be running the command in the Kafka VM and seeing the output in the Cloud Shell. It will take some time for the output to be reflected and you may have to run the command > $ gcloud pubsub subscriptions pull from-kafka — auto-ack — limit=10 < a couple of times to see the output. Your output will look like this
*** image2
The Kafka plugin is deprecated. For more information, refer to https://cloud.google.com/stackdriver/docs/deprecations
Note: This functionality is only available for agents running on Linux. It is not available on Windows.
Kafka is monitored via JMX. Monitoring supports monitoring Kafka version 0.8.2 and higher.
On your VM instance, download kafka-082.conf from the GitHub configuration repository and place it in the directory /etc/stackdriver/collectd.d/:
(cd /etc/stackdriver/collectd.d/ && sudo curl -O https://raw.githubusercontent.com/Stackdriver/stackdriver-agent-service-configs/master/etc/collectd.d/kafka-082.conf)
The downloaded plugin configuration file assumes that your Kafka server is configured to accept JMX connections on port 9999. If you have configured Kafka with a different JMX port, as root, edit the file and follow the instructions to change the JMX port settings.
After adding the configuration file, restart the Monitoring agent by running the following command:
sudo service stackdriver-agent restart
What is monitored:
https://cloud.google.com/monitoring/api/metrics_agent#agent-kafka

Directly connecting jaeger client to remote collector using kafka as intermediate buffer

I am trying to connect to jaeger collector which uses Kafka as intermediate buffer.
Here are my doubts could any one please point to some docs .
QUESTION
1. How to connect to collector by skipping agent and use kafka as intermediate buffer.Please provide me command or configuration
2. Whats the configuration for kafka to connect to particular host.When I tried to use below command its still pointing to localhost and failing
docker run -e SPAN_STORAGE_TYPE=kafka jaegertracing/jaeger-collector:1.17
```{"level":"fatal","ts":1585063279.3705006,"caller":"collector/main.go:70","msg":"Failed to init storage factory","error":"kafka: client has run out of available brokers to talk to (Is your cluster reachable?)","stacktrace":"main.main.func1\n\tgithub.com/jaegertraci```
Please provide me some sample example so that I can go through,...

Brokers for Celery Executor in Airflow

Is it possible to use the following brokers instead of Redis or RabbitMQ:
Zookeeper
IBM MQ
Kafka
Megacache
If so, how would I be able to use it ?
Thanks
As per Celery documentation in a part of transport brokers support, RabbitMQ and Redis are fully featured and qualified as a stable solutions.
According to the list you've provided for any alternatives around, Zookeper might be also adopted as an Celery executor in Airflow but only as an experimental option with some functional limitations.
Installation details for Zookeper broker implementation you can find here.
Using Python package:
$ pip install "celery[zookeeper]"
You can check out all the available extensions in the source setup.py code.
Referencing Airflow documentation:
CeleryExecutor is one of the ways you can scale out the number of workers. For this to work, you need to setup a Celery backend
(RabbitMQ, Redis, …) and change your airflow.cfg to point the executor parameter to CeleryExecutor and provide the related Celery
settings.
After particular Celery backend being prepared, adjust appropriate settings in airflow.cfg file, for any incoming doubts refer to this example.

Adding standalone zookeeper servers into fabric ensemble

We're attempting to add standalone zookeeper servers into the zk ensemble ran by fuse fabric (as either followers or observers). However, it looks like fabric has pretty tight control over the zk configuration and I haven't been able to find any documentation relating to adding hardcoded server configuration params to the dynamic ones used by fabric. Anyone else try this or have some idea of where to look?