I want see all commands issued to my Zookeeper cluster, something like general log in MySQL or "monitor" command in Redis. How can this be done?
The intent is to see how Storm uses Zookeeper (for state management & ack-ing) and a good way to would be to run a sample topology and see all the commands issued to Zookeeper by Storm.
I tried enabling debug log for Zookeeper, but that is insufficient has a lot of noise. For example, issue a create /node prints
2018-02-27 18:05:34 ZooKeeperMain [DEBUG] Processing create
Here is what I've found till now.
Parse the Zookeeper transaction logs.
The best repo I've found to do this is https://github.com/alenca/zklogtool. It's feature packed. The only downside is that it won't record the read queries.
Enable debug logging and grep for Processing request::. This is a hit and miss. Sometimes the request paths are not captured and will need you enable debug logging which can be expensive on a production server.
Related
I am new to kafka. I have been given a task to send 2kb message with optimized throughput and latency. i really don't know how to benchmark these two metrics and setup my cluster. I do not have any cluster monitoring tool to use but to see the statistics on the terminal when i started the producer and consumer. Can anyone please help me which script i can use to see relevant statistics on the consumer end while the data flow is in progress?
make sure you check command-line tools that comes with Apache Kafka installation (bin/ directory). Those include kafka-producer-perf-test.sh and kafka-consumer-perf-test which can help you to test your cluster performance.
This article includes good examples: https://community.cloudera.com/t5/Community-Articles/Kafka-2-3-Performance-testing/ta-p/284767
I have a critical Kafka application that needs to be up and running all the time. The source topics are created by debezium kafka connect for mysql binlog. Unfortunately, many things can go wrong with this setup. A lot of times debezium connectors fail and need to be restarted, so does my apps then (because without throwing any exception it just hangs up and stops consuming). My manual way of testing and discovering the failure is checking kibana log, then consume the suspicious topic through terminal. I can mimic this in code but obviously no way the best practice. I wonder if there is the ability in KafkaStream api that allows me to do such health check, and check other parts of kafka cluster?
Another point that bothers me is if I can keep the stream alive and rejoin the topics when connectors are up again.
You can check the Kafka Streams State to see if it is rebalancing/running, which would indicate healthy operations. Although, if no data is getting into the Topology, I would assume there would be no errors happening, so you need to then lookup the health of your upstream dependencies.
Overall, sounds like you might want to invest some time into using monitoring tools like Consul or Sensu which can run local service health checks and send out alerts when services go down. Or at the very least Elasticseach alerting
As far as Kafka health checking goes, you can do that in several ways
Is the broker and zookeeper process running? (SSH to the node, check processes)
Is the broker and zookeeper ports open? (use Socket connection)
Are there important JMX metrics you can track? (Metricbeat)
Can you find an active Controller broker (use AdminClient#describeCluster)
Are there a required minimum number of brokers you would like to respond as part of the Controller metadata (which can be obtained from AdminClient)
Are the topics that you use having the proper configuration? (retention, min-isr, replication-factor, partition count, etc)? (again, use AdminClient)
I am running development environment for Confluent Kafka, Community edition on Windows, version 3.0.1-2.11.
I am trying to achieve load balancing of tasks between 2 instances of connector. I am running Kafka Zookepper, Server, REST services and 2 instance of Connect distributed on the same machine.
Only difference between properties file for connectors is rest port since they are running on the same machine.
I don't create topics for connector offsets, config, status. Should I?
I have custom code for sink connector.
When I create worker for my sink connector I do this by executing POST request
POST http://localhost:8083/connectors
toward any of the running connectors. Checking is there loaded worker is done at URL
GET http://localhost:8083/connectors
My sink connector has System.out.println() lines in code with which I can follow output of my code in the console log.
When my worker is running I can see that only one instance of connector is executing code. If I terminate one connector another instance will take over the worker and execution will resume. However this is not what I want.
My goal is that both connector instances are running worker code so that they can share the load between them.
I've tried to got over some open source connectors to see is there specifics in writing code of connectors but with no success.
I've made some different attempts to tackle this problem but with no success.
I could rewrite my business code to come around this but I'm pretty sure I'm missing on something not obvious for me.
Recently I commented on Robin Moffatt's answer of this question.
From the sounds of it your custom code is not correctly spawning the number of tasks that you are expecting.
Make sure that you've set tasks.max >1 in your config
Make sure that your connector is correctly creating the appropriate number of tasks to taskConfigs
References:
https://opencredo.com/blogs/kafka-connect-source-connectors-a-detailed-guide-to-connecting-to-what-you-love/
https://docs.confluent.io/current/connect/devguide.html
https://enfuse.io/a-diy-guide-to-kafka-connectors/
In Lagom Dev Enviornment, after starting Kafka using lagomKafkaStart
sometimes it shows KafkaServer Closed Unexpectedly, after that i need to run clean command to again get it running.
Please suggest is this the expected behaviour.
This can happen if you forcibly shut down sbt and the ZooKeeper data becomes corrupted.
Other than running the clean command, you can manually delete the target/lagom-dynamic-projects/lagom-internal-meta-project-kafka/ directory.
This will clear your local data from Kafka, but not from any other database (Cassandra or RDBMS). If you are using Lagom's message broker API, it will automatically repopulate the Kafka topic from the source database when you restart your service.
I'm getting started with Confluent Platform which requires to run Zookeeper (zookeeper-server-start /etc/kafka/zookeeper.properties) and then Kafka (kafka-server-start /etc/kafka/server.properties). I am writing an Upstart script that should run both Kafka and Zookeeper. The issue is that Kafka should block until Zookeeper is ready (because it depends on it) but I can't find a reliable way to know when Zookeeper is ready. Here are some attempts in pseudo-code after running the Zookeeper server start:
Use a hardcoded block
sleep 5
Does not work reliably on slower computers and/or waits longer than needed.
Check when something (hopefully Zookeeper) is running on port 2181
wait until $(echo stat | nc localhost ${port}) is not none
This did not seem to work as it doesn't wait long enough for Zookeeper to accept a Kafka connection.
Check the logs
wait until specific string in zookeeper log is found
This is sketchy and there isn't even a string that cannot also be found on error too (e.g. "binding to port [...]").
Is there a reliable way to know when Zookeeper is ready to accept a Kafka connection? Otherwise, I will have to resort to a combination of 1 and 2.
I found that using a timer is not reliable. the second option (waiting for the port) worked for me:
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties && \
while ! nc -z localhost 2181; do sleep 0.1; done && \
bin/kafka-server-start.sh -daemon config/server.properties
The Kafka error message from your comment is definitely relevant:
FATAL [Kafka Server 0], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering.
This indicates that ZooKeeper is up and running, and Kafka was able to connect to it. As I would have expected, technique #2 was sufficient for verifying that ZooKeeper is ready to accept connections.
Instead, the problem appears to be on the Kafka side. It has registered a ZooKeeper ephemeral node to represent the starting Kafka broker. An ephemeral node is deleted automatically when the client's ZooKeeper session expires (e.g. the process terminates so it stops heartbeating to ZooKeeper). However, this is based on timeouts. If the Kafka broker restarts rapidly, then after restart, it sees that a znode representing that broker already exists. To the new process start, this looks like there is already a broker started and registered at that path. Since brokers are expected to have unique IDs, it aborts.
Waiting for a period of time past the ZooKeeper session expiration is an appropriate response to this problem. If necessary, you could potentially tune the session expiration to happen faster as discussed in the ZooKeeper Administrator's Guide. (See discussion of tickTime, minSessionTimeout and maxSessionTimeout.) However, tuning session expiration to something too rapid could cause clients to experience spurious session expirations during normal operations.
I have less knowledge on Kafka, but perhaps there is also something that can be done on the Kafka side. I know that some management tools like Apache Ambari take steps to guarantee assignment of a unique ID to each broker on provisioning.