Kafka Stream in production - how to start other kafka stream instance when a node falls - apache-kafka

I have a cluster kafka with 3 nodes, and my question is:
There is a way to start a new kafka stream instance in the other node, when the main node falls?
Basically, I would like to apply the same concepts used in kafka connect, that is, start kafka connect in the other 2 nodes and when the main node falls, kafka start the connector in the other node automatically.

Related

Removing ZooKeeper node in Kafka cluster procedure?

What is the best way, procedure to remove Zookeeper node from Kafka cluster because of infrastructure changes?
Is it enought to stop this one node, remove it from Kafka configuration and do rolling restart all Kafka nodes?

Unable to connect kafka Consumer to Kafka cluster

We have Kafka cluster with 3 broker nodes. When all are up and running, consumer able to read data from Kafka. However if I stop all Kafka server and brings up only 2 Kafka server except the one which stopped last then Consumer unable to connect to Kafka cluster.
What could be the reason behind this? Thanks in advance.
I would guess that the problem could be the offsets.topic.replication.factor in the broker that by default is 3 while you are now running a cluster with 2 brokers only.
This is the topic where the consumers store the offsets when consuming and it was created with replication factor of 3 on the first run.
When, on the second run, you start only 2 broker, it could be the problem now.

How can I run Kafka Consumer processor instance on multiple nodes with Apache Nifi

Currently we are using Apache NiFi to consume messages via Kafka consumer. Output of kafka consumer is connected to hive processor.
I'm looking into how to run kafka consumer instance on a nifi cluster.
I have 3 nodes of nifi cluster and a kafka topic which have 3 partitions, I want the kafka consumer to be able run on each node so each consumer can poll message from one of topic partitions.
After I started the kafka consumer processor ,i can only see that the kafka consumer always run on a single node but not all nodes.
Is there any configuration that I missed?
NiFi uses the Apache Kafka client which is what performs the assignment of consumers to partitions. When you start the processor, assuming you have it set to 1 concurrent task, then you should have 1 consumer on each node of your cluster, and each consumer should get assigned a different partition.
https://bryanbende.com/development/2016/09/15/apache-nifi-and-apache-kafka

Kafka - consumers / producers works with all Zookeper instances down

I've configured a cluster of Kafka brokers and a cluster of Zk instances using kafka_2.11-1.1.0 distribution archive.
For Kafka brokers I've configured config/server.properties
broker.id=1,2,3
zookeeper.connect=box1:2181,box2:2181,box3:2181
For Zk instances I've configured config/zookeeper.properties:
server.1=box1:2888:3888
server.2=box3:2888:3888
server.3=box3:2888:3888
I've created a basic producer and a basic consumer and I don't know why I am able to write messages / read messages even if I shut down all the Zookeeper
instances and have all the Kafka brokers up and running.
Even booting up new consumers, producers works without any issue.
I thought having a quorum of Zk instances is a vital point for a Kafka cluster.
For both consumer and producer, I've used following configuration:
bootrapServers=box1:9092,box2:9092,box3:9092
Thanks
I thought having a quorum of Zk instances is a vital point for a Kafka cluster.
Zookeeper quorum is vital for managing partition lists, leaders, etc. In general, ZK is necessary for management that is done by the cluster coordinator in the cluster.
Basically, right now (with ZK down), you cannot modify topics (as the partition metadata is stored in ZK), start up / shut down brokers (as they use ZK for discovery) and other similar operations.
Even booting up new consumers, producers works without any issue.
Producer/consumer operations reach out to brokers only. The broker instance can still append to the log, and can still communicate with other brokers to have replication. So it is possible to send a message, get it received by broker and saved to disk, with other brokers replicating (as they are continuously sending fetch requests to the leader (and they know who this partition's leader is because they saved that data when ZK was still running)).

Connecting Storm with remote Kafka cluster, what would happen if new brokers are added

We are working on an application that uses Storm to pull data from a remote Kafka cluster. As the two cluster lies in different environment there is an issue with network connectivity between them. In simple term by default the remote zookeeper and Kafka brokers does not allow connection from our Storm's worker/supervisor nodes. In order to do that we need firewall access to be given.
My concern is what would happen if new Brokers or Zookeeper is added in the remote cluster ? I understand that we don't have to specify all the zk nodes in order to consume but say they add few brokers and we need to consume from a partition which is served by those new set of nodes ? What would be the impact on the running Storm application ?