Unclean shutdown breaks Kafka cluster - apache-kafka

My team has observed that if broker process die unclean then it will block producer from sending messages to kafka topic.
Here is how to reproduce the problem:
1) Create a Kafka 0.10 with three brokers (A, B and C).
2) Create topic with replication_factor = 2
3) Set producer to send messages with "acks=all" meaning all replicas must be created before able to proceed next message.
3) Force IEM (IBM Endpoint Manager) to send patch to broker A and force server to reboot after patches installed.
Note: min.insync.replicas = 1
Result:
- Producers are not able send messages to kafka topic after broker rebooted and come back to join cluster with following error messages.
[2016-09-28 09:32:41,823] WARN Error while fetching metadata with correlation id 0 : {logstash=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
We suspected that number of replication_factor (2) is not sufficient to our kafka environment but really need an explanation on what happen when broker facing unclean shutdown. The same issue occurred when setting cluster with 2 brokers and replication_factor = 1.
The workaround i used to recover service is to cleanup both kafka topic log file and zookeeper data (rmr /brokers/topics/XXX and rmr /consumers/XXX).
Thanks,
Anukool

Related

A Kafka broker is gracefully shutdown, and incorrect metadata was passed to the Kafka connect client

To maintain the server, one of the 20 brokers was shutdown gracefully, but all kafka-connect cluster (sink) died with the following NPE error. Replication-factor of all topics was more than 2, there were 50 topics and 200 partitions. Checking up the error and the Kafka library source code, it seems that the error occurred when the Connect client cached the metadata including the broker node id set and partition info set information received from the broker.
How can this happen, and how to deal with it in the future?
(the Version of Broker and Client is v2.3.1)
This is a bug. The Connect cluster should not be negatively impacted by a broker shutting down and it should not throw an NPE.
Please open a ticket in https://issues.apache.org/jira/projects/KAFKA/issues/. It's also best it you paste the stack trace as text instead of an image.

kafka + what could be the reasons for kafka broker isn't the leader for topic partition

we have HDP cluster - 2.6.4 with ambari 2.6.1 version
we have 3 kafka brokers with version 10.1 , and 3 zookeeper servers
we saw in the /var/log/kafka/server.log many errors messages as :
in this example we have 6601 errors lines about:
This server is not the leader for that topic-partition
example
[2019-01-06 14:56:53,312] ERROR [ReplicaFetcherThread-0-1011], Error for partition [topic1-example,34] to broker 1011:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)
we check the connectivity's between the kafka brokers and connectivity seems to be ok ( we verify the /var/log/messages and dmesg on the Linux kafka machines )
we are also suspect is the connections between the zookeeper client on kafka brokers to the zookeepers servers
but we not know how to check the relationship between client on kafka to zookeeper servers
we also know that kafka send heartbeat to zookeeper servers ( I think the heartbeat value is 2 seconds ) but we not sure if this is the right direction to search what cause the leader to disappears
any ideas what are the reasons that - kafka broker isn't the leader for topic partition ?
other related links
https://jar-download.com/artifacts/org.apache.kafka/kafka-clients/0.10.2.0/source-code/org/apache/kafka/common/protocol/Errors.java
kafka : one broker keeping print INFO log : "NOT_LEADER_FOR_PARTITION"
https://github.com/SOHU-Co/kafka-node/issues/297

Doubts Regarding Kafka Cluster Setup

I have a use case I want to set up a Kafka cluster initially at the starting I have 1 Kafka Broker(A) and 1 Zookeeper Node. So below mentioned are my queries:
On adding a new Kafka Broker(B) to the cluster. Will all data present on broker A will be distributed automatically? If not what I need to do distribute the data.
Not let's suppose somehow the case! is solved my data is distributed on both the brokers. Now due to some maintenance issue, I want to take down the server B.
How to transfer the data of Broker B to the already existing broker A or to a new Broker C.
How can I increase the replication factor of my brokers at runtime
How can I change the zookeeper IPs present in Kafka Broker Config at runtime without restarting Kafka?
How can I dynamically change the Kafka Configuration at runtime
Regarding Kafka Client:
Do I need to specify all Kafka broker IP to kafkaClient for connection?
And each and every time a broker is added or removed does I need to add or remove my IP in Kafka Client connection String. As it will always require to restart my producer and consumers?
Note:
Kafka Version: 2.0.0
Zookeeper: 3.4.9
Broker Size : (2 core, 8 GB RAM) [4GB for Kafka and 4 GB for OS]
To run a topic from a single kafka broker you will have to set a replication factor of 1 when creating that topic (explicitly, or implicitly via default.replication.factor). This means that the topic's partitions will be on a single broker, even after increasing the number of brokers.
You will have to increase the number of replicas as described in the kafka documentation. You will also have to pay attention that the internal __consumer_offsets topic has enough replicas. This will start the replication process and eventually the original broker will be the leader of every topic partition, and the other broker will be the follower and fully caught up. You can use kafka-topics.sh --describe to check that every partition has both brokers in the ISR (in-sync replicas).
Once that is done you should be able to take the original broker offline and kafka will elect the new broker as the leader of every topic partition. Don't forget to update the clients so they are aware of the new broker as well, in case a client needs to restart when the original broker is down (otherwise it won't find the cluster).
Here are the answers in brief:
Yes, the data present on broker A will also be distributed in Kafka broker B
You can set up three brokers A, B and C so if A fails then B and C will, and if B fails then, C will take over and so on.
You can increase the replication factor of your broker
you could create increase-replication-factor.json and put this content in it:
{"version":1,
"partitions":[
{"topic":"signals","partition":0,"replicas":[0,1,2]},
{"topic":"signals","partition":1,"replicas":[0,1,2]},
{"topic":"signals","partition":2,"replicas":[0,1,2]}
]}
To increase the number of replicas for a given topic, you have to:
Specify the extra partitions to the existing topic with below command(let us say the increase from 2 to 3)
bin/kafktopics.sh --zookeeper localhost:2181 --alter --topic topic-to-increase --partitions 3
There is zoo.cfg file where you can add the IP and configuration related to ZooKeeper.

Kafka 0.10 quickstart: consumer fails when "primary" broker is brought down

So I'm trying the kafka quickstart as per the main documentation. Got the multi-cluster example all setup and test per the instructions and it works. For example, bringing down one broker and the producer and consumer can still send and receive.
However, as per the example, we setup 3 brokers and we bring down broker 2 (with broker id = 1). Now if I bring up all brokers again, but I bring down broker 1 (with broker id = 0), the consumer just hangs. This only happens with broker 1 (id = 0), does not happen with broker 2 or 3. I'm testing this on Windows 7.
Is there something special here with broker 1? Looking at the config they are exactly the same between all 3 brokers except the id, port number and log file location.
I thought it is just a problem with the provided console consumer which doesn't take a broker list, so I wrote a simple java consumer as per their documentation using the default setup but specify the list of brokers in the "bootstrap.servers" property, but no dice, still get the same problem.
The moment I startup broker 1 (broker id = 0), the consumers will just resume working. This isn't a highly available/fault tolerant behavior for the consumer... any help on how to setup a HA/fault tolerant consumer?
Producers doesn't seem to have an issue.
If you follow the quick-start, the created topic should have only one partition with one replica which is hosted in the first broker by default, namely broker 1. That's why the consumer got failed when you brought down this broker.
Try to create a topic with multiple replicas(specifying --replication-factor when creating topic) and rerun your test to see whether it brings higher availability.

kafka new producer is not able to update metadata after one of the broker is down

I have an kafka environment which has 2 brokers and 1 zookeeper.
While I am trying to produce messages to kafka, if i stop broker 1(which is the leader one) the client stops producing messaging and give me the below error although the broker 2 is elected as a new leader for the topic and partions.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
After 10 minutes passed, since broker 2 is new leader i expected producer to send data to broker 2 but it continued failing by giving above exception. lastRefreshMs and lastSuccessfullRefreshMs is still same although the metadataExpireMs is 300000 for producer.
I am using kafka new Producer implementation on producer side.
It seems that when producer is initiated, it binds to one broker and if that broker goes down it is not even trying to connect to another brokers in cluster.
But my expectation is if a broker goes down, it should directly check metadata for another brokers that are available and send data to them.
Btw my topic is 4 partition and has replication factor of 2. Giving this info in case it makes sense.
Configuration params.
{request.timeout.ms=30000, retry.backoff.ms=100, buffer.memory=33554432, ssl.truststore.password=null, batch.size=16384, ssl.keymanager.algorithm=SunX509, receive.buffer.bytes=32768, ssl.cipher.suites=null, ssl.key.password=null, sasl.kerberos.ticket.renew.jitter=0.05, ssl.provider=null, sasl.kerberos.service.name=null, max.in.flight.requests.per.connection=5, sasl.kerberos.ticket.renew.window.factor=0.8, bootstrap.servers=[10.201.83.166:9500, 10.201.83.167:9500], client.id=rest-interface, max.request.size=1048576, acks=1, linger.ms=0, sasl.kerberos.kinit.cmd=/usr/bin/kinit, ssl.enabled.protocols=[TLSv1.2, TLSv1.1, TLSv1], metadata.fetch.timeout.ms=60000, ssl.endpoint.identification.algorithm=null, ssl.keystore.location=null, value.serializer=class org.apache.kafka.common.serialization.ByteArraySerializer, ssl.truststore.location=null, ssl.keystore.password=null, key.serializer=class org.apache.kafka.common.serialization.ByteArraySerializer, block.on.buffer.full=false, metrics.sample.window.ms=30000, metadata.max.age.ms=300000, security.protocol=PLAINTEXT, ssl.protocol=TLS, sasl.kerberos.min.time.before.relogin=60000, timeout.ms=30000, connections.max.idle.ms=540000, ssl.trustmanager.algorithm=PKIX, metric.reporters=[], compression.type=none, ssl.truststore.type=JKS, max.block.ms=60000, retries=0, send.buffer.bytes=131072, partitioner.class=class org.apache.kafka.clients.producer.internals.DefaultPartitioner, reconnect.backoff.ms=50, metrics.num.samples=2, ssl.keystore.type=JKS}
Use Case:
1- Start BR1 and BR2 Produce data (Leader is BR1)
2- Stop BR2 produce data(fine)
3- Stop BR1(which means there is no active working broker in cluster at this time) and then Start BR2 and produce data (failed although leader is BR2)
4- Start BR1 produce data(leader is still BR2 but data is produced finely)
5- Stop BR2(now BR1 is leader)
6- Stop BR1(BR1 is still leader)
7- Start BR1 produce data(message is produced fine again)
If producer send the latest successful data to BR1 and then all brokers goes down, the producer expects BR1 to get up again although BR2 is up and new leader. Is this an expected behaviour?
After spending hours I figured out the behaviour of kafka in my situation. May be this is a bug or may be this needs to be done this way for the reasons lie under the hood but actually if i would do such implementation i wouldn't do this way :)
When all brokers goes down, if you are able to get up only one broker this must be the broker which went down last in order to produce messages successfully.
Let's say you have 5 brokers; BR1, BR2, BR3, BR4 and BR5. If all goes down and if the lastly dead broker is BR3(which was the last leader), although you start all brokers BR1, BR2, BR4 and BR5, it will not make any sense unless you start BR3.
You need to increase the number of retries.
In your case you need to set it to >=5.
That is the only way for your producer to know that your cluster has a new leader.
Besides that, make sure that all your brokers have a copy of your partition(s). Else you aren't going to get a new leader.
in the latest kafka version, when a broker down and that's have a leader partition which used by a producer. The producer will retry until catch retriable exception, then producer need to update metadata. The new metadata can be fetch from leastLoadNode. So new leader will be updated and producer can write there.