Cannot get leader of topic "test" partition - apache-kafka

I was performing a rolling kafka broker upgrade from 2.2.1 to 3.2.0(from cp 5.2.2 to 7.2.1).This was done as a rolling upgrade on kafka cluster.There are three pods i.e, kafka-0,kafka-1 and kafka-2.The replication factor for this topic is 2.During this rolling upgrade I am facing a downtime as leader is not assigned for "test" partition.
Cannot get leader of topic test partition 4: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes
when I added log I can see Partition leader=-1
I tried increasing the replication factor for one of the topic it worked but no concrete reason why did that happen

Probably you have min.in.sync.replicas=2, and unclean.leader.election=false. Therefore, when the current leader went down, it wasn't fully replicated, so no other replica will take its place.
You have 3 brokers, so there's no reason not to have replication factor of 3, other than to save cost, with the tradeoff of unavailability when losing any one replica.

The producer should retry until a new leader is elected. Make sure you have acks= all and min.insync.replicas=2 as #OneCricketeer suggested.
You should expect to see the "NotLeaderException" during upgrades or broker failures though. It's normal and as long as the producer retries and the above configurations are set there won't be data loss.

Related

Kafka brokers will share same locations to store data logs; if they are in a cluster

I am reading one of the article related to Kafka basics. If one of the Kafka brokerX dies in a cluster then, that brokerX data copies will move to other live brokers, which are in the cluster.
If that is the case, Is zookeeper/Kafka Controller will copy the brokerX data folder and move to live brokers like copy paste from one machine hard-disc to another (physical copy)?
Or, live brokers will share a common location ? so that will zookeeper/controller will link/point to the brokerX locations(logical copy) ?
I am little hard in understanding here. Could someone help me on this?
If a broker dies, it's dead. There's no background process that will copy data off of it
The replication of topics only happens while the broker is running
Plus, that image is wrong. The partitions = 2 means exactly that. A third partition doesn't just appear when a broker dies
This all depends if the topic has a replication factor > 1. In this case, brokers holding follower replica are constantly sending fetch requests to the leader replica (a specific broker), with a goal of being head to head with the leader (both the follower replica and leader replica having the same records stored on disk).
So when a broker goes down, all it takes is for the controller to select and promote an in-sync replica (by default, but could select non insync replicas) to take over as the leader of the partition. No copy/paste required, all brokers holding a partition(s) (as a follower replica or leader replica) of that topic are storing the same information prior to shutting down.
If a broker dies the behaviour depends on the dead broker. If it was not the leader for its partition it's non problem. when the broker returns on-line it will have to copy all missing data from the leader replica. If the dead broker was the leader for the partition a new leader will be elected according to some rules. If the new elected leader was in sync before the old leader died, there will be no message loss and the follower brokers will sync their replica from the new leader, as the broken leader will do when up again. If the new elected leader was not in sync you might have some message loss. Anyway you can drive the behaviour of your kafka cluster setting various parameters to balance speed, data integrity and reliability.

How failover works in kafka along with keeping up replication factor

I am trying to understand how failover and replication factors work in kafka.
Let's say my cluster has 3 brokers and replication factor is also 3. In this case each broker will have one copy of partition and one of the broker is leader. If leader broker fails, then one of the follower broker will become leader but now the replication factor is down to 2. At this point if I add a new broker in the cluster, will kafka make sure that replication factor is 3 and will it copy the required data on the new broker.
How will above scenario work if my cluster already has an addition broker.
In your setup (3 broker, 3 replicas), when 1 broker fails Kafka will automatically elect new leaders (on the remaining brokers) for all the partitions whose leaders were on the failing broker.
The replication factor does not change. The replication factor is a topic configuration that can only be changed by the user.
Similarly the Replica list does not change. This lists the brokers that should host each partition.
However, the In Sync Replicas (ISR) list will change and only contain the 2 remaining brokers.
If you add another broker to the cluster, what happens depend on its broker.id:
if the broker.id is the same as the broker that failed, this new broker will start replicating data and eventually join the ISR for all the existing partitions.
if it uses a different broker.id, nothing will happen. You will be able to create new topics with 3 replicas (that is not possible while there are only 2 brokers) but Kafka will not automatically replicate existing partitions. You can manually trigger a reassignment if needed, see the docs.
Leaving out partitions (which is another concept of Kafka):
The replication factor does not say how many times a topic is replicated, but rather how many times it should be replicated. It is not affected by brokers shutting down.
Once a leader broker shuts down, the "leader" status goes over to another broker which is in sync, that means a broker that has the current state replicated and is not behind. Electing "leader" status to a broker that is not in sync would obviously lead to data loss, so this will never happen (when using the right settings).
These replicas eligible for taking "leader status" are called in-sync replica (ISR), which is important, as there is a configuration called min.insync.replicas that specifies how many ISR have to exist for a Kafka message to be acknowledged. If this is set to 0, every Kafka message is acknowledged as "successful" as soon as it enters the "leader" broker, if this broker would die, all data that was not replicated yet is lost. If min.insync.replicas would be set to 1, every message waits with the acknowledgement, until at least 1 replica exists in order to be "successful", so if the broker would die now, there would be a replica covering this data. If there are not enough brokers to cover the minimum amount of replicas, your cluster will fail eventually.
So to answer your question: if you had 2 running brokers, min.insync.replicas=1 (default) and replication factor of 3, your cluster runs fine and will add a replica as soon as you start up another broker. If another of the 2 brokers dies before you launch the third one, you will run into problems.

Does even number of Kafka brokers affect ensemble operation and leader election

we are little confuse about if is valid or not to use kafka cluster with even numbers
the story begin when we build kafka cluster with 6 machines and trying to use the kafka Reassignment partitions tool , because non balance of kafka brokers
unfortunately , kafka Reassignment partitions tool not succeeded to Reassignment partitions
so we suspect about the even number of our kafka machines
so is it true or false ? , to use kafka even number?
As commented, with the official Apache Users email thread
No, Kafka clusters have no such restrictions as that of a Zookeeper Quorum (see definition of quroum) ; in order to elect a leader, "more than half must agree" on a value. That's not possible with an even number.
kafka Reassignment partitions tool not succeeded
You'll need to be more specific. I assume the rest of the cluster works, so why should an even number be an issue?
Failure of reassignment is unlikely to be caused due to the number of brokers being 'even'.
It must have been caused due to follower brokers not being in sync with the leader, or due to a network issue, or due to the leader broker going offline at a critical time.
The Reassignment tool works well if the partitions are all in sync before running the tool.
The tool will usually fail if Partitions are out of sync or if
leader is down an election for the leader is in progress while the tool is running.
As a best practice, it is recommended to make sure the following conditions are met before running the tool:
At least one partition is in sync with the leader
Leader is healthy and no election is happening.
Out-of-sync partitions are not nominated as targets for reassignment.

Zookeeper failures in Kafka 0.9 and above

Based of the answer given in Is Zookeeper a must for Kafka?.
It is clear that what is the responsibility of Zookeeper in Kafka 0.9 and above
I just wanted to understand what will be the impact if zookeeper cluster goes down completely?
kafka uses ZK for membership (figure out what brokers exist and which of them are alive) and leader election (elect the one broker that is controller for the cluster at any moment).
simply put - if ZK fails kafka dies.
if ZK sneezes (say a particularly long GC pause or a short network connectivity issue) a kafka cluster may temporarily "lose" any number of members and/or the controller. by the time this settles you may have a new controller and new leader brokers for all partitions (which may or may not cause loss of acknowledged data, see "unclean leader election"). I'm not sure if all ongoing transactions would be rolled back - never tried.

Fixing under replicated partitions in kafka

In our production environment, we often see that the partitions go under-replicated while consuming the messages from the topics. We are using Kafka 0.11. From the documentation what is understand is
Configuration parameter replica.lag.max.messages was removed. Partition leaders will no longer consider the number of lagging messages when deciding which replicas are in sync.
Configuration parameter replica.lag.time.max.ms now refers not just to the time passed since last fetch request from the replica, but also to time since the replica last caught up. Replicas that are still fetching messages from leaders but did not catch up to the latest messages in replica.lag.time.max.ms will be considered out of sync.
How do we fix this issue? What are the different reasons for replicas go out of sync? In our scenario, we have all the Kafka brokers in the single RACK of the blade servers and all are using the same network with 10GBPS Ethernet(Simplex). I do not see any reason for the replicas to go out of sync due to the network.
We faced the same issue:
Solution was:
Restart the Zookeeper leader.
Restart the broker\brokers that are not replicating some of the partitions.
No data lose.
The issue is due to a faulty state in ZK, there was an opened issue on ZK for this, don't remember the number.
I faced the same issue on Kafka 2.0,
On restart Kafka controller node everything caught-up on the replicas.
But still looking for the reasons why few partitions are under-replicated whereas the other partitions on the same nodes for the same topic works good, and this issue i see on a random partitions.
Do NOT run reassignment for all topics together, consider running it for small portions.
Find the topic that has under-replicated partitions and where reassignment process can't be completed.
Set unclean.leader.election.enable to true for this topic.
Find under-replicated partition that stuck for this topic. Check its leader ID.
Stop the broker (just the service, not the instance).
Execute Preferred Replica Election (in yahoo/kafka-manager or manually).
Start the broker back.
Repeat for the rest of topics that have the same problem.
Also I tried this advice, it didn't help me: https://stackoverflow.com/a/51063607/1929406