When does Kafka throw BufferExhaustedException? - apache-kafka

org.apache.kafka.clients.producer.BufferExhaustedException: Failed to allocate memory within the configured max blocking time 5 ms.
This says that exception is thrown when producer is unable to allocate memory to the record within the configured max blocking time 5 ms.
This is what it says when I was trying to add Kafka s3-sink connectors. There are 11 topics in two kafka brokers and there were consumers present already consuming from these topics. I was spinning out a 2 node Kafka connect cluster with 11 connectors trying to consume from these topics. But there was a huge spike in errors when I started these s3-sink connectors. Once I stopped these connectors, the errors dropped and seemed to be fine. Then I started the consumers again with less number of tasks and this time the errors spiked up when there was a sudden surge in the traffic and back to normal when the traffic was back to normal. There was a max retry of 5 and it messages failed to write even after 5 attempts.
From whatever I had read, it might be due to producer batch size or producer rate being higher than the consumer rate. And I guess each consumer will be occupying upto 64 mb when there is bursty traffic. Could that be the reason? Should I try and increase the blocking time?
Producer Config:
lingerTime: 0
maxBlockTime: 5
bufferMemory: 1024000
batchSize: 102400
ack: "1"
maxRequestSize: 102400
retries: 1
maxInFlightRequestsPerConn: 1000

It was actually due to the increase in the IOPS of the EC2 instances that Kafka brokers couldn't handle. Increasing the number of bytes fetched per poll and decreasing the frequency of polls fixed it.

Related

Kafka - broker partitions not in-sync after restart

We use 3 node kafka clusters running 2.7.0 with quite high number of topics and partitions. Almost all the topics have only 1 partition and replication factor of 3 so that gives us roughly:
topics: 7325
partitions total in cluster (including replica): 22110
Brokers are relatively small with
6vcpu
16gb memory
500GB in /var/lib/kafka occupied by partitions data
As you can imagine because we have 3 brokers and replication factor 3 the data is very evenly spread across brokers. Each broker leads very similar (same) amount of partitions and the number of partitions per broker is equal. Under normal circumstances.
Before doing rolling restart yesterday everything was in-sync. We stopped the process and started it again after 1 minute. It took some 10minutes to get synchronized with Zookeeper and start listening on port.
After saing 'Kafka server started'. Nothing is happening. There is no CPU, memory or disk activity. The partition data is visible on data disk. There are no messages in log for more than 1 day now since process booted up.
We've tried restarting zookeeper cluster (one by one). We've tried restart of broker again. Now it's been 24 hours since last restart and still not change.
Broker itself is reporting it leads 0 partitions. Leadership for all the partitions moved to other brokers and they are reporting that everything located in this broker is not in sync.
I'm aware the number of partitions per broker is far exceeding the recommendation but I'm still confused by lack of any activity or log messages. Any ideas what should be checked further? It looks like something is stuck somewhere. I checked the kafka ACLs and there are no block messages related to broker username.
I tried another restart with DEBUG mode and it seems there is some problem with metadata. These two messages are constantly repeating:
[2022-05-13 16:33:25,688] DEBUG [broker-1-to-controller-send-thread]: Controller isn't cached, looking for local metadata changes (kafka.server.BrokerToControllerRequestThread)
[2022-05-13 16:33:25,688] DEBUG [broker-1-to-controller-send-thread]: No controller defined in metadata cache, retrying after backoff (kafka.server.BrokerToControllerRequestThread)
With kcat it's also impossible to fetch metadata about topics (meaning if I specify this broker as bootstrap server).

Kafka is trying to send messages to a broker in "recovery mode"

I have the following setup
3 Kafka (v2.1.1) Brokers
5 Zookeeper instances
Kafka brokers have the following configuration:
auto.create.topics.enable: 'false'
default.replication.factor: 1
delete.topic.enable: 'false'
log.cleaner.threads: 1
log.message.format.version: '2.1'
log.retention.hours: 168
num.partitions: 1
offsets.topic.replication.factor: 1
transaction.state.log.min.isr: '2'
transaction.state.log.replication.factor: '3'
zookeeper.connection.timeout.ms: 10000
zookeeper.session.timeout.ms: 10000
min.insync.replicas: '2'
request.timeout.ms: 30000
Producer configuration (using Spring Kafka) is more or less as following:
...
acks: all
retries: Integer.MAX_VALUE
deployment.timeout.ms: 360000ms
enable.idempotence: true
...
This configuration I read as follows: There are three Kafka brokers, but once one of them dies, it is fine if only at least two replicate and persist the data before sending the ack back (= in sync replicas). In case of failure, Kafka producer will keep retrying for 6 minutes, but then gives up.
This is the scenario which causes me headache:
All Kafka and Zookeeper instances are up and alive
I start sending messages in chunks (500 pcs each)
In the middle of the processing, one of the Brokers dies (hard kill)
Immediately, I see logs like 2019-08-09 13:06:39.805 WARN 1 --- [b6b45bb5c-7dxh7] o.a.k.c.NetworkClient : [Producer clientId=bla-6b6b45bb5c-7dxh7, transactionalId=bla-6b6b45bb5c-7dxh70] 4 partitions have leader brokers without a matching listener, including [...] (question 1: I do not see any further messages coming in, does this really mean the whole cluster is now stuck and waiting for the dead Broker to come back???)
After the dead Broker starts to boot up again, it starts with recovery of the corrupted index. This operation takes more than 10 minutes as I have a lot of data on the Kafka cluster
Every 30s, the producer tries to send the message again (due to request.timeout.ms property set to 30s)
Since my deployment.timeout.ms is se to 6 minutes and the Broker needs 10 minutes to recover and does not persist the data until then, the producer gives up and stops retrying = I potentially lose the data
The questions are
Why the Kafka cluster waits until the dead Broker comes back?
When the producer realizes the Broker does not respond, why it does not try to connect another Broker?
The thread is completely stuck for 6 minutes and waiting until the dead Broker recovers, how can I tell the producer to rather try another Broker?
Am I missing something or is there any good practice to avoid such scenario?
You have a number of questions, I'll take a shot at providing our experience which will hopefully shed light on some of them.
In my product, IBM IDR Replication, we had to provide information for robustness to customers who's topics were being rebalanced, or whom had lost a broker in their clusters. The results of some of our testing was the simply setting the request timeout was not sufficient because in certain circumstances the request would decide not to wait the entire time, and rather perform another retry almost instantly. This burned through the configured number of retries Ie. there are circumstances where the timeout period is circumvented.
As such we instructed users to utilize a formula like the following...
https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/tasks/robust.html
"To tune the values for your environment, adjust the Kafka producer properties retry.backoff.ms and retries according to the following formula:
retry.backoff.ms * retries > the anticipated maximum time for leader change metadata to propagate in the clusterCopy
For example, you might wish to configure retry.backoff.ms=300, retries=150 and max.in.flight.requests.per.connection=1."
So maybe try utilizing retries and retry.backoff.ms. Note that utilizing retries without idempotence can cause batches to be written out of order if you have more than one in flight... so choose accordingly based on your business logic.
It was our experience that the Kafka Producer writes to the broker which is the leader for the topic, and so you have to wait for the new leader to be elected. When it is, if the retry process is still ongoing, the producer transparently determines the new leader and writes data accordingly.

Kafka broker taking too long to come up

Recently, one of our Kafka broker (out of 5) got shut down incorrectly. Now that we are starting it up again, there are a lot of warning messages about corrupted index files and the broker is still starting up even after 24 hours. There is over 400 GB of data in this broker.
Although the rest of the brokers are up and running but some of the partitions are showing -1 as their leader and the bad broker as the only ISR. I am not seeing other Replicas to be appointed as new leaders, maybe because the bad broker is the only one in sync for those partitions.
Broker Properties:
Replication Factor: 3
Min In Sync Replicas: 1
I am not sure how to handle this. Should I wait for the broker to fix everything itself? is it normal to take so much time?
Is there anything else I can do? Please help.
After an unclean shutdown, a broker can take a while to restart as it has to do log recovery.
By default, Kafka only uses a single thread per log directory to perform this recovery, so if you have thousands of partitions it can take hours to complete.
To speed that up, it's recommended to bump num.recovery.threads.per.data.dir. You can set it to the number of CPU cores.

Cross-Datacenter Kafka Cluster

We operate a relatively low throughput Kafka cluster with brokers in 2 data centers. Replication factor set to guarantee both data centers host a full set of data.
The data centers have high speed interconnect with low latency. This configuration lets us operate our application hot/hot. It has been running this way for approx 8 months.
The cluster appears to be running fine (no data loss) however there are frequent ERRORs in the kafka broker logs (below). Any suggestions?
FollowerRequestProcessor: Unexpected exception causing error
StateChangeFailedException: encountered error while electing leader
for partition [alert20,4] due to: Preferred replica 34 for partition
[alert20,4[ is either not alive or not in the isr. Current leader
and ISR: [{"leader":32,"leader_epoch":49,"isr":[32]}]
LearnerHandler:Unexpected exception causing shutdown while sock still
open.
NotLeaderForPartitionException: This server is not the leader for
that topic-partition

Kafka producer is not able to update metadata after some time

I have an kafka environment which has 3 brokers and 1 zookeeper. I had pushed around >20K message in my topic. Apache Storm is computing the data in topic which is added by producer.
After few hours passed, While I am trying to produce messages to kafka, its showing the following exception
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
After restarting the kafka servers its working fine.
but on production i can't restart my server everytime.
so can any one help me out to figure out my issue.
my kafka configuration are as follows :
prodProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"list of broker");
prodProperties.put(ProducerConfig.ACKS_CONFIG, "1");
prodProperties.put(ProducerConfig.RETRIES_CONFIG, "3");
prodProperties.put(ProducerConfig.LINGER_MS_CONFIG, 5);
Although Kafka producer tuning is a quite hard topic, I can imagine that your producer is trying to generate records faster than it can transfer to your Kafka cluster.
There is a producer setting buffer.memory which defines how much memory producer can use before blocking. Default value is 33554432 (33 MB).
If you increase the producer memory, you will avoid blocking. Try different values, like 100MB.