Kafka Producer 0.9 performance issue with small messages - apache-kafka

We are observing very poor performance with a Java Kafka Producer 0.9 client when sending small messages. The messages are not being accumulated into a larger request batch and thus each small record is being sent separately.
What is wrong with our client configuration? Or is this some other issue?
Using Kafka Client 0.9.0.0. We did not see any related postings in the Kafka unreleased 9.0.1 or 9.1 fixed or unresolved lists, so we are focused on our client configuration and server instance.
We understand the linger.ms should cause the client to accumulate records into a batch.
We set linger.ms to 10 (and also tried 100 and 1000) but these did not result in the batch accumulating records. With a record size of about 100 bytes and a request buffer size of 16K, We would have expected about 160 messages to be sent in a single request.
The trace at the client seems to indicate that the partition may be full, despite having allocated a fresh Bluemix Messaging Hub (Kafka Server 0.9) service instance. The test client is sending multiple messages in a loop with no other I/O.
The log shows a repeating sequence with a suspect line: "Waking up the sender since topic mytopic partition 0 is either full or getting a new batch".
So the newly allocated partition should be essentially empty in our test case, thus why would the producer client be getting a new batch?
2015-12-10 15:14:41,335 3677 [main] TRACE com.isllc.client.producer.ExploreProducer - Sending record: Topic='mytopic', Key='records', Value='Kafka 0.9 Java Client Record Test Message 00011 2015-12-10T15:14:41.335-05:00'
2015-12-10 15:14:41,336 3678 [main] TRACE org.apache.kafka.clients.producer.KafkaProducer - Sending record ProducerRecord(topic=mytopic, partition=null, key=[B#670b40af, value=[B#4923ab24 with callback null to topic mytopic partition 0
2015-12-10 15:14:41,336 3678 [main] TRACE org.apache.kafka.clients.producer.internals.RecordAccumulator - Allocating a new 16384 byte message buffer for topic mytopic partition 0
2015-12-10 15:14:41,336 3678 [main] TRACE org.apache.kafka.clients.producer.KafkaProducer - Waking up the sender since topic mytopic partition 0 is either full or getting a new batch
2015-12-10 15:14:41,348 3690 [kafka-producer-network-thread | ExploreProducer] TRACE org.apache.kafka.clients.producer.internals.Sender - Nodes with data ready to send: [Node(0, kafka01-prod01.messagehub.services.us-south.bluemix.net, 9094)]
2015-12-10 15:14:41,348 3690 [kafka-producer-network-thread | ExploreProducer] TRACE org.apache.kafka.clients.producer.internals.Sender - Created 1 produce requests: [ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.producer.internals.Sender$1#6d62e963, request=RequestSend(header={api_key=0,api_version=1,correlation_id=11,client_id=ExploreProducer}, body={acks=-1,timeout=30000,topic_data=[{topic=mytopic,data=[{partition=0,record_set=java.nio.HeapByteBuffer[pos=0 lim=110 cap=16384]}]}]}), createdTimeMs=1449778481348, sendTimeMs=0)]
2015-12-10 15:14:41,412 3754 [kafka-producer-network-thread | ExploreProducer] TRACE org.apache.kafka.clients.producer.internals.Sender - Received produce response from node 0 with correlation id 11
2015-12-10 15:14:41,412 3754 [kafka-producer-network-thread | ExploreProducer] TRACE org.apache.kafka.clients.producer.internals.RecordBatch - Produced messages to topic-partition mytopic-0 with base offset offset 130 and error: null.
2015-12-10 15:14:41,412 3754 [main] TRACE com.isllc.client.producer.ExploreProducer - Send returned metadata: Topic='mytopic', Partition=0, Offset=130
2015-12-10 15:14:41,412 3754 [main] TRACE com.isllc.client.producer.ExploreProducer - Sending record: Topic='mytopic', Key='records', Value='Kafka 0.9 Java Client Record Test Message 00012 2015-12-10T15:14:41.412-05:00'
Log entries repeat like the above for each record sent
We provided the following properties file:
2015-12-10 15:14:37,843 185 [main] INFO com.isllc.client.AbstractClient - Properties retrieved from file for Kafka client: kafka-producer.properties
2015-12-10 15:14:37,909 251 [main] INFO com.isllc.client.AbstractClient - acks=-1
2015-12-10 15:14:37,909 251 [main] INFO com.isllc.client.AbstractClient - ssl.protocol=TLSv1.2
2015-12-10 15:14:37,909 251 [main] INFO com.isllc.client.AbstractClient - key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - client.id=ExploreProducer
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - ssl.truststore.identification.algorithm=HTTPS
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - ssl.truststore.password=changeit
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - ssl.truststore.type=JKS
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - ssl.enabled.protocols=TLSv1.2
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - ssl.truststore.location=/Library/Java/JavaVirtualMachines/jdk1.8.0_51.jdk/Contents/Home/jre/lib/security/cacerts
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - bootstrap.servers=kafka01-prod01.messagehub.services.us-south.bluemix.net:9094,kafka02-prod01.messagehub.services.us-south.bluemix.net:9094,kafka03-prod01.messagehub.services.us-south.bluemix.net:9094,kafka04-prod01.messagehub.services.us-south.bluemix.net:9094,kafka05-prod01.messagehub.services.us-south.bluemix.net:9094
2015-12-10 15:14:37,910 252 [main] INFO com.isllc.client.AbstractClient - security.protocol=SASL_SSL
Plus we added linger.ms=10 in code.
The Kafka Client shows the expanded/merged configuration list (and displaying the linger.ms setting):
2015-12-10 15:14:37,970 312 [main] INFO org.apache.kafka.clients.producer.ProducerConfig - ProducerConfig values:
compression.type = none
metric.reporters = []
metadata.max.age.ms = 300000
metadata.fetch.timeout.ms = 60000
reconnect.backoff.ms = 50
sasl.kerberos.ticket.renew.window.factor = 0.8
bootstrap.servers = [kafka01-prod01.messagehub.services.us-south.bluemix.net:9094, kafka02-prod01.messagehub.services.us-south.bluemix.net:9094, kafka03-prod01.messagehub.services.us-south.bluemix.net:9094, kafka04-prod01.messagehub.services.us-south.bluemix.net:9094, kafka05-prod01.messagehub.services.us-south.bluemix.net:9094]
retry.backoff.ms = 100
sasl.kerberos.kinit.cmd = /usr/bin/kinit
buffer.memory = 33554432
timeout.ms = 30000
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
ssl.keystore.type = JKS
ssl.trustmanager.algorithm = PKIX
block.on.buffer.full = false
ssl.key.password = null
max.block.ms = 60000
sasl.kerberos.min.time.before.relogin = 60000
connections.max.idle.ms = 540000
ssl.truststore.password = [hidden]
max.in.flight.requests.per.connection = 5
metrics.num.samples = 2
client.id = ExploreProducer
ssl.endpoint.identification.algorithm = null
ssl.protocol = TLSv1.2
request.timeout.ms = 30000
ssl.provider = null
ssl.enabled.protocols = [TLSv1.2]
acks = -1
batch.size = 16384
ssl.keystore.location = null
receive.buffer.bytes = 32768
ssl.cipher.suites = null
ssl.truststore.type = JKS
security.protocol = SASL_SSL
retries = 0
max.request.size = 1048576
value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
ssl.truststore.location = /Library/Java/JavaVirtualMachines/jdk1.8.0_51.jdk/Contents/Home/jre/lib/security/cacerts
ssl.keystore.password = null
ssl.keymanager.algorithm = SunX509
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
send.buffer.bytes = 131072
linger.ms = 10
The Kafka metrics after sending 100 records:
Duration for 100 sends 8787 ms. Sent 7687 bytes.
batch-size-avg = 109.87 [The average number of bytes sent per partition per-request.]
batch-size-max = 110.0 [The max number of bytes sent per partition per-request.]
buffer-available-bytes = 3.3554432E7 [The total amount of buffer memory that is not being used (either unallocated or in the free list).]
buffer-exhausted-rate = 0.0 [The average per-second number of record sends that are dropped due to buffer exhaustion]
buffer-total-bytes = 3.3554432E7 [The maximum amount of buffer memory the client can use (whether or not it is currently used).]
bufferpool-wait-ratio = 0.0 [The fraction of time an appender waits for space allocation.]
byte-rate = 291.8348916277093 []
compression-rate = 0.0 []
compression-rate-avg = 0.0 [The average compression rate of record batches.]
connection-close-rate = 0.0 [Connections closed per second in the window.]
connection-count = 2.0 [The current number of active connections.]
connection-creation-rate = 0.05180541884681138 [New connections established per second in the window.]
incoming-byte-rate = 10.342564641029007 []
io-ratio = 0.0038877559207471236 [The fraction of time the I/O thread spent doing I/O]
io-time-ns-avg = 353749.2840375587 [The average length of time for I/O per select call in nanoseconds.]
io-wait-ratio = 0.21531227995769162 [The fraction of time the I/O thread spent waiting.]
io-wait-time-ns-avg = 1.9591901192488264E7 [The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.]
metadata-age = 8.096 [The age in seconds of the current producer metadata being used.]
network-io-rate = 5.2937784999213795 [The average number of network operations (reads or writes) on all connections per second.]
outgoing-byte-rate = 451.2298783403283 []
produce-throttle-time-avg = 0.0 [The average throttle time in ms]
produce-throttle-time-max = 0.0 [The maximum throttle time in ms]
record-error-rate = 0.0 [The average per-second number of record sends that resulted in errors]
record-queue-time-avg = 15.5 [The average time in ms record batches spent in the record accumulator.]
record-queue-time-max = 434.0 [The maximum time in ms record batches spent in the record accumulator.]
record-retry-rate = 0.0 []
record-send-rate = 2.65611304417116 [The average number of records sent per second.]
record-size-avg = 97.87 [The average record size]
record-size-max = 98.0 [The maximum record size]
records-per-request-avg = 1.0 [The average number of records per request.]
request-latency-avg = 0.0 [The average request latency in ms]
request-latency-max = 74.0 []
request-rate = 2.6468892499606897 [The average number of requests sent per second.]
request-size-avg = 42.0 [The average size of all requests in the window..]
request-size-max = 170.0 [The maximum size of any request sent in the window.]
requests-in-flight = 0.0 [The current number of in-flight requests awaiting a response.]
response-rate = 2.651196976060479 [The average number of responses received per second.]
select-rate = 10.989861465830819 [Number of times the I/O layer checked for new I/O to perform per second]
waiting-threads = 0.0 [The number of user threads blocked waiting for buffer memory to enqueue their records]
Thanks

Guozhang Wang on the Kafka Users mailing list was able to recognize the problem by reviewing our application code:
Guozhang,
Yes - you identified the problem!
We had inserted the .get() for debugging, but didn’t think of the
(huge!) side-effects.
Using the async callback works perfectly well.
We are now able to send 100,000 records in 14 sec from a laptop to the
Bluemix cloud - ~1000x faster,
Thank you very much!
Gary
On Dec 13, 2015, at 2:48 PM, Guozhang Wang wrote:
Gary,
You are calling "kafkaProducer.send(record).get();" for each message,
the get() call block until the Future is initialized, which
effectively synchronize all message sent by asking for the ACK for
each message before sending the next message, hence no batching.
You can try using "send(record, callback)" for async sending and let
the callback handle errors from the returned metadata.
Guozhang

Related

few kafka stream tasks stopped running in a stream application

We have a Stream Topology with 2 source topics, 3 processors, 5 state stores and 4 sink topics.
Those two topics have 12 partitions.
Our application is running in kubernetes. It has three pods and so we have 3 instances of this streams running in one pod each.
When we install the application the rebalancing happens and the stream instances in the three pod goes to running state with 4 tasks each.
These streams work fine consuming events, updating the state store and producing messages through sinks.
But some times message consumption from some partitions is stopped in one or two stream instances.Message consumption of other partitions in the same instance goes fine. So I think that the corresponding tasks suddenly stopped running in those pods. There is no warning or error messages in streams client's log files.
Edited: Added Section:
As there is no consumer join request or rebalance triggered I think the tasks are running but some how no records are consumed.
There are two log messages after this.
17:43:44.105 [kafka-admin-client-thread | XXXXX-admin] INFO o.apache.kafka.clients.NetworkClient - [AdminClient clientId=XXXXX-admin] Node -1 disconnected.
17:42:45.050 [XXXXX-StreamThread-1] INFO o.a.k.s.p.internals.StreamThread - stream-thread [XXXXX-StreamThread-1] Processed 11036 total records, ran 0 punctuators, and committed 45 total tasks since the last update
what could be the reason for that? I could not find the root cause and am out of ideas. Could some one please help.
we are using kafka-streams-3.1.1.
we have the config: session.timeout.ms: 10000
StreamsConfig values:
acceptable.recovery.lag = 10000
application.id = XXXXX
application.server =
bootstrap.servers = XXXXX
buffered.records.per.partition = 1000
built.in.metrics.version = latest
cache.max.bytes.buffering = 10485760
client.id =
commit.interval.ms = 1000
connections.max.idle.ms = 540000
default.deserialization.exception.handler = class org.apache.kafka.streams.errors.LogAndContinueExceptionHandler
default.key.serde = class org.apache.kafka.common.serialization.Serdes$StringSerde
default.list.key.serde.inner = null
default.list.key.serde.type = null
default.list.value.serde.inner = null
default.list.value.serde.type = null
default.production.exception.handler = class org.apache.kafka.streams.errors.DefaultProductionExceptionHandler
default.timestamp.extractor = class org.apache.kafka.streams.processor.FailOnInvalidTimestamp
default.value.serde = null
max.task.idle.ms = 0
max.warmup.replicas = 2
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
num.standby.replicas = 0
num.stream.threads = 1
poll.ms = 50
probing.rebalance.interval.ms = 600000
processing.guarantee = at_least_once
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
replication.factor = 3
request.timeout.ms = 40000
retries = 0
retry.backoff.ms = 100
rocksdb.config.setter = null
security.protocol = SASL_SSL
send.buffer.bytes = 131072
state.cleanup.delay.ms = 600000
state.dir = /tmp/kafka-streams
task.timeout.ms = 300000
topology.optimization = none
upgrade.from = null
window.size.ms = null
windowed.inner.class.serde = null
windowstore.changelog.additional.retention.ms = 86400000

Asynchronous auto-commit of offsets fails

I have a question on Kafka auto-commit mechanism.
I'm using Spring-Kafka with auto-commit enabled.
As an experiment, I disconnected my consumer's connection to Kafka for 30 seconds while the system was idle (no new messages in the topic, no messages being processed).
After reconnecting I got a few messages like so:
Asynchronous auto-commit of offsets {cs-1915-2553221872080030-0=OffsetAndMetadata{offset=19, leaderEpoch=0, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
First, I don't understand what is there to commit? The system was idle (all previous messages were already committed).
Second, the disconnection time was 30 seconds, much less than the 5 minutes (300000 ms) max.poll.interval.ms
Third, in an uncontrolled failure of Kafka I got at least 30K messages of this type, which was resolved by restarting the process. Why is this happening?
I'm listing here my consumer configuration:
allow.auto.create.topics = true
auto.commit.interval.ms = 100
auto.offset.reset = latest
bootstrap.servers = [kafka1-eu.dev.com:9094, kafka2-eu.dev.com:9094, kafka3-eu.dev.com:9094]
check.crcs = true
client.dns.lookup = default
client.id =
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = feature-cs-1915-2553221872080030
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = SSL
send.buffer.bytes = 131072
session.timeout.ms = 15000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = [hidden]
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = /home/me/feature-2553221872080030.keystore
ssl.keystore.password = [hidden]
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = /home/me/feature-2553221872080030.truststore
ssl.truststore.password = [hidden]
ssl.truststore.type = JKS
value.deserializer = class org.springframework.kafka.support.serializer.ErrorHandlingDeserializer2
First, I don't understand what is there to commit?
You are right, there is nothing new to commit if no new data is flowing. However, having auto.commit enabled and your consumer is still running (even without being able to connect to broker) the poll method is still responsible of the following steps:
Fetch messages from assigned partitions
Trigger partition assignment (if necessary)
Commit offsets if auto offset commit is enabled
Together with your interval of 100ms (see auto.commit.intervals) the consumer still tries to asynchronously commit the (non changing) offset position of the consumer.
Second, the disconnection time was 30 seconds, much less than the 5 minutes (300000 ms) max.poll.interval.ms
It is not the max.poll.interval that is causing the rebalance but rather the combination of your heartbeat.interval.ms setting and the session.timeout.ms. Your consumer sends in a background thread heartbeats based on the interval setting, in your case 3 seconds. If no heartbeats are received by the broker before the expiration of this session timeout (in your case 15 seconds), then the broker will remove this client from the group and initiate a rebalance.
A more detailed description of the configuration I mentioned are given in the Kafka documentation on Consumer Configs
Third, in an uncontrolled failure of Kafka I got at least 30K messages of this type, which was resolved by restarting the process. Why is this happening?
That seems to be a combination of the first two questions, where heartbeats cannot be sent and still the consumer is trying to commit through the contiuously called poll method.
As #GaryRussell mentioned in his comment, I would be careful to use auto.commit.enabled and rather take the control over the Offset Management to yourself.

How to improve the performance to read from kafka and forward to kafka with kafka Stream Application

I have Kafka stream application with 1.0.0 Kafka stream API. I have single broker 0.10.2.0 kafka and single topic with single partition. All configurable parameters are same except producer request.timeout.ms. I configured producer request.timeout.ms with 5 minutes to fix Kafka Streams program is throwing exceptions when producing issue.
In my stream application, I read the events from Kafka, process them and forward to another topic of same kafka.
After calculating the statistics, I observed that processing is taking 5% of time and remaining 95% of time is taking for reading & writing.
Even though I have tens of millions of events in Kafka, Some times Kafka poll is returning single digit of records and some times Kafka poll is returning thousand of records.
Some times context forward is taking more time to send fewer records to kafka and some times context forward is taking less time to send more records to kafka.
I tried to increase reading performance by increasing max.poll.records,poll.ms values. But no luck.
How can I improve the performance while reading and forwarding? How kafka poll and forward would work? What are parameters helped to improve the performance?
Following are few important producer config parameters in my application.
acks = 1
batch.size = 16384
buffer.memory = 33554432
compression.type = none
connections.max.idle.ms = 540000
enable.idempotence = false
linger.ms = 100
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 240000
retries = 10
retry.backoff.ms = 100
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
transaction.timeout.ms = 60000
transactional.id = null
Following are few important consumer config parameters in my application:
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
check.crcs = true
connections.max.idle.ms = 540000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
heartbeat.interval.ms = 3000
internal.leave.group.on.close = false
isolation.level = read_uncommitted
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 2147483647
max.poll.records = 10000
metadata.max.age.ms = 300000
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
Following are few important stream config parameters in my application:
application.server =
buffered.records.per.partition = 1000
cache.max.bytes.buffering = 10485760
commit.interval.ms = 30000
connections.max.idle.ms = 540000
key.serde = null
metadata.max.age.ms = 300000
num.standby.replicas = 0
num.stream.threads = 1
poll.ms = 1000
processing.guarantee = at_least_once
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
replication.factor = 1
request.timeout.ms = 40000
retry.backoff.ms = 100
rocksdb.config.setter = null
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
state.cleanup.delay.ms = 600000
timestamp.extractor = null
value.serde = null
windowstore.changelog.additional.retention.ms = 86400000
zookeeper.connect =
You can bring in parallelism in your operation by controlling the key and increasing the number of partitions of the topic.
The above would increase the number of Kafka streams to process parallely. This can be handled by increasing the number of threads for the Kafka streams applications
You can create multiple Kafka Consumer, in different threads, and assigning it, to the same consumer group. They will consume messages in parallel and will not lose messages.
How do you send messages?
With Kafka you can send messages in a Fire-and-Forget way: it improves the throughput.
producer.send(record);
The acks parameter controls how many partition replicas must receive the record before the producer can consider the write successful.
If you set ack=0 the producer will not wait for a reply from the broker before assuming the message was sent successfully. However, because the producer is not waiting for any response from the server, it can send messages as fast as the network will support, so this setting can be used to achieve very high throughput.

Kafka Consumer Rebalancing takes too long

I have a Kafka Streams Application which takes data from few topics and joins the data and puts it in another topic.
Kafka Configuration:
5 kafka brokers
Kafka Topics - 15 partitions and 3 replication factor.
Note: I am running Kafka Streams Applications on the same machines where my Kafka Brokers are running.
Few millions of records are consumed/produced every hour.
Whenever I take any kafka broker down, it goes into rebalancing and it takes approx. 30 minutes or sometimes even more for rebalancing.
Anyone has any idea how to solve rebalancing issue in kafka consumer?
Also, many times it throws exception while rebalancing.
This is stopping us from going live in production environment with this setup. Any help would be appreciated.
Caused by: org.apache.kafka.clients.consumer.CommitFailedException: ?
Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:725)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:604)
at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1173)
at org.apache.kafka.streams.processor.internals.StreamTask.commitOffsets(StreamTask.java:307)
at org.apache.kafka.streams.processor.internals.StreamTask.access$000(StreamTask.java:49)
at org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:268)
at org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:187)
at org.apache.kafka.streams.processor.internals.StreamTask.commitImpl(StreamTask.java:259)
at org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:362)
at org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:346)
at org.apache.kafka.streams.processor.internals.StreamThread$3.apply(StreamThread.java:1118)
at org.apache.kafka.streams.processor.internals.StreamThread.performOnStreamTasks(StreamThread.java:1448)
at org.apache.kafka.streams.processor.internals.StreamThread.suspendTasksAndState(StreamThread.java:1110)
Kafka Streams Config:
bootstrap.servers=kafka-1:9092,kafka-2:9092,kafka-3:9092,kafka-4:9092,kafka-5:9092
max.poll.records = 100
request.timeout.ms=40000
ConsumerConfig it internally creates is:
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [kafka-1:9092, kafka-2:9092, kafka-3:9092, kafka-4:9092, kafka-5:9092]
check.crcs = true
client.id = conversion-live-StreamThread-1-restore-consumer
connections.max.idle.ms = 540000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id =
heartbeat.interval.ms = 3000
interceptor.classes = null
internal.leave.group.on.close = false
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 2147483647
max.poll.records = 100
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 40000
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
I would recommend to configure StandbyTasks via parameter num.standby.replicas=1 (default is 0). This should help to reduce the rebalance time significantly.
Furthermore, I would recommend to upgrade your application to Kafka 0.11. Note, Streams API 0.11 is backward compatible to 0.10.1 and 0.10.2 brokers, thus, you don't need to upgrade your brokers for this. Rebalance behavior was heavily improved in 0.11 and will be further improved in upcoming 1.0 release (cf. https://cwiki.apache.org/confluence/display/KAFKA/KIP-167%3A+Add+interface+for+the+state+store+restoration+process), thus, upgrading your application to the latest version is always an improvement for rebalancing.
to my experience,
first
your max.poll.records is too small given your workload: Few millions of records are consumed/produced every hour.
so if max.poll.records is too small say 1, then the rebalancing takes very long. i don't know the reason.
second, make sure the number of partitions of the input topics to you stream app are consistent.
e.g. if APP-1 has two input topics A and B. if A has 4 partitions, and B has 2, then rebalancing takes very long. However, if A and B both have 4 partitions event some partitions are idle, then rebalancing time is good.
hope it helps

Why can't I increase session.timeout.ms?

I want to increase session.timeout.ms to allow longer time for processing the messages received between poll() calls. However when I change session.timeout.ms to a higher value than 30000, it fails to create Consumer object and throws below error.
Could anyone tell why can't I increase session.timeout.ms value or if I am missing something?
0 [main] INFO org.apache.kafka.clients.consumer.ConsumerConfig - ConsumerConfig values:
request.timeout.ms = 40000
check.crcs = true
retry.backoff.ms = 100
ssl.truststore.password = null
ssl.keymanager.algorithm = SunX509
receive.buffer.bytes = 262144
ssl.cipher.suites = null
ssl.key.password = null
sasl.kerberos.ticket.renew.jitter = 0.05
ssl.provider = null
sasl.kerberos.service.name = null
session.timeout.ms = 40000
sasl.kerberos.ticket.renew.window.factor = 0.8
bootstrap.servers = [server-name:9092]
client.id =
fetch.max.wait.ms = 500
fetch.min.bytes = 50000
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
sasl.kerberos.kinit.cmd = /usr/bin/kinit
auto.offset.reset = latest
value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
partition.assignment.strategy = [org.apache.kafka.clients.consumer.RangeAssignor]
ssl.endpoint.identification.algorithm = null
max.partition.fetch.bytes = 2097152
ssl.keystore.location = null
ssl.truststore.location = null
ssl.keystore.password = null
metrics.sample.window.ms = 30000
metadata.max.age.ms = 300000
security.protocol = PLAINTEXT
auto.commit.interval.ms = 5000
ssl.protocol = TLS
sasl.kerberos.min.time.before.relogin = 60000
connections.max.idle.ms = 540000
ssl.trustmanager.algorithm = PKIX
group.id = test7
enable.auto.commit = false
metric.reporters = []
ssl.truststore.type = JKS
send.buffer.bytes = 131072
reconnect.backoff.ms = 50
metrics.num.samples = 2
ssl.keystore.type = JKS
heartbeat.interval.ms = 3000
Exception in thread "main" org.apache.kafka.common.KafkaException:
Failed to construct kafka consumer at
org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:624)
at
org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:518)
at
org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:500)
These conditions needed to be keep in mind to change session.timeout.ms:
group.max.session.timeout.ms in the server.properties > session.timeout.ms in the consumer.properties.
group.min.session.timeout.ms in the server.properties < session.timeout.ms in the consumer.properties.
request.timeout.ms > session.timeout.ms + fetch.wait.max.ms
(session.timeout.ms)/3 > heartbeat.interval.ms
session.timeout.ms > Worst case processing time of Consumer Records per consumer poll(ms).
The range of consumer session timeout is controlled by broker group.max.session.timeout.ms(default 30s) and group.min.session.timeout.ms(default 6s).
You should increase group.max.session.timeout.ms first in broker side, otherwise you will get "The session timeout is not within an acceptable range.".
i am using spring-kafka
i had added the following config but the consumer still was not up:
buildProperties.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, env.getProperty("kafka.user-events-min-bytes"));
buildProperties.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, env.getProperty("kafka.user-events-wait-time-ms") );
buildProperties.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, env.getProperty("kafka.user-events-wait-time-ms") );
buildProperties.put(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG, env.getProperty("kafka.user-events-request-timeout-ms"));
buildProperties.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, env.getProperty("kafka.user-events-wait-time-ms"));
i figured out it was failing because poll.timeout was 1000;
adding the following config helped:
factory.getContainerProperties().setPollTimeout(Integer.parseInt(env.getProperty("kafka.user-events-wait-time-ms")));
While the other answers to this question correctly describe the error and how to increase session.timeout.ms, there is a better and more direct way to address the original goal:
allow longer time for processing the messages received between poll() calls
The best way to achieve this in modern Kafka versions is to directly set max.poll.interval.ms in the consumer configuration to a higher value.
Most contemporary client libraries today are based on librdkafka, which has a background thread sending heartbeats. The librdkafka CONFIGURATION documentation describes session.timeout.ms as:
Client group session and failure detection timeout. The consumer sends periodic heartbeats (heartbeat.interval.ms) to indicate its liveness to the broker. If no hearts are received by the broker for a group member within the session timeout, the broker will remove the consumer from the group and trigger a rebalance.
Where as max.poll.interval.ms (which defaults to 300000ms, or 5 minutes) is described as:k
Maximum allowed time between calls to consume messages (e.g., rd_kafka_consumer_poll()) for high-level consumers. If this interval is exceeded the consumer is considered failed and the group will rebalance in order to reassign the partitions to another consumer group member. Warning: Offset commits may be not possible at this point. Note: It is recommended to set enable.auto.offset.store=false for long-time processing applications and then explicitly store offsets (using offsets_store()) after message processing, to make sure offsets are not auto-committed prior to processing has finished. The interval is checked two times per second. See KIP-62 for more information.
Heartbeat support (KIP-62) was added to Kafka in version 0.10.1. The reason this is better than increasing session.timeout.ms is that the broker can distinguish between consumer client entirely disappearing (eg, crashing, network interruptions) and long processing time. In the former case, the broker can rebalance to another consumer faster.
How to set max.poll.records in Kafka-Connect API
It was solved. I added below configuration in connect-avro-standalone.properties
group.id=mygroup
consumer.max.poll.records=1000