Spring Kafka auto commit offset, commits same offset at commit interval even if there are no records to read? - apache-kafka

Spring Kafka with the following configuration,
enable.auto.commit = true
auto.commit.interval.ms = 1000
The topic receives no messages for few days, So my Kafka consumer will commit the same offset again and again at the interval provided?
Will it help in keeping the offset write to broker active, so that the offsets.retention.minutes = 1440 (Kafka 0.10.2 default) property doesn't kick in and resets the offset.

Related

While producing the messages, some brokers going breaking down makes any exception in Kafka producer side?

I am testing the scenario as follows.
I am producing the messages to sink which is the Kafka containing the three brokers.
What if brokers are going to down, the producing side have an any issue because of the broker-down?
When I tested it on my local using Flink, I generated the messages and sinked them to Kafka. And I have three kafka brokers. When I deleted the number of brokers to 2, there are no problems. And obviously, when all the brokers are going to down, then the producer-side app gives an exception.
So, according to these fact, I think that the producer-side app can still alive without any errors until one broker remains. Is my assumption correct?
Below is the my producer side configuration.
acks = 1
batch.size = 16384
compression.type = lz4
connections.max.idle.ms = 540000
delivery.timeout.ms = 120000
enable.idempotence = false
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
linger.ms = 0
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
replication is two and I have three partitions for each topic.
Any help will be appreciated.
Thanks.
It all depends on your requirements and your producer configuration. At the moment, yes you can have 2 out of 3 brokers alive and your producer will continue as normal.
This is because you have acks=1 which means only the leader has to acknowledge the message before it is considered successful. The followers don't have to acknowledge the message.
You should also check whether you have changed min.insync.replicas at the broker or topic level configuration. The default is 1, meaning only 1 in-sync replica is needed for a broker to allow acks=all requests.
Side note: you have replication=2, I'd change this so partitions were replicated across all 3 brokers.
I'm not sure if I understood your question, but in Kafka client API there are some retryable Exceptions (like Not Leader, or unreached/unknown host).
So your Producer wil retry until reaching the first limit of these two configs:
retries : https://kafka.apache.org/documentation/#producerconfigs_retries
delivery.timeout.ms : https://kafka.apache.org/documentation/#producerconfigs_delivery.timeout.ms
So using the default values :
retries > 2 billions time &
delivery.timeout.ms = 2 minutes
Your producer will do N retries for only 2 minutes then crashes.

Kafka topic partition has missing offsets

I have a Flink streaming application which is consuming data from a Kafka topic which has 3 partitions. Even though, the application is continuously running and working without any obvious errors, I see a lag in the consumer group for the flink app on all 3 partitions.
./kafka-consumer-groups.sh --bootstrap-server $URL --all-groups --describe
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
group-1 topic-test 0 9566 9568 2 - - -
group-1 topic-test 1 9672 9673 1 - - -
group-1 topic-test 2 9508 9509 1 - - -
If I send new records, they get processed but the lag still exists. I tried to view the last few records for partition 0 and this is what I got (ommiting the message part) -
./kafka-console-consumer.sh --topic topic-test --bootstrap-server $URL --property print.offset=true --partition 0 --offset 9560
Offset:9560
Offset:9561
Offset:9562
Offset:9563
Offset:9564
Offset:9565
The log-end-offset value is at 9568 and the current offset is at 9566. Why are these offsets not available in the console consumer and why does this lag exist?
There were a few instances where I noticed missing offsets. Example -
Offset:2344
Offset:2345
Offset:2347
Offset:2348
Why did the offset jump from 2345 to 2347 (skipping 2346)? Does this have something to do with how the producer is writing to the topic?
You can describe your topic for any sort of configuration added while it was created. If the log compaction is enabled through log.cleanup.policy=compact, then the behaviour will be different in the runtime. You can see these lags, due to log compactions lags value setting or missing offsets may be due messages produced with a key but null value.
Configuring The Log Cleaner
The log cleaner is enabled by default. This will start the pool of cleaner threads. To enable log cleaning on a particular topic, add the log-specific property log.cleanup.policy=compact.
The log.cleanup.policy property is a broker configuration setting defined in the broker's server.properties file; it affects all of the topics in the cluster that do not have a configuration override in place. The log cleaner can be configured to retain a minimum amount of the uncompacted "head" of the log. This is enabled by setting the compaction time lag log.cleaner.min.compaction.lag.ms.
This can be used to prevent messages newer than a minimum message age from being subject to compaction. If not set, all log segments are eligible for compaction except for the last segment, i.e. the one currently being written to. The active segment will not be compacted even if all of its messages are older than the minimum compaction time lag.
The log cleaner can be configured to ensure a maximum delay after which the uncompacted "head" of the log becomes eligible for log compaction log.cleaner.max.compaction.lag.ms.
The lag is calculated based on the latest offset committed by the Kafka consumer (lag=latest offset-latest offset committed). In general, Flink commits Kafka offsets when it performs a checkpoint, so there is always some lag if check it using the consumer groups commands.
That doesn't mean that Flink hasn't consumed and processed all the messages in the topic/partition, it just means that it has still not committed them.

__consumer_offset is unable to sync

I am using mm2 with below properties
source(A),sink(B) clusters both have their own separate zookeeper
I consume some data from topic test in source A.
then I stopped consumer, and start mirror process
when I pointed consumer with same group id to sink then it start consuming from beginning. I am expecting it should start in sink from where it left off in source.
###############
A.bootstrap.servers = localhost:9092
B.bootstrap.servers = localhost:9093
A->B.enabled = true
A->B.topics = test
#B->A.enabled = true
#B->A.topics = .*
checkpoints.topic.replication.factor=1
heartbeats.topic.replication.factor=1
offset-syncs.topic.replication.factor=1
offset.storage.replication.factor=1
status.storage.replication.factor=1
config.storage.replication.factor=1```
Since Kafka 2.7, MirrorMaker can automatically mirror consumer group offsets by setting sync.group.offsets.enabled=true.
In your example:
A->B.sync.group.offsets.enabled=true
Before 2.7, MirrorMaker does not automatically commit consumer group offsets and you need to use RemoteClusterUtils to do the offsets translation.

What if kafka offset manager is down

A confluence doc shows how to fetching consumer offsets stored kafka, as follows: https://cwiki.apache.org/confluence/display/KAFKA/Committing+and+fetching+consumer+offsets+in+Kafka
It seems one broker is assigned as the offset manager, all the offset fetch and commit are done to this broker. but what if this broker is down?
Broker offsetManager = metadataResponse.coordinator();
// if the coordinator is different, from the above channel's host then reconnect
channel.disconnect();
channel = new BlockingChannel(offsetManager.host(), offsetManager.port(),
BlockingChannel.UseDefaultBufferSize(),
BlockingChannel.UseDefaultBufferSize(),
5000 /* read timeout in millis */);
channel.connect();
By configuring:
1. offsets.topic.num.partitions : The number of partitions for offset to commit the topic.
&
2. offsets.topic.replication.factor: replication factor for the offset topic"
in server.properties file, we are going to have Offset manager with one broker acting as leader and rest as followers and hence it follows the same leader failure mechanism in kafka.
Hence, when the offset manager that handles commitment of offset is down, Broker Controller eventually elects one of the ISR as the next offset manager Leader.

Kafka Consumer - receiving messages Inconsistently

I can send and receive messages on command line against a Kafka location installation. I also can send messages through a Java code. And those messages are showing up in a Kafka command prompt. I also have a Java code for the Kafka consumer. The code received message yesterday. It doesn't receive any messages this morning, however. The code has not been changed. I am wondering whether the property configuration isn't quite right nor not. Here is my configuration:
The Producer:
bootstrap.servers - localhost:9092
group.id - test
key.serializer - StringSerializer.class.getName()
value.serializer - StringSerializer.class.getName()
and the ProducerRecord is set as
ProducerRecord<String, String>("test", "mykey", "myvalue")
The Consumer:
zookeeper.connect - "localhost:2181"
group.id - "test"
zookeeper.session.timeout.ms - 500
zookeeper.sync.time.ms - 250
auto.commit.interval.ms - 1000
key.deserializer - org.apache.kafka.common.serialization.StringDeserializer
value.deserializer - org.apache.kafka.common.serialization.StringDeserializer
and for Java code:
Map<String, Integer> topicCount = new HashMap<>();
topicCount.put("test", 1);
Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreams = consumer
.createMessageStreams(topicCount);
List<KafkaStream<byte[], byte[]>> streams = consumerStreams.get(topic);
What is missing?
A number of things could be going on.
First, your consumer's ZooKeeper session timeout is very low, which means the consumer may be experiencing many "soft failures" due to garbage collection pauses. When this happens, the consumer group will rebalance, which can pause consumption. And if this is happening very frequently, the consumer could get into a state where it never consumes messages because it's constantly being rebalanced. I suggest increasing the ZooKeeper session timeout to 30 seconds to see if this resolves the issue. If so, you can experiment setting it lower.
Second, can you confirm new messages are being produced to the "test" topic? Your consumer will only consume new messages that it hasn't committed yet. It's possible the topic doesn't have any new messages.
Third, do you have other consumers in the same consumer group that could be processing the messages? If one consumer is experiencing frequent soft failures, other consumers will be assigned its partitions.
Finally, you're using the "old" consumer which will eventually be removed. If possible, I suggest moving to the "new" consumer (KafkaConsumer.java) which was made available in Kafka 0.9. Although I can't promise this will resolve your issue.
Hope this helps.