Explain replication-offset-checkpoint AND recovery-point-offset in Kafka - apache-kafka

Can Some explain what these files means, present inside kafka broker logs.
root#a2md23297l:/tmp/kafka-logs-1# cat recovery-point-offset-checkpoint
0
5
my-topic 0 0
kafkatopic_R2P1_1 0 0
my-topic 1 0
kafkatopic_R2P1 0 0
test 0 0
root#a2md23297l:/tmp/kafka-logs-1# cat replication-offset-checkpoint
0
5
my-topic 0 0
kafkatopic_R2P1_1 0 2
my-topic 1 0
kafkatopic_R2P1 0 2
test 0 57
Fyi, my-topic,kafkatopic_R2P1_1,my-topic,kafkatopic_R2P1,test are the topics created.
Thanks in advance.

AFAIK: recovery-point-offset-checkpoint is the internal broker log where Kafka tracks which messages (from-to offset) were successfully checkpointed to disk.
replication-offset-checkpoint is the internal broker log where Kafka tracks which messages (from-to offset) were successfully replicated to other brokers.
For more details you can take a deeper look at: kafka/core/src/main/scala/kafka/server/LogOffsetMetadata.scala and ReplicaManager.scala. The code is commented pretty well.

Marko is spot on.
the starting two numbers (0- Not sure what this is) (5-Number of partitions that are present on that particular disk)
Numbers next to the topic name(0- Partition number of the topic)
next number is the offset which was flushed to the disk(recovery-point-offset-checpoint) and in replication-offset-checkpoint last offset which the replicas were successfully replicated the data

Related

how to drain records in a kafka topic

During application planned maintenance activities , there is a need to drain all the messages a in kafka topic.
In MQ , we can monitor the queue depth and start the maintenance activities once all the messages are consumed. In kafka , do we have a similar mechanism to find out if all messages in the topic has been consumed and its safe to shutdown the producer and consumer ?
Using the following command you can monitor the LAG of your consumer group, once the lag is 0 means no more messages in topic to consume
./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group count_errors --describe
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER
count_errors logs 2 2908278 2908278 0 consumer-1_/10.8.0.55
count_errors logs 3 2907501 2907501 0 consumer-1_/10.8.0.43
count_errors logs 4 2907541 2907541 0 consumer-1_/10.8.0.177
count_errors logs 1 2907499 2907499 0 consumer-1_/10.8.0.115
count_errors logs 0 2907469 2907469 0 consumer-1_/10.8.0.126

Kafka console consumer commits wrong offset when using --max-messages

I have a kafka console consumer in version 1.1.0 that i use to get messages from Kafka.
When I use kafka-console-consumer.sh script with option --max-messages it seems like it is commiting wrong offsets.
I've created a topic and a consumer group and read some messages:
/kafka_2.11-1.1.0/bin/kafka-consumer-groups.sh --bootstrap-server 192.168.1.23:9092 --describe --group my-consumer-group
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
test.offset 1 374 374 0 - - -
test.offset 0 0 375 375 - - -
Than I read 10 messages like this:
/kafka_2.11-1.1.0/bin/kafka-console-consumer.sh --bootstrap-server 192.168.1.23:9092 --topic test.offset --timeout-ms 1000 --max-messages 10 --consumer.config /kafka_2.11-1.1.0/config/consumer.properties
1 var_1
3 var_3
5 var_5
7 var_7
9 var_9
11 var_11
13 var_13
15 var_15
17 var_17
19 var_19
Processed a total of 10 messages
But now offsets show that it read all the messages in a topic
/kafka_2.11-1.1.0/bin/kafka-consumer-groups.sh --bootstrap-server 192.168.1.23:9092 --describe --group my-consumer-group
Note: This will not show information about old Zookeeper-based consumers.
Consumer group 'my-consumer-group' has no active members.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
test.offset 1 374 374 0 - - -
test.offset 0 375 375 0 - - -
And now when I want to read more messages I get an error that there are no more messages in a topic:
/kafka_2.11-1.1.0/bin/kafka-console-consumer.sh --bootstrap-server 192.168.1.23:9092 --topic test.offset --timeout-ms 1000 --max-messages 10 --consumer.config /kafka_2.11-1.1.0/config/consumer.properties
[2020-02-28 08:27:54,782] ERROR Error processing message, terminating consumer process: (kafka.tools.ConsoleConsumer$)
kafka.consumer.ConsumerTimeoutException
at kafka.consumer.NewShinyConsumer.receive(BaseConsumer.scala:98)
at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:129)
at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:84)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:54)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
Processed a total of 0 messages
What do I do wrong? Why the offset moved to last message in topic and not just by 10 messages?
This is about auto commit feature of Kafka consumer. As mentioned in this link:
The easiest way to commit offsets is to allow the consumer to do it
for you. If you configure enable.auto.commit=true, then every five
seconds the consumer will commit the largest offset your client
received from poll(). The five-second interval is the default and is
controlled by setting auto.commit.interval.ms. Just like everything
else in the consumer, the automatic commits are driven by the poll
loop. Whenever you poll, the consumer checks if it is time to commit,
and if it is, it will commit the offsets it returned in the last poll.
So in your case when your consumer poll, it receives messages up to 500 (default value of max.poll.records) and after 5 seconds it commits largest offset that return from last poll (375 in your case) even you specify max-messages as 10.
--max-messages: The maximum number of messages to
consume before exiting. If not set,
consumption is continual.

Upgrading consumer from Kafka 8 to 10 with no code changes fails in ZookeeperConsumerConnector.RebalanceListener

I changed my Maven pom.xml to use the 0.10.1.0 client jar, and without changing any of the client code I ran both a producer and consumer.
The producer added messages to the Kafka 10 cluster fine (verified by kafka-consumer-offset-checker.sh), but the consumers that should have covered the 10 partitions in the topic did not seem to register at all. All partitions are unowned.
The consumer offset and owner output:
kafka-consumer-offset-checker.sh --zookeeper localhost:2181 --topic eddude-default-topic --group optimizer-group
[2017-06-28 12:56:06,493] WARN WARNING: ConsumerOffsetChecker is deprecated and will be dropped in releases following 0.9.0. Use ConsumerGroupCommand instead. (kafka.tools.ConsumerOffsetChecker$)
Group Topic Pid Offset logSize Lag Owner
optimizer-group eddude-default-topic 0 28 28 0 none
optimizer-group eddude-default-topic 1 2 2 0 none
optimizer-group eddude-default-topic 2 87 87 0 none
optimizer-group eddude-default-topic 3 0 0 0 none
optimizer-group eddude-default-topic 4 0 0 0 none
optimizer-group eddude-default-topic 5 2 5 3 none
optimizer-group eddude-default-topic 6 80 80 0 none
optimizer-group eddude-default-topic 7 29 29 0 none
optimizer-group eddude-default-topic 8 15 15 0 none
optimizer-group eddude-default-topic 9 0 0 0 none
And here is the relevant consumer client error from my app log:
2017-06-28 12:55:24,702 ERROR [ConnectorManagerEventPool 1] An error occurred starting KafkaTopicSet 4:eddude-default-topic
kafka.common.ConsumerRebalanceFailedException: optimizer-group_L-SEA-10002721-1498679709599-7154a218 can't rebalance after 4 retries
at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:670) ~[kafka_2.10-0.10.1.0.jar:na]
at kafka.consumer.ZookeeperConsumerConnector.kafka$consumer$ZookeeperConsumerConnector$$reinitializeConsumer(ZookeeperConsumerConnector.scala:977) ~[kafka_2.10-0.10.1.0.jar:na]
at kafka.consumer.ZookeeperConsumerConnector.consume(ZookeeperConsumerConnector.scala:264) ~[kafka_2.10-0.10.1.0.jar:na]
at kafka.javaapi.consumer.ZookeeperConsumerConnector.createMessageStreams(ZookeeperConsumerConnector.scala:85) ~[kafka_2.10-0.10.1.0.jar:na]
at kafka.javaapi.consumer.ZookeeperConsumerConnector.createMessageStreams(ZookeeperConsumerConnector.scala:97) ~[kafka_2.10-0.10.1.0.jar:na]
at com.ebay.traffic.messaging.optimizer.impl.kafka.KafkaTopicSet.start(KafkaTopicSet.java:160) ~[classes/:na]
I am just using the same Kafka 8 client code I already had and ignoring the deprecation warnings for now. Shouldn't it work as-is?
I could also post details like the configuration properties and code establishing the actual producer and consumer, but I thought I'd first simply ask in case it is an obvious answer.

Kafka Consumer Attached to partition but not consuming messages

I am new to Kafka. I have a single node Kafka broker(v 0.10.2) and a zookeeper (3.4.9). I am using new Kafka Consumer APIs. One strange thing I observed is when I am starting multiple Kafka consumers for multiple topics placed in a single group and on hitting ./kafka-consumer-groups.sh this script for the group. Few of the consumers are attached to the group but they do not consume any message.
Below are the stats of group command.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST
topic1 0 288 288 0 consumer-8-c9487cd3-573b-4c97-87c1-ddf2063ab5ae /<serverip> consumer-8
topic1 1 283 283 0 consumer-8-c9487cd3-573b-4c97-87c1-ddf2063ab5ae /<serverip> consumer-8
topic1 2 279 279 0 consumer-8-c9487cd3-573b-4c97-87c1-ddf2063ab5ae /<serverip> consumer-8
topic2 0 - 9 - consumer-1-b0476dc8-099c-4a62-a68c-e9dc9c0a5bed /<serverip> consumer-1
topic2 1 - 2 - consumer-1-b0476dc8-099c-4a62-a68c-e9dc9c0a5bed /<serverip> consumer-1
topic3 0 450 450 0 consumer-3-63c07703-17d0-471b-8c5f-17347699f108 /<serverip> consumer-3
topic4 1 - 54
- consumer-2-94dcc209-8377-45ce-8473-9ab0d85951c4 /<serverip>
topic2 2 441 441 0 consumer-5-bcfffc99-5915-41f4-b3e4-970baa204c14 /<serverip>
So can someone help me that why for topic topic2 partition 0 current-offset is showing - and lag is showing - but messages are still there in the server as LOG-END-OFFSET is showing 9.
This is happening very frequently and restarting the consumers solves the issue temporarily.
Any help will be appreciated.

ZooKeeper node counter?

I have a cluster of ZooKeeper with just 2 nodes, each zoo.conf has the following
# Servers
server.1=10.138.0.8:2888:3888
server.2=10.138.0.9:2888:3888
the same two lines are present in both configs
[root#zk1-prod supervisor.d]# echo mntr | nc 10.138.0.8 2181
zk_version 3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 5
zk_packets_sent 4
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state follower
zk_znode_count 4
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 27
zk_open_file_descriptor_count 28
zk_max_file_descriptor_count 4096
[root#zk1-prod supervisor.d]# echo mntr | nc 10.138.0.9 2181
zk_version 3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 3
zk_packets_sent 2
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 4
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 27
zk_open_file_descriptor_count 29
zk_max_file_descriptor_count 4096
zk_followers 1
zk_synced_followers 1
zk_pending_syncs 0
so why zk_znode_count == 4 ?
Znodes are not Zookeeper servers.
From Hadoop Definitive Guide:
Zookeeper doesn’t have files and directories, but a unified concept of
a node, called a znode, which acts both as a container of data (like a
file) and a container of other znodes (like a directory).
zk_znode_count refers to number of znodes available in that Zookeeper server. In your ZK ensemble, each server has four znodes.