Related
I am using #RetryableTopic to implement retry logic in kafka consumer.
I gave config as below:
#RetryableTopic(
attempts = "4",
backoff = #Backoff(delay = 300000, multiplier = 10.0),
autoCreateTopics = "false",
topicSuffixingStrategy = SUFFIX_WITH_INDEX_VALUE
)
However, instead of retrying for 4 times, it retries infinitely, and that too in no delay time. Can someone please help me with the code?
I want the message to be retried 4 times with first time delay- after 5 mins, then 2nd delay after 10 mins, third delay after 20 mins...
Code is below:
int i = 1;
#RetryableTopic(
attempts = "4",
backoff = #Backoff(delay = 300000, multiplier = 10.0),
autoCreateTopics = "false",
topicSuffixingStrategy = SUFFIX_WITH_INDEX_VALUE
)
#KafkaListener(topics = "topic_string_data", containerFactory = "default")
public void consume(#Payload String message , #Header(KafkaHeaders.RECEIVED_TOPIC) String topic) {
String prachi = null;
System.out.println("current time: " + new Date());
System.out.println("retry method invoked -> " + i++ + " times from topic: " + topic);
System.out.println("current time: " + new Date());
prachi.equals("abc");
}
#DltHandler
public void listenDlt(String in, #Header(KafkaHeaders.RECEIVED_TOPIC) String topic,
#Header(KafkaHeaders.OFFSET) long offset) {
System.out.println("current time dlt: " + new Date());
System.out.println("DLT Received: " + in + " from " + topic + " offset " + offset + " -> " + i++ + " times");
System.out.println("current time dlt: " + new Date());
//dump event to dlt queue
}
Kafka config:
#Bean
public ConsumerFactory<String, String> consumerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.GROUP_ID_CONFIG, "grp_STRING");
return new DefaultKafkaConsumerFactory<>(config);
//inject consumer factory to kafka listener consumer factory
}
#Bean(name = "default")
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory factory = new ConcurrentKafkaListenerContainerFactory();
factory.setConsumerFactory(consumerFactory());
return factory;
}
Logs when running app: this is not the complete logs:
32:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.186 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Setting offset for partition topic_string_data-retry-1-0 to the committed offset FetchPosition{offset=1, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.187 INFO 96675 --- [ner#6-dlt-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-7, groupId=grp_STRING] Setting offset for partition topic_string_data-dlt-1 to the committed offset FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.187 INFO 96675 --- [3-retry-0-0-C-1] o.s.k.l.KafkaMessageListenerContainer : grp_STRING: partitions assigned: [topic_string_data-retry-0-0]
2022-02-23 13:58:40.187 INFO 96675 --- [ntainer#2-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-2, groupId=grp_STRING] Setting offset for partition topic_string_data-1 to the committed offset FetchPosition{offset=27, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.187 INFO 96675 --- [4-retry-1-0-C-1] o.s.k.l.KafkaMessageListenerContainer : grp_STRING: partitions assigned: [topic_string_data-retry-1-0]
2022-02-23 13:58:40.187 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Setting offset for partition topic_string_data-retry-2-0 to the committed offset FetchPosition{offset=1, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.188 INFO 96675 --- [ntainer#2-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-2, groupId=grp_STRING] Setting offset for partition topic_string_data-0 to the committed offset FetchPosition{offset=24, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.188 INFO 96675 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-4, groupId=grp_STRING] Setting offset for partition topic_string-1 to the committed offset FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.188 INFO 96675 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-4, groupId=grp_STRING] Setting offset for partition topic_string-0 to the committed offset FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.188 INFO 96675 --- [5-retry-2-0-C-1] o.s.k.l.KafkaMessageListenerContainer : grp_STRING: partitions assigned: [topic_string_data-retry-2-0]
2022-02-23 13:58:40.188 INFO 96675 --- [ntainer#2-0-C-1] o.s.k.l.KafkaMessageListenerContainer : grp_STRING: partitions assigned: [topic_string_data-1, topic_string_data-0]
2022-02-23 13:58:40.189 INFO 96675 --- [ntainer#0-0-C-1] o.s.k.l.KafkaMessageListenerContainer : grp_STRING: partitions assigned: [topic_string-1, topic_string-0]
2022-02-23 13:58:40.188 INFO 96675 --- [ner#6-dlt-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-7, groupId=grp_STRING] Setting offset for partition topic_string_data-dlt-2 to the committed offset FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.190 INFO 96675 --- [ner#6-dlt-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-grp_STRING-7, groupId=grp_STRING] Setting offset for partition topic_string_data-dlt-0 to the committed offset FetchPosition{offset=14, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[197mumnb29632:9092 (id: 0 rack: null)], epoch=0}}
2022-02-23 13:58:40.191 INFO 96675 --- [ner#6-dlt-0-C-1] o.s.k.l.KafkaMessageListenerContainer : grp_STRING: partitions assigned: [topic_string_data-dlt-0, topic_string_data-dlt-1, topic_string_data-dlt-2]
2022-02-23 13:58:40.196 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 1 for partition topic_string_data-retry-1-0
2022-02-23 13:58:40.196 WARN 96675 --- [4-retry-1-0-C-1] essageListenerContainer$ListenerConsumer : Seek to current after exception; nested exception is org.springframework.kafka.listener.ListenerExecutionFailedException: Listener failed; nested exception is org.springframework.kafka.listener.KafkaBackoffException: Partition 0 from topic topic_string_data-retry-1 is not ready for consumption, backing off for approx. 26346 millis.
current time: Wed Feb 23 13:58:40 IST 2022
retry method invoked -> 3 times from topic: topic_string_data-retry-0
current time: Wed Feb 23 13:58:40 IST 2022
2022-02-23 13:58:40.713 INFO 96675 --- [3-retry-0-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-3, groupId=grp_STRING] Seeking to offset 2 for partition topic_string_data-retry-0-0
current time: Wed Feb 23 13:58:40 IST 2022
retry method invoked -> 4 times from topic: topic_string_data-retry-0
current time: Wed Feb 23 13:58:40 IST 2022
2022-02-23 13:58:41.228 INFO 96675 --- [3-retry-0-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-3, groupId=grp_STRING] Seeking to offset 3 for partition topic_string_data-retry-0-0
current time: Wed Feb 23 13:58:41 IST 2022
retry method invoked -> 5 times from topic: topic_string_data-retry-0
current time: Wed Feb 23 13:58:41 IST 2022
2022-02-23 13:58:41.740 INFO 96675 --- [3-retry-0-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-3, groupId=grp_STRING] Seeking to offset 4 for partition topic_string_data-retry-0-0
current time: Wed Feb 23 13:58:41 IST 2022
retry method invoked -> 6 times from topic: topic_string_data-retry-0
current time: Wed Feb 23 13:58:41 IST 2022
2022-02-23 13:58:42.254 INFO 96675 --- [3-retry-0-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-3, groupId=grp_STRING] Seeking to offset 5 for partition topic_string_data-retry-0-0
current time: Wed Feb 23 13:58:42 IST 2022
retry method invoked -> 7 times from topic: topic_string_data-retry-0
current time: Wed Feb 23 13:58:42 IST 2022
2022-02-23 13:58:42.777 INFO 96675 --- [3-retry-0-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-3, groupId=grp_STRING] Seeking to offset 6 for partition topic_string_data-retry-0-0
current time: Wed Feb 23 13:58:42 IST 2022
retry method invoked -> 8 times from topic: topic_string_data-retry-0
current time: Wed Feb 23 13:58:42 IST 2022
2022-02-23 13:58:43.298 INFO 96675 --- [3-retry-0-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-3, groupId=grp_STRING] Seeking to offset 7 for partition topic_string_data-retry-0-0
current time: Wed Feb 23 13:58:43 IST 2022
retry method invoked -> 9 times from topic: topic_string_data-retry-0
current time: Wed Feb 23 13:58:43 IST 2022
2022-02-23 13:58:43.809 INFO 96675 --- [3-retry-0-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-3, groupId=grp_STRING] Seeking to offset 8 for partition topic_string_data-retry-0-0
current time: Wed Feb 23 13:58:43 IST 2022
retry method invoked -> 10 times from topic: topic_string_data-retry-0
current time: Wed Feb 23 13:58:43 IST 2022
current time: Wed Feb 23 13:59:10 IST 2022
retry method invoked -> 11 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:10 IST 2022
2022-02-23 13:59:10.733 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 2 for partition topic_string_data-retry-1-0
2022-02-23 13:59:10.736 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 1 for partition topic_string_data-retry-2-0
2022-02-23 13:59:10.737 WARN 96675 --- [5-retry-2-0-C-1] essageListenerContainer$ListenerConsumer : Seek to current after exception; nested exception is org.springframework.kafka.listener.ListenerExecutionFailedException: Listener failed; nested exception is org.springframework.kafka.listener.KafkaBackoffException: Partition 0 from topic topic_string_data-retry-2 is not ready for consumption, backing off for approx. 29483 millis.
current time: Wed Feb 23 13:59:10 IST 2022
retry method invoked -> 12 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:10 IST 2022
2022-02-23 13:59:11.249 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 3 for partition topic_string_data-retry-1-0
current time: Wed Feb 23 13:59:11 IST 2022
retry method invoked -> 13 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:11 IST 2022
2022-02-23 13:59:11.769 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 4 for partition topic_string_data-retry-1-0
current time: Wed Feb 23 13:59:11 IST 2022
retry method invoked -> 14 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:11 IST 2022
2022-02-23 13:59:12.286 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 5 for partition topic_string_data-retry-1-0
current time: Wed Feb 23 13:59:12 IST 2022
retry method invoked -> 15 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:12 IST 2022
2022-02-23 13:59:12.805 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 6 for partition topic_string_data-retry-1-0
current time: Wed Feb 23 13:59:12 IST 2022
retry method invoked -> 16 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:12 IST 2022
2022-02-23 13:59:13.339 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 7 for partition topic_string_data-retry-1-0
current time: Wed Feb 23 13:59:13 IST 2022
retry method invoked -> 17 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:13 IST 2022
2022-02-23 13:59:13.856 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 8 for partition topic_string_data-retry-1-0
current time: Wed Feb 23 13:59:13 IST 2022
retry method invoked -> 18 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:13 IST 2022
2022-02-23 13:59:14.372 INFO 96675 --- [4-retry-1-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-6, groupId=grp_STRING] Seeking to offset 9 for partition topic_string_data-retry-1-0
current time: Wed Feb 23 13:59:14 IST 2022
retry method invoked -> 19 times from topic: topic_string_data-retry-1
current time: Wed Feb 23 13:59:14 IST 2022
current time: Wed Feb 23 13:59:40 IST 2022
retry method invoked -> 20 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:40 IST 2022
2022-02-23 13:59:40.846 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 2 for partition topic_string_data-retry-2-0
current time dlt: Wed Feb 23 13:59:40 IST 2022
DLT Received: prachi from topic_string_data-dlt offset 14 -> 21 times
current time dlt: Wed Feb 23 13:59:46 IST 2022
current time: Wed Feb 23 13:59:46 IST 2022
retry method invoked -> 22 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:46 IST 2022
2022-02-23 13:59:47.466 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 3 for partition topic_string_data-retry-2-0
current time dlt: Wed Feb 23 13:59:47 IST 2022
DLT Received: prachi from topic_string_data-dlt offset 15 -> 23 times
current time dlt: Wed Feb 23 13:59:47 IST 2022
current time: Wed Feb 23 13:59:47 IST 2022
retry method invoked -> 24 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:47 IST 2022
2022-02-23 13:59:47.981 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 4 for partition topic_string_data-retry-2-0
current time dlt: Wed Feb 23 13:59:47 IST 2022
DLT Received: prachisharma from topic_string_data-dlt offset 16 -> 25 times
current time dlt: Wed Feb 23 13:59:47 IST 2022
current time: Wed Feb 23 13:59:47 IST 2022
retry method invoked -> 26 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:47 IST 2022
2022-02-23 13:59:48.493 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 5 for partition topic_string_data-retry-2-0
current time dlt: Wed Feb 23 13:59:48 IST 2022
DLT Received: hie from topic_string_data-dlt offset 17 -> 27 times
current time dlt: Wed Feb 23 13:59:48 IST 2022
current time: Wed Feb 23 13:59:48 IST 2022
retry method invoked -> 28 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:48 IST 2022
2022-02-23 13:59:49.011 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 6 for partition topic_string_data-retry-2-0
current time dlt: Wed Feb 23 13:59:49 IST 2022
DLT Received: hieeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee from topic_string_data-dlt offset 18 -> 29 times
current time dlt: Wed Feb 23 13:59:49 IST 2022
current time: Wed Feb 23 13:59:49 IST 2022
retry method invoked -> 30 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:49 IST 2022
2022-02-23 13:59:49.527 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 7 for partition topic_string_data-retry-2-0
current time dlt: Wed Feb 23 13:59:49 IST 2022
DLT Received: hie from topic_string_data-dlt offset 19 -> 31 times
current time dlt: Wed Feb 23 13:59:49 IST 2022
current time: Wed Feb 23 13:59:49 IST 2022
retry method invoked -> 32 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:49 IST 2022
current time dlt: Wed Feb 23 13:59:50 IST 2022
DLT Received: hi from topic_string_data-dlt offset 20 -> 33 times
current time dlt: Wed Feb 23 13:59:50 IST 2022
2022-02-23 13:59:50.039 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 8 for partition topic_string_data-retry-2-0
current time: Wed Feb 23 13:59:50 IST 2022
retry method invoked -> 34 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:50 IST 2022
2022-02-23 13:59:50.545 INFO 96675 --- [5-retry-2-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-grp_STRING-1, groupId=grp_STRING] Seeking to offset 9 for partition topic_string_data-retry-2-0
current time dlt: Wed Feb 23 13:59:50 IST 2022
DLT Received: hi from topic_string_data-dlt offset 21 -> 35 times
current time dlt: Wed Feb 23 13:59:50 IST 2022
current time: Wed Feb 23 13:59:50 IST 2022
retry method invoked -> 36 times from topic: topic_string_data-retry-2
current time: Wed Feb 23 13:59:50 IST 2022
current time dlt: Wed Feb 23 13:59:51 IST 2022
DLT Received: hi from topic_string_data-dlt offset 22 -> 37 times
current time dlt: Wed Feb 23 13:59:51 IST 2022
It seems there are two separate problems.
One is that you seem to already have records in the topics, and if you have it configured to earliest the app will read all those records when it starts up. You can either set ConsumerConfig.AUTO_OFFSET_RESET_CONFIG to latest, or, if you're running locally on docker, you can stop the Kafka container and prune the volumes with something like docker system prune --volumes (note that this will erase data from all your stopped containers - use wisely).
Can you try one of these and test again?
The other problem is that the framework is wrongly setting the default maxDelay of 30s even though the annotation states the default is to ignore. I'll open an issue for that and add the link here.
For now You can set a maxDelay such as #Backoff(delay = 600000, multiplier = 3.0, maxDelay = 5400000), then the application should have the correct delays for 10, 30 and 90 minutes as you wanted.
Let me know if that works out for you, or if you have any other problems related to this issue.
EDIT: Issue opened, you can follow the development there https://github.com/spring-projects/spring-kafka/issues/2137
It should be fixed in the next release.
EDIT 2: Actually the phrasing in the #BackOff annotation is rather ambiguous, but seems like the behavior is correct and you should explicitly set a larger maxDelay.
The documentation should clarify this behavior in the next release.
EDIT 3: To answer your question in the comments, the way retryable topics work is the partition is paused for the duration of the delay, but the consumer keeps polling the broker, so longer delays don't trigger rebalancing.
From your logs the rebalancing is from the main topic's partitions, so it's unlikely it has anything to do with this feature.
EDIT 4: The Retryable Topics feature was released in Spring for Apache Kafka 2.7.0, which uses kafka-clients 2.7.0. However, there have been several improvements to the feature, so I recommend using the latest Spring Kafka version (currently 2.8.3) if possible to benefit from those.
I have a Spring Cloud Stream Kafka Stream application that reads from a topic (event) and performs a simple processing:
#Configuration
class EventKStreamConfiguration {
private val logger = LoggerFactory.getLogger(javaClass)
#StreamListener
fun process(#Input("event") eventStream: KStream<String, EventReceived>) {
eventStream.foreach { key, value ->
logger.info("--------> Processing Event {}", value)
// Save in DB
}
}
}
This application is using a Kafka environment from Confluent Cloud, with an event topic with 6 partitions. The full configuration is:
spring:
application:
name: events-processor
cloud:
stream:
schema-registry-client:
endpoint: ${schema-registry-url:http://localhost:8081}
kafka:
streams:
binder:
brokers: ${kafka-brokers:localhost}
configuration:
application:
id: ${spring.application.name}
default:
key:
serde: org.apache.kafka.common.serialization.Serdes$StringSerde
schema:
registry:
url: ${spring.cloud.stream.schema-registry-client.endpoint}
value:
subject:
name:
strategy: io.confluent.kafka.serializers.subject.RecordNameStrategy
processing:
guarantee: exactly_once
bindings:
event:
consumer:
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
bindings:
event:
destination: event
data:
mongodb:
uri: ${mongodb-uri:mongodb://localhost/test}
server:
port: 8085
logging:
level:
org.springframework.kafka.config: debug
---
spring:
profiles: confluent-cloud
cloud:
stream:
kafka:
streams:
binder:
autoCreateTopics: false
configuration:
retry:
backoff:
ms: 500
security:
protocol: SASL_SSL
sasl:
mechanism: PLAIN
jaas:
config: xxx
basic:
auth:
credentials:
source: USER_INFO
schema:
registry:
basic:
auth:
user:
info: yyy
Messages are being correctly processed by the KStream. If I restart the application they are not reprocessed. Note: I don’t want them to be reprocessed, so this behaviour is ok.
However the startup logs show some strange bits:
First it displays the creation of a restore consumer client. with auto offset reset none:
2019-07-19 10:20:17.120 INFO 82473 --- [ main] o.a.k.s.p.internals.StreamThread : stream-thread [events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1] Creating restore consumer client
2019-07-19 10:20:17.123 INFO 82473 --- [ main] o.a.k.clients.consumer.ConsumerConfig : ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = none
Then it creates a consumer client with auto offset reset earliest.
2019-07-19 10:20:17.235 INFO 82473 --- [ main] o.a.k.s.p.internals.StreamThread : stream-thread [events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1] Creating consumer client
2019-07-19 10:20:17.241 INFO 82473 --- [ main] o.a.k.clients.consumer.ConsumerConfig : ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
The final traces of the startup log show an offset reset to 0. This happens on every restart of the application:
2019-07-19 10:20:31.577 INFO 82473 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread : stream-thread [events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1] State transition from PARTITIONS_ASSIGNED to RUNNING
2019-07-19 10:20:31.578 INFO 82473 --- [-StreamThread-1] org.apache.kafka.streams.KafkaStreams : stream-client [events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f] State transition from REBALANCING to RUNNING
2019-07-19 10:20:31.669 INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-3 to offset 0.
2019-07-19 10:20:31.669 INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-0 to offset 0.
2019-07-19 10:20:31.669 INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-1 to offset 0.
2019-07-19 10:20:31.669 INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-5 to offset 0.
2019-07-19 10:20:31.670 INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-4 to offset 0.
What's the reason why there are two consumers configured?
Why does the second one have auto.offset.reset = earliest when I haven't configured it explicitly and the Kafka default is latest?
I want the default (auto.offset.reset = latest) behaviour and it seems to be working fine. However, doesn't it contradict what I see in the logs?
UPDATE:
I would rephrase the third question like this: Why do the logs show that the partitions are being reseted to 0 on every restart and in spite of it no messages are redelivered to the KStream?
UPDATE 2:
I've simplified the scenario, this time with a native Kafka Streams application. The behaviour is exactly the same as observed with Spring Cloud Stream. However, inspecting the consumer-group and the partitions I've found it kind of makes sense.
KStream:
fun main() {
val props = Properties()
props[StreamsConfig.APPLICATION_ID_CONFIG] = "streams-wordcount"
props[StreamsConfig.BOOTSTRAP_SERVERS_CONFIG] = "localhost:9092"
props[StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG] = 0
props[StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG] = Serdes.String().javaClass.name
props[StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG] = Serdes.String().javaClass.name
val builder = StreamsBuilder()
val source = builder.stream<String, String>("streams-plaintext-input")
source.foreach { key, value -> println("$key $value") }
val streams = KafkaStreams(builder.build(), props)
val latch = CountDownLatch(1)
// attach shutdown handler to catch control-c
Runtime.getRuntime().addShutdownHook(object : Thread("streams-wordcount-shutdown-hook") {
override fun run() {
streams.close()
latch.countDown()
}
})
try {
streams.start()
latch.await()
} catch (e: Throwable) {
exitProcess(1)
}
exitProcess(0)
}
This is what I've seen:
1) With an empty topic, the startup shows a resetting of all partitions to offset 0:
07:55:03.885 [streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-2 to offset 0.
07:55:03.886 [streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-3 to offset 0.
07:55:03.886 [streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-0 to offset 0.
07:55:03.886 [streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-1 to offset 0.
07:55:03.886 [streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-4 to offset 0.
07:55:03.886 [streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-3549a54e-49db-4490-bd9f-7156e972021a-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-5 to offset 0
2) I put one message in the topic and inspect the consumer group, seeing that the record is in partition 4:
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
streams-plaintext-input 0 - 0 - streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer-905a307a-4c49-4d8b-ac2e-5525ba2e8a8e /127.0.0.1 streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer
streams-plaintext-input 5 - 0 - streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer-905a307a-4c49-4d8b-ac2e-5525ba2e8a8e /127.0.0.1 streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer
streams-plaintext-input 1 - 0 - streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer-905a307a-4c49-4d8b-ac2e-5525ba2e8a8e /127.0.0.1 streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer
streams-plaintext-input 2 - 0 - streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer-905a307a-4c49-4d8b-ac2e-5525ba2e8a8e /127.0.0.1 streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer
streams-plaintext-input 3 - 0 - streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer-905a307a-4c49-4d8b-ac2e-5525ba2e8a8e /127.0.0.1 streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer
streams-plaintext-input 4 1 1 0 streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer-905a307a-4c49-4d8b-ac2e-5525ba2e8a8e /127.0.0.1 streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer
3) I restart the application. Now the resetting only affects the empty partitions (0, 1, 2, 3, 5):
07:57:39.477 [streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-2 to offset 0.
07:57:39.478 [streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-3 to offset 0.
07:57:39.478 [streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-0 to offset 0.
07:57:39.479 [streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-1 to offset 0.
07:57:39.479 [streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-b1565eca-7d80-4550-97d2-e78ead62a840-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-5 to offset 0.
4) I insert another message, inspect the consumer group state and the same thing happens: the record is in partition 2 and when restarting the application it only resets the empty partitions (0, 1, 3, 5):
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
streams-plaintext-input 0 - 0 - streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer-cb04e2bd-598f-455f-b913-1370b4144dd6 /127.0.0.1 streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer
streams-plaintext-input 5 - 0 - streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer-cb04e2bd-598f-455f-b913-1370b4144dd6 /127.0.0.1 streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer
streams-plaintext-input 1 - 0 - streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer-cb04e2bd-598f-455f-b913-1370b4144dd6 /127.0.0.1 streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer
streams-plaintext-input 2 1 1 0 streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer-cb04e2bd-598f-455f-b913-1370b4144dd6 /127.0.0.1 streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer
streams-plaintext-input 3 - 0 - streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer-cb04e2bd-598f-455f-b913-1370b4144dd6 /127.0.0.1 streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer
streams-plaintext-input 4 1 1 0 streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer-cb04e2bd-598f-455f-b913-1370b4144dd6 /127.0.0.1 streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer
08:00:42.313 [streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-3 to offset 0.
08:00:42.314 [streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-0 to offset 0.
08:00:42.314 [streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-1 to offset 0.
08:00:42.314 [streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=streams-wordcount-addb08ed-62ce-47f9-a446-f2ee0592c53d-StreamThread-1-consumer, groupId=streams-wordcount] Resetting offset for partition streams-plaintext-input-5 to offset 0.
What's the reason why there are two consumers configured?
Restore Consumer Client is a dedicated consumer for fault tolerance and state management. It is the responsible for restoring the state from the changelog topics. It is displayed seperately from the application consumer client. You can find more information here :
https://docs.confluent.io/current/streams/monitoring.html#kafka-restore-consumer-client-id
Why does the second one have auto.offset.reset = earliest when I haven't configured it explicitly and the Kafka default is latest?
You are right, auto.offset.reset default value is latest in Kafka Consumer. But in Spring Cloud Stream, default value for consumer startOffset is earliest. Hence it shows earliest in second consumer. Also it depends on spring.cloud.stream.bindings.<channelName>.group binding. If it is set explicitly, then startOffset is set to earliest, otherwise it is set to latest for anonymous consumer.
Reference : Spring Cloud Stream Kafka Consumer Properties
I want the default (auto.offset.reset = latest) behaviour and it
seems to be working fine. However, doesn't it contradict what I see in
the logs?
In case of anonymous consumer group, the default value for startOffset will be latest.
When I run my Java application and instantiate the KafkaConsumer object (fed with the minimum required properties: key and value deserializer and group_id); I see lots of INFO messages on the StdOut (If I provide unsupported properties, I also see WARNING messages).
I want to see when fetch events take place. I assume that by increasing the loglevel to DEBUG I will be able to see that. Unfortunately, I am not able to increase it.
I tried to feed the log4j.properties file in multiple ways (placing the file at specific paths and also passing it as parameter (-Dlog4j.configuration). The output remains the same.
cd /Users/user/git/kafka/toys; JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home "/Applications/NetBeans/NetBeans 8.2.app/Contents/Resources/NetBeans/java/maven/bin/mvn" "-Dexec.args=-classpath %classpath ch.demo.toys.CarthusianConsumer" -Dexec.executable=/Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home/bin/java -Dexec.classpathScope=runtime -DskipTests=true org.codehaus.mojo:exec-maven-plugin:1.2.1:exec
Running NetBeans Compile On Save execution. Phase execution is skipped and output directories of dependency projects (with Compile on Save turned on) will be used instead of their jar artifacts.
Scanning for projects...
------------------------------------------------------------------------
Building toys 1.0-SNAPSHOT
------------------------------------------------------------------------
--- exec-maven-plugin:1.2.1:exec (default-cli) # toys ---
Jul 10, 2019 2:52:00 PM org.apache.kafka.common.config.AbstractConfig logAll
INFO: ConsumerConfig values:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [kafka-server:9090, kafka-server:9091, kafka-server:9092]
check.crcs = true
client.dns.lookup = default
client.id =
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = carthusian-consumer
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.IntegerDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 100000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = DEBUG
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
Jul 10, 2019 2:52:01 PM org.apache.kafka.common.utils.AppInfoParser$AppInfo <init>
INFO: Kafka version: 2.3.0
Jul 10, 2019 2:52:01 PM org.apache.kafka.common.utils.AppInfoParser$AppInfo <init>
INFO: Kafka commitId: fc1aaa116b661c8a
Jul 10, 2019 2:52:01 PM org.apache.kafka.common.utils.AppInfoParser$AppInfo <init>
INFO: Kafka startTimeMs: 1562763121219
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.KafkaConsumer subscribe
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Subscribed to topic(s): sequence
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.Metadata update
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Cluster ID: REIXp5FySKGPHlRyfTALLQ
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.AbstractCoordinator$FindCoordinatorResponseHandler onSuccess
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Discovered group coordinator kafka-tds:9091 (id: 2147483646 rack: null)
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinPrepare
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Revoking previously assigned partitions []
Revoke event: []
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.AbstractCoordinator sendJoinGroupRequest
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] (Re-)joining group
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.AbstractCoordinator sendJoinGroupRequest
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] (Re-)joining group
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1 onSuccess
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Successfully joined group with generation 96
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinComplete
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Setting newly assigned partitions: sequence-1, sequence-0
Assignment event: [sequence-1, sequence-0]
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.SubscriptionState lambda$requestOffsetReset$3
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Seeking to EARLIEST offset of partition sequence-1
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.SubscriptionState lambda$requestOffsetReset$3
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Seeking to EARLIEST offset of partition sequence-0
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.SubscriptionState maybeSeekUnvalidated
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Resetting offset for partition sequence-0 to offset 0.
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.SubscriptionState maybeSeekUnvalidated
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Resetting offset for partition sequence-1 to offset 0.
Loaded 9804 records from [sequence-0] partitions
Loaded 9804 records from [sequence-1] partitions
Loaded 9799 records from [sequence-0] partitions
Loaded 9799 records from [sequence-1] partitions
Loaded 9799 records from [sequence-0] partitions
Loaded 9799 records from [sequence-1] partitions
Loaded 9799 records from [sequence-0] partitions
Loaded 9799 records from [sequence-1] partitions
Loaded 9799 records from [sequence-0] partitions
Solved by placing the following (simple) log4j.properties under src/main/resources and running the app straight from console (rather than from the IDE). Fetching messages are now shown.
# Root logger option
log4j.rootLogger=DEBUG, stdout
# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1} - %m%n
At the moment I do not know which class is generating the messages I am looking for, hence the DEBUG setting on the rootLogger.
I execute shell script kafka-topics.sh to alter partition count from 3 to 10.
./bin/kafka-topics.sh --zookeeper xxx.xx.xxx.xx:2181/my-broker-0 --alter --topic target-topic-example-1 --partitions 10
As soon as I execute the command, Kafka cluster occurred errors.
[2018-11-20 14:45:29,954] INFO Created log for partition [target-topic-example-1,5] in /data/kafka/brokers with properties {flush.messages -> 9223372036854775807, message.timestamp.type -> CreateTime, segment.bytes -> 536870912, preallocate -> false, cleanup.policy -> [delete], delete.retention.ms -> 86400000, segment.ms -> 604800000, min.insync.replicas -> 1, file.delete.delay.ms -> 60000, retention.ms -> 259200000, max.message.bytes -> 10000000, message.format.version -> 0.10.1-IV2, index.interval.bytes -> 4096, segment.index.bytes -> 10485760, retention.bytes -> -1, segment.jitter.ms -> 0, min.cleanable.dirty.ratio -> 0.5, compression.type -> producer, unclean.leader.election.enable -> true, message.timestamp.difference.max.ms -> 9223372036854775807, min.compaction.lag.ms -> 0, flush.ms -> 9223372036854775807}. (kafka.log.LogManager)
[2018-11-20 14:45:29,955] INFO Partition [target-topic-example-1,5] on broker 8: No checkpointed highwatermark is found for partition [target-topic-example-1,5] (kafka.cluster.Partition)
[2018-11-20 14:45:29,957] INFO Completed load of log target-topic-example-1-4 with 1 log segments and log end offset 0 in 1 ms (kafka.log.Log)
[2018-11-20 14:45:29,957] INFO Created log for partition [target-topic-example-1,4] in /data/kafka/brokers with properties {flush.messages -> 9223372036854775807, message.timestamp.type -> CreateTime, segment.bytes -> 536870912, preallocate -> false, cleanup.policy -> [delete], delete.retention.ms -> 86400000, segment.ms -> 604800000, min.insync.replicas -> 1, file.delete.delay.ms -> 60000, retention.ms -> 259200000, max.message.bytes -> 10000000, message.format.version -> 0.10.1-IV2, index.interval.bytes -> 4096, segment.index.bytes -> 10485760, retention.bytes -> -1, segment.jitter.ms -> 0, min.cleanable.dirty.ratio -> 0.5, compression.type -> producer, unclean.leader.election.enable -> true, message.timestamp.difference.max.ms -> 9223372036854775807, min.compaction.lag.ms -> 0, flush.ms -> 9223372036854775807}. (kafka.log.LogManager)
[2018-11-20 14:45:29,958] INFO Partition [target-topic-example-1,4] on broker 8: No checkpointed highwatermark is found for partition [target-topic-example-1,4] (kafka.cluster.Partition)
[2018-11-20 14:45:29,958] INFO [ReplicaFetcherManager on broker 8] Removed fetcher for partitions target-topic-example-1-5,target-topic-example-1-4 (kafka.server.ReplicaFetcherManager)
[2018-11-20 14:45:29,958] INFO Truncating log target-topic-example-1-5 to offset 0. (kafka.log.Log)
[2018-11-20 14:45:29,958] INFO Truncating log target-topic-example-1-4 to offset 0. (kafka.log.Log)
[2018-11-20 14:45:29,961] INFO [ReplicaFetcherManager on broker 8] Added fetcher for partitions List([target-topic-example-1-5, initOffset 0 to broker BrokerEndPoint(1,My-broker02-172.cm.skp,9092)] , [target-topic-example-1-4, initOffset 0 to broker BrokerEndPoint(0,My- broker01-172.cm.skp,9092)] ) (kafka.server.ReplicaFetcherManager)
[2018-11-20 14:45:30,062] ERROR [ReplicaFetcherThread-0-1], Error for partition [other-topic-1,3] to broker 1:org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request (kafka.server.ReplicaFetcherThread)
[2018-11-20 14:45:30,070] ERROR [ReplicaFetcherThread-0-1], Error for partition [other-topic-2,0] to broker 1:org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request (kafka.server.ReplicaFetcherThread)
[2018-11-20 14:45:30,070] ERROR [ReplicaFetcherThread-0-1], Error for partition [other-topic-3,3] to broker 1:org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request (kafka.server.ReplicaFetcherThread)
[2018-11-20 14:45:30,070] ERROR [ReplicaFetcherThread-0-1], Error for partition [other-topic-4,0] to broker 1:org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request (kafka.server.ReplicaFetcherThread)
My command to change the number of partitions worked well as a result.
However, I do not know what the above errors are like.
(For reference, there are a total of 10 brokers, and more than 500 topics.)
Just altering the number of partitions isn't enough, you need to re-balance the cluster. The newer partitions are only active when the consumers are informed about these new partitions.
Please refer to my answer for process of adding partitions: https://stackoverflow.com/a/58293572/8702415
I would also recommend to conduct preferred replica election. ./bin/kafka-preferred-replica-election.sh --zookeeper localhost:9092
The command:
./bin/kafka-consumer-perf-test.sh --topic test_topic --messages 500000 --zookeeper zookeepernode --show-detailed-stats --consumer.config ./conf/connect-distributed.properties
leads to
{metadata.broker.list=kafkanode1:9093,kafkanode2:9093,av3l338p.kafkanode3:9093, request.timeout.ms=30000, client.id=perf-consumer-68473, security.protocol=SASL_SSL}
WARN Fetching topic metadata with correlation id 0 for topics [Set(test_topic)] from broker [BrokerEndPoint(1003,kafkanode3,9093)] failed (kafka.client.ClientUtils$)
java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:99)
[...]
WARN Fetching topic metadata with correlation id 0 for topics [Set(test_topic)] from broker [BrokerEndPoint(1001,kafkanode1,9093)] failed (kafka.client.ClientUtils$)
java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
[...]
WARN Fetching topic metadata with correlation id 0 for topics [Set(test_topic)] from broker [BrokerEndPoint(1002,kafkanode2,9093)] failed (kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:122)
Try to add the flag "--new-consumer":
./bin/kafka-consumer-perf-test.sh --new-consumer --broker-list <broker-host>:9093 --topic <topic-read-from> --messages <number-of-messages> --show-detailed-stats --consumer.config consumer.properties --zookeeper <zookeeper-host>:2181
The consumer.properties looks like (including SASL_SSL and Kerberos configuration):
zookeeper.connect=<zookeeper-host>:2181
# timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
#consumer group id
group.id=<consumer-group>
security.protocol=SASL_SSL
ssl.keystore.location=<path-to-keystore-file>
ssl.keystore.password=<password>
ssl.truststore.location=<path-to-truststore-file>
ssl.truststore.password=<password>
The output should look like:
time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2017-06-01 11:29:15:194, 0, 1,6726, 4,1816, 5000, 12500,0000
2017-06-01 11:29:15:207, 0, 3,3916, 859,4761, 10000, 2500000,0000
2017-06-01 11:29:15:208, 0, 5,1009, 1709,3201, 15000, 5000000,0000
2017-06-01 11:29:15:209, 0, 6,7850, 1684,1164, 20000, 5000000,0000
2017-06-01 11:29:15:291, 0, 8,5101, 21,2977, 25000, 61728,3951
2017-06-01 11:29:15:460, 0, 10,2138, 10,0808, 30000, 29585,7988
2017-06-01 11:29:15:462, 0, 11,9258, 1711,9637, 35000, 5000000,0000
2017-06-01 11:29:15:463, 0, 13,6348, 1709,0416, 40000, 5000000,0000
2017-06-01 11:29:15:673, 0, 15,3255, 8,0511, 45000, 23809,5238
2017-06-01 11:29:15:962, 0, 17,0417, 5,9384, 50000, 17301,0381
2017-06-01 11:29:15:963, 0, 18,7520, 1710,2280, 55000, 5000000,0000
2017-06-01 11:29:15:963, 0, 20,4568, Infinity, 60000, Infinity
2017-06-01 11:29:16:090, 0, 22,1223, 13,1138, 65000, 39370,0787
2017-06-01 11:29:16:282, 0, 23,8131, 8,8062, 70000, 26041,6667