Spark Streaming from Kafka topic throws offset out of range with no option to restart the stream - scala

I have a streaming job running on Spark 2.1.1, polling Kafka 0.10. I am using the Spark KafkaUtils class to create a DStream, and everything is working fine until I have data that ages out of the topic because of the retention policy. My problem comes when I stop my job to make some changes if any data has aged out of the topic I get an error saying that my offsets are out of range. I have done a lot of research including looking at the spark source code, and I see lots of comments like the comments in this issue: SPARK-19680 - basically saying that data should not be lost silently - so auto.offset.reset is ignored by spark. My big question, though, is what can I do now? My topic will not poll in spark - it dies on startup with the offsets exception. I don't know how to reset the offsets so my job will just get started again. I have not enabled checkpoints since I read that those are unreliable for this use. I used to have a lot of code to manage offsets, but it appears that spark ignores requested offsets if there are any committed, so I am currently managing offsets like this:
val stream = KafkaUtils.createDirectStream[String, T](
ssc,
PreferConsistent,
Subscribe[String, T](topics, kafkaParams))
stream.foreachRDD { (rdd, batchTime) =>
val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
Log.debug("processing new batch...")
val values = rdd.map(x => x.value())
val incomingFrame: Dataset[T] = SparkUtils.sparkSession.createDataset(values)(consumer.encoder()).persist
consumer.processDataset(incomingFrame, batchTime)
stream.asInstanceOf[CanCommitOffsets].commitAsync(offsets)
}
ssc.start()
ssc.awaitTermination()
As a workaround I have been changing my group ids but that is really lame. I know this is expected behavior and should not happen, I just need to know how to get the stream running again. Any help would be appreciated.

Here is a block of code I wrote to get by this until a real solution is introduced to spark-streaming-kafka. It basically resets the offsets for the partitions that have aged out based on the OffsetResetStrategy you set. Just give it the same Map params, _params, you provide to KafkaUtils. Call this before calling KafkaUtils.create****Stream() from your driver.
final OffsetResetStrategy offsetResetStrategy = OffsetResetStrategy.valueOf(_params.get(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG).toString().toUpperCase(Locale.ROOT));
if(OffsetResetStrategy.EARLIEST.equals(offsetResetStrategy) || OffsetResetStrategy.LATEST.equals(offsetResetStrategy)) {
LOG.info("Going to reset consumer offsets");
final KafkaConsumer<K,V> consumer = new KafkaConsumer<>(_params);
LOG.debug("Fetching current state");
final List<TopicPartition> parts = new LinkedList<>();
final Map<TopicPartition, OffsetAndMetadata> currentCommited = new HashMap<>();
for(String topic: this.topics()) {
List<PartitionInfo> info = consumer.partitionsFor(topic);
for(PartitionInfo i: info) {
final TopicPartition p = new TopicPartition(topic, i.partition());
final OffsetAndMetadata m = consumer.committed(p);
parts.add(p);
currentCommited.put(p, m);
}
}
final Map<TopicPartition, Long> begining = consumer.beginningOffsets(parts);
final Map<TopicPartition, Long> ending = consumer.endOffsets(parts);
LOG.debug("Finding what offsets need to be adjusted");
final Map<TopicPartition, OffsetAndMetadata> newCommit = new HashMap<>();
for(TopicPartition part: parts) {
final OffsetAndMetadata m = currentCommited.get(part);
final Long begin = begining.get(part);
final Long end = ending.get(part);
if(m == null || m.offset() < begin) {
LOG.info("Adjusting partition {}-{}; OffsetAndMeta={} Begining={} End={}", part.topic(), part.partition(), m, begin, end);
final OffsetAndMetadata newMeta;
if(OffsetResetStrategy.EARLIEST.equals(offsetResetStrategy)) {
newMeta = new OffsetAndMetadata(begin);
} else if(OffsetResetStrategy.LATEST.equals(offsetResetStrategy)) {
newMeta = new OffsetAndMetadata(end);
} else {
newMeta = null;
}
LOG.info("New offset to be {}", newMeta);
if(newMeta != null) {
newCommit.put(part, newMeta);
}
}
}
consumer.commitSync(newCommit);
consumer.close();
}

auto.offset.reset=latest/earliest will be applied only when consumer starts first time.
there is Spark JIRA to resolve this issue, till then we need live with work arounds.
https://issues.apache.org/jira/browse/SPARK-19680

Try
auto.offset.reset=latest
Or
auto.offset.reset=earliest
earliest: automatically reset the offset to the earliest offset
latest: automatically reset the offset to the latest offset
none: throw exception to the consumer if no previous offset is found for the consumer's group
anything else: throw exception to the consumer.
One more thing that affects what offset value will correspond to smallest and largest configs is log retention policy. Imagine you have a topic with retention configured to 1 hour. You produce 10 messages, and then an hour later you post 10 more messages. The largest offset will still remain the same but the smallest one won't be able to be 0 because Kafka will already remove these messages and thus the smallest available offset will be 10.

This problem was solved in the stream structuring structure by including "failOnDataLoss" = "false". It is unclear why there is no such option in the spark DStream framework.
This is a BIG quesion for spark developers!
In our projects, we tried to solve this problem by resetting the offsets form ealiest + 5 minutes ... it helps in most cases.

Related

Aug 2019 - Kafka Consumer Lag programmatically

Is there any way we can programmatically find lag in the Kafka Consumer.
I don't want external Kafka Manager tools to install and check on dashboard.
We can list all the consumer group and check for lag for each group.
Currently we do have command to check the lag and it requires the relative path where the Kafka resides.
Spring-Kafka, kafka-python, Kafka Admin client or using JMX - is there any way we can code and find out the lag.
We were careless and didn't monitor the process, the consumer was in zombie state and the lag went to 50,000 which resulted in lot of chaos.
Only when the issue arises we think of these cases as we were monitoring the script but didn't knew it will be result in zombie process.
Any thoughts are extremely welcomed!!
you can get this using kafka-python, run this on each broker or loop through list of brokers, it will give all topic partitions consumer lag.
BOOTSTRAP_SERVERS = '{}'.format(socket.gethostbyname(socket.gethostname()))
client = BrokerConnection(BOOTSTRAP_SERVERS, 9092, socket.AF_INET)
client.connect_blocking()
list_groups_request = ListGroupsRequest_v1()
future = client.send(list_groups_request)
while not future.is_done:
for resp, f in client.recv():
f.success(resp)
for group in future.value.groups:
if group[1] == 'consumer':
#print(group[0])
list_mebers_in_groups = DescribeGroupsRequest_v1(groups=[(group[0])])
future = client.send(list_mebers_in_groups)
while not future.is_done:
for resp, f in client.recv():
#print resp
f.success(resp)
(error_code, group_id, state, protocol_type, protocol, members) = future.value.groups[0]
if len(members) !=0:
for member in members:
(member_id, client_id, client_host, member_metadata, member_assignment) = member
member_topics_assignment = []
for (topic, partitions) in MemberAssignment.decode(member_assignment).assignment:
member_topics_assignment.append(topic)
for topic in member_topics_assignment:
consumer = KafkaConsumer(
bootstrap_servers=BOOTSTRAP_SERVERS,
group_id=group[0],
enable_auto_commit=False
)
consumer.topics()
for p in consumer.partitions_for_topic(topic):
tp = TopicPartition(topic, p)
consumer.assign([tp])
committed = consumer.committed(tp)
consumer.seek_to_end(tp)
last_offset = consumer.position(tp)
if last_offset != None and committed != None:
lag = last_offset - committed
print "group: {} topic:{} partition: {} lag: {}".format(group[0], topic, p, lag)
consumer.close(autocommit=False)
Yes. We can get consumer lag in kafka-python. Not sure if this is best way to do it. But this works.
Currently we are giving our consumer manually, you also get consumers from kafka-python, but it gives only the list of active consumers. So if one of your consumers is down. It may not show up in the list.
First establish client connection
from kafka import BrokerConnection
from kafka.protocol.commit import *
import socket
#This takes in only one broker at a time. So to use multiple brokers loop through each one by giving broker ip and port.
def establish_broker_connection(server, port, group):
'''
Client Connection to each broker for getting consumer offset info
'''
bc = BrokerConnection(server, port, socket.AF_INET)
bc.connect_blocking()
fetch_offset_request = OffsetFetchRequest_v3(group, None)
future = bc.send(fetch_offset_request)
Next we need to get the current offset for each topic the consumer is subscribed to. Pass the above future and bc here.
from kafka import SimpleClient
from kafka.protocol.offset import OffsetRequest, OffsetResetStrategy
from kafka.common import OffsetRequestPayload
def _get_client_connection():
'''
Client Connection to the cluster for getting topic info
'''
# Give comma seperated info of kafka broker "broker1:port1, broker2:port2'
client = SimpleClient(BOOTSTRAP_SEREVRS)
return client
def get_latest_offset_for_topic(self, topic):
'''
To get latest offset for a topic
'''
partitions = self.client.topic_partitions[topic]
offset_requests = [OffsetRequestPayload(topic, p, -1, 1) for p in partitions.keys()]
client = _get_client_connection()
offsets_responses = client.send_offset_request(offset_requests)
latest_offset = offsets_responses[0].offsets[0]
return latest_offset # Gives latest offset for topic
def get_current_offset_for_consumer_group(future, bc):
'''
Get current offset info for a consumer group
'''
while not future.is_done:
for resp, f in bc.recv():
f.success(resp)
# future.value.topics -- This will give all the topics in the form of a list.
for topic in self.future.value.topics:
latest_offset = self.get_latest_offset_for_topic(topic[0])
for partition in topic[1]:
offset_difference = latest_offset - partition[1]
offset_difference gives the difference between the last offset produced in the topic and the last offset (or message) consumed by your consumer.
If you are not getting current offset for a consumer for a topic then it means your consumer is probably down.
So you can raise alerts or send mail if the offset difference is above a threshold you want or if you get empty offsets for your consumer.
The java client exposes the lag for its consumers over JMX; in this example we have 5 partitions...
Spring Boot can publish these to micrometer.
I'm writing code in scala but use only native java API from KafkaConsumer and KafkaProducer.
You need only know the name of Consumer Group and Topics.
it's possible to avoid pre-defined topic, but then you will get Lag only for Consumer Group which exist and which state is stable not rebalance, this can be a problem for alerting.
So all that you really need to know and use are:
KafkaConsumer.commited - return latest committed offset for TopicPartition
KafkaConsumer.assign - do not use subscribe, because it causes to CG rebalance. You definitely do not want that your monitoring process to influence on the subject of monitoring.
kafkaConsumer.endOffsets - return latest produced offset
Consumer Group Lag - is a difference between the latest committed and latest produced
import java.util.{Properties, UUID}
import org.apache.kafka.clients.consumer.KafkaConsumer
import org.apache.kafka.clients.producer.KafkaProducer
import org.apache.kafka.common.TopicPartition
import org.apache.kafka.common.serialization.{StringDeserializer, StringSerializer}
import scala.collection.JavaConverters._
import scala.util.Try
case class TopicPartitionInfo(topic: String, partition: Long, currentPosition: Long, endOffset: Long) {
val lag: Long = endOffset - currentPosition
override def toString: String = s"topic=$topic,partition=$partition,currentPosition=$currentPosition,endOffset=$endOffset,lag=$lag"
}
case class ConsumerGroupInfo(consumerGroup: String, topicPartitionInfo: List[TopicPartitionInfo]) {
override def toString: String = s"ConsumerGroup=$consumerGroup:\n${topicPartitionInfo.mkString("\n")}"
}
object ConsumerLag {
def consumerGroupInfo(bootStrapServers: String, consumerGroup: String, topics: List[String]) = {
val properties = new Properties()
properties.put("bootstrap.servers", bootStrapServers)
properties.put("auto.offset.reset", "latest")
properties.put("group.id", consumerGroup)
properties.put("key.deserializer", classOf[StringDeserializer])
properties.put("value.deserializer", classOf[StringDeserializer])
properties.put("key.serializer", classOf[StringSerializer])
properties.put("value.serializer", classOf[StringSerializer])
properties.put("client.id", UUID.randomUUID().toString)
val kafkaProducer = new KafkaProducer[String, String](properties)
val kafkaConsumer = new KafkaConsumer[String, String](properties)
val assignment = topics
.map(topic => kafkaProducer.partitionsFor(topic).asScala)
.flatMap(partitions => partitions.map(p => new TopicPartition(p.topic, p.partition)))
.asJava
kafkaConsumer.assign(assignment)
ConsumerGroupInfo(consumerGroup,
kafkaConsumer.endOffsets(assignment).asScala
.map { case (tp, latestOffset) =>
TopicPartitionInfo(tp.topic,
tp.partition,
Try(kafkaConsumer.committed(tp)).map(_.offset).getOrElse(0), // TODO Warn if Null, Null mean Consumer Group not exist
latestOffset)
}
.toList
)
}
def main(args: Array[String]): Unit = {
println(
consumerGroupInfo(
bootStrapServers = "kafka-prod:9092",
consumerGroup = "not-exist",
topics = List("events", "anotherevents")
)
)
println(
consumerGroupInfo(
bootStrapServers = "kafka:9092",
consumerGroup = "consumerGroup1",
topics = List("events", "anotehr events")
)
)
}
}
if anyone is looking for consumer lag in confluent cloud here is simple script
BOOTSTRAP_SERVERS = "<>.aws.confluent.cloud"
CCLOUD_API_KEY = "{{ ccloud_apikey }}"
CCLOUD_API_SECRET = "{{ ccloud_apisecret }}"
ENVIRONMENT = "dev"
CLUSTERID = "dev"
CACERT = "/usr/local/lib/python{{ python3_version }}/site-packages/certifi/cacert.pem"
def main():
client = KafkaAdminClient(bootstrap_servers=BOOTSTRAP_SERVERS,
ssl_cafile=CACERT,
security_protocol='SASL_SSL',
sasl_mechanism='PLAIN',
sasl_plain_username=CCLOUD_API_KEY,
sasl_plain_password=CCLOUD_API_SECRET)
for group in client.list_consumer_groups():
if group[1] == 'consumer':
consumer = KafkaConsumer(
bootstrap_servers=BOOTSTRAP_SERVERS,
ssl_cafile=CACERT,
group_id=group[0],
enable_auto_commit=False,
api_version=(0,10),
security_protocol='SASL_SSL',
sasl_mechanism='PLAIN',
sasl_plain_username=CCLOUD_API_KEY,
sasl_plain_password=CCLOUD_API_SECRET
)
list_members_in_groups = client.list_consumer_group_offsets(group[0])
for (topic,partition) in list_members_in_groups:
consumer.topics()
tp = TopicPartition(topic, partition)
consumer.assign([tp])
committed = consumer.committed(tp)
consumer.seek_to_end(tp)
last_offset = consumer.position(tp)
if last_offset != None and committed != None:
lag = last_offset - committed
print("group: {} topic:{} partition: {} lag: {}".format(group[0], topic, partition, lag))
consumer.close(autocommit=False)

Kafka consumer is reading last committed offset on re-start (Java)

I have a kakfa consumer for which enable.auto.commit is set to false. Whenever I re-start my consumer application, it always reads the last committed offset again and then the next offsets.
For ex. Last committed offset is 50. When I restart consumer, it again reads offset 50 first and then the next offsets.
I am performing commitsync as shown below.
Map<TopicPartition, OffsetAndMetadata> offsets = new HashMap<>();
offsets.put(new TopicPartition("sometopic", partition), new OffsetAndMetadata(offset));
kafkaconsumer.commitSync(offsets);
I tried setting auto.offset.reset to earliest and latest but it is not changing the behavior.
Am I missing something here in consumer configuration ?
config.put(ConsumerConfig.CLIENT_ID_CONFIG, "CLIENT_ID");
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
config.put(ConsumerConfig.GROUP_ID_CONFIG, "GROUP_ID");
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,StringDeserializer.class.getName());
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,CustomDeserializer.class.getName());
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
If you want to use commitSync(offset) you have to be careful and read its Javadoc:
The committed offset should be the next message your application will consume, i.e. lastProcessedMessageOffset + 1.
If you don't add + 1 to the offset, it is expected that on next restart, the consumer will consume again the last message. As mentioned in the other answer, if you use commitSync() without any argument, you don't have to worry about that
It looks like you're trying to commit using new OffsetAndMetadta(offset). That's not the typical usage.
Here's an example from the documentation, under Manual Offset Control:
List<ConsumerRecord<String, String>> buffer = new ArrayList<>();
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
buffer.add(record);
}
if (buffer.size() >= minBatchSize) {
insertIntoDb(buffer);
consumer.commitSync();
buffer.clear();
}
}
https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
Notice how the consumer.commitSync() call is performed without any parameters. It simply consumes, and it will commit to whatever was consumed up to that point.

Reading last Kafka partition's record that's transactionally written

I'm trying to read the last record of a topic's partition. The topic's producer is writing transactionally. The consumer is set up with isolation.level=read_committed. I manage the offsets manually. Here's my consumer code:
// Fetch the end offset of a partition
Map<TopicPartition, Long> endOffsets = consumer.endOffsets(Collections.singleton(topicPartition));
Long endOffset = endOffsets.get(topicPartition);
// Try to read the last record
if (endOffset > 0) {
consumer.seek(topicPartition, Math.max(0, endOffset - 5));
List records = consumer.poll(10 * 1000).records(topicPartition);
if (records.isEmpty()) {
throw new IllegalStateException("How can this be?");
} else {
// Return the last record
return records.get(records.size() - 1);
}
}
So to read the last record I ask for the end offset, then seek to endOffset - 5 (because of Kafka skipping offsets in exactly-once mode, like I've seen in this question: Kafka Streams does not increment offset by 1 when producing to topic), and start polling.
It's been working well before, but now I've got an exception telling me zero records are polled. What can be a reason for that? I can't reproduce it and I think I'm lost.

java KafkaConsumer never get results

I'm new to kafka, I have the following sample code :
KafkaConsumer<String,String> kc = new KafkaConsumer<String, String>(props);
while(true) {
List<String> topicNames = Arrays.asList(topics.split(","));
if (!kc.assignment().isEmpty()) {
kc.unsubscribe();
}
kc.subscribe(topicNames);
ConsumerRecords<String, String> recv = kc.poll(1000L);
if (!recv.isEmpty()) {
System.out.println("NOT EMPTY");
}
}
The recv is always empty but if I try to increment the pool timeout the records are returned, also if I cut off the unsubscribe part.
I've taken this piece of code from an integration proprietary software and I cannot modify it.
So my question is: Is this only a timing problem or there is more?
There is a lot that happens when a consumer (re)subscribes to a topic.
Very roughly and as far as I remember the consumer will:
request cluster information
request consumer group metadata
make a JOIN_GROUP request
be assigned certain partitions
The underlying mechanisms are even more complicated if there are more consumers within the same group. That's because the partitions should be reassigned between all the consumers within the group.
That is why:
1000 millis might not be enough for all this and you didn't poll anything in time
you polled something when you increased the timeout because Kafka managed to perform all of these bootstrapping operations
you polled something when you removed the unsubscription to the topics because most likely your consumer was already subscribed
So there is a timing issue. And I think that there is something more - un/subscribing to a topic within an infinite loop makes no sense to me (see the other answer).
You should subscribe to your topics only once at the beginning. Like this:
final KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
final ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
}

How to check if Kafka Consumer is ready

I have Kafka commit policy set to latest and missing first few messages. If I give a sleep of 20 seconds before starting to send the messages to the input topic, everything is working as desired. I am not sure if the problem is with consumer taking long time for partition rebalancing. Is there a way to know if the consumer is ready before starting to poll ?
You can use consumer.assignment(), it will return set of partitions and verify whether all of the partitions are assigned which are available for that topic.
If you are using spring-kafka project, you can include spring-kafka-test dependancy and use below method to wait for topic assignment , but you need to have container.
ContainerTestUtils.waitForAssignment(Object container, int partitions);
You can do the following:
I have a test that reads data from kafka topic.
So you can't use KafkaConsumer in multithread environment, but you can pass parameter "AtomicReference assignment", update it in consumer-thread, and read it in another thread.
For example, snipped of working code in project for testing:
private void readAvro(String readFromKafka,
AtomicBoolean needStop,
List<Event> events,
String bootstrapServers,
int readTimeout) {
// print the topic name
AtomicReference<Set<TopicPartition>> assignment = new AtomicReference<>();
new Thread(() -> readAvro(bootstrapServers, readFromKafka, needStop, events, readTimeout, assignment)).start();
long startTime = System.currentTimeMillis();
long maxWaitingTime = 30_000;
for (long time = System.currentTimeMillis(); System.currentTimeMillis() - time < maxWaitingTime;) {
Set<TopicPartition> assignments = Optional.ofNullable(assignment.get()).orElse(new HashSet<>());
System.out.println("[!kafka-consumer!] Assignments [" + assignments.size() + "]: "
+ assignments.stream().map(v -> String.valueOf(v.partition())).collect(Collectors.joining(",")));
if (assignments.size() > 0) {
break;
}
try {
Thread.sleep(1_000);
} catch (InterruptedException e) {
e.printStackTrace();
needStop.set(true);
break;
}
}
System.out.println("Subscribed! Wait summary: " + (System.currentTimeMillis() - startTime));
}
private void readAvro(String bootstrapServers,
String readFromKafka,
AtomicBoolean needStop,
List<Event> events,
int readTimeout,
AtomicReference<Set<TopicPartition>> assignment) {
KafkaConsumer<String, byte[]> consumer = (KafkaConsumer<String, byte[]>) queueKafkaConsumer(bootstrapServers, "latest");
System.out.println("Subscribed to topic: " + readFromKafka);
consumer.subscribe(Collections.singletonList(readFromKafka));
long started = System.currentTimeMillis();
while (!needStop.get()) {
assignment.set(consumer.assignment());
ConsumerRecords<String, byte[]> records = consumer.poll(1_000);
events.addAll(CommonUtils4Tst.readEvents(records));
if (readTimeout == -1) {
if (events.size() > 0) {
break;
}
} else if (System.currentTimeMillis() - started > readTimeout) {
break;
}
}
needStop.set(true);
synchronized (MainTest.class) {
MainTest.class.notifyAll();
}
consumer.close();
}
P.S.
needStop - global flag, to stop all running thread if any in case of failure of success
events - list of object, that i want to check
readTimeout - how much time we will wait until read all data, if readTimeout == -1, then stop when we read anything
Thanks to Alexey (I have also voted up), I seemed to have resolved my issue essentially following the same idea.
Just want to share my experience... in our case we using Kafka in request & response way, somewhat like RPC. Request is being sent on one topic and then waiting for response on another topic. Running into a similar issue i.e. missing out first response.
I have tried ... KafkaConsumer.assignment(); repeatedly (with Thread.sleep(100);) but doesn't seem to help. Adding a KafkaConsumer.poll(50); seems to have primed the consumer (group) and receiving the first response too. Tested few times and it consistently working now.
BTW, testing requires stopping application & deleting Kafka topics and, for a good measure, restarted Kafka too.
PS: Just calling poll(50); without assignment(); fetching logic, like Alexey mentioned, may not guarantee that consumer (group) is ready.
You can modify an AlwaysSeekToEndListener (listens only to new messages) to include a callback:
public class AlwaysSeekToEndListener<K, V> implements ConsumerRebalanceListener {
private final Consumer<K, V> consumer;
private Runnable callback;
public AlwaysSeekToEndListener(Consumer<K, V> consumer) {
this.consumer = consumer;
}
public AlwaysSeekToEndListener(Consumer<K, V> consumer, Runnable callback) {
this.consumer = consumer;
this.callback = callback;
}
#Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
}
#Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
consumer.seekToEnd(partitions);
if (callback != null) {
callback.run();
}
}
}
and subscribe with a latch callback:
CountDownLatch initLatch = new CountDownLatch(1);
consumer.subscribe(singletonList(topic), new AlwaysSeekToEndListener<>(consumer, () -> initLatch.countDown()));
initLatch.await(); // blocks until consumer is ready and listening
then proceed to start your producer.
If your policy is set to latest - which takes effect if there are no previously committed offsets - but you have no previously committed offsets, then you should not worry about 'missing' messages, because you're telling Kafka not to care about messages that were sent 'previously' to your consumers being ready.
If you care about 'previous' messages, you should set the policy to earliest.
In any case, whatever the policy, the behaviour you are seeing is transient, i.e. once committed offsets are saved in Kafka, on every restart the consumers will pick up where they left previoulsy
I needed to know if a kafka consumer was ready before doing some testing, so i tried with consumer.assignment(), but it only returned the set of partitions assigned, but there was a problem, with this i cannot see if this partitions assigned to the group had offset setted, so later when i tried to use the consumer it didnĀ“t have offset setted correctly.
The solutions was to use committed(), this will give you the last commited offsets of the given partitions that you put in the arguments.
So you can do something like: consumer.committed(consumer.assignment())
If there is no partitions assigned yet it will return:
{}
If there is partitions assigned, but no offset yet:
{name.of.topic-0=null, name.of.topic-1=null}
But if there is partitions and offset:
{name.of.topic-0=OffsetAndMetadata{offset=5197881, leaderEpoch=null, metadata=''}, name.of.topic-1=OffsetAndMetadata{offset=5198832, leaderEpoch=null, metadata=''}}
With this information you can use something like:
consumer.committed(consumer.assignment()).isEmpty();
consumer.committed(consumer.assignment()).containsValue(null);
And with this information you can be sure that the kafka consumer is ready.