Kafka consumer returns empty iterator - apache-kafka

In my sample program i try to publish a file and try to consume that immediately. But my consumer iterator returns null.
Any idea what I'm doing wrong?
Test
**main(){**
KafkaMessageProducer producer = new KafkaMessageProducer(topic, file);
producer.generateMessgaes();
MessageListener listener = new MessageListener(topic);
listener.start();
}
MessageListener
public void start() {
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put(topic, new Integer(CoreConstants.THREAD_SIZE));
Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumerConnector
.createMessageStreams(topicCountMap);
List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);
executor = Executors.newFixedThreadPool(CoreConstants.THREAD_SIZE);
for (KafkaStream<byte[], byte[]> stream : streams) {
System.out.println("The stream is --"+ stream.iterator().makeNext().topic());
executor.submit(new ListenerThread(stream));
}
try { // without this wait the subsequent shutdown happens immediately before any messages are delivered
Thread.sleep(10000);
} catch (InterruptedException ie) {
}
if (consumerConnector != null) {
consumerConnector.shutdown();
}
if (executor != null) {
executor.shutdown();
}
}
ListenerThread
public class ListenerThread implements Runnable {
private KafkaStream<byte[], byte[]> stream;
public ListenerThread(KafkaStream<byte[], byte[]> msgStream) {
this.stream = msgStream;
System.out.println("----------" + stream.iterator().makeNext().topic());
}
public void run() {
try {
ConsumerIterator<byte[], byte[]> it = stream.iterator();
while (it.hasNext()) {
// MessageAndMetadata<byte[], byte[]> messageAndMetadata =
// it.makeNext();
// String topic = messageAndMetadata.topic();
// byte[] message = messageAndMetadata.message();
System.out.println("111111111111111111111111111");
FileProcessor processor = new FileProcessor();
processor.processFile("LOB_TOPIC", it.next().message());
}
in the above iterator it is not going inside while loop , since the iterator is null. But I'm sure I'm publishing a single message to the same topic and consumer listens to that topic.
Any help would be appreciated

I was having this same issue yesterday. After trying to work with it for a while, I couldn't get it to read from my current topic. So I took following steps
a. Stopped my consumer,
b. stopped the producer,
c. stopped the kafka server
bin/zookeeper-server-stop.sh config/zookeeper.properties
d. stopped the zookeeper
bin/zookeeper-server-stop.sh config/zookeeper.properties
After that I deleted my topic.
bin/kafka-topics.sh --delete --zookeeper localhost:2181 --topic test
I also deleted the files that was created by following the "Setting up a multi-broker cluster" but I don't think it created the issue.
a. Started the Zookeeper
b. started kafka
c. started producer and send some messages to Kafka
it started to work again. I am not sure if this will help you or not. But seems like that somehow my producer must have got disconnected from the consumer. Hope this helps.

Related

Kafka retry mechanism doesn't stop even the previous retry attempt was successful

I have a kafka retry mechanism in place which retry 2 times by waiting 30 seconds for each attempt. I noticed that even though the first retry attempt was successful, It's still retrying the second attempt. This results in generating duplicate messages in the kafka topic. Is there any way to stop Kafka to do unnecessary retries when the previous retry attempt is successful?
Here is my listener configuration
#Bean
#ConditionalOnMissingBean(name = "kafkaListenerContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, SpecificRecord>
kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, SpecificRecord> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(testConsumerFactory());
factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setAckMode(AckMode.RECORD);
SeekToCurrentErrorHandler errorHandler =
new SeekToCurrentErrorHandler((record, exception) -> {
LOGGER.error("Error while processing the record {}", exception.getCause().getMessage());
}, new FixedBackOff(30000L, 2L));
factory.setErrorHandler(errorHandler);
return factory;
}
Here is my listener logic and the flow
ConsumerA consumes the data from the topicA and makes a call to microservice for the data
After getting the data, Producer publishes the data to topicB
ConsumerB consumes the data from the topicB and makes a call to a microservice for persisting the data
Once data gets persisted, a new message gets published to topicB.
Consumer logic for topicA
#KafkaListener(topics = "${test.topicA.name}",
containerFactory = "kafkaListenerContainerFactory")
public void topicListener(ConsumerRecord<String, SpecificRecord> record) {
LOGGER.info("Consumed {} topic from partition {} ", record.topic(), record.partition());
testService.getData(record);
}
Consumer logic for topicB
#KafkaListener(topics = "${test.topicB.name}",
containerFactory = "kafkaListenerContainerFactory")
public void topicListener(ConsumerRecord<String, SpecificRecord> record) {
LOGGER.info("Consumed {} topic from partition {} ", record.topic(), record.partition());
testService2.persistDetails(record);
}

InvalidStateStoreException on KStream join using GlobalKtables

I have a Kafka Streams application where I am joining a KStream that reads
from "topic1" with a GlobalKTable that reads from "topic2" and then with
another GlobalKTable that reads from "topic3".
When I try to push messages to all 3 topics at the same time then I get following exception -
org.apache.kafka.streams.errors.InvalidStateStoreException
If I push messages one by one in these topics i.e push messages in topic2 then in topic3 and then in topic1, then I do not get this
exception.
I have also added StateListener before I start KafkaStreams
KafkaStreams.StateListener stateListener = new KafkaStreams.StateListener() {
#Override
public void onChange (KafkaStreams.State newState, KafkaStreams.State oldState) {
if(newState == KafkaStreams.State.REBALANCING) {
try {
Thread.sleep(1000);
}
catch (InterruptedException e) {
e.printStackTrace();
}
}
}
};
streams.setStateListener(stateListener);
streams.start();
Also I wait till the store is queryable after the stream has started by calling following method
public static <T> T waitUntilStoreIsQueryable(final String storeName,
final QueryableStoreType<T> queryableStoreType,
final KafkaStreams streams) throws InterruptedException {
while (true) {
try {
return streams.store(storeName, queryableStoreType);
} catch (final InvalidStateStoreException ignored) {
// store not yet ready for querying
Thread.sleep(100);
}
}
}
Following is the Kafka Streams and GlobalKTable join code:
KStream<String, GenericRecord> topic1KStream =
builder.stream(
"topic1",
Consumed.with(Serdes.String(), genericRecordSerde)
);
GlobalKTable<String, GenericRecord> topic2KTable =
builder.globalTable(
"topic2",
Consumed.with(Serdes.String(), genericRecordSerde),
Materialized.<String, GenericRecord, KeyValueStore<Bytes, byte[]>>as("topic2-global-store")
.withKeySerde(Serdes.String())
.withValueSerde(genericRecordSerde)
);
GlobalKTable<String, GenericRecord> topic3KTable =
builder.globalTable(
"topic3",
Consumed.with(Serdes.String(), genericRecordSerde),
Materialized.<String, GenericRecord, KeyValueStore<Bytes, byte[]>>as("topic3-global-store")
.withKeySerde(Serdes.String())
.withValueSerde(genericRecordSerde)
);
KStream<String, MergedObj> stream_topic1_topic2 = topic1KStream.join(
topic2KTable,
(topic2Id, topic1Obj) -> topic1.get("id").toString(),
(topic1Obj, topic2Obj) -> new MergedObj(topic1Obj, topic2Obj)
);
final KStream<String, GenericRecord> enrichedStream =
stream_topic1_topic2.join(
topic3KTable,
(topic2Id, mergedObj) -> mergedObj.topic3Id(),
(mergedObj, topic3Obj) -> new Enriched(
mergedObj.topic1Obj,
mergedObj.topic2Obj,
topic3Obj
).enrich()
);
enrichedStream.to("enrichedStreamTopic", Produced.with(Serdes.String(),getGenericRecordSerde()));
The above code is very similar to this.
When I try to push messages to all 3 topics at the same time then I get
following exception:
org.apache.kafka.streams.errors.StreamsException: Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-0000000000, topic=topic1,
partition=1, offset=61465,
stacktrace=org.apache.kafka.streams.errors.InvalidStateStoreException:
Store topic2-global-store is currently closed.
at
org.apache.kafka.streams.state.internals.WrappedStateStore.validateStoreOpen(WrappedStateStore.java:66)
at
org.apache.kafka.streams.state.internals.CachingKeyValueStore.get(CachingKeyValueStore.java:150)
at
org.apache.kafka.streams.state.internals.CachingKeyValueStore.get(CachingKeyValueStore.java:37)
at
org.apache.kafka.streams.state.internals.MeteredKeyValueStore.get(MeteredKeyValueStore.java:135)
at
org.apache.kafka.streams.processor.internals.ProcessorContextImpl$KeyValueStoreReadOnlyDecorator.get(ProcessorContextImpl.java:245)
at
org.apache.kafka.streams.kstream.internals.KTableSourceValueGetterSupplier$KTableSourceValueGetter.get(KTableSourceValueGetterSupplier.java:49)
at
org.apache.kafka.streams.kstream.internals.KStreamKTableJoinProcessor.process(KStreamKTableJoinProcessor.java:71)
at
org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at
org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:183)
at
org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:162)
at
org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:122)
at
org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:87)
at
org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:364)
at
org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.process(AssignedStreamsTasks.java:199)
at
org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:420)
at
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:890)
at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
at
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
I fixed the issue
in my code I had auto.register.schemas=false because I had manually registered schemas for all my topics.
After I set auto.register.schemas=true and re-ran streams application it worked fine. I think it needs this flag for its internal topics.

Unable to get number of messages in kafka topic

I am fairly new to kafka. I have created a sample producer and consumer in java. Using the producer, I was able to send data to a kafka topic but I am not able to get the number of records in the topic using the following consumer code.
public class ConsumerTests {
public static void main(String[] args) throws Exception {
BasicConfigurator.configure();
String topicName = "MobileData";
String groupId = "TestGroup";
Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("group.id", groupId);
properties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> kafkaConsumer = new KafkaConsumer<>(properties);
kafkaConsumer.subscribe(Arrays.asList(topicName));
try {
while (true) {
ConsumerRecords<String, String> consumerRecords = consumer.poll(100);
System.out.println("Record count is " + records.count());
}
} catch (WakeupException e) {
// ignore for shutdown
} finally {
consumer.close();
}
}
}
I don't get any exception in the console but consumerRecords.count() always returns 0, even if there are messages in the topic. Please let me know, if I am missing something to get the record details.
The poll(...) call should normally be in a loop. It's always possible for the initial poll(...) to return no data (depending on the timeout) while the partition assignment is in progress. Here's an example:
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
System.out.println("Record count is " + records.count());
}
} catch (WakeupException e) {
// ignore for shutdown
} finally {
consumer.close();
}
For more info see this relevant article:

Kafka Consumer committing manually based on a condition.

#kafkaListener consumer is commiting once a specific condition is met. Let us say a topic gets the following data from a producer
"Message 0" at offset[0]
"Message 1" at offset[1]
They are received at the consumer and commited with help of acknowledgement.acknowledge()
then the below messages come to the topic
"Message 2" at offset[2]
"Message 3" at offset[3]
The consumer which is running receive the above data. Here condition fail and the above offsets are not committed.
Even if new data comes at the topic, then also "Message 2" and "Message 3" should be picked up by any consumer from the same consumer group as they are not committed. But this is not happening,the consumer picks up a new message.
When I restart my consumer then I get back Message2 and Message3. This should have happened while the consumers were running.
The code is as follows -:
KafkaConsumerConfig file
enter code here
#Configuration
#EnableKafka
public class KafkaConsumerConfig {
#Bean
KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(3);
factory.setBatchListener(true);
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
factory.getContainerProperties().setSyncCommits(true);
return factory;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> propsMap = new HashMap<>();
propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "15000");
propsMap.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.GROUP_ID_CONFIG, "group1");
propsMap.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
propsMap.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG,"1");
return propsMap;
}
#Bean
public Listener listener() {
return new Listener();
}
}
Listner Class
public class Listener {
public CountDownLatch countDownLatch0 = new CountDownLatch(3);
private Logger LOGGER = LoggerFactory.getLogger(Listener.class);
static int count0 =0;
#KafkaListener(topics = "abcdefghi", group = "group1", containerFactory = "kafkaListenerContainerFactory")
public void listenPartition0(String data, #Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
#Header(KafkaHeaders.OFFSET) List<Long> offsets, Acknowledgment acknowledgment) throws InterruptedException {
count0 = count0 + 1;
LOGGER.info("start consumer 0");
LOGGER.info("received message via consumer 0='{}' with partition-offset='{}'", data, partitions + "-" + offsets);
if (count0%2 ==0)
acknowledgment.acknowledge();
LOGGER.info("end of consumer 0");
}
How can i achieve my desired result?
That's correct. The offset is a number which is pretty easy to keep tracking in the memory on consumer instance. We need offsets commited for newly arrived consumers in the group for the same partitions. That's why it works as expected when you restart an application or when rebalance happens for the group.
To make it working as you would like you should consider to implement ConsumerSeekAware in your listener and call ConsumerSeekCallback.seek() for the offset you would like to star consume from the next poll cycle.
http://docs.spring.io/spring-kafka/docs/2.0.0.M2/reference/html/_reference.html#seek:
public class Listener implements ConsumerSeekAware {
private final ThreadLocal<ConsumerSeekCallback> seekCallBack = new ThreadLocal<>();
#Override
public void registerSeekCallback(ConsumerSeekCallback callback) {
this.seekCallBack.set(callback);
}
#KafkaListener()
public void listen(...) {
this.seekCallBack.get().seek(topic, partition, 0);
}
}

Kafka Producer Consumer API Issue

I am using Kafka v0.10.0.0 and created Producer & Consumer Java code. But code is stuck on producer.send without any exception in logs.
Can anyone please help. Thank in advance.
I am using/modifying "mapr - kakfa sample program". You can look at the full code here.
https://github.com/panwars87/kafka-sample-programs
**Important: I changed the kafka-client version to 0.10.0.0 in maven dependencies and running Kafka 0.10.0.0 in my local.
public class Producer {
public static void main(String[] args) throws IOException {
// set up the producer
KafkaProducer<String, String> producer;
System.out.println("Starting Producers....");
try (InputStream props = Resources.getResource("producer.props").openStream()) {
Properties properties = new Properties();
properties.load(props);
producer = new KafkaProducer<>(properties);
System.out.println("Property loaded successfully ....");
}
try {
for (int i = 0; i < 20; i++) {
// send lots of messages
System.out.println("Sending record one by one....");
producer.send(new ProducerRecord<String, String>("fast-messages","sending message - "+i+" to fast-message."));
System.out.println(i+" message sent....");
// every so often send to a different topic
if (i % 2 == 0) {
producer.send(new ProducerRecord<String, String>("fast-messages","sending message - "+i+" to fast-message."));
producer.send(new ProducerRecord<String, String>("summary-markers","sending message - "+i+" to summary-markers."));
producer.flush();
System.out.println("Sent msg number " + i);
}
}
} catch (Throwable throwable) {
System.out.printf("%s", throwable.getStackTrace());
throwable.printStackTrace();
} finally {
producer.close();
}
}
}
public class Consumer {
public static void main(String[] args) throws IOException {
// and the consumer
KafkaConsumer<String, String> consumer;
try (InputStream props = Resources.getResource("consumer.props").openStream()) {
Properties properties = new Properties();
properties.load(props);
if (properties.getProperty("group.id") == null) {
properties.setProperty("group.id", "group-" + new Random().nextInt(100000));
}
consumer = new KafkaConsumer<>(properties);
}
consumer.subscribe(Arrays.asList("fast-messages", "summary-markers"));
int timeouts = 0;
//noinspection InfiniteLoopStatement
while (true) {
// read records with a short timeout. If we time out, we don't really care.
ConsumerRecords<String, String> records = consumer.poll(200);
if (records.count() == 0) {
timeouts++;
} else {
System.out.printf("Got %d records after %d timeouts\n", records.count(), timeouts);
timeouts = 0;
}
for (ConsumerRecord<String, String> record : records) {
switch (record.topic()) {
case "fast-messages":
System.out.println("Record value for fast-messages is :"+ record.value());
break;
case "summary-markers":
System.out.println("Record value for summary-markers is :"+ record.value());
break;
default:
throw new IllegalStateException("Shouldn't be possible to get message on topic ");
}
}
}
}
}
The code you're running is for a demo of mapR which is not Kafka. MapR claims API compatibility with Kafka 0.9, but even then mapR treats message offsets differently that does Kafka (offsets are byte offsets of messages rather than incremental offsets), etc.. The mapR implementation is also very, very different to say the least. This means that if you're lucky, a Kafka 0.9 app might just happen to run on mapR and vise versa. There is no such guarantee for other releases.
Thank you everyone for all your inputs. I resolved this by tweaking Mapr code and referring few other posts. Link for the solution api:
https://github.com/panwars87/hadoopwork/tree/master/kafka/kafka-api