KAFKA + FLINK 1.1.2 consumer group not working as excepted - apache-kafka

When I tried to connect to one topic with 3 with partition and 3 FlinkKafkaConsumer09 consume from one topic and using Kafka consumer group property as below.
props.setProperty("group.id", "myGroup");
props.setProperty("auto.offset.reset", "latest");
but still 3 consumer receives all data. according to consumer group concept , data should send to only one consumer inside consumer group.
But it works good with normal Java consumer. issue with FlinkKafkaConsumer09 ?

This issue can be solved by writing on FlinkConsumer .
Steps : 1. you have to pass partitions as property to flink consumer
issue : according this you have one consumer for one partition
public class YourConsumer<T> extends FlinkKafkaConsumerBase<T>
{
public static final long DEFAULT_POLL_TIMEOUT = 100L;
private final long pollTimeout;
public FlinkKafkaConsumer09(String topic, DeserializationSchema<T> valueDeserializer, Properties props) {
this(Collections.singletonList(topic), valueDeserializer, props);
}
public FlinkKafkaConsumer09(String topic, KeyedDeserializationSchema<T> deserializer, Properties props) {
this(Collections.singletonList(topic), deserializer, props);
}
public FlinkKafkaConsumer09(List<String> topics, DeserializationSchema<T> deserializer, Properties props) {
this(topics, new KeyedDeserializationSchemaWrapper<>(deserializer), props);
}
public FlinkKafkaConsumer09(List<String> topics, KeyedDeserializationSchema<T> deserializer, Properties props) {
super(topics, deserializer);
this.properties = checkNotNull(props, "props");
setDeserializer(this.properties);
// configure the polling timeout
try {
if (properties.containsKey(KEY_POLL_TIMEOUT)) {
this.pollTimeout = Long.parseLong(properties.getProperty(KEY_POLL_TIMEOUT));
} else {
this.pollTimeout = DEFAULT_POLL_TIMEOUT;
}
}
catch (Exception e) {
throw new IllegalArgumentException("Cannot parse poll timeout for '" + KEY_POLL_TIMEOUT + '\'', e);
}
}
#Override
protected AbstractFetcher<T, ?> createFetcher(
SourceContext<T> sourceContext,
List<KafkaTopicPartition> thisSubtaskPartitions,
SerializedValue<AssignerWithPeriodicWatermarks<T>> watermarksPeriodic,
SerializedValue<AssignerWithPunctuatedWatermarks<T>> watermarksPunctuated,
StreamingRuntimeContext runtimeContext) throws Exception {
boolean useMetrics = !Boolean.valueOf(properties.getProperty(KEY_DISABLE_METRICS, "false"));
return new Kafka09Fetcher<>(sourceContext, thisSubtaskPartitions,
watermarksPeriodic, watermarksPunctuated,
runtimeContext, deserializer,
properties, pollTimeout, useMetrics);
}
#Override
protected List<KafkaTopicPartition> getKafkaPartitions(List<String> topics) {
// read the partitions that belong to the listed topics
final List<KafkaTopicPartition> partitions = new ArrayList<>();
int partition=Integer.valueOf(this.properties.get("partitions"));
try (KafkaConsumer<byte[], byte[]> consumer = new KafkaConsumer<>(this.properties)) {
for (final String topic: topics) {
// get partitions for each topic
List<PartitionInfo> partitionsForTopic = consumer.partitionsFor(topic);
// for non existing topics, the list might be null.
if (partitionsForTopic != null) {
partitions.addAll(convertToFlinkKafkaTopicPartition(partitionsForTopic),partition);
}
}
}
if (partitions.isEmpty()) {
throw new RuntimeException("Unable to retrieve any partitions for the requested topics " + topics);
}
// we now have a list of partitions which is the same for all parallel consumer instances.
LOG.info("Got {} partitions from these topics: {}", partitions.size(), topics);
if (LOG.isInfoEnabled()) {
logPartitionInfo(LOG, partitions);
}
return partitions;
}
private static List<KafkaTopicPartition> convertToFlinkKafkaTopicPartition(List<PartitionInfo> partitions,int partition) {
checkNotNull(partitions);
List<KafkaTopicPartition> ret = new ArrayList<>(partitions.size());
//for (PartitionInfo pi : partitions) {
ret.add(new KafkaTopicPartition(partitions.get(partition).topic(), partitions.get(partition).partition()));
// }
return ret;
}
private static void setDeserializer(Properties props) {
final String deSerName = ByteArrayDeserializer.class.getCanonicalName();
Object keyDeSer = props.get(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG);
Object valDeSer = props.get(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG);
if (keyDeSer != null && !keyDeSer.equals(deSerName)) {
LOG.warn("Ignoring configured key DeSerializer ({})", ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG);
}
if (valDeSer != null && !valDeSer.equals(deSerName)) {
LOG.warn("Ignoring configured value DeSerializer ({})", ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG);
}
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, deSerName);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, deSerName);
}
}

Related

Kafka multiple consumer groups different threads not working as expected

So I am fairly familiar with Kafka and how consumer groups work, in that 2 consumers in different consumer groups that subscribe to the same topic should both get their own copy of the published messages on a Kafka Topic.
consumer pre process
This holds true when using 2 different processes for the 2 consumers. Where I can use this code for the producer
using System;
using System.IO;
using System.Reflection;
using System.Threading.Tasks;
using Confluent.Kafka;
using Confluent.Kafka.Admin;
using NLog;
namespace Producer
{
class Program
{
private static ILogger _logger = LogManager.GetLogger("Global");
private static string topicName = "insane6";
public static async Task Main(string[] args)
{
var config = new ProducerConfig
{
Acks = Acks.Leader,
BootstrapServers = "XXXXXXXX"
};
using (var adminClient = new AdminClientBuilder(new AdminClientConfig { BootstrapServers = config.BootstrapServers }).Build())
{
try
{
adminClient.CreateTopicsAsync(new TopicSpecification[] {
new TopicSpecification { Name = topicName, ReplicationFactor = 1, NumPartitions = 1 } }).ConfigureAwait(false).GetAwaiter().GetResult();
}
catch (CreateTopicsException e)
{
_logger.Error($"An error occured creating topic {e.Results[0].Topic}: {e.Results[0].Error.Reason}");
}
}
// If serializers are not specified, default serializers from
// `Confluent.Kafka.Serializers` will be automatically used where
// available. Note: by default strings are encoded as UTF8.
using (var p = new ProducerBuilder<Null, string>(config)
// Note: All handlers are called on the main .Consume thread.
.SetErrorHandler((_, e) => _logger.Error($"Error: {e.Reason}"))
.SetStatisticsHandler((_, json) => _logger.Debug($"Statistics: {json}"))
.SetLogHandler((consumer, message) => _logger.Debug($"{message.Level} {message.Message}"))
.Build())
{
Console.WriteLine("Type 'Q' to quit");
while (true)
{
try
{
var dr = await p.ProduceAsync(topicName,
new Message<Null, string>
{ Value =DateTime.UtcNow.ToString("O") });
_logger.Debug($"Delivered '{dr.Value}' to '{dr.TopicPartitionOffset}'");
}
catch (ProduceException<Null, string> e)
{
_logger.Error($"Delivery failed: {e.Error.Reason}");
}
var key = Console.ReadKey();
if (key.Key == ConsoleKey.Q)
{
break;
}
}
}
Console.ReadLine();
}
}
}
And I have this consumer code
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Confluent.Kafka;
namespace Consumer
{
class Program
{
private CancellationTokenSource cts = new CancellationTokenSource();
private KafkaSubscriber kafkaSubscriber;
private string topicName = "insane6";
public static void Main(string[] args)
{
var p = new Program(args[0]);
}
public Program(string consumerGroup)
{
kafkaSubscriber = new KafkaSubscriber(new ConsumerSettings()
{
ConsumerConfig = CreateConfig(consumerGroup),
Topic = topicName,
});
kafkaSubscriber.ReceiveError += KafkaSubscriber_ReceiveError;
kafkaSubscriber.CreateConsumer(cts.Token);
Console.ReadLine();
}
private void KafkaSubscriber_ReceiveError(object sender, System.IO.ErrorEventArgs e)
{
cts = new CancellationTokenSource();
kafkaSubscriber.CreateConsumer(cts.Token);
}
public ConsumerConfig CreateConfig(string consumerGroup)
{
var conf = new ConsumerConfig
{
GroupId = consumerGroup,
BootstrapServers = "XXXXXX",
AutoOffsetReset = AutoOffsetReset.Earliest,
EnableAutoCommit = false,
ClientId = Guid.NewGuid().ToString("N")
};
return conf;
}
}
}
Where the actual subscriber code looks like this
using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
using System.Threading.Tasks;
using Confluent.Kafka;
using NLog;
namespace Consumer
{
class KafkaSubscriber
{
private readonly ConsumerSettings _consumerSettings;
private static ILogger _logger = LogManager.GetLogger("Global");
public KafkaSubscriber(ConsumerSettings consumerSettings)
{
_consumerSettings = consumerSettings;
}
public event EventHandler<ErrorEventArgs> ReceiveError;
protected virtual void OnReceiveError(ErrorEventArgs e)
{
ReceiveError?.Invoke(this, e);
}
private void ErrorHandler(IConsumer<Ignore, string> consumer, Error error)
{
_logger.Error("Kafka ErrorHandler", error);
if (error.IsFatal || error.Code == ErrorCode.Local_TimedOut)
{
_logger.Error("Throwing fatal error code as exception");
throw new KafkaException(error);
}
}
public void CreateConsumer(CancellationToken ct)
{
using (var c = new ConsumerBuilder<Ignore, string>(_consumerSettings.ConsumerConfig)
.SetErrorHandler(ErrorHandler)
.SetStatisticsHandler((_, json) => _logger.Debug($"Statistics: {json}"))
.SetLogHandler((consumer, message) => _logger.Debug($"{message.Level} {message.Message}"))
.SetPartitionsAssignedHandler((c, partitions) =>
{
_logger.Info($"Assigned partitions: [{string.Join(", ", partitions)}]");
})
.SetPartitionsRevokedHandler((c, partitions) =>
{
_logger.Info($"Revoking assignment: [{string.Join(", ", partitions)}]");
})
.Build())
{
c.Subscribe(_consumerSettings.Topic);
//c.Assign();
try
{
var count = 0;
var offsets = new List<ConsumeResult<Ignore, string>>();
while (true)
{
try
{
var cr = c.Consume(ct);
c.Commit(cr);
_logger.Debug(
$"\r\n{_consumerSettings.ConsumerConfig.GroupId} Consumed message '{cr.Message.Value}' at: '{cr.TopicPartitionOffset}'.\r\n");
}
catch (ConsumeException e)
{
_logger.Error($"Error occured: {e.Error.Reason}");
OnReceiveError(new ErrorEventArgs(e));
}
catch (InvalidProgramException e)
{
_logger.Error($"Error occured: {e.Message}");
OnReceiveError(new ErrorEventArgs(e));
}
catch (KafkaException kex)
{
_logger.Error($"Error occured: {kex.Message}");
OnReceiveError(new ErrorEventArgs(kex));
}
}
}
catch (OperationCanceledException)
{
c.Close();
}
}
}
}
}
So if I run a single producer, and 2 consumer processes (consumers run from command line like Consumer.exe "cg1" and Consumer.exe "cg2"
Everything works as expected, both consumers get the message from the publisher on the topic, as shown in the following screen shot
All good so far, but according to every other StackOverflow or Kafka doc I have seen it should be possible to have a consumer per thread.
consumer pre thread
So If I adjust my bootstrap consumer code to this, which should be identical to running the 2 separate processes, since each consumer is using a new consumer group name, each consumer is in its own thread, there really should be no difference to the 2 separate processes
public static void Main(string[] args)
{
var p = new Program(new [] { "cat","dog"});
}
public Program(string[] consumerGroups)
{
foreach (var consumerGroup in consumerGroups)
{
var thread = new Thread((x) =>
{
kafkaSubscriber = new KafkaSubscriber(new ConsumerSettings()
{
ConsumerConfig = CreateConfig(consumerGroup),
Topic = topicName,
});
kafkaSubscriber.ReceiveError += KafkaSubscriber_ReceiveError;
kafkaSubscriber.CreateConsumer(cts.Token);
});
thread.Start();
}
Console.ReadLine();
}
Yet when running this code, this is what is seen, where only 1 of the consumers actually picks up a message from the topic, this is not the expected behavior at all for me.
I really can't see anything weird, I don't think I have missed anything. All seems correct. Yet only 1 consumer "dog consumer group" in this case sees the produced topic messages, for me the "cat consumer group" should also see the produced messages.
What am I doing wrong?
I am using the Confluent.Kafka official C# driver : https://docs.confluent.io/clients-confluent-kafka-dotnet/current/overview.html
Was being complete spanner, the consumer variable was not private to the thread, all ok now. Phew

How do I set in Kafka to not consume from where it left?

I have a Kafka consumer in Golang. I don't want to consume from where I left last time, but rather current message. How can I do it?
You can set enable.auto.commit to false and auto.offset.reset to latest for your consumer group id. This means kafka will not be automatically committing your offsets.
With auto commit disabled, your consumer group progress would not be saved (unless you do manually). So whenever the consumer is restarted for whatever reason, it does not find its progress saved and resets to the latest offset.
set a new group.id to your consumer.
Then use auto.offset.reset to define the behavior of this new consumer group, in you case: latest
Apache kafka consumer api provides a method called kafkaConsumer.seekToEnd() which can be used to ignore the existing messages and only consume messages published after the consumer has been started without changing the current group ID of the consumer.
Below is the implementation of the same. The program takes 3 arguments : topic name, group ID and offset range (0 to start from beginning, - 1 to receive messages after consumer has started, other than 0 or - 1 will imply to to consumer to consume from that offset)
import org.apache.kafka.clients.consumer.*;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.errors.WakeupException;
import java.util.*;
public class Consumer {
private static Scanner in;
public static void main(String[] argv)throws Exception{
if (argv.length != 3) {
System.err.printf("Usage: %s <topicName> <groupId> <startingOffset>\n",
Consumer.class.getSimpleName());
System.exit(-1);
}
in = new Scanner(System.in);
String topicName = argv[0];
String groupId = argv[1];
final long startingOffset = Long.parseLong(argv[2]);
ConsumerThread consumerThread = new ConsumerThread(topicName,groupId,startingOffset);
consumerThread.start();
String line = "";
while (!line.equals("exit")) {
line = in.next();
}
consumerThread.getKafkaConsumer().wakeup();
System.out.println("Stopping consumer .....");
consumerThread.join();
}
private static class ConsumerThread extends Thread{
private String topicName;
private String groupId;
private long startingOffset;
private KafkaConsumer<String,String> kafkaConsumer;
public ConsumerThread(String topicName, String groupId, long startingOffset){
this.topicName = topicName;
this.groupId = groupId;
this.startingOffset=startingOffset;
}
public void run() {
Properties configProperties = new Properties();
configProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
configProperties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.ByteArrayDeserializer");
configProperties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
configProperties.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
configProperties.put(ConsumerConfig.CLIENT_ID_CONFIG, "offset123");
configProperties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,false);
configProperties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
//Figure out where to start processing messages from
kafkaConsumer = new KafkaConsumer<String, String>(configProperties);
kafkaConsumer.subscribe(Arrays.asList(topicName), new ConsumerRebalanceListener() {
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
System.out.printf("%s topic-partitions are revoked from this consumer\n", Arrays.toString(partitions.toArray()));
}
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
System.out.printf("%s topic-partitions are assigned to this consumer\n", Arrays.toString(partitions.toArray()));
Iterator<TopicPartition> topicPartitionIterator = partitions.iterator();
while(topicPartitionIterator.hasNext()){
TopicPartition topicPartition = topicPartitionIterator.next();
System.out.println("Current offset is " + kafkaConsumer.position(topicPartition) + " committed offset is ->" + kafkaConsumer.committed(topicPartition) );
if(startingOffset == -2) {
System.out.println("Leaving it alone");
}else if(startingOffset ==0){
System.out.println("Setting offset to begining");
kafkaConsumer.seekToBeginning(topicPartition);
}else if(startingOffset == -1){
System.out.println("Setting it to the end ");
kafkaConsumer.seekToEnd(topicPartition);
}else {
System.out.println("Resetting offset to " + startingOffset);
kafkaConsumer.seek(topicPartition, startingOffset);
}
}
}
});
//Start processing messages
try {
while (true) {
ConsumerRecords<String, String> records = kafkaConsumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
System.out.println(record.value());
}
if(startingOffset == -2)
kafkaConsumer.commitSync();
}
}catch(WakeupException ex){
System.out.println("Exception caught " + ex.getMessage());
}finally{
kafkaConsumer.close();
System.out.println("After closing KafkaConsumer");
}
}
public KafkaConsumer<String,String> getKafkaConsumer(){
return this.kafkaConsumer;
}
}
}

Unable to get number of messages in kafka topic

I am fairly new to kafka. I have created a sample producer and consumer in java. Using the producer, I was able to send data to a kafka topic but I am not able to get the number of records in the topic using the following consumer code.
public class ConsumerTests {
public static void main(String[] args) throws Exception {
BasicConfigurator.configure();
String topicName = "MobileData";
String groupId = "TestGroup";
Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("group.id", groupId);
properties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> kafkaConsumer = new KafkaConsumer<>(properties);
kafkaConsumer.subscribe(Arrays.asList(topicName));
try {
while (true) {
ConsumerRecords<String, String> consumerRecords = consumer.poll(100);
System.out.println("Record count is " + records.count());
}
} catch (WakeupException e) {
// ignore for shutdown
} finally {
consumer.close();
}
}
}
I don't get any exception in the console but consumerRecords.count() always returns 0, even if there are messages in the topic. Please let me know, if I am missing something to get the record details.
The poll(...) call should normally be in a loop. It's always possible for the initial poll(...) to return no data (depending on the timeout) while the partition assignment is in progress. Here's an example:
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
System.out.println("Record count is " + records.count());
}
} catch (WakeupException e) {
// ignore for shutdown
} finally {
consumer.close();
}
For more info see this relevant article:

Kafka Consumer committing manually based on a condition.

#kafkaListener consumer is commiting once a specific condition is met. Let us say a topic gets the following data from a producer
"Message 0" at offset[0]
"Message 1" at offset[1]
They are received at the consumer and commited with help of acknowledgement.acknowledge()
then the below messages come to the topic
"Message 2" at offset[2]
"Message 3" at offset[3]
The consumer which is running receive the above data. Here condition fail and the above offsets are not committed.
Even if new data comes at the topic, then also "Message 2" and "Message 3" should be picked up by any consumer from the same consumer group as they are not committed. But this is not happening,the consumer picks up a new message.
When I restart my consumer then I get back Message2 and Message3. This should have happened while the consumers were running.
The code is as follows -:
KafkaConsumerConfig file
enter code here
#Configuration
#EnableKafka
public class KafkaConsumerConfig {
#Bean
KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(3);
factory.setBatchListener(true);
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
factory.getContainerProperties().setSyncCommits(true);
return factory;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> propsMap = new HashMap<>();
propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "15000");
propsMap.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.GROUP_ID_CONFIG, "group1");
propsMap.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
propsMap.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG,"1");
return propsMap;
}
#Bean
public Listener listener() {
return new Listener();
}
}
Listner Class
public class Listener {
public CountDownLatch countDownLatch0 = new CountDownLatch(3);
private Logger LOGGER = LoggerFactory.getLogger(Listener.class);
static int count0 =0;
#KafkaListener(topics = "abcdefghi", group = "group1", containerFactory = "kafkaListenerContainerFactory")
public void listenPartition0(String data, #Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
#Header(KafkaHeaders.OFFSET) List<Long> offsets, Acknowledgment acknowledgment) throws InterruptedException {
count0 = count0 + 1;
LOGGER.info("start consumer 0");
LOGGER.info("received message via consumer 0='{}' with partition-offset='{}'", data, partitions + "-" + offsets);
if (count0%2 ==0)
acknowledgment.acknowledge();
LOGGER.info("end of consumer 0");
}
How can i achieve my desired result?
That's correct. The offset is a number which is pretty easy to keep tracking in the memory on consumer instance. We need offsets commited for newly arrived consumers in the group for the same partitions. That's why it works as expected when you restart an application or when rebalance happens for the group.
To make it working as you would like you should consider to implement ConsumerSeekAware in your listener and call ConsumerSeekCallback.seek() for the offset you would like to star consume from the next poll cycle.
http://docs.spring.io/spring-kafka/docs/2.0.0.M2/reference/html/_reference.html#seek:
public class Listener implements ConsumerSeekAware {
private final ThreadLocal<ConsumerSeekCallback> seekCallBack = new ThreadLocal<>();
#Override
public void registerSeekCallback(ConsumerSeekCallback callback) {
this.seekCallBack.set(callback);
}
#KafkaListener()
public void listen(...) {
this.seekCallBack.get().seek(topic, partition, 0);
}
}

Kafka Producer Consumer API Issue

I am using Kafka v0.10.0.0 and created Producer & Consumer Java code. But code is stuck on producer.send without any exception in logs.
Can anyone please help. Thank in advance.
I am using/modifying "mapr - kakfa sample program". You can look at the full code here.
https://github.com/panwars87/kafka-sample-programs
**Important: I changed the kafka-client version to 0.10.0.0 in maven dependencies and running Kafka 0.10.0.0 in my local.
public class Producer {
public static void main(String[] args) throws IOException {
// set up the producer
KafkaProducer<String, String> producer;
System.out.println("Starting Producers....");
try (InputStream props = Resources.getResource("producer.props").openStream()) {
Properties properties = new Properties();
properties.load(props);
producer = new KafkaProducer<>(properties);
System.out.println("Property loaded successfully ....");
}
try {
for (int i = 0; i < 20; i++) {
// send lots of messages
System.out.println("Sending record one by one....");
producer.send(new ProducerRecord<String, String>("fast-messages","sending message - "+i+" to fast-message."));
System.out.println(i+" message sent....");
// every so often send to a different topic
if (i % 2 == 0) {
producer.send(new ProducerRecord<String, String>("fast-messages","sending message - "+i+" to fast-message."));
producer.send(new ProducerRecord<String, String>("summary-markers","sending message - "+i+" to summary-markers."));
producer.flush();
System.out.println("Sent msg number " + i);
}
}
} catch (Throwable throwable) {
System.out.printf("%s", throwable.getStackTrace());
throwable.printStackTrace();
} finally {
producer.close();
}
}
}
public class Consumer {
public static void main(String[] args) throws IOException {
// and the consumer
KafkaConsumer<String, String> consumer;
try (InputStream props = Resources.getResource("consumer.props").openStream()) {
Properties properties = new Properties();
properties.load(props);
if (properties.getProperty("group.id") == null) {
properties.setProperty("group.id", "group-" + new Random().nextInt(100000));
}
consumer = new KafkaConsumer<>(properties);
}
consumer.subscribe(Arrays.asList("fast-messages", "summary-markers"));
int timeouts = 0;
//noinspection InfiniteLoopStatement
while (true) {
// read records with a short timeout. If we time out, we don't really care.
ConsumerRecords<String, String> records = consumer.poll(200);
if (records.count() == 0) {
timeouts++;
} else {
System.out.printf("Got %d records after %d timeouts\n", records.count(), timeouts);
timeouts = 0;
}
for (ConsumerRecord<String, String> record : records) {
switch (record.topic()) {
case "fast-messages":
System.out.println("Record value for fast-messages is :"+ record.value());
break;
case "summary-markers":
System.out.println("Record value for summary-markers is :"+ record.value());
break;
default:
throw new IllegalStateException("Shouldn't be possible to get message on topic ");
}
}
}
}
}
The code you're running is for a demo of mapR which is not Kafka. MapR claims API compatibility with Kafka 0.9, but even then mapR treats message offsets differently that does Kafka (offsets are byte offsets of messages rather than incremental offsets), etc.. The mapR implementation is also very, very different to say the least. This means that if you're lucky, a Kafka 0.9 app might just happen to run on mapR and vise versa. There is no such guarantee for other releases.
Thank you everyone for all your inputs. I resolved this by tweaking Mapr code and referring few other posts. Link for the solution api:
https://github.com/panwars87/hadoopwork/tree/master/kafka/kafka-api