I have been using Kafka8 and trying to move to kafka10.
We have a topic with 10 partitions and used to create a consumer group with 10 consumers as shown below.
public void run(int a_numThreads) {
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put(topic, new Integer(a_numThreads));
Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);
List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);
// now launch all the threads
//
executor = Executors.newFixedThreadPool(a_numThreads);
// now create an object to consume the messages
//
int threadNumber = 0;
for (final KafkaStream stream : streams) {
executor.execute(new ConsumerTest(stream, threadNumber));
threadNumber++;
}
}
Here, based on number of partitions we used to pass number of threads.
But, with kafka10 consumers not sure if there anything like that. Here it doesnt return streams based on partitions.
public static void main(String[] args) {
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.33.10:9092");
props.put("group.id", "group-1");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("auto.offset.reset", "earliest");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> kafkaConsumer = new KafkaConsumer<>(props);
kafkaConsumer.subscribe(Arrays.asList("HelloKafkaTopic"));
while (true) {
ConsumerRecords<String, String> records = kafkaConsumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
System.out.printf("offset = %d, value = %s", record.offset(), record.value());
System.out.println();
}
}
}
Thanks in Advance
The new consumer enables a simple and efficient implementation which can handle all IO from a single thread. That's quite different with the old consumer. See this blog for further details :
https://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0-9-consumer-client/
Related
I have a Kafka consumer:
consumer.subscribe(statusTopicList);
try {
ConsumerRecords<String, String> consumerRecords =
consumer.poll(Duration.ofSeconds(60));
System.out.println("SIZE IS: " + consumerRecords());
for (ConsumerRecord<String, String > record : consumerRecords) {
System.out.println(“Record is: “ + record.value());
}
} catch (Exception e) {
e.printStackTrace();
} finally {
consumer.close();
}
And in my unit tests:
((MockConsumer<String, String>) consumer)
.addRecord(new ConsumerRecord<>(
topic, 1, 0, "test-application1", record1));
((MockConsumer<String, String >) consumer)
.addRecord( new ConsumerRecord<>(
topic, 1, 0, "test-application2", record2));
The size of the consumerRecords is still 1 though I’m adding 2 records. How can I read both the messages in one poll?
My consumer properties are:
private static Consumer<String, SecondaryJoinStatus> createStatusConsumer() {
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
bootstrapServers);
props.put(ConsumerConfig.GROUP_ID_CONFIG,
groupId);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
props.put("schema.registry.url",
schemaRegistryUrl);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,
“EARLIEST");
props.put("max.partition.fetch.bytes", 5242880);
return new KafkaConsumer<>(props);
}
Use MAX_POLL_RECORDS_CONFIG config to poll more records here
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 5);
In kafka I need to consume a topic with two partitions from two consumers (partition 1 to consumer 1 and partition 2 to consumer 2) using Java.
This is my Producer Code
public class KafkaClientOperationProducer {
KafkaClientOperationConsumer kac = new KafkaClientOperationConsumer();
public void initiateProducer(ClientOperation clientOperation,
ClientOperationManager activityManager,Logger logger) throws Exception {
Properties props = new Properties();
props.put("bootstrap.servers","localhost:9092,localhost:9093,localhost:9094");
props.put("key.serializer","org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
Producer<String, ClientOperation> producer = new KafkaProducer<>(props);
try{
ProducerRecord<String, ClientOperation> record = new ProducerRecord<String, ClientOperation>(
topicName, key, clientOperation);
producer.send(record);
}
finally{
producer.flush();
producer.close();
kac.initiateConsumer(activityManager);//Calling Consumer
}
}
}
This is my Consumer code
public class KafkaClientOperationConsumer{
String topicName = "CA_Topic";
String groupName = "CA_TopicGroup";
public void initiateConsumer(ClientOperationManager activityManager) throws Exception {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092,localhost:9093,localhost:9094");
props.put("group.id", groupName);
props.put("enable.auto.commit", "true");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
KafkaConsumer<String, ClientOperation> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList(topicName));
ConsumerRecords<String, ClientOperation> records = consumer.poll(100);
try{
for (ConsumerRecord<String, ClientOperation> record : records) {
activityManager.save(record.value());//saves data in database
}}
finally{
consumer.close();}
}
}
The above code is working fine for single consumer not for multiple consumers
The clientOperation is a object which holds data about client operation.
The partition number is three(which you can see from the code) ,When i tried to call initiateConsumer using thread i.e..(ExecutorService executor) I'm getting Duplicate values in database
Please change my code so that i can consume CA_Topic using two consumers,I can't use two JVM's due to memory problem.Thanks in advance
I guess you must use KafkaConsumer.assign method. Here a little example:
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "ip:port");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "group_id");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.ByteArrayDeserializer");
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.ByteArrayDeserializer");
final Consumer<byte[], byte[]> consumer = new KafkaConsumer<>(props);
TopicPartition topicPartition = new TopicPartition("topic", 0); // topic name and partition id to be assigned for this consumer. in other consumer configurations this value must be any value other than 0
List<TopicPartition> partitionList = new ArrayList<TopicPartition>();
partitionList.add(topicPartition);
consumer.assign(partitionList); // in this line, 0. partition assigning to this consumer
You can see detail in documentation of Kafka: https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#assign(java.util.Collection)
Kafka consumer poll api not returning records to low timeout.
If I increase the timeout value in poll then records are coming.
I am not able to get understand this logic. Please help, following the code:
public ConsumerRecords<String, Map<String, String>> subscribeToQueue(String topic, QueueListener q) {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "com.intuit.eventcollection.queue.KafkaJsonDeserializer");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("auto.offset.reset", "earliest");
// Figure out where to start processing messages from
KafkaConsumer<String, Map<String, String>> kafkaConsumer = new KafkaConsumer<String, Map<String, String>>(
props);
kafkaConsumer.subscribe(Arrays.asList(topic));
ConsumerRecords<String, Map<String, String>> records = null;
// Start processing messages
try {
records = kafkaConsumer.poll(100);
Poll will return nothing if there are no new unconsumed messages published in the time period specified as the timeout to poll( timeout ).
I have implemented a high level consumer per the example page: https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
When the code runs, it only consume half of the messages produced. I have a basic 3 node zookeeper cluster and 2 kafka brokers. When I run the simple consumer code (not high level consumer), all the messages are consumed. Any ideas will be appreciated.
Consumer code
public void run() {
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put("test", 2);
Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);
List<KafkaStream<byte[], byte[]>> streams = consumerMap.get("test");
executor = Executors.newFixedThreadPool(2);
int threadNumber = 0;
for (final KafkaStream stream : streams) {
executor.submit(new Consumer(stream, threadNumber));
threadNumber++;
}
}
private static ConsumerConfig createConsumerConfig() {
Properties props = new Properties();
props.put("zookeeper.connect", "zookeeper01:2181,zookeeper02:2181,zookeeper03:2181");
props.put("group.id", "Consumers");
props.put("zookeeper.session.timeout.ms", "10000");
props.put("enable.auto.commit", "true");
props.put("zookeeper.sync.time.ms", "1000");
props.put("value.deserializer","org.apache.kafka.common.serialization.StringDeserializer");
props.put("auto.commit.interval.ms", "500");
return new ConsumerConfig(props);
}
i am trying to run this KAFKA consumer code in java for a particular topic but its not receiving any message from that topic. Server in running on a different windows machine. please help me out.
{Properties props = new Properties();
props.put("bootstrap.servers", "10.100.144.157:2181");
props.put("group.id", "test");
props.put("enable.auto.commit", "false");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("test"));
final int minBatchSize = 200;
List<ConsumerRecord<String, String>> buffer = new ArrayList<>();
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
buffer.add(record);
}
if (buffer.size() >= minBatchSize) {
insertIntoDb(buffer);
consumer.commitSync();
buffer.clear();
}
}}