is there a way to read only new (unread) messages in kafka consumer? - apache-kafka

in consumer when subscribed to a topic and start consuming messages it read the messages from the the beginning is there any way to read only unread messages ? this is the code i used for consuming the messages.
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer","org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
//Kafka Consumer subscribes list of topics here.
consumer.subscribe(Arrays.asList(topicName));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value= %s\n", record.offset(), record.key(), record.value());
}

Turn off auto.commit and manually commit each message offset after your app has successfully read it. That way if the app crashes it will restart at exactly the last committed offset.

i have try your code, and it read only unread messages. what's your client version and server version?

Related

Spring kafka manual offset commit does not work as expected

In my kafka consumer application, configurations are as below and I am using spring kafka version 2.8.
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaConfig.getBootStrapServer());
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.IntegerDeserializer");
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "io.confluent.kafka.serializers.KafkaAvroDeserializer");
props.put(ConsumerConfig.GROUP_ID_CONFIG, CONSUMER_GROUP_ID);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, MAX_POLL_RECORDS);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
kafka listner container factory is configured as below and I am using MANUAL_IMMEDIATE acknowledgement.
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<Integer, Order_Response>> kafkaListenerContainerFactory() throws SSMUtilFailedException {
ConcurrentKafkaListenerContainerFactory<Integer, Order_Response> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(1);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);
factory.getContainerProperties().setSyncCommits(true);
factory.getContainerProperties().setCommitLogLevel(LogIfLevelEnabled.Level.INFO);
return factory;
}
My kafkalistner would look like this. Here I manaully acknowledge all the consumed records.
#KafkaListener(topics = KAFKA_CONSUME_TOPIC)
public void listenForOrderResponses(ConsumerRecord<Integer, record> consumedRecord, Acknowledgment ack){
ack.acknowledge();}
When I forcefully crash (JVM) consumer application and start it again, the consumer does not fetch records from the last committed offset. It misses some of the messages and the offset has increased. I want to consume from the last committed offset. Could you please tell me what is missing here?

Kafka Transactional read committed Consumer

I have transactional and normal Producer in application which are writting to topic kafka-topic as below.
Configuration for transactional Kafka Producer
#Bean
public Map<String, Object> producerConfigs() {
Map<String, Object> props = new HashMap<>();
// list of host:port pairs used for establishing the initial connections to the Kakfa cluster
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.RETRIES_CONFIG, 5);
/*The amount of time to wait before attempting to retry a failed request to a given topic partition.
* This avoids repeatedly sending requests in a tight loop under some failure scenarios.*/
props.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 3);
/*"The configuration controls the maximum amount of time the client will wait "
"for the response of a request. If the response is not received before the timeout "
"elapses the client will resend the request if necessary or fail the request if "
"retries are exhausted.";.*/
props.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 1);
/*To avoid duplicate msg*/
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
/*Will wait for ack from broker n all replicas*/
props.put(ProducerConfig.ACKS_CONFIG, "all");
/*Kafka Transactional Properties */
props.put(ProducerConfig.CLIENT_ID_CONFIG, "transactional-producer");
props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "test-transactional-id"); // set transaction id
return props;
}
#Bean
public KafkaProducer<String, String> kafkaProducer() {
return new KafkaProducer<>(producerConfigs());
}
Normal Producer config are same only ProducerConfig.CLIENT_ID_CONFIG and ProducerConfig.TRANSACTIONAL_ID_CONFIG are not added.
Consumer config is as below
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
//list of host:port pairs used for establishing the initial connections to the Kafka cluster
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
//allows a pool of processes to divide the work of consuming and processing records
props.put(ConsumerConfig.GROUP_ID_CONFIG, "kafka_group");
//automatically reset the offset to the earliest offset
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
//Auto commit is set false.Will do manual commit
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
/*Kafka Transactional Property ->Controls how to read messages written transactionally
* read_committed - poll transactional messages which have been committed only
* read_uncommitted - will return all messages, even transactional messages
* default is read_uncommitted
* */
props.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, "read_committed");
return props;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
As I am setting isolation.level as read_committed so It should consumer only transactional messages from subscribed topic.
But is it consuming transactional and non-transactional messages from topic.
Do I am missing any configuration so that consumer will only consume transactional messages from subscribed topic.
Thanks in advance :-)
It doesn't work that way. isolation.level only pertains to records committed by transactional producers. All consumers see records published by non-transactional producers.
You need to use two different topics to get the behavior you desire,.

Getting the number of partitions in the producer in kafka 0.8.0 using partitionsFor method

Do we have a support for partitionsFor method in the producer in kafka version 0.8.0? I want to use this method to get the number of partitions given a kafka topic.
If this method is not available in kafka 0.8.0, what is the easiest way to get the number of partitions in the producer in this specific version of kafka?
You can use listTopics() method also
ArrayList<Topics> topicList = new ArrayList<Topics>();
Properties props = new Properties();
Map<String, List<PartitionInfo>> topics;
Topics topic;
InputStream input =
getClass().getClassLoader().getResourceAsStream("kafkaCluster.properties");
try {
props.load(input);
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,
props.getProperty("BOOTSTRAP_SERVERS_CONFIG"));
props.put("key.deserializer",
"org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer",
"org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String,
String>(props);
topics = consumer.listTopics();
// System.out.println(topics.get(topics));
for (Map.Entry<String, List<PartitionInfo>> entry :
topics.entrySet()) {
System.out.println("Key = " + entry.getKey() + ", Value = " +
entry.getValue());
topic = new Topics();
topic.setTopic_name(entry.getKey());
topic.setPartitions(Integer.toString(entry.getValue().size()));
topicList.add(topic);
}
} catch (IOException e) {
e.printStackTrace();
}
Why don't you try following approach
https://stackoverflow.com/a/35458605/5922904.
Also ZkUtils have the method getPartitionsForTopics which can also be used. Although I have not tried and tested it myself

Kafka consumer does not start from latest message

I want to have a Kafka Consumer which starts from the latest message in a topic.
Here is the java code:
private static Properties properties = new Properties();
private static KafkaConsumer<String, String> consumer;
static
{
properties.setProperty("bootstrap.servers","localhost");
properties.setProperty("enable.auto.commit", "true");
properties.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.setProperty("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.setProperty("group.id", "test");
properties.setProperty("auto.offset.reset", "latest");
consumer = new KafkaConsumer<>(properties);
consumer.subscribe(Collections.singletonList("mytopic"));
}
#Override
public StreamHandler call() throws Exception
{
while (true)
{
ConsumerRecords<String, String> consumerRecords = consumer.poll(200);
Iterable<ConsumerRecord<String, String>> records = consumerRecords.records("mytopic");
for(ConsumerRecord<String, String> rec : records)
{
System.out.println(rec.value());
}
}
}
Although the value for auto.offset.reset is latest, but the consumer starts form messages which belong to 2 days ago and then it catches up with the latest messages.
What am I missing?
Have you run this same code before with the same group.id? The auto.offset.reset parameter is only used if there is not an existing offset already stored for your consumer. So if you've run the example previously, say two days ago, and then you run it again, it will start from the last consumed position.
Use seekToEnd() if you would like to manually go to the end of the topic.
See https://stackoverflow.com/a/32392174/1392894 for a slightly more thorough discussion of this.
If you want to manually control the position of your offsets you need to set enable.auto.commit = false.
If you want to position all offsets to the end of each partition then call seekToEnd()
https://kafka.apache.org/0102/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#seekToEnd(java.util.Collection)

How to pass Integer value to kafka producer and read it back on kafka consumer console using IntegerSerializer in Kafka

I am trying to send Integer value through Kafka producer using the kafka provided API IntegerSerializer,but the integer value is not getting parsed correctly and it displayed in form of random unknown symbol on the Kafka consumer console.
public static void main(String[] args) throws Exception{
int i;
// Check arguments length value
if(args.length == 0){
System.out.println("Enter topic name");
return;
}
//Assign topicName to string variable
String topicName = args[0].toString();
// create instance for properties to access producer configs
Properties props = new Properties();
//Assign localhost id
props.put("bootstrap.servers", "localhost:9092");
//Set acknowledgements for producer requests.
props.put("acks", "all");
//If the request fails, the producer can automatically retry,
props.put("retries", "0");
//Specify buffer size in config
props.put("batch.size"," 16384");
//Reduce the no of requests less than 0
props.put("linger.ms", "1");
//The buffer.memory controls the total amount of memory available to the producer for buffering.
props.put("buffer.memory", "33554432");
props.put("key.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer",
"org.apache.kafka.common.serialization.IntegerSerializer");
KafkaProducer<String,Integer> producerRcrd = new KafkaProducer<String,Integer>(props);
producerRcrd.send(new ProducerRecord<String,Integer>(topicName, "Key1",100));
System.out.println("Message sent successfully");
producerRcrd.flush();
producerRcrd.close();
}
}
Then it is not showing 100 on Kafka-consumer console.
Appending
--property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer --property value.deserializer=org.apache.kafka.common.serialization.IntegerDeserializer
to kafka-console-consumer.sh, having console message formatter know how to deserialize your message body.