#KafkaListener vs ConsumerFactory groupId - apache-kafka

I followed "Intro to Apache Kafka with Spring" tutorial by baeldung.com.
I set up a KafkaConsumerConfig class with the kafkaConsumerFactory method:
private ConsumerFactory<String, String> kafkaConsumerFactory(String groupId) {
Map<String, Object> props = new HashMap<>();
...
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
...
return new DefaultKafkaConsumerFactory<>(props);
}
and two "custom" factories:
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> fooKafkaListenerContainerFactory() {
return kafkaListenerContainerFactory("foo");
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> barKafkaListenerContainerFactory() {
return kafkaListenerContainerFactory("bar");
}
In the MessageListener class, instead I used #KafkaListener annotation to register consumers with the given groupId to listen on a topic:
#KafkaListener(topics = "${message.topic.name}", groupId = "foo", containerFactory = "fooKafkaListenerContainerFactory")
public void listenGroupFoo(String message) {
System.out.println("Received Message in group 'foo': " + message);
...
}
#KafkaListener(topics = "${message.topic.name}", groupId = "bar", containerFactory = "barKafkaListenerContainerFactory")
public void listenGroupBar(String message) {
System.out.println("Received Message in group 'bar': " + message);
...
}
In this way there are two group of consumers, the ones having groupId "foo" and the ones having groupId "bar".
Now if I change container factory for the "foo" consumers from fooKafkaListenerContainerFactory to barKafkaListenerContainerFactory in this way
#KafkaListener(topics = "${message.topic.name}", groupId = "foo", containerFactory = "barKafkaListenerContainerFactory")
public void listenGroupFoo(String message) {
...
}
It seems an incompatibility between groupId of KafkaListener and groupId of container factory but nothing changes.
So, what I'm trying to understand is what props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);property does and why it seem is not considered.

The factory groupId is a default which is only used if there is no groupId (or id) on the #KafkaListener.
In early versions, it was only possible to set the groupId on the factory, which meant you needed a separate factory for each listener if different groups are needed, which defeats the idea of a factory that can be used for multiple listeners.
See the javadocs...
/**
* Override the {#code group.id} property for the consumer factory with this value
* for this listener only.
* <p>SpEL {#code #{...}} and property place holders {#code ${...}} are supported.
* #return the group id.
* #since 1.3
*/
String groupId() default "";
/**
* When {#link #groupId() groupId} is not provided, use the {#link #id() id} (if
* provided) as the {#code group.id} property for the consumer. Set to false, to use
* the {#code group.id} from the consumer factory.
* #return false to disable.
* #since 1.3
*/
boolean idIsGroup() default true;

Related

multiple kafka consumers with same group id

i'm new to kafka. I have created a kafka consumer with spring boot (spring-kafka dependency). In my app i have used consumerFactory and producerfactory beans for config. So in my application i have created the kafka consumer like below.
#RetryableTopic(
attempts = "3",
backoff = #Backoff(delay = 1000, multiplier = 2.0),
autoCreateTopics = "false")
#KafkaListener(topics = "myTopic", groupId = "myGroupId")
public void consume(#Payload(required = false) String message) {
processMessage(message);
}
My configs are like below
#Bean
public ConsumerFactory<String, Object> consumerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, env.getProperty("kafka.consumer.bootstrap.servers"));
config.put(ConsumerConfig.GROUP_ID_CONFIG, env.getProperty("kafka.consumer.group"));
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
return new DefaultKafkaConsumerFactory<>(config);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Object> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setCommitLogLevel(LogIfLevelEnabled.Level.DEBUG);
factory.getContainerProperties().setMissingTopicsFatal(false);
return factory;
}
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, env.getProperty("kafka.consumer.bootstrap.servers"));
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
return new DefaultKafkaProducerFactory<>(config);
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
So i want to consume parallelly since i may get more messages. What i found about consuming parallelly topics is that i need to create multiple partitions for a topic and i need to create a consumer for each partition. Let´s say i have 10 partitions for my topic, then i can have 10 consumers in the same consumer group reading one partition each. I understand this behavior. But my concern is how can i create several consumers in my application.
Do i have to write multiple kafka consumer using #KafkaListener with the same functionality ? In that case do i have to write below method X amount of times if i need X amount of same consumers.
#RetryableTopic(
attempts = "3",
backoff = #Backoff(delay = 1000, multiplier = 2.0),
autoCreateTopics = "false")
#KafkaListener(topics = "myTopic", groupId = "myGroupId")
public void consume(#Payload(required = false) String message) {
processMessage(message);
}
What are the options or configs that i need to achieve parallel consuming with multiple consumers ?
Thank you in advance.
The #KafkaListener has this option:
/**
* Override the container factory's {#code concurrency} setting for this listener. May
* be a property placeholder or SpEL expression that evaluates to a {#link Number}, in
* which case {#link Number#intValue()} is used to obtain the value.
* <p>SpEL {#code #{...}} and property place holders {#code ${...}} are supported.
* #return the concurrency.
* #since 2.2
*/
String concurrency() default "";
See more in docs: https://docs.spring.io/spring-kafka/reference/html/#kafka-listener-annotation

How to pass groupId value in #KafkaListener from database?

I want to connect my spring mvc application with Kafka Server to consume kafka messages. For this I have written KafkaConsumer class as below.
#Service
public class KafkaConsumer {
#KafkaListener(groupId = "my-group-id", topicPattern = "VID.*", containerFactory = SystemParameterConstants.KAFKA_LISTENER_CONTAINER_FACTORY)
public void receivedMessage(#Payload String message) {
logger.info("================ receivedMessage() ==================");
logger.info("::: Message recieved from kafka ::: {}", message);
ObjectMapper objectMapper = new ObjectMapper();
try {
...
} catch (JsonProcessingException e) {
e.printStackTrace();
}
}
}
Here I have hard coded group id "my-group-id" but I want to read this groupId from DB so that I can have different groupId for different environment.
Please suggest a solution. Thanks!
See its JavaDocs:
/**
* Override the {#code group.id} property for the consumer factory with this value
* for this listener only.
* <p>SpEL {#code #{...}} and property place holders {#code ${...}} are supported.
* #return the group id.
* #since 1.3
*/
String groupId() default "";
If you have a bean, which reads data from the DB, then you can do like this:
groupId = "#{myDbBean.grouIdFromDb}"
See also docs: https://docs.spring.io/spring-kafka/docs/current/reference/html/#annotation-properties

is it possible to assign one spring kafka consumer to one instance and another consumer to another instance of the same service

i have two kafka listeners like below:
#KafkaListener(topics = "foo1, foo2", groupId = foo.id, id = "foo")
public void fooTopics(#Header(KafkaHeaders.RECEIVED_TOPIC) String topic, String message, Acknowledgment acknowledgment) {
//processing
}
#KafkaListener(topics = "Bar1, Bar2", groupId = bar.id, id = "bar")
public void barTopics(#Header(KafkaHeaders.RECEIVED_TOPIC) String topic, String message, Acknowledgment acknowledgment) {
//processing
same application is running on two instances like inc1 and inc2. is there a way if i can assign foo listener to inc1 and bar listener to inc2. and if one instance is going down both the listener(foo and bar) assign to the running instance.
You can use the #KafkaListener property autoStartup, introduced since 2.2.
When an instance die, you can automatically start it up in the other instance like so:
#Autowired
private KafkaListenerEndpointRegistry registry;
...
#KafkaListener(topics = "foo1, foo2", groupId = foo.id, id = "foo", autoStartup = "false")
public void fooTopics(#Header(KafkaHeaders.RECEIVED_TOPIC) String topic, String message, Acknowledgment acknowledgment) {
//processing
}
//Start up condition
registry.getListenerContainer("foo").start();

Prioritizing Kafka topic

I need to read message from topic1 completely and then read message from topic2. I will be receiving messages in these topic everyday once. I managed to stop reading messages from topic2 before reading all the messages in topic1, but this is happening for me only once when the server is started. Can someone help me with this scenario.
ListenerConfig code
#EnableKafka
#Configuration
public class ListenerConfig {
#Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, "batch");
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "5");
return props;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setBatchListener(true);
return factory;
}
#Bean("kafkaListenerContainerTopic1Factory")
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerTopic1Factory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setIdleEventInterval(60000L);
factory.setBatchListener(true);
return factory;
}
#Bean("kafkaListenerContainerTopic2Factory")
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerTopic2Factory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setBatchListener(true);
return factory;
}
}
Listner code
#Service
public class Listener {
private static final Logger LOG = LoggerFactory.getLogger(Listener.class);
#Autowired
private KafkaListenerEndpointRegistry registry;
#KafkaListener(id = "first-listener", topics = "topic1", containerFactory = "kafkaListenerContainerTopic1Factory")
public void receive(#Payload List<String> messages,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
#Header(KafkaHeaders.OFFSET) List<Long> offsets) {
for (int i = 0; i < messages.size(); i++) {
LOG.info("received first='{}' with partition-offset='{}'",
messages.get(i), partitions.get(i) + "-" + offsets.get(i));
}
}
#KafkaListener(id = "second-listener", topics = "topic2", containerFactory = "kafkaListenerContaierTopic2Factory" , autoStartup="false" )
public void receiveRel(#Payload List<String> messages,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
#Header(KafkaHeaders.OFFSET) List<Long> offsets) {
for (int i = 0; i < messages.size(); i++) {
LOG.info("received second='{}' with partition-offset='{}'",
messages.get(i), partitions.get(i) + "-" + offsets.get(i));
}
}
#EventListener()
public void eventHandler(ListenerContainerIdleEvent event) {
LOG.info("Inside event");
this.registry.getListenerContainer("second-listener").start();
}
Kindly help me in resolving , as this cycle should happen everyday. Reading topic1 message completely and then reading message from topic2.
You are already using an idle event listener to start the second listener - it should also stop the first listener.
When the second listener goes idle; stop it.
You should be checking which container the event is for to decide which container to stop and/or start.
Then, using a TaskScheduler, schedule a start() of the first listener at the next time you want it to start.
Topic in Kafka is an abstraction where stream of records are published. Streams are naturally unbounded, so they have a start but they do not have a defined end. For your case, first you need to clearly define what is the end of your topic1 and your topic2 so that you can stop/presume your consumers when needed. Maybe you know how many messages you will process for each topic, so you can use: position or commmited to stop one consumer and presume the other one in that moment. Or if you are using a streaming framework they usually have a session window where the framework detects a groups elements by sessions of activity. You can also prefer to put that logic into the application side so that you don't need to stop/start any consumer threads.

Spring Kafka get assigned partitions

I know I can find out from which partition record comes in, but I wonder is any way to dynamically get which partitions are assigned for consumers at specific moment? Maybe I need to implement some listener to detect and follow up partitions assignation info?
I am using spring-kafka 1.3.2 with ConcurrentKafkaListenerContainerFactory and #KafkaListener.
Yes, you can do:
#Bean
public ConsumerAwareRebalanceListener rebalanceListener() {
return new ConsumerAwareRebalanceListener() {
#Override
public void onPartitionsAssigned(Consumer<?, ?> consumer, Collection<TopicPartition> partitions) {
// here partitions
}
};
}
And then add it, for example, to ConcurrentKafkaListenerContainerFactory
#Bean
public ConcurrentKafkaListenerContainerFactory<Object, Object> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<Object, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
ContainerProperties props = factory.getContainerProperties();
props.setConsumerRebalanceListener(rebalanceListener());
return factory;
}
I did it in different way by using KafkaListenerEndpointRegistry
for (MessageListenerContainer messageListenerContainer : kafkaListenerEndpointRegistry.getListenerContainers()) {
List<KafkaMessageListenerContainer> containers = ((ConcurrentMessageListenerContainer) messageListenerContainer).getContainers();
List<TopicPartition> topicPartitions = (List<TopicPartition>) containers.stream().flatMap(kafkaMessageListenerContainer ->
kafkaMessageListenerContainer.getAssignedPartitions().stream()).collect(Collectors.toList());
partitions.addAll(topicPartitions.stream().map(TopicPartition::partition).collect(Collectors.toList()));
}