Spring Kafka get assigned partitions - apache-kafka

I know I can find out from which partition record comes in, but I wonder is any way to dynamically get which partitions are assigned for consumers at specific moment? Maybe I need to implement some listener to detect and follow up partitions assignation info?
I am using spring-kafka 1.3.2 with ConcurrentKafkaListenerContainerFactory and #KafkaListener.

Yes, you can do:
#Bean
public ConsumerAwareRebalanceListener rebalanceListener() {
return new ConsumerAwareRebalanceListener() {
#Override
public void onPartitionsAssigned(Consumer<?, ?> consumer, Collection<TopicPartition> partitions) {
// here partitions
}
};
}
And then add it, for example, to ConcurrentKafkaListenerContainerFactory
#Bean
public ConcurrentKafkaListenerContainerFactory<Object, Object> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<Object, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
ContainerProperties props = factory.getContainerProperties();
props.setConsumerRebalanceListener(rebalanceListener());
return factory;
}

I did it in different way by using KafkaListenerEndpointRegistry
for (MessageListenerContainer messageListenerContainer : kafkaListenerEndpointRegistry.getListenerContainers()) {
List<KafkaMessageListenerContainer> containers = ((ConcurrentMessageListenerContainer) messageListenerContainer).getContainers();
List<TopicPartition> topicPartitions = (List<TopicPartition>) containers.stream().flatMap(kafkaMessageListenerContainer ->
kafkaMessageListenerContainer.getAssignedPartitions().stream()).collect(Collectors.toList());
partitions.addAll(topicPartitions.stream().map(TopicPartition::partition).collect(Collectors.toList()));
}

Related

multiple kafka consumers with same group id

i'm new to kafka. I have created a kafka consumer with spring boot (spring-kafka dependency). In my app i have used consumerFactory and producerfactory beans for config. So in my application i have created the kafka consumer like below.
#RetryableTopic(
attempts = "3",
backoff = #Backoff(delay = 1000, multiplier = 2.0),
autoCreateTopics = "false")
#KafkaListener(topics = "myTopic", groupId = "myGroupId")
public void consume(#Payload(required = false) String message) {
processMessage(message);
}
My configs are like below
#Bean
public ConsumerFactory<String, Object> consumerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, env.getProperty("kafka.consumer.bootstrap.servers"));
config.put(ConsumerConfig.GROUP_ID_CONFIG, env.getProperty("kafka.consumer.group"));
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
return new DefaultKafkaConsumerFactory<>(config);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Object> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setCommitLogLevel(LogIfLevelEnabled.Level.DEBUG);
factory.getContainerProperties().setMissingTopicsFatal(false);
return factory;
}
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, env.getProperty("kafka.consumer.bootstrap.servers"));
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
return new DefaultKafkaProducerFactory<>(config);
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
So i want to consume parallelly since i may get more messages. What i found about consuming parallelly topics is that i need to create multiple partitions for a topic and i need to create a consumer for each partition. Let´s say i have 10 partitions for my topic, then i can have 10 consumers in the same consumer group reading one partition each. I understand this behavior. But my concern is how can i create several consumers in my application.
Do i have to write multiple kafka consumer using #KafkaListener with the same functionality ? In that case do i have to write below method X amount of times if i need X amount of same consumers.
#RetryableTopic(
attempts = "3",
backoff = #Backoff(delay = 1000, multiplier = 2.0),
autoCreateTopics = "false")
#KafkaListener(topics = "myTopic", groupId = "myGroupId")
public void consume(#Payload(required = false) String message) {
processMessage(message);
}
What are the options or configs that i need to achieve parallel consuming with multiple consumers ?
Thank you in advance.
The #KafkaListener has this option:
/**
* Override the container factory's {#code concurrency} setting for this listener. May
* be a property placeholder or SpEL expression that evaluates to a {#link Number}, in
* which case {#link Number#intValue()} is used to obtain the value.
* <p>SpEL {#code #{...}} and property place holders {#code ${...}} are supported.
* #return the concurrency.
* #since 2.2
*/
String concurrency() default "";
See more in docs: https://docs.spring.io/spring-kafka/reference/html/#kafka-listener-annotation

Kakfa Listener and Consumer not invoking

I'm building a simple Kafka application with a producer and a consumer. I'm sending a string through postman and pushing through the topic. The topic is receiving the message but the consumer isn't consuming it.
ConsumerConfig.Java
#EnableKafka
#Configuration
#ConditionalOnProperty(name = "kafka.enabled", havingValue = "true")
public class KafkaConsumerConfig {
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory(){
ConcurrentKafkaListenerContainerFactory<String,String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
#Bean
public Map<String,Object> config(){
Map<String,Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
config.put(ConsumerConfig.GROUP_ID_CONFIG, "group_Id");
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
config.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
config.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000");
return config;
}
#Bean
public ConsumerFactory<String,String> consumerFactory(){
return new DefaultKafkaConsumerFactory<>(config());
}
}
CosumerService.Java
#Service
#ConditionalOnProperty(name = "kafka.enabled", havingValue = "true")
#Component
public class KafkaConsumerService {
private static final Logger log = LoggerFactory.getLogger(KafkaConsumerService.class);
private static final String TOPIC = "Kafka_Test";
#KafkaListener(topics = TOPIC, groupId= "group_Id")
public void consumeOTP(String otp) {
log.debug("The OTP Sent to Kafka is:" + otp);
}
}
Based on your question I am assuming you're using spring-kafka with Spring Boot. For a simple example, with this setup you can avoid all the Bean configuration and use the DefaultBean from Spring Kafka so you can basically do the setup using the application.yml file, there's better explanation in this post but basically:
Producer:
#Service
public class SimpleProducer {
private KafkaTemplate<String, String> simpleProducer;
public SimpleProducer(KafkaTemplate<String, String> simpleProducer) {
this.simpleProducer = simpleProducer;
}
public void send(String message) {
simpleProducer.send("simple-message", message);
}
}
Consumer:
#Slf4j
#Service
public class SimpleConsumer {
#KafkaListener(id = "simple-consumer", topics = "simple-message")
public void consumeMessage(String message) {
log.info("Consumer got message: {}", message);
}
}
Api so you can produce sending a message:
#RestController
#RequestMapping("/api")
public class MessageApi {
private final SimpleProducer simpleProducer;
public MessageApi(SimpleProducer simpleProducer) {
this.simpleProducer = simpleProducer;
}
#PostMapping("/message")
public ResponseEntity<String> message(#RequestBody String message) {
simpleProducer.send(message);
return ResponseEntity.ok("Message received: " + message);
}
}
Because you're using the defaults with String as key and String as value you don't even have to add any specific configuration to the spring-boot props or yaml files.

Topology with no input topics will create no stream threads and no global thread

I am writing a Kafka Streams application, and I would like to include two application id in this application, but I keep getting error saying that "Topology with no input topics will create no stream threads and no global thread, must subscribe to at least one source topic or global table." Could you please let me know where I made a mistake? Thank you so much!
public class KafkaStreamsConfigurations {
...
#Bean(name = KafkaStreamsDefaultConfiguration.DEFAULT_STREAMS_CONFIG_BEAN_NAME)
#Primary
public KafkaStreamsConfiguration kStreamsConfigs() {
Map<String, Object> props = new HashMap<>();
setDefaults(props);
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "default");
return new KafkaStreamsConfiguration(props);
}
public void setDefaults(Map<String, Object> props) {...}
#Bean("snowplowStreamBuilder")
public StreamsBuilderFactoryBean streamsBuilderFactoryBean() {
Map<String, Object> props = new HashMap<>();
setDefaults(props);
...
props.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 0);
props.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 1);
Properties properties = new Properties();
props.forEach(properties::put);
StreamsBuilderFactoryBean streamsBuilderFactoryBean = new StreamsBuilderFactoryBean();
streamsBuilderFactoryBean.setStreamsConfiguration(properties);
return streamsBuilderFactoryBean;
}
}
Here is my application class.
public class SnowplowStreamsApp {
#Bean("snowplowStreamsApp")
public KStream<String, String> [] startProcessing(
#Qualifier("snowplowStreamBuilder") StreamsBuilder builder) {
KStream<String, String>[] branches = builder.stream(inputTopicPubsubSnowplow, Consumed
.with(Serdes.String(), Serdes.String()))
.mapValues(snowplowEnrichedGoodDataFormatter::formatEnrichedData)
.branch(...);
return branches;
}
}
Name your factory bean DEFAULT_STREAMS_BUILDER_BEAN_NAME instead of snowplowStreamBuilder - otherwise, the default factory bean will be started with no defined streams.

Spring Kafka - deserialise Pojos without type information

I am working on a distributed microservices application that uses Kafka for internal communication. The applications exchange POJOs over topics. When a producer sends a message to consumer, a header is added by default indicating the package name and class name of the object in payload. The consumer application then uses this information to deserialise the payload. But this requires me to define exact same class in the same package in both the applications which does not result in a good design for me. If i set the configuration (JsonSerializer.ADD_TYPE_INFO_HEADERS) on producer side to not send the type in header, it results in an error on consumer side. Also I dont want to use default type on consumer application as it has multiple listeners that expect different types of objects. Why cant the kafkalistener simply deserialise the json payload to the object type given in argument, why does it need the header?
To work around this I defined a consumerFactory with 'BytesDeserialser' and a KafkaListenerContainerFactory with a 'BytesJsonMessageConverter' on the consumer application. With this it worked on the consumer side, but I am not sure how to make this work on the producer side while using a replyingKafkaTemplate and deserialising the reply from consumer.
Below are my configurations -
//producer configs
#Bean
public Map<String, Object> producerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
props.put(JsonSerializer.TYPE_MAPPINGS, "cat:com.common.adapter.model.response.AccountResponse");
return props;
}
#Bean
public ProducerFactory<String, Object> replyProducerFactory() {
return new DefaultKafkaProducerFactory<>(producerConfigs());
}
#Bean
public KafkaTemplate<String, Object> replyTemplate() {
return new KafkaTemplate<>(replyProducerFactory());
}
//consumer configs
#Bean
public ReplyingKafkaTemplate<String, Object, Object> replyingKafkaTemplate() {
ReplyingKafkaTemplate<String, Object, Object> replyingKafkaTemplate =
new ReplyingKafkaTemplate<>(requestProducerFactory(), replyListenerContainer());
replyingKafkaTemplate.setReplyTimeout(10000);
replyingKafkaTemplate.setMessageConverter(converter());
return replyingKafkaTemplate;
}
#Bean
public KafkaMessageListenerContainer<String, Object> replyListenerContainer() {
ContainerProperties containerProperties = new ContainerProperties(replyTopic);
return new KafkaMessageListenerContainer<>(replyConsumerFactory(), containerProperties);
}
#Bean
public ConsumerFactory<String, Object> replyConsumerFactory() {
JsonDeserializer<Object> jsonDeserializer = new JsonDeserializer<>();
jsonDeserializer.addTrustedPackages("*");
return new DefaultKafkaConsumerFactory<>(consumerConfigs(), new StringDeserializer(), jsonDeserializer);
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
props.put(JsonDeserializer.TYPE_MAPPINGS, "cat:com.trader.account.model.response.AccountResponse");
return props;
}
You can use type mapping.
The producer maps com.acme.Foo to foo and the consumer maps foo to com.other.Bar.
The types must be compatible at the JSON level.
If you only receive one type, you can configure the deserializer to use that instead of looking for headers with type information.
https://docs.spring.io/spring-kafka/docs/2.5.2.RELEASE/reference/html/#serdes-json-config
JsonDeserializer.KEY_DEFAULT_TYPE: Fallback type for deserialization of keys if no header information is present.
JsonDeserializer.VALUE_DEFAULT_TYPE: Fallback type for deserialization of values if no header information is present.
Starting with version 2.5, you can add a function which will be called by the deserializer so you can introspect the data to determine the type.
See Using Methods to Determine Types.
This (and type mapping) are the only way to handle multiple types in the replying template. On the consumer side, we can infer the type based on the method parameter (which is the correct mechanism to use there - it is not a "work around").

Prioritizing Kafka topic

I need to read message from topic1 completely and then read message from topic2. I will be receiving messages in these topic everyday once. I managed to stop reading messages from topic2 before reading all the messages in topic1, but this is happening for me only once when the server is started. Can someone help me with this scenario.
ListenerConfig code
#EnableKafka
#Configuration
public class ListenerConfig {
#Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, "batch");
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "5");
return props;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setBatchListener(true);
return factory;
}
#Bean("kafkaListenerContainerTopic1Factory")
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerTopic1Factory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setIdleEventInterval(60000L);
factory.setBatchListener(true);
return factory;
}
#Bean("kafkaListenerContainerTopic2Factory")
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerTopic2Factory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setBatchListener(true);
return factory;
}
}
Listner code
#Service
public class Listener {
private static final Logger LOG = LoggerFactory.getLogger(Listener.class);
#Autowired
private KafkaListenerEndpointRegistry registry;
#KafkaListener(id = "first-listener", topics = "topic1", containerFactory = "kafkaListenerContainerTopic1Factory")
public void receive(#Payload List<String> messages,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
#Header(KafkaHeaders.OFFSET) List<Long> offsets) {
for (int i = 0; i < messages.size(); i++) {
LOG.info("received first='{}' with partition-offset='{}'",
messages.get(i), partitions.get(i) + "-" + offsets.get(i));
}
}
#KafkaListener(id = "second-listener", topics = "topic2", containerFactory = "kafkaListenerContaierTopic2Factory" , autoStartup="false" )
public void receiveRel(#Payload List<String> messages,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
#Header(KafkaHeaders.OFFSET) List<Long> offsets) {
for (int i = 0; i < messages.size(); i++) {
LOG.info("received second='{}' with partition-offset='{}'",
messages.get(i), partitions.get(i) + "-" + offsets.get(i));
}
}
#EventListener()
public void eventHandler(ListenerContainerIdleEvent event) {
LOG.info("Inside event");
this.registry.getListenerContainer("second-listener").start();
}
Kindly help me in resolving , as this cycle should happen everyday. Reading topic1 message completely and then reading message from topic2.
You are already using an idle event listener to start the second listener - it should also stop the first listener.
When the second listener goes idle; stop it.
You should be checking which container the event is for to decide which container to stop and/or start.
Then, using a TaskScheduler, schedule a start() of the first listener at the next time you want it to start.
Topic in Kafka is an abstraction where stream of records are published. Streams are naturally unbounded, so they have a start but they do not have a defined end. For your case, first you need to clearly define what is the end of your topic1 and your topic2 so that you can stop/presume your consumers when needed. Maybe you know how many messages you will process for each topic, so you can use: position or commmited to stop one consumer and presume the other one in that moment. Or if you are using a streaming framework they usually have a session window where the framework detects a groups elements by sessions of activity. You can also prefer to put that logic into the application side so that you don't need to stop/start any consumer threads.