Spring Kafka Idempotence Producer configuration - apache-kafka

For the native Java Kafka client, there is a Kafka configuration called, enable.idempotence and we can set it to be true to enable idempotence producer.
However, for Spring Kafka, I can't find similar idempotence property in KafkaProperties class.
So I am wondering, if I manually set in my Spring Kafka configuration file, whether this property will take effect or Spring will totally ignore this config for Spring Kafka?

There are two ways to specify this property
application.properties You can use this property to specify any additional properties on producer
spring.kafka.producer.properties.*= # Additional producer-specific properties used to configure the client.
If you have any additional common config between Producer and Consumer
spring.kafka.properties.*= # Additional properties, common to producers and consumers, used to configure the client.
Through Code You can also override and customize the configs
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,bootstrapAddress);
configProps.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
return new DefaultKafkaProducerFactory<>(configProps);
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
}

You're trying to add features that are not handled by Spring KafkaProperties, if you look at the documentation, you can do as the following:
Only a subset of the properties supported by Kafka are available directly through the KafkaProperties class.
If you wish to configure the producer or consumer with additional properties that are not directly supported, use the following properties:
spring.kafka.properties.prop.one=first
spring.kafka.admin.properties.prop.two=second
spring.kafka.consumer.properties.prop.three=third
spring.kafka.producer.properties.prop.four=fourth
spring.kafka.streams.properties.prop.five=fifth
https://docs.spring.io/spring-boot/docs/current/reference/html/boot-features-messaging.html#boot-features-kafka-extra-props
Yannick

You can find it with ProducerConfig as it is producer configuration. In order to enable this, you need to add below line in producerConfigs:
Properties producerProperties = new Properties();
producerProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
producerProperties.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);

Related

Spring Integration Kafka - Exception Handling by using SeekToCurrentErrorHandler

Problem Statement :
Need to handle exceptions occur while consuming messages in kafka
Commit failed offset
Seek to the next unprocessed offset, so that next polling start from this offset.
Seems all these are handled as part of SeekToCurrentErrorHandler.java in Spring-Kafka.
How to leverage this functionality in Spring-Integration-Kafka ?
Please help with this.
Versions used :
Spring-Integration-Kafka - 3.3.1
Spring for apache kafka - 2.5.x
#Bean(name ="kafkaConsumerFactory")
public ConsumerFactory consumerFactory0(
HashMap<String, String> properties = new HashMap<>();
properties.put("bootstrap.servers", "kafkaServerl");
properties.put("key.deserializer", StringDeserializer.class);
properties.put("value.deserializer", StringDeserializer.class);
properties.put("auto.offset.reset", "earliest");
} return new DefaultKafkaConsumerFactoryo(properties); I
#Bean("customKafkalistenerContainer")
public ConcurrentMessagelistenerContainerCtring, AddAccountReqRes> customKafkaListenerContainer() (
ContainerProperties containerProps = new ContainerProperties("Topici");
containerProps.setGroupld("Groupldl");
return (ConcurrentMessagelistenerContainerCtring, CustomReqRes>) new ConcurrentMessageListenerContainer<>(
} kafkaConsumerFactory, containerProps);
IntegrationFlows.from(Kafka.messageDrivenChannelAdapter(customKafkalistenerContainer, KafkaMessageDrivenChannelAdapter.ListenerMode.record)
.errorChannel(errorChannel()))
.handle(transformationProcessor, "process")
.channel("someChannel")
.get();
spring-integration-kafka uses spring-kafka underneath, so you just need to configure the adapter's container with the error handler.
spring-integration-kafka was moved to spring-integration starting with 5.4 (it was an extension previously). So, the current versions of both jars is 5.4.2.

How to setup Kafka consume application dependency on batch process

I am developing a consumer application using spring-kafka. I am planning to keep it running by 24*7 using pod. But, recently I got to know there are some batch processes which would be running in between. And, when those batches are running, our processing shouldn't occur. So, probably, somehow I have to stop polling for records and when the batches are finished then I can resume my processing. But, I have no clue how to achieve this..
Whether the batches are running or not, I can query and get the details from table, by looking into some flag. But, how can I stop polling for records ? and will it not cause re balancing if I just keep consumer application running without processing anything ?
Config class :
#Bean
public ConsumerFactory<String, GenericRecord> consumerFactory(){
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,KAFKA_BROKERS);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, OFFSET_RESET);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class.getName());
props.put(ConsumerConfig.GROUP_ID_CONFIG, GROUP_ID_CONFIG);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, MAX_POLL_RECORDS);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, MAX_POLL_INTERVAL);
props.put(KafkaAvroDeserializerConfig.SCHEMA_REGISTRY_URL_CONFIG, SCHEMA_REGISTRY_URL);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
props.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, SSL_PROTOCOL);
props.put(SslConfigs.SSL_TRUSTSTORE_LOCATION_CONFIG,SSL_TRUSTSTORE_LOCATION_FILE_NAME);
props.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG, SSL_TRUSTSTORE_SECURE);
props.put(SslConfigs.SSL_KEYSTORE_LOCATION_CONFIG,SSL_KEYSTORE_LOCATION_FILE_NAME);
props.put(SslConfigs.SSL_KEYSTORE_PASSWORD_CONFIG, SSL_KEYSTORE_SECURE);
props.put(SslConfigs.SSL_KEY_PASSWORD_CONFIG, SSL_KEY_SECURE);
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
ConcurrentKafkaListenerContainerFactory<String, GenericRecord>
kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, GenericRecord> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(KAFKA_CONCURRENCY);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE); // manual async comm
return factory;
}
Code :
#KafkaListener(topics = "${app.topic}", groupId = "${app.group_id_config}")
public void run(ConsumerRecord<String, GenericRecord> record, Acknowledgment acknowledgement) throws Exception {
try {
int flag = getBatchIndquery ();
// How to stop and resume based on the flag value---?
// business logic process once the consumer resumes
processRecords();
InsertDb();
acknowledgement.acknowledge();
}catch (Exception ex) {
System.out.println(record);
System.out.println(ex.getMessage());
}
}
Use the endpoint registry to stop and start the container...
#KafkaListener(id = "myListener" ...)
#Autowired
KafkaListenerEndpointRegistry registry;
...
registry.getListenerContainer("myListener").stop();
See #KafkaListener Lifecycle Management.
The listener containers created for #KafkaListener annotations are not beans in the application context. Instead, they are registered with an infrastructure bean of type KafkaListenerEndpointRegistry. This bean is automatically declared by the framework and manages the containers' lifecycles; it will auto-start any containers that have autoStartup set to true. All containers created by all container factories must be in the same phase. See Listener Container Auto Startup for more information. You can manage the lifecycle programmatically by using the registry. Starting or stopping the registry will start or stop all the registered containers. Alternatively, you can get a reference to an individual container by using its id attribute. You can set autoStartup on the annotation, which overrides the default setting configured into the container factory. You can get a reference to the bean from the application context, such as auto-wiring, to manage its registered containers.

Spring / Kafka - configuration properties priorities - application.properties, bean(ProducerFactory)

Kafka configuration properties:
Can i have the same property "key"(and maybe different "value") in
(1) application.properties,
(2) bean(ProducerFactory/ProducerConfig) and
If yes, who is the "last-win"?
P.S Yes, i know, test it! But it will also be handy to have this question/answer on SO.
EDIT:
Example:
(1) spring.kafka.producer.properties.enable.idempotence=true
(2) props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "false");
With props defined as:
#Configuration
public class KafkaProducerConfiguration {
#Bean
public ProducerFactory<Object, Object> producerFactory() {
Map<String, Object> props = new HashMap<>();
The boot property (1) is only used when Boot auto-configures the producer factory for you. Since you are defining your own producer factory #Bean (2), Boot's is disabled and the properties are ignored.
If you want to use the Boot application.properties, simply remove your producerFactory #Bean and let Boot configure the producer factory for you.
I have no idea what config/producer.properties (3) is.

Kafka producer and consumer in seperate EC2 instance

I have 2 ec2 instances one for Kafka broker and the other for Kafka Consumer. May i know how to connect both the ec2 instance to communicate with each other. If i produce a message in my broker i need to get it in the consumer.
Basically, i am looking for that part of configuration where i need to give the consumer information in the broker ec2 instance and vice versa (whichever way it works) . Do i need to use some api or something ?
I have tried in a single node cluster and it worked.
It does not matter you are hosting your broker in ec-2 or elsewhere as long as it is accessible to consumer.
A sample consumer in Java using StringDeserializer for both key and value. You need to use the KafkaConsumer API if you are accessing from a Java program
Properties props = new Properties();
props.put("bootstrap.servers", "YOUR_KAFKA_BROKER_ADDRESS");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
}
https://kafka.apache.org/10/javadoc/?org/apache/kafka/clients/consumer/KafkaConsumer.html
If you're using Kafka across machines, you need to configure the listeners correctly. This article explains how: https://rmoff.net/2018/08/02/kafka-listeners-explained/
where i need to give the consumer information in the broker
Brokers don't push messages to consumers, so you wouldn't give the information for the consumer to any broker
Any code that works against a single broker should work for more than one, assuming network settings are configured properly

Kafka spout integration

I am using kafka 0.10.1.1 and storm 1.0.2. In the storm documentation for kafka integration , i can see that offsets are still maintained using zookeeper as we are initializing kafka spout using zookeeper servers.
How can i bootstrap the spout using kafka servers .Is there any example for this .
Example from storm docs
BrokerHosts hosts = new ZkHosts(zkConnString);
SpoutConfig spoutConfig = new SpoutConfig(hosts, topicName, "/" + topicName, UUID.randomUUID().toString());
spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
KafkaSpout kafkaSpout = new KafkaSpout(spoutConfig);
This option using zookeeper is working fine and is consuming the messages . but i was not able to see the consumer group or storm nodes as consumers in kafkamanager ui .
Alternate approach tried is this .
KafkaSpoutConfig<String, String> kafkaSpoutConfig = newKafkaSpoutConfig();
KafkaSpout<String, String> spout = new KafkaSpout<>(kafkaSpoutConfig);
private static KafkaSpoutConfig<String, String> newKafkaSpoutConfig() {
Map<String, Object> props = new HashMap<>();
props.put(KafkaSpoutConfig.Consumer.BOOTSTRAP_SERVERS, bootstrapServers);
props.put(KafkaSpoutConfig.Consumer.GROUP_ID, GROUP_ID);
props.put(KafkaSpoutConfig.Consumer.KEY_DESERIALIZER,
"org.apache.kafka.common.serialization.StringDeserializer");
props.put(KafkaSpoutConfig.Consumer.VALUE_DESERIALIZER,
"org.apache.kafka.common.serialization.StringDeserializer");
props.put(KafkaSpoutConfig.Consumer.ENABLE_AUTO_COMMIT, "true");
String[] topics = new String[1];
topics[0] = topicName;
KafkaSpoutStreams kafkaSpoutStreams =
new KafkaSpoutStreamsNamedTopics.Builder(new Fields("message"), topics).build();
KafkaSpoutTuplesBuilder<String, String> tuplesBuilder =
new KafkaSpoutTuplesBuilderNamedTopics.Builder<>(new TuplesBuilder(topicName)).build();
KafkaSpoutConfig<String, String> spoutConf =
new KafkaSpoutConfig.Builder<>(props, kafkaSpoutStreams, tuplesBuilder).build();
return spoutConf;
}
But this solution is showing CommitFailedException after reading few messages from kafka.
Storm-kafka writes consumer information in a different location and different format in zookeeper with common kafka client. So you can't see it in kafkamanager ui.
You can find some other monitor tools, like
https://github.com/keenlabs/capillary.
On your alternate approach, you're likely getting CommitFailedException due to:
props.put(KafkaSpoutConfig.Consumer.ENABLE_AUTO_COMMIT, "true");
Up to Storm 2.0.0-SNAPSHOT (and since 1.0.6) -
KafkaConsumer autocommit is unsupported
From the docs:
Note that KafkaConsumer autocommit is unsupported. The
KafkaSpoutConfig constructor will throw an exception if the
"enable.auto.commit" property is set, and the consumer used by the
spout will always have that property set to false. You can configure
similar behavior to autocommit through the setProcessingGuarantee
method on the KafkaSpoutConfig builder.
References:
http://storm.apache.org/releases/2.0.0-SNAPSHOT/storm-kafka-client.html
http://storm.apache.org/releases/1.0.6/storm-kafka-client.html