Problem Statement :
Need to handle exceptions occur while consuming messages in kafka
Commit failed offset
Seek to the next unprocessed offset, so that next polling start from this offset.
Seems all these are handled as part of SeekToCurrentErrorHandler.java in Spring-Kafka.
How to leverage this functionality in Spring-Integration-Kafka ?
Please help with this.
Versions used :
Spring-Integration-Kafka - 3.3.1
Spring for apache kafka - 2.5.x
#Bean(name ="kafkaConsumerFactory")
public ConsumerFactory consumerFactory0(
HashMap<String, String> properties = new HashMap<>();
properties.put("bootstrap.servers", "kafkaServerl");
properties.put("key.deserializer", StringDeserializer.class);
properties.put("value.deserializer", StringDeserializer.class);
properties.put("auto.offset.reset", "earliest");
} return new DefaultKafkaConsumerFactoryo(properties); I
#Bean("customKafkalistenerContainer")
public ConcurrentMessagelistenerContainerCtring, AddAccountReqRes> customKafkaListenerContainer() (
ContainerProperties containerProps = new ContainerProperties("Topici");
containerProps.setGroupld("Groupldl");
return (ConcurrentMessagelistenerContainerCtring, CustomReqRes>) new ConcurrentMessageListenerContainer<>(
} kafkaConsumerFactory, containerProps);
IntegrationFlows.from(Kafka.messageDrivenChannelAdapter(customKafkalistenerContainer, KafkaMessageDrivenChannelAdapter.ListenerMode.record)
.errorChannel(errorChannel()))
.handle(transformationProcessor, "process")
.channel("someChannel")
.get();
spring-integration-kafka uses spring-kafka underneath, so you just need to configure the adapter's container with the error handler.
spring-integration-kafka was moved to spring-integration starting with 5.4 (it was an extension previously). So, the current versions of both jars is 5.4.2.
Related
This question already has answers here:
How can I send large messages with Kafka (over 15MB)?
(9 answers)
Closed 6 months ago.
I have researched different configs even from Stackoverflow, but really stucked with it for several days so created separate question for it. I am trying to configure Kafka to send large messages (10-50 mbytes). I run Kafka in Docker (version is confluentinc/cp-kafka:7.2.1). I also understand that Kafka is the not the best instrument for it. I am trying to config Kafka from Java the way below, and restarted my Kafka Docker instance, but still see the error message:
org.apache.kafka.common.errors.RecordTooLargeException: The request
included a message larger than the max message size the server will
accept.
And below is config which I use (from Google and Stackoverflow).
Here are my Producer and Consumer and KafkaAdmin java classes:
KafkaAdminConfig.java:
#Bean
public KafkaAdmin kafkaAdmin() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress);
configProps.put("max.message.bytes", String.valueOf(maxFileSize));
configProps.put("max.request.size", maxFileSize);
configProps.put("replica.fetch.max.bytes", maxFileSize);
configProps.put("message.max.bytes", maxFileSize);
configProps.put("max.message.bytes", maxFileSize);
configProps.put("max.message.max.bytes", maxFileSize);
configProps.put("max.partition.fetch.bytes", maxFileSize);
return new KafkaAdmin(configProps);
}
ProducerConfig.java
#Bean
public ProducerFactory<String, Byte[]> producerFactoryLargeFiles() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress);
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class);
//required to allow Kafka process files <= 20 mb
configProps.put("buffer.memory", maxFileSize);
configProps.put("max.request.size", maxFileSize);
configProps.put("replica.fetch.max.bytes", maxFileSize);
configProps.put("message.max.bytes", maxFileSize);
configProps.put("max.message.bytes", maxFileSize);
configProps.put("acks", "all");
configProps.put("retries", 0);
configProps.put("batch.size", 16384);
configProps.put("linger.ms", 1);
return new DefaultKafkaProducerFactory<>(configProps);
}
ConsumerConfig.java
#Bean
public ConsumerFactory<String, String> consumerFactoryLargeFiles() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress);
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
//required to allow Kafka process files <= 20 mb
props.put("fetch.message.max.bytes", maxFileSize);
return new DefaultKafkaConsumerFactory<>(props);
}
maxFileSize is 104857600 - it is about 104Mb. And I am trying to send message about 3MB.
I also added following env variables to my docker compose:
KAFKA_MAX_REQUEST_SIZE: 104857600 KAFKA_PRODUCER_MAX_REQUEST_SIZE:
104857600 CONNECT_PRODUCER_MAX_REQUEST_SIZE: 104857600
I will be happy to provide additional information or logs if need.
I suggest to upload file with other means and send file name via Kafka
I resolved by myself - maybe this will be helpful for other people. I followed following manual by Baeldung - https://www.baeldung.com/java-kafka-send-large-message:
Kafka topic config via java
Kafka broker config via env variable cause I am using Docker:
KAFKA_MESSAGE_MAX_BYTES: 20971520
Kafka Producer config via java
Kafka Consumer config via java
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> dcmContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(**consumerEventFactory()**);
Map<String, String> micrometerTags = new HashMap<>();
micrometerTags.put(KafkaCommonConfig.CONSUMER_TAG, TAG_VALUE);
factory.getContainerProperties().setMicrometerTags(micrometerTags);
return factory;
}
consumerEventFactory() in the above is called from below :
#Bean
public ConsumerFactory<String, String> consumerEventFactory() {
Map<String, Object> config = new HashMap<>();
config.put(
ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
CCMReader.getCcmDcmConsumerConfig().getBootStrapServers());
config.put(
ConsumerConfig.GROUP_ID_CONFIG, "randommmmm");
config.put(
ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG,
CCMReader.getCcmDcmConsumerConfig().getMaxPollIntervalMs());
config.put(
ConsumerConfig.MAX_POLL_RECORDS_CONFIG,
CCMReader.getCcmDcmConsumerConfig().getMaxPollRecords());
config.put(
ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG,
CCMReader.getCcmDcmConsumerConfig().getRequestTimeOutMs());
config.put(
ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG,
CCMReader.getCcmDcmConsumerConfig().getHeartbeatIntervalMs());
config.put(
ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,
"true");
config.put(
ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,
CCMReader.getCcmDcmConsumerConfig().getAutoCommitIntervalMs());
config.put(
ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,
"earliest");
config.put(JsonDeserializer.TRUSTED_PACKAGES, "*");
return new DefaultKafkaConsumerFactory<>(
config, new StringDeserializer(), new StringDeserializer());
}
Since there is no error handler, all the exceptions should be rejected.
However, the exception events are being called again and again (infinite times)
Not sure what is happening...only conclusion till now is black magic. Please help!
Whats happening when the exception occurs? Is the consumer process getting restarted?
Since there is no error handler, all the exceptions should be rejected.- No I don't think so.
Its normal for Kafka consumer to exit when exception is not handled within process method. So once the consumer comes up again (restart policy of your docker may be), it starts reading the same events since those events are not yet committed back to brokers from this client id as successful.
Am not sure of Spring Kafka, but I feels like you should define exception handling to make sure that those events are handled properly so that they are committed back as normal events than killing the consumer and running into an infinite loop.
Since there is no error handler,...
The default error handler (2.8 and later) is the DefaultErrorHandler (SeekToCurrentErrorHandler for earlier versions).
Both these error handlers will retry 9 times.
I am developing a consumer application using spring-kafka. I am planning to keep it running by 24*7 using pod. But, recently I got to know there are some batch processes which would be running in between. And, when those batches are running, our processing shouldn't occur. So, probably, somehow I have to stop polling for records and when the batches are finished then I can resume my processing. But, I have no clue how to achieve this..
Whether the batches are running or not, I can query and get the details from table, by looking into some flag. But, how can I stop polling for records ? and will it not cause re balancing if I just keep consumer application running without processing anything ?
Config class :
#Bean
public ConsumerFactory<String, GenericRecord> consumerFactory(){
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,KAFKA_BROKERS);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, OFFSET_RESET);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class.getName());
props.put(ConsumerConfig.GROUP_ID_CONFIG, GROUP_ID_CONFIG);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, MAX_POLL_RECORDS);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, MAX_POLL_INTERVAL);
props.put(KafkaAvroDeserializerConfig.SCHEMA_REGISTRY_URL_CONFIG, SCHEMA_REGISTRY_URL);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
props.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, SSL_PROTOCOL);
props.put(SslConfigs.SSL_TRUSTSTORE_LOCATION_CONFIG,SSL_TRUSTSTORE_LOCATION_FILE_NAME);
props.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG, SSL_TRUSTSTORE_SECURE);
props.put(SslConfigs.SSL_KEYSTORE_LOCATION_CONFIG,SSL_KEYSTORE_LOCATION_FILE_NAME);
props.put(SslConfigs.SSL_KEYSTORE_PASSWORD_CONFIG, SSL_KEYSTORE_SECURE);
props.put(SslConfigs.SSL_KEY_PASSWORD_CONFIG, SSL_KEY_SECURE);
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
ConcurrentKafkaListenerContainerFactory<String, GenericRecord>
kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, GenericRecord> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(KAFKA_CONCURRENCY);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE); // manual async comm
return factory;
}
Code :
#KafkaListener(topics = "${app.topic}", groupId = "${app.group_id_config}")
public void run(ConsumerRecord<String, GenericRecord> record, Acknowledgment acknowledgement) throws Exception {
try {
int flag = getBatchIndquery ();
// How to stop and resume based on the flag value---?
// business logic process once the consumer resumes
processRecords();
InsertDb();
acknowledgement.acknowledge();
}catch (Exception ex) {
System.out.println(record);
System.out.println(ex.getMessage());
}
}
Use the endpoint registry to stop and start the container...
#KafkaListener(id = "myListener" ...)
#Autowired
KafkaListenerEndpointRegistry registry;
...
registry.getListenerContainer("myListener").stop();
See #KafkaListener Lifecycle Management.
The listener containers created for #KafkaListener annotations are not beans in the application context. Instead, they are registered with an infrastructure bean of type KafkaListenerEndpointRegistry. This bean is automatically declared by the framework and manages the containers' lifecycles; it will auto-start any containers that have autoStartup set to true. All containers created by all container factories must be in the same phase. See Listener Container Auto Startup for more information. You can manage the lifecycle programmatically by using the registry. Starting or stopping the registry will start or stop all the registered containers. Alternatively, you can get a reference to an individual container by using its id attribute. You can set autoStartup on the annotation, which overrides the default setting configured into the container factory. You can get a reference to the bean from the application context, such as auto-wiring, to manage its registered containers.
For the native Java Kafka client, there is a Kafka configuration called, enable.idempotence and we can set it to be true to enable idempotence producer.
However, for Spring Kafka, I can't find similar idempotence property in KafkaProperties class.
So I am wondering, if I manually set in my Spring Kafka configuration file, whether this property will take effect or Spring will totally ignore this config for Spring Kafka?
There are two ways to specify this property
application.properties You can use this property to specify any additional properties on producer
spring.kafka.producer.properties.*= # Additional producer-specific properties used to configure the client.
If you have any additional common config between Producer and Consumer
spring.kafka.properties.*= # Additional properties, common to producers and consumers, used to configure the client.
Through Code You can also override and customize the configs
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,bootstrapAddress);
configProps.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
return new DefaultKafkaProducerFactory<>(configProps);
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
}
You're trying to add features that are not handled by Spring KafkaProperties, if you look at the documentation, you can do as the following:
Only a subset of the properties supported by Kafka are available directly through the KafkaProperties class.
If you wish to configure the producer or consumer with additional properties that are not directly supported, use the following properties:
spring.kafka.properties.prop.one=first
spring.kafka.admin.properties.prop.two=second
spring.kafka.consumer.properties.prop.three=third
spring.kafka.producer.properties.prop.four=fourth
spring.kafka.streams.properties.prop.five=fifth
https://docs.spring.io/spring-boot/docs/current/reference/html/boot-features-messaging.html#boot-features-kafka-extra-props
Yannick
You can find it with ProducerConfig as it is producer configuration. In order to enable this, you need to add below line in producerConfigs:
Properties producerProperties = new Properties();
producerProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
producerProperties.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
I am using kafka 0.10.1.1 and storm 1.0.2. In the storm documentation for kafka integration , i can see that offsets are still maintained using zookeeper as we are initializing kafka spout using zookeeper servers.
How can i bootstrap the spout using kafka servers .Is there any example for this .
Example from storm docs
BrokerHosts hosts = new ZkHosts(zkConnString);
SpoutConfig spoutConfig = new SpoutConfig(hosts, topicName, "/" + topicName, UUID.randomUUID().toString());
spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
KafkaSpout kafkaSpout = new KafkaSpout(spoutConfig);
This option using zookeeper is working fine and is consuming the messages . but i was not able to see the consumer group or storm nodes as consumers in kafkamanager ui .
Alternate approach tried is this .
KafkaSpoutConfig<String, String> kafkaSpoutConfig = newKafkaSpoutConfig();
KafkaSpout<String, String> spout = new KafkaSpout<>(kafkaSpoutConfig);
private static KafkaSpoutConfig<String, String> newKafkaSpoutConfig() {
Map<String, Object> props = new HashMap<>();
props.put(KafkaSpoutConfig.Consumer.BOOTSTRAP_SERVERS, bootstrapServers);
props.put(KafkaSpoutConfig.Consumer.GROUP_ID, GROUP_ID);
props.put(KafkaSpoutConfig.Consumer.KEY_DESERIALIZER,
"org.apache.kafka.common.serialization.StringDeserializer");
props.put(KafkaSpoutConfig.Consumer.VALUE_DESERIALIZER,
"org.apache.kafka.common.serialization.StringDeserializer");
props.put(KafkaSpoutConfig.Consumer.ENABLE_AUTO_COMMIT, "true");
String[] topics = new String[1];
topics[0] = topicName;
KafkaSpoutStreams kafkaSpoutStreams =
new KafkaSpoutStreamsNamedTopics.Builder(new Fields("message"), topics).build();
KafkaSpoutTuplesBuilder<String, String> tuplesBuilder =
new KafkaSpoutTuplesBuilderNamedTopics.Builder<>(new TuplesBuilder(topicName)).build();
KafkaSpoutConfig<String, String> spoutConf =
new KafkaSpoutConfig.Builder<>(props, kafkaSpoutStreams, tuplesBuilder).build();
return spoutConf;
}
But this solution is showing CommitFailedException after reading few messages from kafka.
Storm-kafka writes consumer information in a different location and different format in zookeeper with common kafka client. So you can't see it in kafkamanager ui.
You can find some other monitor tools, like
https://github.com/keenlabs/capillary.
On your alternate approach, you're likely getting CommitFailedException due to:
props.put(KafkaSpoutConfig.Consumer.ENABLE_AUTO_COMMIT, "true");
Up to Storm 2.0.0-SNAPSHOT (and since 1.0.6) -
KafkaConsumer autocommit is unsupported
From the docs:
Note that KafkaConsumer autocommit is unsupported. The
KafkaSpoutConfig constructor will throw an exception if the
"enable.auto.commit" property is set, and the consumer used by the
spout will always have that property set to false. You can configure
similar behavior to autocommit through the setProcessingGuarantee
method on the KafkaSpoutConfig builder.
References:
http://storm.apache.org/releases/2.0.0-SNAPSHOT/storm-kafka-client.html
http://storm.apache.org/releases/1.0.6/storm-kafka-client.html