flume-ng throws Kafka topic must be specified - apache-kafka

I'm trying to pull data off my kafka topic and write it to HDFS, and appear to have my flume conf identical to what I've seen in several examples, but I can't seem to get around the below error. I can consume from the the topic through python, so I know I'm ok there. I'm on flume version 1.6.0 and java 9.0.1. What am I doing wrong to make it not accept the kafka topic?
09 Jul 2018 17:17:26,973 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:145) -Creating channels
09 Jul 2018 17:17:26,984 INFO [conf-file-poller-0] (org.apache.flume.channel.DefaultChannelFactory.create:42) - Creating instance of channel kafka_hdfs_channel type memory
09 Jul 2018 17:17:26,989 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:200) - Created channel kafka_hdfs_channel
09 Jul 2018 17:17:26,989 INFO [conf-file-poller-0] (org.apache.flume.source.DefaultSourceFactory.create:41) - Creating instance of source kafka_source, type org.apache.flume.source.kafka.KafkaSource
09 Jul 2018 17:17:26,993 ERROR [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadSources:361) - Source kafka_source has been removed due to an error during configuration
org.apache.flume.conf.ConfigurationException: Kafka topic must be specified.
at org.apache.flume.source.kafka.KafkaSource.configure(KafkaSource.java:180)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:326)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:97)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:300)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base/java.lang.Thread.run(Thread.java:844)}
And here is my flume config:
agentCDIS.sources = kafka_source
agentCDIS.channels = kafka_hdfs_channel
agentCDIS.sinks = hdfs_sink
agentCDIS.sources.kafka_source.type = org.apache.flume.source.kafka.KafkaSource
agentCDIS.sources.kafka_source.kafka.bootstrap.servers = 10.4.3.61:9092, 10.4.3.62:9092, 10.4.3.63:9092
agentCDIS.sources.kafka_source.kafka.topic = test
agentCDIS.sources.kafka_source.kafka.consumer.group.id = cn_flume_group
agentCDIS.sources.kafka_source.channels = kafka_hdfs_channel
agentCDIS.sources.kafka_source.interceptors = i1
agentCDIS.sources.kafka_source.interceptors.i1.type = timestamp
agentCDIS.sources.kafka_source.kafka.consumer.timeout.ms = 1000
agentCDIS.channels.kafka_hdfs_channel.type = memory
agentCDIS.channels.kafka_hdfs_channel.capacity = 10000
agentCDIS.channels.kafka_hdfs_channel.transactionCapacity = 1000
agentCDIS.sinks.hdfs_sink.type = hdfs
agentCDIS.sinks.hdfs_sink.hdfs.path = hdfs://10.4.16.16:8020/user/cnelson/kafka/%{topic}/%y-%m-%d
agentCDIS.sinks.hdfs_sink.hdfs.rollInterval = 5
agentCDIS.sinks.hdfs_sink.hdfs.rollSize = 0
agentCDIS.sinks.hdfs_sink.fileType = DataStream
agentCDIS.sinks.hdfs_sink.channel = kafka_hdfs_channel
agentCDIS.sinks.loggerSink.type = logger
agentCDIS.sinks.loggerSink.kafka_hdfs_channel = memoryChannel
agentCDIS.channels.memoryChannel.type = memory
agentCDIS.channels.memoryChannel.capacity = 100

I went through the post and the config a few times and noticed - you've mentioned that you are using Flume's version 1.6 and as per the documentation, the properties are slightly different. Could you please try the following:
Instead of agentCDIS.sources.kafka_source.kafka.bootstrap.servers => try agentCDIS.sources.kafka_source.zookeeperConnect - the value for this property would be the zookeeper URI used by your Kafka cluster.
Instead of agentCDIS.sources.kafka_source.kafka.topic = test => try agentCDIS.sources.kafka_source.topic = test
Instead of agentCDIS.sources.kafka_source.kafka.consumer.group.id = cn_flume_group => try agentCDIS.sources.kafka_source.groupId = cn_flume_group
The above 3 properties that you've used in your config file were introduced from version 1.7.
I hope this helps!

Related

Apache flume with kafka source, kafka sink and memory channel - throwing UNKNOWN_TOPIC_OR_PARTITION

I am new to Apache flume https://flume.apache.org/. For one of the use-case, I need to move data from the Kafka topic on one cluster (bootstrap: bootstrap1, topic: topic1) to topic with different name in a different cluster (bootstrap: bootstrap2, topic: topic2). There are another use-cases in same project which fits best for flume and I need to use same flume pipeline for this use-case though there are other options to copy from Kafka to Kafka.
I tried below configs and the results are as mentioned in each options.
#1: telnet to kafka sink (bootstrap2, topic2) --> works perfect.
configs:
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = topic2
a1.sinks.k1.kafka.bootstrap.servers = bootstrap2
a1.sinks.k1.kafka.flumeBatchSize = 100
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
#2: kafka as source(bootstrap1, topic1) and logger as sink --> works perfect.
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.batchSize = 10
a1.sources.r1.batchDurationMillis = 2000
a1.sources.r1.kafka.bootstrap.servers = bootstrap1
a1.sources.r1.kafka.topics = topic1
a1.sources.r1.kafka.consumer.group.id = flume-gis-consumer
a1.sources.r1.backoffSleepIncrement = 1000
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
#3: kafka as source (bootstrap1, topic1) and kafka as sink(bootstrap2, topic2) --> gives error as mentioned below the config.
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.batchSize = 10
a1.sources.r1.batchDurationMillis = 2000
a1.sources.r1.kafka.bootstrap.servers = bootstrap1
a1.sources.r1.kafka.topics = topic1
a1.sources.r1.kafka.consumer.group.id = flume-gis-consumer1
a1.sources.r1.backoffSleepIncrement = 1000
# Describe the sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = topic2
a1.sinks.k1.kafka.bootstrap.servers = bootstrap2
a1.sinks.k1.kafka.flumeBatchSize = 100
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 100
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
Error:
(kafka-producer-network-thread | producer-1) [WARN - org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.handleCompletedMetadataResponse(NetworkClient.java:968)] [Producer clientId=producer-1] Error while fetching metadata with correlation id 85 : {topic1=UNKNOWN_TOPIC_OR_PARTITION}
Continuously shows above error.
ERROR upon terminating flume-ng command
(SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:158)] Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Failed to publish events
at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:268)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flume.EventDeliveryException: Could not send event
at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:234)
... 3 more
Seeking help from the stackoverflow community on:
What config is going wrong here. Kafka topics exists in respective clusters. (Option 1 and Option 2 works fine and I can see messages flowing from source to sink)
Why producer thread is trying to produce in source kafka topic?
I encountered the same issue today. My case is even worse because I host two topics on a single Kafka cluster.
It is really misleading that the producer thread in Kafka sink is producing back to the Kafka source topic.
I fixed the issue by setting allowTopicOverride to false for Kafka sink.
Quote from Kafka sink part in Flume document:
allowTopicOverride: Default is true. When set, the sink will allow a message to be produced into a topic specified by the topicHeader property (if provided).
topicHeader: When set in conjunction with allowTopicOverride will produce a message into the value of the header named using the value of this property. Care should be taken when using in conjunction with the Kafka Source topicHeader property to avoid creating a loopback.
And in Kafka source part:
setTopicHeader: Default is true. When set to true, stores the topic of the retrieved message into a header, defined by the topicHeader property.
So by default, Apache Flume store the Kafka source topic in topicHeader for each event. Then, Kafka sink by default write to the topic specify in topicHeader.

Http source configuration doesn't work for flume

Iam a beginner to Apache flume. Iam trying to pull data from a REST API and take it through flume and send to a kafka topic. But it is not working so far. The configuration I tried to use is shown below.There is a test GET API at localhost:8080/kafka/publish/ in the system. Iam trying to get data from this. The below configuration I tried pulling from flume documentation.
a1.sources = r1
a1.channels = c1
a1.sources.r1.type = http
a1.sources.r1.port = 8080
a1.sources.r1.channels = c1
a1.sources.r1.handler = org.apache.flume.source.http.JSONHandler
a1.sources.r1.handler.nickname = random props
a1.sources.r1.HttpConfiguration.sendServerVersion = false
a1.sources.r1.ServerConnector.idleTimeout = 300
a1.sinks.k1.channel = c1
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = simple
a1.sinks.k1.kafka.bootstrap.servers = localhost:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy
Can anyone help me solve this. What is the problem here?
The logs is added below
2020-12-03 11:16:17,696 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateConfigFilterSet(FlumeConfiguration.java:623)] Agent configuration for 'a1' has no configfilters.
2020-12-03 11:16:17,713 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:373)] Agent configuration for 'a1' does not contain any valid channels. Marking it as invalid.
2020-12-03 11:16:17,714 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:154)] Agent configuration invalid for agent 'a1'. It will be removed.
2020-12-03 11:16:17,715 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:163)] Post-validation flume configuration contains configuration for agents: []
2020-12-03 11:16:17,718 (conf-file-poller-0) [WARN - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:139)] No configuration found for this host:a1
2020-12-03 11:16:17,730 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:162)] Starting new configuration:{ sourceRunners:{} sinkRunners:{} channels:{} }

File reader configuration doesn't work for Fume

Iam new to flume and was trying my first experiment with it.Iam trying to read data from a file using fume and send it to a kafka topic.
The configuration is pulled from a tutorial website.The configuration is shown below.
a1.sources = r1
a1.sinks = sample
a1.channels = sample-channel
a1.sources.r1.type = exec
a1.sources.r1.command = tail -f \data.txt
a1.sources.r1.logStdErr = true
a1.channels.sample-channel.type = memory
a1.channels.sample-channel.capacity = 1000
a1.channels.sample-channel.transactionCapacity = 100
a1.sources.r1.channels = sample-channel
a1.sinks.sample.topic = sample
a1.sinks.sample.brokerList = 127.0.0.1:9092
a1.sinks.sample.requiredAcks = 1
a1.sinks.sample.batchSize = 20
a1.sinks.sample.channel = sample-channel
But this doesnot do anything.It isn't throwing any errors,but a few warnings. The log is shown below.
2020-12-03 12:01:17,265 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateConfigFilterSet(FlumeConfiguration.java:623)] Agent configuration for 'a1' has no configfilters.
2020-12-03 12:01:17,291 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:884)] Could not configure sink sample due to: Component has no type. Cannot configure. sample
org.apache.flume.conf.ConfigurationException: Component has no type. Cannot configure. sample
at org.apache.flume.conf.ComponentConfiguration.configure(ComponentConfiguration.java:76)
at org.apache.flume.conf.sink.SinkConfiguration.configure(SinkConfiguration.java:44)
at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:867)
at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:383)
at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.access$000(FlumeConfiguration.java:228)
at org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:153)
at org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:133)
at org.apache.flume.node.PropertiesFileConfigurationProvider.getFlumeConfiguration(PropertiesFileConfigurationProvider.java:194)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:97)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:145)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:835)
How can I solve this?
As the error says, r1 has no type, so it doesn't know what to do with your source
You're missing
a1.sources.r1.type = exec
Personally, I'd suggest Filebeat or Telegraf over Flume for taking files to Kafka

Telegraf inputs.tail with zimbra.log

I have some questions, how I can set telegraf.conf file for collect logs from the "zimbra.conf" file?
Now I tried to use this config text, but it does not work :(((
I want to send this logs to grafana
One of the lines "zimbra.conf" for example:
Oct 1 10:20:46 webmail postfix/smtp[7677]: BD5BAE9999: to=user#mail.com, relay=mo94.cloud.mail.com[92.97.907.14]:25, delay=0.73, delays=0.09/0.01/0.58/0.19, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 4C25fk2pjFz32N5)
And I do not understand exactly how works the "grok_patterns ="
[[inputs.tail]]
files = ["/var/log/zimbra.log"]
from_beginning = false
grok_patterns = ['%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST} %{DATA:program}(?:\[%{POSINT}\])?: %{GREEDYDATA:message}']
name_override = "zimbra_access_log"
grok_custom_pattern_files = []
grok_custom_patterns = '''
TS_UNIX %{MONTH}%{SPACE}%{MONTHDAY}%{SPACE}%{HOUR}:%{MINUTE}:%{SECOND}
TS_CUSTOM %{MONTH}%{SPACE}%{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND}
'''
grok_timezone = "Local"
data_format = "grok"
I have copied your example line into a log file called Prueba.txt wich contains the following lines:
Oct 3 00:52:32 webmail postfix/smtp[7677]: BD5BAE9999: to=user#mail.com, relay=mo94.cloud.mail.com[92.97.907.14]:25, delay=0.73, delays=0.09/0.01/0.58/0.19, dsn=2.0.0, status=sent (250 2.0$
Oct 13 06:25:01 webmail systemd-logind[949]: New session 229478 of user zimbra.
Oct 13 06:25:02 webmail zmconfigd[27437]: Shutting down. Received signal 15
Oct 13 06:25:02 webmail systemd-logind[949]: Removed session c296.
Oct 13 06:25:03 webmail sshd[28005]: Failed password for invalid user julianne from 120.131.2.210 port 10570 ssh2
I have been able to parse the data with this configuration of the tail.input plugin:
[[inputs.tail]]
files = ["Prueba.txt"]
from_beginning = true
data_format = "grok"
grok_patterns = ['%{TIMESTAMP_ZIMBRA} %{GREEDYDATA:source} %{DATA:program}(?:\[%{POSINT}\])?: %{GREEDYDATA:message}']
grok_custom_patterns = '''
TIMESTAMP_ZIMBRA (\w{3} \d{1,2} \d{2}:\d{2}:\d{2})
'''
name_override = "log_frames"
You need to match the input string with regular expressions. For that there are some predefined patters such as GREEDYDATA = .* that you can use to match your input (another example will be NUMBER = (?:%{BASE10NUM}) BASE16NUM (?<![0-9A-Fa-f])(?:[+-]?(?:0x)?(?:[0-9A-Fa-f]+))) . You can also define your own patterns in grok_custom_patterns. Take a look at this website with some patters: https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Apx-GrokPatterns/GrokPatterns_title.html
In this case I defined a TIMESTAMP_ZIMBRA pattern for matching Oct 3 00:52:32 and Oct 03 00:52:33 alike inputs.
Here is the collected metric by Prometheus:
# HELP log_frames_delay Telegraf collected metric
# TYPE log_frames_delay untyped
log_frames_delay{delays="0.09/0.01/0.58/0.19",dsn="2.0.0",host="localhost.localdomain",message="BD5BAE9999:",path="Prueba.txt",program="postfix/smtp",relay="mo94.cloud.mail.com[92.97.907.14]:25",source="webmail",status="sent (250 2.0.0 Ok: queued as 4C25fk2pjFz32N5)",to="user#mail.com"} 0.73
P.D.: Ensure that telegraf has access to the log files.

org.apache.kafka.common.errors.RecordTooLargeException in Flume Kafka Sink

I am trying to read data from JMS source and pushing them into KAFKA topic, while doing that after few hours i observed that pushing frequency to the KAFKA topic became almost zero and after some initial analysis i found following exception in FLUME logs .
28 Feb 2017 16:35:44,758 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:158) - Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Failed to publish events
at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:252)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.RecordTooLargeException: The message is 1399305 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration.
at org.apache.kafka.clients.producer.KafkaProducer$FutureFailure.<init>(KafkaProducer.java:686)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:449)
at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:212)
... 3 more
Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The message is 1399305 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration.
my flume shows the current set value (in logs ) for max.request.size as 1048576 , which is clearly very less than 1399305 , increasing this max.request.size may eliminate these exception but am unable to find correct place for updating that value .
My flume.config ,
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.channels.c1.type = file
a1.channels.c1.transactionCapacity = 1000
a1.channels.c1.capacity = 100000000
a1.channels.c1.checkpointDir = /data/flume/apache-flume-1.7.0-bin/checkpoint
a1.channels.c1.dataDirs = /data/flume/apache-flume-1.7.0-bin/data
a1.sources.r1.type = jms
a1.sources.r1.interceptors.i1.type = timestamp
a1.sources.r1.interceptors.i1.preserveExisting = true
a1.sources.r1.channels = c1
a1.sources.r1.initialContextFactory = some context urls
a1.sources.r1.connectionFactory = some_queue
a1.sources.r1.providerURL = some_url
#a1.sources.r1.providerURL = some_url
a1.sources.r1.destinationType = QUEUE
a1.sources.r1.destinationName = some_queue_name
a1.sources.r1.userName = some_user
a1.sources.r1.passwordFile= passwd
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = some_kafka_topic
a1.sinks.k1.kafka.bootstrap.servers = some_URL
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.flumeBatchSize = 1
a1.sinks.k1.channel = c1
Any help will be really appreciated !!
This change has to be done at Kafka.
Update the Kafka producer configuration file producer.properties with a larger value like
max.request.size=10000000
It seems like i have resolved my issue ; As suspected increasing the max.request.size eliminated the exception , for updating such kafka sink(producer) properties FLUME provides the constant prefix as kafka.producer. and we can append this constant prefix with any kafka properties ;
so mine goes as, a1.sinks.k1.kafka.producer.max.request.size = 5271988 .