How can you set the max.message.bytes of a state store changelog topic? - apache-kafka

I have a Kafka Streams application with messages up to 10MiB. I want to persist these messages in a state store, but Kafka Streams fails to produce to the internal changelog topic:
2017-11-17 08:36:19,792 ERROR RecordCollectorImpl - task [4_5] Error sending record to topic appid-statestorename-state-store-changelog. No more offsets will be recorded for this task and the exception will eventually be thrown
org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
2017-11-17 08:36:20,583 ERROR StreamThread - stream-thread [StreamThread-1] Failed while executing StreamTask 4_5 due to flush state:
By adding some logging, it looks like the default max.message.bytes setting of an internal topic is 1MiB.
The default max.message.bytes for the cluster is set to 50MiB.
Is it possible to tweak the configuration of internal topics of Kafka Streams applications?
A work-around is to start the streams application, let it create the topics, and afterwards alter the topic config. But this feels like a dirty hack.
./kafka-topics.sh --zookeeper ... \
--alter --topic appid-statestorename-state-store-changelog \
--config max.message.bytes=10485760

Kafka 1.0 allows to specify custom topic properties for internal topics via StreamsConfig.
You prefix those configs with "topic." and can use any configs as defined in TopicConfig.
See the original KIP for more details:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-173%3A+Add+prefix+to+StreamsConfig+to+enable+setting+default+internal+topic+configs

Related

Error while fetching metadata kafka {test=LEADER_NOT_AVAILABLE}?

I get the error after running those simple commands -
I started the Zookeeper and the Kafka servers,
I execute the command:
./kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
and execute the command:
./kafka-console-producer --broker-list localhost:9092 --topic test
I obtain a list of WARN like:
[2019-12-08 21:36:13,024] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 37 : {test=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
what did I do wrong?
Thanks
If your broker has the auto.create.topics.enable set to true, then this error will be transient and you should be able to produce message without any further error.
It happens just because the producer is asking for metadata about the topic it wants to write to but that topic doesn't exist in the cluster and the partition leader (where the producer wants to write) doesn't exist yet.
If you retry, the broker will create the topic and the command will work fine.
If the above configuration is set to false, then the broker doesn't create the topic automatically on the first request from a client so you have to create it upfront.
Finally, but it's not your case, the above error could even happen when the topic exist but, for example, the broker which is leader for the specific topic partition is down and a new leader election is in progress.
I wanted to add a comment, but its seems I can't. Just go through this link. Someone had a similar problem and it seems the problem is not what you have done but can be somethings different.
Link: https://grokbase.com/t/kafka/users/134qvay38q/leadernotavailable-exception

Cannot Restart Kafka Consumer Application, Failing due to OffsetOutOfRangeException

Currently, my Kafka Consumer streaming application is manually committing the offsets into Kafka with enable.auto.commit set to false.
The application failed when I tried restarting it throwing below exception:
org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions:{partition-12=155555555}
Assuming the above error is due to the message not present/partition deleted due to retention period, I tried below method:
I disabled the manual commit and enabled auto commit(enable.auto.commit=true and auto.offset.reset=earliest)
Still it fails with the same error
org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions:{partition-12=155555555}
Please suggest ways to restart the job so that it can successfully read the correct offset for which message/partition is present
You are trying to read offset 155555555 from partition 12 of topic partition, but -most probably- it might have already been deleted due to your retention policy.
You can either use Kafka Streams Application Reset Tool in order to reset your Kafka Streams application's internal state, such that it can reprocess its input data from scratch
$ bin/kafka-streams-application-reset.sh
Option (* = required) Description
--------------------- -----------
* --application-id <id> The Kafka Streams application ID (application.id)
--bootstrap-servers <urls> Comma-separated list of broker urls with format: HOST1:PORT1,HOST2:PORT2
(default: localhost:9092)
--intermediate-topics <list> Comma-separated list of intermediate user topics
--input-topics <list> Comma-separated list of user input topics
--zookeeper <url> Format: HOST:POST
(default: localhost:2181)
or start your consumer using a fresh consumer group ID.
I met the same problem and I use package org.apache.spark.streaming.kafka010 in my application.In the begining,I suscepted the auto.offset.reset strategy take no effect,but when I read the description of the method fixKafkaParams in the object KafkaUtils,i found the configuration has been overwrited.I guess the reason why it tweak the configuration ConsumerConfig.AUTO_OFFSET_RESET_CONFIG for executor is to keep consistent offset obtained by driver and executor.

Setting log.retentions.hours for broker in Kafka 0.10.2.x

I am trying to set log.retenton.hours for broker level configuration for kafka 0.10.2x. But I am getting this error for below command.
kafka-configs.sh --zookeeper zookeeper:2181 --entity-type brokers --entity-name 0 --alter --add-config log.retention.hours=-1
Error while executing config command requirement failed: Unknown Dynamic Configuration 'log.retention.hours'.
java.lang.IllegalArgumentException: requirement failed: Unknown Dynamic Configuration 'log.retention.hours'.
at scala.Predef$.require(Predef.scala:277)
at kafka.server.DynamicConfig$.$anonfun$validate$1(DynamicConfig.scala:101)
at kafka.server.DynamicConfig$.$anonfun$validate$1$adapted(DynamicConfig.scala:100)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1406)
at kafka.server.DynamicConfig$.kafka$server$DynamicConfig$$validate(DynamicConfig.scala:100)
at kafka.server.DynamicConfig$Broker$.validate(DynamicConfig.scala:59)
at kafka.admin.AdminUtils$.changeBrokerConfig(AdminUtils.scala:555)
at kafka.admin.ConfigCommand$.alterConfig(ConfigCommand.scala:105)
at kafka.admin.ConfigCommand$.main(ConfigCommand.scala:68)
at kafka.admin.ConfigCommand.main(ConfigCommand.scala)
As the error says, that property is not dynamic (cannot be modified while the broker is running)
Plus, that feature shouldn't be possible with your version
From Kafka version 1.1 onwards, some of the broker configs can be updated without restarting the broker
You can set retention per topic level, otherwise, you need to edit the server.properties file of every broker and gracefully reboot them
I'm sure you have a good reason for "disabling" retention, but I would suggest trying compacted topics first
log.retention.hours is read-only property at broker level, so it can't be changed using kafka-config.sh dynamically.
Change it in server.properties and restart the brokers.
Here are the details for readonly or dynamic broker config.
https://kafka.apache.org/documentation/#dynamicbrokerconfigs

Kafka - how to use #KafkaListener(topicPattern="${kafka.topics}") where property kafka.topics is 'sss.*'?

I'm trying to implement Kafka consumer with topic names as a pattern. E.g. #KafkaListener(topicPattern="${kafka.topics}") where property kafka.topics is 'sss.*'. Now when I send message to topic 'sss.test' or any other topic name like 'sss.xyz', 'sss.pqr', it's throwing error as below:
WARN o.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 12 : {sss.xyz-topic=LEADER_NOT_AVAILABLE}
I tried to enable listeners & advertised.listeners in the server.properties file but when I re-start Kafka it consumes messages from all old topics which were tried. The moment I use new topic name, it throws above error.
Kafka doesn't support pattern matching? Or there's some configuration which I'm missing? Please suggest.

Kafka unrecoverable if broker dies

We have a kafka cluster with three brokers (node ids 0,1,2) and a zookeeper setup with three nodes.
We created a topic "test" on this cluster with 20 partitions and replication factor 2. We are using Java producer API to send messages to this topic. One of the kafka broker intermittently goes down after which it is unrecoverable. To simulate the case, we killed one of the broker manually. As per the kafka arch, it is supposed to self recover, but which is not happening. When I describe the topic on the console, I see the number of ISR's reduced to one for few of the partitions as one of the broker killed. Now, whenever we are trying to push messages via the producer API (either Java client or console producer), we are encountering SocketTimeoutException.. One quick look into the logs says, "Unable to fetch the metadata"
WARN [2015-07-01 22:55:07,590] [ReplicaFetcherThread-0-3][] kafka.server.ReplicaFetcherThread - [ReplicaFetcherThread-0-3],
Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 23711; ClientId: ReplicaFetcherThread-0-3;
ReplicaId: 0; MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo: [zuluDelta,2] -> PartitionFetchInfo(11409,1048576),[zuluDelta,14] -> PartitionFetchInfo(11483,1048576).
Possible cause: java.nio.channels.ClosedChannelException
[2015-07-01 23:37:40,426] WARN Fetching topic metadata with correlation id 0 for topics [Set(test)] from broker [id:1,host:abc-0042.yy.xxx.com,port:9092] failed (kafka.client.ClientUtils$)
java.net.SocketTimeoutException
at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:201)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:221)
at kafka.utils.Utils$.read(Utils.scala:380)
at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:111)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
at kafka.producer.SyncProducer.send(SyncProducer.scala:113)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:93)
at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
Any leads will be appreciated...
From your error Unable to fetch metadata it could mostly be because you could have set the bootstrap.servers in the producer to the broker that has died.
Ideally, you must have more than one broker in the bootstrap.servers list because if one of the broker fails (or is unreachable) then the other could give you the metadata.
FYI: Metadata is the information about a particular topic that tells how many number of partitions it has, their leader brokers, follower brokers etc.
So, when a key is produced to a partition, its corresponding leader broker will be the one to whom the messages will be sent to.
From your question, your ISR set has only one broker. You could try setting the bootstrap.server to this broker.