Working with kafka version: 2.0.1 & kafka-streams-scala version 2.0.1
Log DEBUG messages such as:
DEBUG 2019-05-08 09:57:53,322 [he.kafka.clients.NetworkClient] [
] [ ]: [Consumer
clientId=XXX-bd6b071d-a44f-4253-a3a5-539d60a72dd3-StreamThread-1-consumer,
groupId=XXX] Disconnecting from node YYY due to request
timeout."
led me to increase the request.timeout.ms value:
private val config: Properties = new Properties
config.put(StreamsConfig.REQUEST_TIMEOUT_MS_CONFIG, "240000")
...
private val streams: KafkaStreams = new KafkaStreams(topology, config)
However, this sets the new value to 240000ms for both the AdminClientConfig and the ConsumerConfig (The default request.timeout.ms values for AdminClientConfig and the ConsumerConfig are actually different -- 120000ms and 40000ms respectively).
Is there any way to set the Kafka Streams configuration values for either the AdminClientConfig or the ConsumerConfig without overriding the values of both?
You can prefix any config with consumer. or admin. to apply it to only one client.
There is also main.consumer., restore.consumer. and global.consumer. to distinguish the different consumers further. Using consumer. as prefix, the config is applied to all consumers.
Finally, there is also producer. prefix (just mention it for completeness).
Compare the docs: https://docs.confluent.io/current/streams/developer-guide/config-streams.html#naming
Related
We got strange errors on Kafka Streams during starting app
java.lang.IllegalArgumentException: Illegal base64 character 7b
at java.base/java.util.Base64$Decoder.decode0(Base64.java:743)
at java.base/java.util.Base64$Decoder.decode(Base64.java:535)
at java.base/java.util.Base64$Decoder.decode(Base64.java:558)
at org.apache.kafka.streams.processor.internals.StreamTask.decodeTimestamp(StreamTask.java:985)
at org.apache.kafka.streams.processor.internals.StreamTask.initializeTaskTime(StreamTask.java:303)
at org.apache.kafka.streams.processor.internals.StreamTask.initializeMetadata(StreamTask.java:265)
at org.apache.kafka.streams.processor.internals.AssignedTasks.initializeNewTasks(AssignedTasks.java:71)
at org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:385)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:769)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:698)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:671)
and, as a result, error about failed stream: ERROR KafkaStreams - stream-client [xxx] All stream threads have died. The instance will be in error state and should be closed.
According to code inside org.apache.kafka.streams.processor.internals.StreamTask, failure happened due to error in decoding timestamp metadata (StreamTask.decodeTimestamp()). It happened on prod, and can't reproduce on stage.
What could be the root cause of such errors?
Extra info: our app uses Kafka-Streams and consumes messages from several kafka brokers using the same application.id and state.dir (actually we switch from one broker to another, but during some period we connected to both brokers, so we have two kafka streams, one per each broker). As I understand, consumer group lives on broker side (so shouldn't be a problem), but state dir is on client side. Maybe some race condition occurred due to using the same state.dir for two kafka streams? could it be the root cause?
We use kafka-streams v.2.4.0, kafka-clients v.2.4.0, Kafka Broker v.1.1.1, with the following configs:
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.timestamp.extractor: org.apache.kafka.streams.processor.WallclockTimestampExtractor
default.deserialization.exception.handler: org.apache.kafka.streams.errors.LogAndContinueExceptionHandler
commit.interval.ms: 5000
num.stream.threads: 1
auto.offset.reset: latest
Finally, we figured out what is the root cause of corrupted metadata by some consumer groups.
It was one of our internal monitoring tool (written with pykafka) that corrupted metadata by temporarily inactive consumer groups.
Metadata were unencrupted and contained invalid data like the following: {"consumer_id": "", "hostname": "monitoring-xxx"}.
In order to understand what exactly we have in consumer metadata, we could use the following code:
Map<String, Object> config = Map.of( "group.id", "...", "bootstrap.servers", "...");
String topicName = "...";
Consumer<byte[], byte[]> kafkaConsumer = new KafkaConsumer<byte[], byte[]>(config, new ByteArrayDeserializer(), new ByteArrayDeserializer());
Set<TopicPartition> topicPartitions = kafkaConsumer.partitionsFor(topicName).stream()
.map(partitionInfo -> new TopicPartition(topicName, partitionInfo.partition()))
.collect(Collectors.toSet());
kafkaConsumer.committed(topicPartitions).forEach((key, value) ->
System.out.println("Partition: " + key + " metadata: " + (value != null ? value.metadata() : null)));
Several options to fix already corrupted metadata:
change consumer group to a new one. caution that you might lose or duplicate messages depending on the latest or earliest offset reset policy. so for some cases, this option might be not acceptable
overwrite metadata manually (timestamp is encoded according to logic inside StreamTask.decodeTimestamp()):
Map<TopicPartition, OffsetAndMetadata> updatedTopicPartitionToOffsetMetadataMap = kafkaConsumer.committed(topicPartitions).entrySet().stream()
.collect(Collectors.toMap(Map.Entry::getKey, (entry) -> new OffsetAndMetadata((entry.getValue()).offset(), "AQAAAXGhcf01")));
kafkaConsumer.commitSync(updatedTopicPartitionToOffsetMetadataMap);
or specify metadata as Af////////// that means NO_TIMESTAMP in Kafka Streams.
Actually i have a springboot based micro-service , and i have used kafka to produce/consume data from different system.
Now my question is i have two different topics and based on topics i have two different consumer classes to consume data,
how to define multiple consumer properties in application.yml file ?
I configured for one consumer in application.yml like below :-
spring:
kafka:
consumer:
bootstrapservers: http://199.968.98.101:9092
group-id: groupid-QA-02
auto-offset-reset: latest
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
I am using #KafkaListener in my consumer classes
example of consumer method which i used in code
#KafkaListener(topics = "${app.topic.b2b_tf_ta_req}", groupId = "${app.topic.groupoId}")
public void consume(String message) throws Exception {
}
As far as I know bootstrap-servers accept comma separated list of servers
i.e. if you set it to server1:9092,server2:9092 kafka should connect to all of them
I can send and receive messages on command line against a Kafka location installation. I also can send messages through a Java code. And those messages are showing up in a Kafka command prompt. I also have a Java code for the Kafka consumer. The code received message yesterday. It doesn't receive any messages this morning, however. The code has not been changed. I am wondering whether the property configuration isn't quite right nor not. Here is my configuration:
The Producer:
bootstrap.servers - localhost:9092
group.id - test
key.serializer - StringSerializer.class.getName()
value.serializer - StringSerializer.class.getName()
and the ProducerRecord is set as
ProducerRecord<String, String>("test", "mykey", "myvalue")
The Consumer:
zookeeper.connect - "localhost:2181"
group.id - "test"
zookeeper.session.timeout.ms - 500
zookeeper.sync.time.ms - 250
auto.commit.interval.ms - 1000
key.deserializer - org.apache.kafka.common.serialization.StringDeserializer
value.deserializer - org.apache.kafka.common.serialization.StringDeserializer
and for Java code:
Map<String, Integer> topicCount = new HashMap<>();
topicCount.put("test", 1);
Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreams = consumer
.createMessageStreams(topicCount);
List<KafkaStream<byte[], byte[]>> streams = consumerStreams.get(topic);
What is missing?
A number of things could be going on.
First, your consumer's ZooKeeper session timeout is very low, which means the consumer may be experiencing many "soft failures" due to garbage collection pauses. When this happens, the consumer group will rebalance, which can pause consumption. And if this is happening very frequently, the consumer could get into a state where it never consumes messages because it's constantly being rebalanced. I suggest increasing the ZooKeeper session timeout to 30 seconds to see if this resolves the issue. If so, you can experiment setting it lower.
Second, can you confirm new messages are being produced to the "test" topic? Your consumer will only consume new messages that it hasn't committed yet. It's possible the topic doesn't have any new messages.
Third, do you have other consumers in the same consumer group that could be processing the messages? If one consumer is experiencing frequent soft failures, other consumers will be assigned its partitions.
Finally, you're using the "old" consumer which will eventually be removed. If possible, I suggest moving to the "new" consumer (KafkaConsumer.java) which was made available in Kafka 0.9. Although I can't promise this will resolve your issue.
Hope this helps.
I'm trying to build pipeline with Apache Flume:
spooldir -> kafka channel -> hdfs sink
Events go to kafka topic without problems and I can see them with kafkacat request. But kafka channel can't write files to hdfs via sink. The error is:
Timed out while waiting for data to come from Kafka
Full log:
2016-02-26 18:25:17,125
(SinkRunner-PollingRunner-DefaultSinkProcessor-SendThread(zoo02:2181))
[DEBUG -
org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:717)]
Got ping response for sessionid: 0x2524a81676d02aa after 0ms
2016-02-26 18:25:19,127
(SinkRunner-PollingRunner-DefaultSinkProcessor-SendThread(zoo02:2181))
[DEBUG -
org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:717)]
Got ping response for sessionid: 0x2524a81676d02aa after 1ms
2016-02-26 18:25:21,129
(SinkRunner-PollingRunner-DefaultSinkProcessor-SendThread(zoo02:2181))
[DEBUG -
org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:717)]
Got ping response for sessionid: 0x2524a81676d02aa after 0ms
2016-02-26 18:25:21,775
(SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG -
org.apache.flume.channel.kafka.KafkaChannel$KafkaTransaction.doTake(KafkaChannel.java:327)]
Timed out while waiting for data to come from Kafka
kafka.consumer.ConsumerTimeoutException at
kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:69)
at
kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
at
kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
at
org.apache.flume.channel.kafka.KafkaChannel$KafkaTransaction.doTake(KafkaChannel.java:306)
at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:374)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:745)
My FlUME's config is:
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c2
# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /home/alex/spoolFlume
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://10.12.0.1:54310/logs/flumetest/
a1.sinks.k1.hdfs.filePrefix = flume-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writeFormat = Text
a1.channels.c2.type = org.apache.flume.channel.kafka.KafkaChannel
a1.channels.c2.capacity = 10000
a1.channels.c2.transactionCapacity = 1000
a1.channels.c2.brokerList=kafka10:9092,kafka11:9092,kafka12:9092
a1.channels.c2.topic=flume_test_001
a1.channels.c2.zookeeperConnect=zoo00:2181,zoo01:2181,zoo02:2181
# Bind the source and sink to the channel
a1.sources.r1.channels = c2
a1.sinks.k1.channel = c2
With memory channel instead of kafka channel all works good.
Thanks for any ideas in advance!
ConsumerTimeoutException means there is no new message for a long time, doesn't mean connect time out for Kafka.
http://kafka.apache.org/documentation.html
consumer.timeout.ms -1 Throw a timeout exception to the consumer if no message is available for consumption after the specified interval
Kafka's ConsumerConfig class has the "consumer.timeout.ms" configuration property, which Kafka sets by default to -1. Any new Kafka Consumer is expected to override the property with a suitable value.
Below is a reference from Kafka documentation :
consumer.timeout.ms -1
By default, this value is -1 and a consumer blocks indefinitely if no new message is available for consumption. By setting the value to a positive integer, a timeout exception is thrown to the consumer if no message is available for consumption after the specified timeout value.
When Flume creates a Kafka channel, it is setting the timeout.ms value to 100, as seen on the Flume logs at the INFO level. That explains why we see a ton of these ConsumerTimeoutExceptions.
level: INFO Post-validation flume configuration contains configuration for agents: [agent]
level: INFO Creating channels
level: DEBUG Channel type org.apache.flume.channel.kafka.KafkaChannel is a custom type
level: INFO Creating instance of channel c1 type org.apache.flume.channel.kafka.KafkaChannel
level: DEBUG Channel type org.apache.flume.channel.kafka.KafkaChannel is a custom type
level: INFO Group ID was not specified. Using flume as the group id.
level: INFO {metadata.broker.list=kafka:9092, request.required.acks=-1, group.id=flume,
zookeeper.connect=zookeeper:2181, **consumer.timeout.ms=100**, auto.commit.enable=false}
level: INFO Created channel c1
Going by the Flume user guide on Kafka channel settings, I tried to override this value by specifying the below, but that doesn't seem to work though:
agent.channels.c1.kafka.consumer.timeout.ms=5000
Also, we did a load test with pounding data through the channels constantly, and this exception didn't occur during the tests.
I read flume's source code, and found that flume reads value of the key "timeout" for "consumer.timeout.ms".
So you can config the value for "consumer.timeout.ms" like this:
agent1.channels.kafka_channel.timeout=-1
I installed kafka on my server and want to learn how to use it,
I found a sample code written by scala, below is part of it,
def createConsumerConfig(zookeeper: String, groupId: String): ConsumerConfig = {
val props = new Properties()
props.put("zookeeper.connect", zookeeper)
props.put("group.id", groupId)
props.put("auto.offset.reset", "largest")
props.put("zookeeper.session.timeout.ms", "400")
props.put("zookeeper.sync.time.ms", "200")
props.put("auto.commit.interval.ms", "1000")
val config = new ConsumerConfig(props)
config
}
but I don't know how to find the group id on my server.
The group id is something you define yourself for your consumer by providing a string id for it. All consumers started with the same id will "cooperate" and read topics in a coordinated way where each consumer instance will handle a subset of the messages in a topic. Providing a non-existent group id will be considered to be a new consumer and create a new entry in Zookeeper where committed offsets will be stored.
You could get a Zookeeper shell and list path where Kafka stores consumers' offsets like this:
./bin/zookeeper-shell.sh localhost:2181
ls /consumers
You'll get a list of all groups.
EDIT: I missed the part where you said that you're setting this up yourself so I thought that you want to list the consumer groups of an existing cluster.
Lundahl is right, this is a property that you define, which is used to coordinate consumer threads so that they don't consume "each other's" messages (each consumes a subset). If you, for example, use 2 consumers with different groups, they'll each consume the whole topic.
/kafkadir/kafka-consumer-groups.sh --all-topics --bootstrap-server hostname:port --list