Reactive kafka Keep getting Timeout exception while producing message - apache-kafka

Keep getting log:reactor.core.Exceptions$ErrorCallbackNotImplemented: org.apache.kafka.common.errors.TimeoutException: Topic topic not present in metadata after 60000 ms. Caused by: org.apache.kafka.common.errors.TimeoutException: Topic topic not present in metadata after 60000 ms. when trying to produce message on kafka.
Already made sure that I have Jackson core, Jackson databind and Kafka clients dependencies in the producer project. Also How do I pass security protocol in reactor kafka SenderOptions

Topic topic not present in metadata after 60000 ms. You have to create the topic before you can use it - either with command line tools, or with an AdminClient.
You can set any ProducerConfig property in the map passed into the create().
/**
* Creates a sender options instance with the specified config overrides for the underlying
* Kafka {#link Producer}.
* #return new instance of sender options
*/
#NonNull
static <K, V> SenderOptions<K, V> create(#NonNull Map<String, Object> configProperties) {
return new ImmutableSenderOptions<>(configProperties);
}

Related

Kafka streams fail on decoding timestamp metadata inside StreamTask

We got strange errors on Kafka Streams during starting app
java.lang.IllegalArgumentException: Illegal base64 character 7b
at java.base/java.util.Base64$Decoder.decode0(Base64.java:743)
at java.base/java.util.Base64$Decoder.decode(Base64.java:535)
at java.base/java.util.Base64$Decoder.decode(Base64.java:558)
at org.apache.kafka.streams.processor.internals.StreamTask.decodeTimestamp(StreamTask.java:985)
at org.apache.kafka.streams.processor.internals.StreamTask.initializeTaskTime(StreamTask.java:303)
at org.apache.kafka.streams.processor.internals.StreamTask.initializeMetadata(StreamTask.java:265)
at org.apache.kafka.streams.processor.internals.AssignedTasks.initializeNewTasks(AssignedTasks.java:71)
at org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:385)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:769)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:698)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:671)
and, as a result, error about failed stream: ERROR KafkaStreams - stream-client [xxx] All stream threads have died. The instance will be in error state and should be closed.
According to code inside org.apache.kafka.streams.processor.internals.StreamTask, failure happened due to error in decoding timestamp metadata (StreamTask.decodeTimestamp()). It happened on prod, and can't reproduce on stage.
What could be the root cause of such errors?
Extra info: our app uses Kafka-Streams and consumes messages from several kafka brokers using the same application.id and state.dir (actually we switch from one broker to another, but during some period we connected to both brokers, so we have two kafka streams, one per each broker). As I understand, consumer group lives on broker side (so shouldn't be a problem), but state dir is on client side. Maybe some race condition occurred due to using the same state.dir for two kafka streams? could it be the root cause?
We use kafka-streams v.2.4.0, kafka-clients v.2.4.0, Kafka Broker v.1.1.1, with the following configs:
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.timestamp.extractor: org.apache.kafka.streams.processor.WallclockTimestampExtractor
default.deserialization.exception.handler: org.apache.kafka.streams.errors.LogAndContinueExceptionHandler
commit.interval.ms: 5000
num.stream.threads: 1
auto.offset.reset: latest
Finally, we figured out what is the root cause of corrupted metadata by some consumer groups.
It was one of our internal monitoring tool (written with pykafka) that corrupted metadata by temporarily inactive consumer groups.
Metadata were unencrupted and contained invalid data like the following: {"consumer_id": "", "hostname": "monitoring-xxx"}.
In order to understand what exactly we have in consumer metadata, we could use the following code:
Map<String, Object> config = Map.of( "group.id", "...", "bootstrap.servers", "...");
String topicName = "...";
Consumer<byte[], byte[]> kafkaConsumer = new KafkaConsumer<byte[], byte[]>(config, new ByteArrayDeserializer(), new ByteArrayDeserializer());
Set<TopicPartition> topicPartitions = kafkaConsumer.partitionsFor(topicName).stream()
.map(partitionInfo -> new TopicPartition(topicName, partitionInfo.partition()))
.collect(Collectors.toSet());
kafkaConsumer.committed(topicPartitions).forEach((key, value) ->
System.out.println("Partition: " + key + " metadata: " + (value != null ? value.metadata() : null)));
Several options to fix already corrupted metadata:
change consumer group to a new one. caution that you might lose or duplicate messages depending on the latest or earliest offset reset policy. so for some cases, this option might be not acceptable
overwrite metadata manually (timestamp is encoded according to logic inside StreamTask.decodeTimestamp()):
Map<TopicPartition, OffsetAndMetadata> updatedTopicPartitionToOffsetMetadataMap = kafkaConsumer.committed(topicPartitions).entrySet().stream()
.collect(Collectors.toMap(Map.Entry::getKey, (entry) -> new OffsetAndMetadata((entry.getValue()).offset(), "AQAAAXGhcf01")));
kafkaConsumer.commitSync(updatedTopicPartitionToOffsetMetadataMap);
or specify metadata as Af////////// that means NO_TIMESTAMP in Kafka Streams.

How to configure multiple kafka consumer in application.yml file

Actually i have a springboot based micro-service , and i have used kafka to produce/consume data from different system.
Now my question is i have two different topics and based on topics i have two different consumer classes to consume data,
how to define multiple consumer properties in application.yml file ?
I configured for one consumer in application.yml like below :-
spring:
kafka:
consumer:
bootstrapservers: http://199.968.98.101:9092
group-id: groupid-QA-02
auto-offset-reset: latest
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
I am using #KafkaListener in my consumer classes
example of consumer method which i used in code
#KafkaListener(topics = "${app.topic.b2b_tf_ta_req}", groupId = "${app.topic.groupoId}")
public void consume(String message) throws Exception {
}
As far as I know bootstrap-servers accept comma separated list of servers
i.e. if you set it to server1:9092,server2:9092 kafka should connect to all of them

Kafka - how to use #KafkaListener(topicPattern="${kafka.topics}") where property kafka.topics is 'sss.*'?

I'm trying to implement Kafka consumer with topic names as a pattern. E.g. #KafkaListener(topicPattern="${kafka.topics}") where property kafka.topics is 'sss.*'. Now when I send message to topic 'sss.test' or any other topic name like 'sss.xyz', 'sss.pqr', it's throwing error as below:
WARN o.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 12 : {sss.xyz-topic=LEADER_NOT_AVAILABLE}
I tried to enable listeners & advertised.listeners in the server.properties file but when I re-start Kafka it consumes messages from all old topics which were tried. The moment I use new topic name, it throws above error.
Kafka doesn't support pattern matching? Or there's some configuration which I'm missing? Please suggest.

AdminUtils.createTopic API throws kafka.admin.AdminOperationException

I am using Confluent 3.0.1 platform on windows. I followed installation guide and developer guide to do all installation and develop my Topology.
I started Zookeeper, then Kafka server and try to run my topology. But getting below error on Kafka server. Even if I create topic manually and run the topology I see same error.
INFO Topic creation {"version":1,"partitions":{"0":[0]}} (kafka.admin.AdminUtils$)
[2016-09-21 17:20:08,807] INFO [KafkaApi-0] Auto creation of topic Text4 with 1 partitions and replication factor 1 is successful (kafka.server.KafkaApis)
[2016-09-21 17:20:09,436] ERROR [KafkaApi-0] Error when handling request {group_id=my-first-streams-application1} (kafka.server.KafkaApis)
kafka.admin.AdminOperationException: replication factor: 3 larger than available brokers: 1
at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:117)
at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:403)
at kafka.server.KafkaApis.kafka$server$KafkaApis$$createTopic(KafkaApis.scala:629)
at kafka.server.KafkaApis.kafka$server$KafkaApis$$createGroupMetadataTopic(KafkaApis.scala:651)
at kafka.server.KafkaApis$$anonfun$getOrCreateGroupMetadataTopic$1.apply(KafkaApis.scala:657)
at kafka.server.KafkaApis$$anonfun$getOrCreateGroupMetadataTopic$1.apply(KafkaApis.scala:657)
at scala.Option.getOrElse(Option.scala:121)
at kafka.server.KafkaApis.getOrCreateGroupMetadataTopic(KafkaApis.scala:657)
at kafka.server.KafkaApis.handleGroupCoordinatorRequest(KafkaApis.scala:818)
at kafka.server.KafkaApis.handle(KafkaApis.scala:86)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
at java.lang.Thread.run(Thread.java:745)
And my topology code is below:
public class MetricTopology implements InitializingBean {
#Autowired
#Qualifier("getStreamsConfig")
private Properties properties;
// Method to build topology.
public void buildTopology() {
System.out.println("MetricTopology.buildTopology()");
TopologyBuilder builder = new TopologyBuilder();
// add the source processor node that takes Kafka topic "Text4" as input
builder.addSource("Source", "Text4")
// add the Metricsprocessor node which takes the source processor as its upstream processor
.addProcessor("Process", () -> new MetricsProcessor(), "Source");
// Building Stream.
KafkaStreams streams = new KafkaStreams(builder, properties);
streams.start();
}
// Called after all properties are set.
public void afterPropertiesSet() throws Exception {
buildTopology();
}
}
Below are the Properties which I am using which is part of different java source file.
Properties settings = new Properties();
// Set a few key parameters. This properties will be picked from property file.
settings.put(StreamsConfig.APPLICATION_ID_CONFIG, "my-first-streams-application1");
settings.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
settings.put(StreamsConfig.ZOOKEEPER_CONNECT_CONFIG, "localhost:2181");
settings.put(StreamsConfig.KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
settings.put(StreamsConfig.VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
settings.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, "1");
kafka.admin.AdminOperationException: replication factor: 3 larger than available brokers: 1
In order to create a topic with replication factor 3, you need at least 3 running brokers.

flink kafka consumer groupId not working

I am using kafka with flink.
In a simple program, I used flinks FlinkKafkaConsumer09, assigned the group id to it.
According to Kafka's behavior, when I run 2 consumers on the same topic with same group.Id, it should work like a message queue. I think it's supposed to work like:
If 2 messages sent to Kafka, each or one of the flink program would process the 2 messages totally twice(let's say 2 lines of output in total).
But the actual result is that, each program would receive 2 pieces of the messages.
I have tried to use consumer client that came with the kafka server download. It worked in the documented way(2 messages processed).
I tried to use 2 kafka consumers in the same Main function of a flink programe. 4 messages processed totally.
I also tried to run 2 instances of flink, and assigned each one of them the same program of kafka consumer. 4 messages.
Any ideas?
This is the output I expect:
1> Kafka and Flink2 says: element-65
2> Kafka and Flink1 says: element-66
Here's the wrong output i always get:
1> Kafka and Flink2 says: element-65
1> Kafka and Flink1 says: element-65
2> Kafka and Flink2 says: element-66
2> Kafka and Flink1 says: element-66
And here is the segment of code:
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
ParameterTool parameterTool = ParameterTool.fromArgs(args);
DataStream<String> messageStream = env.addSource(new FlinkKafkaConsumer09<>(parameterTool.getRequired("topic"), new SimpleStringSchema(), parameterTool.getProperties()));
messageStream.rebalance().map(new MapFunction<String, String>() {
private static final long serialVersionUID = -6867736771747690202L;
#Override
public String map(String value) throws Exception {
return "Kafka and Flink1 says: " + value;
}
}).print();
env.execute();
}
I have tried to run it twice and also in the other way:
create 2 datastreams and env.execute() for each one in the Main function.
There was a quite similar question on the Flink user mailing list today, but I can't find the link to post it here. So here a part of the answer:
"Internally, the Flink Kafka connectors don’t use the consumer group
management functionality because they are using lower-level APIs
(SimpleConsumer in 0.8, and KafkaConsumer#assign(…) in 0.9) on each
parallel instance for more control on individual partition
consumption. So, essentially, the “group.id” setting in the Flink
Kafka connector is only used for committing offsets back to ZK / Kafka
brokers."
Maybe that clarifies things for you.
Also, there is a blog post about working with Flink and Kafka that may help you (https://data-artisans.com/blog/kafka-flink-a-practical-how-to).
Since there is not much use of group.id of flink kafka consumer other than commiting offset to zookeeper. Is there any way of offset monitoring as far as flink kafka consumer is concerned. I could see there is a way [with the help of consumer-groups/consumer-offset-checker] for console consumers but not for flink kafka consumers.
We want to see how our flink kafka consumer is behind/lagging with kafka topic size[total number of messages in topic at given point of time], it is fine to have it at partition level.