Apache Kafka: Fetching topic metadata with correlation id 0 - apache-kafka

I sent a single message to my Kafka by using the following code:
def getHealthSink(kafkaHosts: String, zkHosts: String) = {
val kafkaHealth: Subscriber[String] = kafka.publish(ProducerProperties(
brokerList = kafkaHosts,
topic = "health_check",
encoder = new StringEncoder()
))
Sink.fromSubscriber(kafkaHealth).runWith(Source.single("test"))
}
val kafkaHealth = getHealthSink(kafkaHosts, zkHosts)
and I got the following error message:
ERROR kafka.utils.Utils$ fetching topic metadata for topics
[Set(health_check)] from broker
[ArrayBuffer(id:0,host:****,port:9092)] failed
kafka.common.KafkaException: fetching topic metadata for topics
[Set(health_check)] from broker
[ArrayBuffer(id:0,host:****,port:9092)] failed
Do you have any idea what can be the problem?

The error message is incredibly unclear, but basically "Fetching topic metadata" is the first thing the producer does, which means this is where it is first establishing a connection to Kafka.
There's a good chance that either the broker you are trying to connect to is down, or there is another connectivity issue (ports, firewalls, dns, etc).
In unrelated news: You seem to be using the old and deprecated Scala producer. We recommend moving to the new Java producer (org.apache.kafka.clients.KafkaProducer)

Related

Consuming from a Kafka topic that requires authentication using reactor kafka

I have a micro-service that consumes from a Kafka topic that requires authentication. Below is the code I wrote for that. I am fetching the username and password from environment variables that I am sure is working as expected.
val receiverOptions = ReceiverOptions.create<ByteBuffer, ByteBuffer(defaultKafkaBrokerConfig.getAsProperties())
val kafkaJaasConfig = String.format(
"org.apache.kafka.common.security.scram.ScramLoginModule required username='%s' password='%s';",
kafkaUsername,
kafkaPassword
)
val schedulerKafkaConsumer = Schedulers.newSingle("consumer")
val options = receiverOptions
.subscription(topicConfig.topics)
.pollTimeout(Duration.ofMillis(topicConfig.pollWaitTimeoutMs.toLong()))
.consumerProperty(SaslConfigs.SASL_MECHANISM, "SCRAM-SHA-512")
.consumerProperty(SaslConfigs.SASL_JAAS_CONFIG, kafkaJaasConfig)
.consumerProperty(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SASL_PLAINTEXT")
return KafkaReceiver.create(options)
.receive()
.subscribeOn(schedulerKafkaConsumer)
.map { record: ReceiverRecord<ByteBuffer, ByteBuffer> -> handleConsumerRecord(record) }
.onErrorContinue { throwable: Throwable?, _: Any? ->
log.error(
"Error consuming and deserializing messages",
throwable
)
}
The code works fine when I run it locally. However, on GCP development environment, i get the following error:
Bootstrap broker <<some_ip>>:9093 (id: -2 rack: null) disconnected
org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-service_gcp-edge_6ef14fcc-3443-4782-aa38-0910a5aea9b9-2, groupId=service_gcp-edge_6ef14fcc-3443-4782-aa38-0910a5aea9b9] Connection to node -1 (kafka0-data-europe-west4-kafka.internal/<<some_ip>>:9093) terminated during authentication. This may happen due to any of the following reasons: (1) Authentication failed due to invalid credentials with brokers older than 1.0.0, (2) Firewall blocking Kafka TLS traffic (eg it may only allow HTTPS traffic), (3) Transient network issue.
Upon bashing into the cluster, I could connect to the topic and consume messages with a command line tool and so it removes the possibility of any infra related issue.
Can someone please help me figure out what am I doing wrong here and how can I fix this?

Kafka streams fail on decoding timestamp metadata inside StreamTask

We got strange errors on Kafka Streams during starting app
java.lang.IllegalArgumentException: Illegal base64 character 7b
at java.base/java.util.Base64$Decoder.decode0(Base64.java:743)
at java.base/java.util.Base64$Decoder.decode(Base64.java:535)
at java.base/java.util.Base64$Decoder.decode(Base64.java:558)
at org.apache.kafka.streams.processor.internals.StreamTask.decodeTimestamp(StreamTask.java:985)
at org.apache.kafka.streams.processor.internals.StreamTask.initializeTaskTime(StreamTask.java:303)
at org.apache.kafka.streams.processor.internals.StreamTask.initializeMetadata(StreamTask.java:265)
at org.apache.kafka.streams.processor.internals.AssignedTasks.initializeNewTasks(AssignedTasks.java:71)
at org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:385)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:769)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:698)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:671)
and, as a result, error about failed stream: ERROR KafkaStreams - stream-client [xxx] All stream threads have died. The instance will be in error state and should be closed.
According to code inside org.apache.kafka.streams.processor.internals.StreamTask, failure happened due to error in decoding timestamp metadata (StreamTask.decodeTimestamp()). It happened on prod, and can't reproduce on stage.
What could be the root cause of such errors?
Extra info: our app uses Kafka-Streams and consumes messages from several kafka brokers using the same application.id and state.dir (actually we switch from one broker to another, but during some period we connected to both brokers, so we have two kafka streams, one per each broker). As I understand, consumer group lives on broker side (so shouldn't be a problem), but state dir is on client side. Maybe some race condition occurred due to using the same state.dir for two kafka streams? could it be the root cause?
We use kafka-streams v.2.4.0, kafka-clients v.2.4.0, Kafka Broker v.1.1.1, with the following configs:
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.timestamp.extractor: org.apache.kafka.streams.processor.WallclockTimestampExtractor
default.deserialization.exception.handler: org.apache.kafka.streams.errors.LogAndContinueExceptionHandler
commit.interval.ms: 5000
num.stream.threads: 1
auto.offset.reset: latest
Finally, we figured out what is the root cause of corrupted metadata by some consumer groups.
It was one of our internal monitoring tool (written with pykafka) that corrupted metadata by temporarily inactive consumer groups.
Metadata were unencrupted and contained invalid data like the following: {"consumer_id": "", "hostname": "monitoring-xxx"}.
In order to understand what exactly we have in consumer metadata, we could use the following code:
Map<String, Object> config = Map.of( "group.id", "...", "bootstrap.servers", "...");
String topicName = "...";
Consumer<byte[], byte[]> kafkaConsumer = new KafkaConsumer<byte[], byte[]>(config, new ByteArrayDeserializer(), new ByteArrayDeserializer());
Set<TopicPartition> topicPartitions = kafkaConsumer.partitionsFor(topicName).stream()
.map(partitionInfo -> new TopicPartition(topicName, partitionInfo.partition()))
.collect(Collectors.toSet());
kafkaConsumer.committed(topicPartitions).forEach((key, value) ->
System.out.println("Partition: " + key + " metadata: " + (value != null ? value.metadata() : null)));
Several options to fix already corrupted metadata:
change consumer group to a new one. caution that you might lose or duplicate messages depending on the latest or earliest offset reset policy. so for some cases, this option might be not acceptable
overwrite metadata manually (timestamp is encoded according to logic inside StreamTask.decodeTimestamp()):
Map<TopicPartition, OffsetAndMetadata> updatedTopicPartitionToOffsetMetadataMap = kafkaConsumer.committed(topicPartitions).entrySet().stream()
.collect(Collectors.toMap(Map.Entry::getKey, (entry) -> new OffsetAndMetadata((entry.getValue()).offset(), "AQAAAXGhcf01")));
kafkaConsumer.commitSync(updatedTopicPartitionToOffsetMetadataMap);
or specify metadata as Af////////// that means NO_TIMESTAMP in Kafka Streams.

Alpakka Akka Stream unable to read from kafka

I have built a very simple akka stream based on the alpakka project, but it doesn't read anything from kafka even though it connects and creates a consumer group. I have created an implicit Actor System and Materializer for the stream.
val done = Consumer.committableSource(consumerSettings,
Subscriptions.topics(kafkaTopic))
.map(msg => msg.committableOffset)
.mapAsync(1) { offset =>
offset.commitScaladsl()
}
.runWith(Sink.ignore)
[stream.actor.dispatcher] sends this message to KafkaConsumerActor "Requesting messages, requestId: 1, partitions: Set(kafka-topic-0)"
The KafkaConsumerActor doesn't seem to receive the message but when the supervisor asks the Actor to shutdown it does receive the message and shutdown.
Any lead on why it fails to read Kafka without an Error or Exception ?
I couldn't figure out why my akka stream wasn't consuming messages from the kafka broker, But When I implemented the same stream as a Runnable Graph, it worked.
Examples that I used - https://www.programcreek.com/scala/akka.stream.scaladsl.RunnableGraph

Kafka - org.apache.kafka.common.errors.NetworkException

I have a kafka client code which connects to Kafka( Server 0.10.1 and client is 0.10.2) brokers. There are 2 topics with 2 different consumer group in the code and also there is a producer. Getting the NetworkException from the producer code once in a while( once in 2 days, once in 5 days, ...). We see consumer group (Re)joining info in the logs for both the consumer group followed by the NetworkException from the producer future.get() call. Not sure why are we getting this error.
Code :-
final Future<RecordMetadata> futureResponse =
producer.send(new ProducerRecord<>("ping_topic", "ping"));
futureResponse.get();
Exception :-
org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received.
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received.
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:70)
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:57)
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:25)
Kafka API definition for NetworkException,
"A misc. network-related IOException occurred when making a request.
This could be because the client's metadata is out of date and it is
making a request to a node that is now dead."
Thanks
I had the same error while testing the Kafka Consumer. I was using a sender template for it.
In the consumer configuration I set additionally the following properties:
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 15000);
After sending the Message I added a thread sleep:
ListenableFuture<SendResult<String, String>> future =
senderTemplate.send(MyConsumer.TOPIC_NAME, jsonPayload);
Thread.Sleep(10000).
It was necessary to make the test work, but maybe not suitable for your case.

Kafka : Error from SyncGroup, The request timed out

Recently we are experiencing "Error from SyncGroup: The request timed out" frequently with the Java Kafka APIs.
This issue usually happens with few topic or consumer group in Kafka cluster. Does anyone can provide some pointers about this error?
As a workaround, if I change the consumer group name I don't see the error.
Broker Version : 0.9.0
Kafka client version : 0.9.0.1
Exception in thread "main" org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The request timed out.
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupRequestHandler.handle(AbstractCoordinator.java:444)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupRequestHandler.handle(AbstractCoordinator.java:411)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:665)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:644)
at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:380)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:274)
#zer0Id0l
We have had the same problem recently. It happens because some Kafka Streams messages have meta information footprint which is more than a regular one (when you don't use Kafka Streams). To fix the issue, go to __consumer_offsets topic settings and set max.message.bytes param higher than it is by default. For example, in our case we have max.message.bytes = 20971520. That will completely solve your problem.