node-rdkafka - debug set to all but I only see broker transport failure - apache-kafka

I am trying to connect to kafka server. Authentication is based on GSSAPI.
/opt/app-root/src/server/node_modules/node-rdkafka/lib/error.js:411
return new LibrdKafkaError(e);
^
Error: broker transport failure
at Function.createLibrdkafkaError (/opt/app-root/src/server/node_modules/node-rdkafka/lib/error.js:411:10)
at /opt/app-root/src/server/node_modules/node-rdkafka/lib/client.js:350:28
This my test_kafka.js:
const Kafka = require('node-rdkafka');
const kafkaConf = {
'group.id': 'espdev2',
'enable.auto.commit': true,
'metadata.broker.list': 'br01',
'security.protocol': 'SASL_SSL',
'sasl.kerberos.service.name': 'kafka',
'sasl.kerberos.keytab': 'svc_esp_kafka_nonprod.keytab',
'sasl.kerberos.principal': 'svc_esp_kafka_nonprod#INT.LOCAL',
'debug': 'all',
'enable.ssl.certificate.verification': true,
//'ssl.certificate.location': 'some-root-ca.cer',
'ssl.ca.location': 'some-root-ca.cer',
//'ssl.key.location': 'svc_esp_kafka_nonprod.keytab',
};
const topics = 'hello1';
console.log(Kafka.features);
let readStream = new Kafka.KafkaConsumer.createReadStream(kafkaConf, { "auto.offset.reset": "earliest" }, { topics })
readStream.on('data', function (message) {
const messageString = message.value.toString();
console.log(`Consumed message on Stream: ${messageString}`);
});

You can look at this issue for the explanation of this error:
https://github.com/edenhill/librdkafka/issues/1987
Taken from #edenhill:
As a general rule for librdkafka-based clients: given that the cluster and client are correctly configured, all errors can be ignored as they are most likely temporary and librdkafka will attempt to recover automatically. In this specific case; if a group coordinator request fails it will be retried (using any broker in state Up) within 500ms. The current assignment and group membership will not be affected, if a new coordinator is found before the missing heartbeats times out the membership (session.timeout.ms).
Auto offset commits will be stalled until a new coordinator is found. In a future version we'll extend the error type to include a severity, allowing applications to happily ignore non-terminal errors. At this time an application should consider all errors informational, and not terminal.

Related

MqttBrowserClient fails to connect due to missing conack package

I am trying to make webapp over flutter which will connect to HIVE broker. I took the broker name from the official website, set the port number to 8000 just like mentioned there and still get the error message as below:
error is mqtt-client::NoConnectionException: The maximum allowed connection attempts ({1}) were exceeded. The broker is not responding to the connection request message (Missing Connection Acknowledgement?
I really have no clue how to proceed. Can someone please help?
Below is my code:
MqttBrowserClient mq = MqttBrowserClient(
'wss://broker.mqttdashboard.com:8000', '',
maxConnectionAttempts: 1);
/*
MqttBrowserClient mq = MqttBrowserClient('ws://test.mosquitto.org', 'client-1',
maxConnectionAttempts: 1);
*/
class mqttService {
Future<MqttBrowserClient?> connectToServer() async {
try {
final connMess = MqttConnectMessage()
.withClientIdentifier('clientz5tWzoydVL')
.authenticateAs('a14guguliye', 'z5tWzoydVL')
.withWillTopic('willtopic')
.withWillMessage('My Will message')
.startClean() // Non persistent session for testing
.withWillQos(MqttQos.atLeastOnce);
mq.port = 1883;
mq.keepAlivePeriod = 50;
mq.connectionMessage = connMess;
mq.websocketProtocols = MqttClientConstants.protocolsSingleDefault;
mq.onConnected = onConnected;
var status = await mq.connect();
return mq;
} catch (e) {
print("error is " + e.toString());
mq.disconnect();
return null;
}
}
}
That port 8000 may be open but the HiveMQ broker may not be listening.
Make sure that the broker is fully booted and binds to that IP:Port combo.
In the HiveMQ broker startup output, you should see something similar to:
Started Websocket Listener on address 0.0.0.0 and on port 8000
If needed, the HiveMQ Broker configuration documentation is here.
You can use the public HiveMQ MQTT Websocket demo client to test your connection to make sure it's not a local code issue.
As a last option, use Wireshark to monitor MQTT traffic with a filter of tcp.port == 8000 and mqtt

Clickhouse: kafka broken messages error handling

Clickhouse version 21.12.3.32. I`m following this PR(https://github.com/ClickHouse/ClickHouse/pull/21850) to handle incorrect messages from kafka topic, but after some investigation I`ve found that if a single message contains broken data, a whole batch of received messages cannot be parsed and it can lead to data loss.
Kafka engine table:
CREATE TABLE default.kafka_engine (message String)
ENGINE = Kafka
SETTINGS
kafka_broker_list = 'kafka:9092',
kafka_topic_list = 'topic',
kafka_group_name = 'group',
kafka_format = 'JSONAsString',
kafka_row_delimiter = '\n',
kafka_num_consumers = 1,
kafka_handle_error_mode ='stream';
Example of broken message: [object Object]
First message error: JSON object must begin with '{'.: (at row 1).
Other messages error: Cannot parse input: expected ']' at end of stream..
Is it possible to skip just that broken message and correctly parse other messages in a batch received from kafka topic?
Changing the kafka_format to RawBLOB fixed my issue.

Getting "java.lang.IllegalStateException: Tried to lookup lag for unknown task 3_0" after upgrading Kafka Stream from 2.5.1 to 2.6.2

I just upgraded our Kafka Stream application from 2.5.1 to 2.6.2. It used to work, now it doesn't.
Here is the troublesome topology (I have omitted the irrelevant Serdes):
val builder = new StreamsBuilder()
val contractEventStream: KStream[TariffId, ContractEvent] =
builder.stream[String, ContractUpsertAvro](settings.contractsTopicName)
.flatMap { (_, contractAvro) =>
ContractEvent.from(contractAvro)
.map(contractEvent => (contractEvent.tariffId, contractEvent))
}
val tariffsTable: KTable[TariffId, Tariff] =
builder.stream[String, TariffUpdateEventAvro](settings.tariffTopicName)
.flatMapValues(Tariff.fromAvro(_))
.selectKey((_, tariff) => tariff.tariffId)
.toTable(Materialized.`with`(tariffIdSerde, tariffSerde)) // Materialized.as also throws the same IllegalStateExceptions
contractEventStream
.join(tariffsTable)(JourneyStep.from(_, _).asInstanceOf[ContractCreated])(Joined.`with`(tariffIdSerde, contractEventSerde, tariffSerde))
.selectKey((_, contractUpdated) => contractUpdated.accountId)
.foreach((_, journeyStep) => println(journeyStep))
The join gives the following exception:
java.lang.IllegalStateException: Tried to lookup lag for unknown task 3_0
at org.apache.kafka.streams.processor.internals.assignment.ClientState.lagFor(ClientState.java:306)
at java.util.Comparator.lambda$comparingLong$6043328a$1(Comparator.java:511)
at java.util.Comparator.lambda$thenComparing$36697e65$1(Comparator.java:216)
at java.util.TreeMap.compare(TreeMap.java:1295)
at java.util.TreeMap.put(TreeMap.java:538)
at java.util.TreeSet.add(TreeSet.java:255)
at java.util.AbstractCollection.addAll(AbstractCollection.java:344)
at java.util.TreeSet.addAll(TreeSet.java:312)
at org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor.getPreviousTasksByLag(StreamsPartitionAssignor.java:1275)
at org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor.assignTasksToThreads(StreamsPartitionAssignor.java:1189)
at org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor.computeNewAssignment(StreamsPartitionAssignor.java:940)
at org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor.assign(StreamsPartitionAssignor.java:399)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.performAssignment(ConsumerCoordinator.java:589)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onJoinLeader(AbstractCoordinator.java:684)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.access$1000(AbstractCoordinator.java:111)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:597)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:560)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:1160)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:1135)
at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:206)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:169)
at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:129)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:602)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:412)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:297)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1296)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1237)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1210)
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:767)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:624)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:551)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:510)
I can't see what I am doing wrong. The code above works with Kafka 2.5.1. Anyone has any idea what is going on?
The problem is caused by the Kafka Streams cache, which it keeps on disk. This cache is specific to Kafka-version and to the Kafka Streams topology you use (ie. a change in your topology could also lead to this error).
The cache is usually found in /tmp or elsewhere if you passed in the "state.dir" property to Kafka Streams. Clear the directory with the cache and you should be able to cleanly start again.

Producer lost some message on kafka restart

Kafka Client : 0.11.0.0-cp1
Kafka Broker :
On Kafka broker rolling restart, our application lost some messages while sending to broker. I believe with rolling restart there should not be any loss of message. These are the producer (Using Producer with asynchronous send() and not using callback/future etc) settings we are using :
val acksConfig: String = "all",
val retriesConfig: Int = Int.MAX_VALUE,
val retriesBackOffConfig: Int = 1000,
val batchSize: Int = 32768,
val lingerTime: Int = 1,
val maxBlockTime: Int = Int.MAX_VALUE,
val requestTimeOut: Int = 420000,
val bufferMemory: Int = 33_554_432,
val compressionType: String = "gzip",
val keySerializer: Class<StringSerializer> = StringSerializer::class.java,
val valueSerializer: Class<ByteArraySerializer> = ByteArraySerializer::class.java
I am seeing these exceptions in the logs
2019-03-19 17:30:59,224 [org.apache.kafka.clients.producer.internals.Sender] [kafka-producer-network-thread | producer-1] (Sender.java:511) WARN org.apache.kafka.clients.producer.internals.Sender - Got error produce response with correlation id 1105790 on topic-partition catapult_on_entitlement_updates_prod-67, retrying (2147483643 attempts left). Error: NOT_LEADER_FOR_PARTITION
But log says retry attempt left, i am curious why didnt it retry then? Let me know if anyone has any idea?
Two things to note:
What is the replication factor of the topic you are producing and what is the required number of min.insync.replicas?
What do you mean by "producer lost some messages". The producer if it cannot successfully produce to #min.insync.replicas brokers it will throw an exception and fail (for synchronous production). It is up to the producer/ client to retry in case of failure (synchronous or asynchronous production).

Gracefully restart a Reactive-Kafka Consumer Stream on failure

Problem
When I restart/complete/STOP stream the old Consumer does not Die/Shutdown:
[INFO ] a.a.RepointableActorRef -
Message [akka.kafka.KafkaConsumerActor$Internal$Stop$]
from Actor[akka://ufo-sightings/deadLetters]
to Actor[akka://ufo-sightings/system/kafka-consumer-1#1896610594]
was not delivered. [1] dead letters encountered.
Description
I'm building a service that receives a message from Kafka topic and sends the message to an external service via HTTP request.
A connection with the external service can be broken, and my service needs to retry the request.
Additionally, if there is an error in the Stream, entire stream needs to restart.
Finally, sometimes I don't need the stream and its corresponding Kafka-consumer and I would like to shut down the entire stream
So I have a Stream:
Consumer.committableSource(customizedSettings, subscriptions)
.flatMapConcat(sourceFunction)
.toMat(Sink.ignore)
.run
Http request is sent in sourceFunction
I followed new Kafka Consumer Restart instructions in the new documentation
RestartSource.withBackoff(
minBackoff = 20.seconds,
maxBackoff = 5.minutes,
randomFactor = 0.2 ) { () =>
Consumer.committableSource(customizedSettings, subscriptions)
.watchTermination() {
case (consumerControl, streamComplete) =>
logger.info(s" Started Watching Kafka consumer id = ${consumer.id} termination: is shutdown: ${consumerControl.isShutdown}, is f completed: ${streamComplete.isCompleted}")
consumerControl.isShutdown.map(_ => logger.info(s"Shutdown of consumer finally happened id = ${consumer.id} at ${DateTime.now}"))
streamComplete
.flatMap { _ =>
consumerControl.shutdown().map(_ -> logger.info(s"3.consumer id = ${consumer.id} SHUTDOWN at ${DateTime.now} GRACEFULLY:CLOSED FROM UPSTREAM"))
}
.recoverWith {
case _ =>
consumerControl.shutdown().map(_ -> logger.info(s"3.consumer id = ${consumer.id} SHUTDOWN at ${DateTime.now} ERROR:CLOSED FROM UPSTREAM"))
}
}
.flatMapConcat(sourceFunction)
}
.viaMat(KillSwitches.single)(Keep.right)
.toMat(Sink.ignore)(Keep.left)
.run
There is an issue opened that discusses this non-terminating Consumer in a complex Akka-stream, but there is no solution yet.
Is there a workaround that forces the Kafka Consumer termination
How about wrapping the consumer in an Actor and registering a KillSwitch, see: https://doc.akka.io/docs/akka/2.5/stream/stream-dynamic.html#dynamic-stream-handling
Then in the Actor postStop method you can terminate the stream.
By wrapping the Actor in a BackoffSupervisor, you get the exponential backoff.
Example actor: https://github.com/tradecloud/kafka-akka-extension/blob/master/src/main/scala/nl/tradecloud/kafka/KafkaSubscriberActor.scala#L27