Kafka Message loss when Kafkaserver goes down

Kafka Message loss when Kafkaserver goes down - apache-kafka

I have a scenario that when Kafka server or Zookeeper both goes down, but the producer application trying to send a message to Kafka broker.
When the Kafka broker or Zookeeper comes online, messages is getting loss, and i see the producer delivery result offset has Offset: -1001 and Partition [Any]
Below are my configuration to reproduce
new ProducerConfig
{
BootstrapServers = <BROKET_END_POINT>,
MessageMaxBytes = 20971520,
MessageCopyMaxBytes = 20971520,
LogConnectionClose = false,
BrokerAddressFamily = BrokerAddressFamily.V4,
ConnectionsMaxIdleMs = 0,
//https://github.com/Azure/azure-event-hubs-for-kafka/issues/139
SocketKeepaliveEnable = true,
MetadataMaxAgeMs = 180000,
//https://learn.microsoft.com/en-us/azure/event-hubs/apache-kafka-configurations
RequestTimeoutMs = 60000,
MessageTimeoutMs = 0,
MessageSendMaxRetries = int.MaxValue,
RetryBackoffMs = 100,
EnableIdempotence = true,
Acks = Acks.All,
BatchSize = 2000000,
LingerMs = 0,
CompressionType = CompressionType.Gzip
SocketNagleDisable = true
};
I used the above setting to produce the message using Confluent kafka nuget package version 1.8.2 from .Net code.
Appreciate your inputs in advance

Related

Cannot use csvSftpConnector with schema registry

I'm trying to configure a standalone producer to read csv files from a sftp server and send data to a topic on the cloud.
So far I succeeded in reading my csv data from the file and parsing it according to my value.schema.
But now instead of using a fixed configuration schema, I'd like to use the schema registry. So I configured an AVRO schema for my test topic on the confluent cloud, generated the API key/secret and updated my config files.
I can see that connection is working fine, no authentication errors, via cli I can access the test schema, but when I try to run the producer I get the following error:
[2021-09-20 16:39:53,442] INFO SftpCsvSourceConnectorConfig values:
batch.size = 1000
behavior.on.error = IGNORE
cleanup.policy = NONE
csv.case.sensitive.field.names = false
csv.escape.char = 92
csv.file.charset = UTF-8
csv.first.row.as.header = false
csv.ignore.leading.whitespace = true
csv.ignore.quotations = false
csv.keep.carriage.return = false
csv.null.field.indicator = NEITHER
csv.quote.char = 34
csv.rfc.4180.parser.enabled = false
csv.separator.char = 44
csv.skip.lines = 0
csv.strict.quotes = false
csv.verify.reader = true
empty.poll.wait.ms = 250
error.path = /home/alberto/opt/confluent-6.2.0/sftp2/error
file.minimum.age.ms = 0
finished.path = /home/alberto/opt/confluent-6.2.0/sftp2/finished
input.file.pattern = .*.csv
input.path = /home/alberto/opt/confluent-6.2.0/sftp2/data
kafka.topic = testSchema
kerberos.keytab.path =
kerberos.user.principal =
key.schema = {"name" : "com.example.users.UserKey","type" : "STRUCT","isOptional" : true,"fieldSchemas" : {"material" : {"type" : "STRING","isOptional" : true}}}
parser.timestamp.date.formats = [yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd' 'HH:mm:ss]
parser.timestamp.timezone = UTC
processing.file.extension = .PROCESSING
proxy.password = [hidden]
proxy.username =
schema.generation.enabled = false
schema.generation.key.fields = []
schema.generation.key.name = defaultkeyschemaname
schema.generation.value.name = defaultvalueschemaname
sftp.host = 192.168.1.6
sftp.password = [hidden]
sftp.port = 22
sftp.proxy.url =
sftp.username = user
timestamp.field =
timestamp.mode = PROCESS_TIME
tls.passphrase = [hidden]
tls.pemfile =
tls.private.key = [hidden]
tls.public.key = [hidden]
value.schema =
...
Caused by: org.apache.kafka.common.config.ConfigException: Both configs key.schema and value.schema must be set if schema.generation.enabled is false, but key.schema was not null and value.schema was null.
at io.confluent.connect.sftp.source.SftpSourceConnectorConfig.validateSchema(SftpSourceConnectorConfig.java:181)
at io.confluent.connect.sftp.source.SftpSourceConnectorConfig.<init>(SftpSourceConnectorConfig.java:121)
at io.confluent.connect.sftp.source.SftpCsvSourceConnectorConfig.<init>(SftpCsvSourceConnectorConfig.java:157)
at io.confluent.connect.sftp.SftpCsvSourceConnector.start(SftpCsvSourceConnector.java:44)
at org.apache.kafka.connect.runtime.WorkerConnector.doStart(WorkerConnector.java:184)
at org.apache.kafka.connect.runtime.WorkerConnector.start(WorkerConnector.java:209)
at org.apache.kafka.connect.runtime.WorkerConnector.doTransitionTo(WorkerConnector.java:348)
at org.apache.kafka.connect.runtime.WorkerConnector.doTransitionTo(WorkerConnector.java:331)
... 7 more
If I set schema.generation.enabled to true, it seems that it creates an empty schema:
value.schema = {"type":"STRUCT","isOptional":false,"fieldSchemas":{}}
and then I get:
org.apache.kafka.common.config.ConfigException: Failed to access Avro data from topic testSchema : Schema being registered is incompatible with an earlier schema for subject "testSchema-value"; error code: 409; error code: 409
as if it's trying to register a schema, except that it's not what I want, I just need to fetch the schema from the registry and use it.
If anyone need any addition information regarding the configuration I'll happy to provide.

node-rdkafka - debug set to all but I only see broker transport failure

I am trying to connect to kafka server. Authentication is based on GSSAPI.
/opt/app-root/src/server/node_modules/node-rdkafka/lib/error.js:411
return new LibrdKafkaError(e);
^
Error: broker transport failure
at Function.createLibrdkafkaError (/opt/app-root/src/server/node_modules/node-rdkafka/lib/error.js:411:10)
at /opt/app-root/src/server/node_modules/node-rdkafka/lib/client.js:350:28
This my test_kafka.js:
const Kafka = require('node-rdkafka');
const kafkaConf = {
'group.id': 'espdev2',
'enable.auto.commit': true,
'metadata.broker.list': 'br01',
'security.protocol': 'SASL_SSL',
'sasl.kerberos.service.name': 'kafka',
'sasl.kerberos.keytab': 'svc_esp_kafka_nonprod.keytab',
'sasl.kerberos.principal': 'svc_esp_kafka_nonprod#INT.LOCAL',
'debug': 'all',
'enable.ssl.certificate.verification': true,
//'ssl.certificate.location': 'some-root-ca.cer',
'ssl.ca.location': 'some-root-ca.cer',
//'ssl.key.location': 'svc_esp_kafka_nonprod.keytab',
};
const topics = 'hello1';
console.log(Kafka.features);
let readStream = new Kafka.KafkaConsumer.createReadStream(kafkaConf, { "auto.offset.reset": "earliest" }, { topics })
readStream.on('data', function (message) {
const messageString = message.value.toString();
console.log(`Consumed message on Stream: ${messageString}`);
});

You can look at this issue for the explanation of this error:
https://github.com/edenhill/librdkafka/issues/1987
Taken from #edenhill:
As a general rule for librdkafka-based clients: given that the cluster and client are correctly configured, all errors can be ignored as they are most likely temporary and librdkafka will attempt to recover automatically. In this specific case; if a group coordinator request fails it will be retried (using any broker in state Up) within 500ms. The current assignment and group membership will not be affected, if a new coordinator is found before the missing heartbeats times out the membership (session.timeout.ms).
Auto offset commits will be stalled until a new coordinator is found. In a future version we'll extend the error type to include a severity, allowing applications to happily ignore non-terminal errors. At this time an application should consider all errors informational, and not terminal.

Creating partition for topic in kafka-node

I have a created a HighLevelProducer to publish messages to a topic stream that will be consumed by a ConsumerGroupStream using kafka-node. When I create multiple consumers from the same ConsumerGroup to consume from that same topic only one partition is created and only one consumer is consuming. I have also tried to define the number of partitions for that topic although I'm not sure if is required to define it upon creating the topic and if so how many partitions will I need in advance. In addition, is it possible to push an object to the Transform stream and not a string (I currently used JSON.stringify because otherwise I got [Object object] in the consumer.
const myProducerStream = ({ kafkaHost, highWaterMark, topic }) => {
const kafkaClient = new KafkaClient({ kafkaHost });
const producer = new HighLevelProducer(kafkaClient);
const options = {
highWaterMark,
kafkaClient,
producer
};
kafkaClient.refreshMetadata([topic], err => {
if (err) throw err;
});
return new ProducerStream(options);
};
const transfrom = topic => new Transform({
objectMode: true,
decodeStrings: true,
transform(obj, encoding, cb) {
console.log(`pushing message ${JSON.stringify(obj)} to topic "${topic}"`);
cb(null, {
topic,
messages: JSON.stringify(obj)
});
}
});
const publisher = (topic, kafkaHost, highWaterMark) => {
const myTransfrom = transfrom(topic);
const producer = myProducerStream({ kafkaHost, highWaterMark, topic });
myTransfrom.pipe(producer);
return myTransform;
};
The consumer:
const createConsumerStream = (sourceTopic, kafkaHost, groupId) => {
const consumerOptions = {
kafkaHost,
groupId,
protocol: ['roundrobin'],
encoding: 'utf8',
id: uuidv4(),
fromOffset: 'latest',
outOfRangeOffset: 'earliest',
};
const consumerGroupStream = new ConsumerGroupStream(consumerOptions, sourceTopic);
consumerGroupStream.on('connect', () => {
console.log(`Consumer id: "${consumerOptions.id}" is connected!`);
});
consumerGroupStream.on('error', (err) => {
console.error(`Consumer id: "${consumerOptions.id}" encountered an error: ${err}`);
});
return consumerGroupStream;
};
const publisher = (func, destTopic, consumerGroupStream, kafkaHost, highWaterMark) => {
const messageTransform = new AsyncMessageTransform(func, destTopic);
const resultProducerStream = myProducerStream({ kafkaHost, highWaterMark, topic: destTopic })
consumerGroupStream.pipe(messageTransform).pipe(resultProducerStream);
};

For the first question:
The maximum working consumers in a group are equal to the number of partitions.
So if you have TopicA with 1 partition and you have 5 consumers in your consumer group, 4 of them will be idle.
If you have TopicA with 5 partitions and you have 5 consumers in your consumer group, all of them will be active and consuming messages from your topic.
To specify the number of partitions, you should create the topic from CLI instead of expecting Kafka to create it when you first publish messages.
To create a topic with a specific number of partitions:
bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 3 --partitions 3 --topic test
To alter the number of partitions in an already existed topic:
bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic test
--partitions 40
Please note that you can only increase the number of partitions, you can not decrease them.
Please refer to Kafka Docs https://kafka.apache.org/documentation.html
Also if you'd like to understand more about Kafka please check the free book https://www.confluent.io/resources/kafka-the-definitive-guide/

Producer lost some message on kafka restart

Kafka Client : 0.11.0.0-cp1
Kafka Broker :
On Kafka broker rolling restart, our application lost some messages while sending to broker. I believe with rolling restart there should not be any loss of message. These are the producer (Using Producer with asynchronous send() and not using callback/future etc) settings we are using :
val acksConfig: String = "all",
val retriesConfig: Int = Int.MAX_VALUE,
val retriesBackOffConfig: Int = 1000,
val batchSize: Int = 32768,
val lingerTime: Int = 1,
val maxBlockTime: Int = Int.MAX_VALUE,
val requestTimeOut: Int = 420000,
val bufferMemory: Int = 33_554_432,
val compressionType: String = "gzip",
val keySerializer: Class<StringSerializer> = StringSerializer::class.java,
val valueSerializer: Class<ByteArraySerializer> = ByteArraySerializer::class.java
I am seeing these exceptions in the logs
2019-03-19 17:30:59,224 [org.apache.kafka.clients.producer.internals.Sender] [kafka-producer-network-thread | producer-1] (Sender.java:511) WARN org.apache.kafka.clients.producer.internals.Sender - Got error produce response with correlation id 1105790 on topic-partition catapult_on_entitlement_updates_prod-67, retrying (2147483643 attempts left). Error: NOT_LEADER_FOR_PARTITION
But log says retry attempt left, i am curious why didnt it retry then? Let me know if anyone has any idea?

Two things to note:
What is the replication factor of the topic you are producing and what is the required number of min.insync.replicas?
What do you mean by "producer lost some messages". The producer if it cannot successfully produce to #min.insync.replicas brokers it will throw an exception and fail (for synchronous production). It is up to the producer/ client to retry in case of failure (synchronous or asynchronous production).

log.debug("Coordinator discovery failed for group {}, refreshing metadata", groupId) with kafka 0.11.0.x

I'm using Kafka (version 0.11.0.2) server API to start a kafka broker in localhost.As it run without any problem.Producer also can send messages success.But consumer can't get messages and there is nothing error log in console.So I debugged the code and it looping for "refreshing metadata".
Here is the source code
while (coordinatorUnknown()) {
RequestFuture<Void> future = lookupCoordinator();
client.poll(future, remainingMs);
if (future.failed()) {
if (future.isRetriable()) {
remainingMs = timeoutMs - (time.milliseconds() - startTimeMs);
if (remainingMs <= 0)
break;
log.debug("Coordinator discovery failed for group {}, refreshing metadata", groupId);
client.awaitMetadataUpdate(remainingMs);
} else
throw future.exception();
} else if (coordinator != null && client.connectionFailed(coordinator)) {
// we found the coordinator, but the connection has failed, so mark
// it dead and backoff before retrying discovery
coordinatorDead();
time.sleep(retryBackoffMs);
}
remainingMs = timeoutMs - (time.milliseconds() - startTimeMs);
if (remainingMs <= 0)
break;
}
Adddtion: I change the Kafka version to 0.10.x，its run OK.
Here is my Kafka server code.
private static void startKafkaLocal() throws Exception {
final File kafkaTmpLogsDir = File.createTempFile("zk_kafka", "2");
if (kafkaTmpLogsDir.delete() && kafkaTmpLogsDir.mkdir()) {
Properties props = new Properties();
props.setProperty("host.name", KafkaProperties.HOSTNAME);
props.setProperty("port", String.valueOf(KafkaProperties.KAFKA_SERVER_PORT));
props.setProperty("broker.id", String.valueOf(KafkaProperties.BROKER_ID));
props.setProperty("zookeeper.connect", KafkaProperties.ZOOKEEPER_CONNECT);
props.setProperty("log.dirs", kafkaTmpLogsDir.getAbsolutePath());
//advertised.listeners=PLAINTEXT://xxx.xx.xx.xx:por
// flush every message.
// flush every 1ms
props.setProperty("log.default.flush.scheduler.interval.ms", "1");
props.setProperty("log.flush.interval", "1");
props.setProperty("log.flush.interval.messages", "1");
props.setProperty("replica.socket.timeout.ms", "1500");
props.setProperty("auto.create.topics.enable", "true");
props.setProperty("num.partitions", "1");
KafkaConfig kafkaConfig = new KafkaConfig(props);
KafkaServerStartable kafka = new KafkaServerStartable(kafkaConfig);
kafka.startup();
System.out.println("start kafka ok "+kafka.serverConfig().numPartitions());
}
}
Thanks.

With kafka 0.11, if you set num.partitions to 1 you also need to set the following 3 settings:
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
That should be obvious from your server logs when running 0.11.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Kafka Message loss when Kafkaserver goes down - apache-kafka

Related

Cannot use csvSftpConnector with schema registry

node-rdkafka - debug set to all but I only see broker transport failure

Creating partition for topic in kafka-node

Producer lost some message on kafka restart

log.debug("Coordinator discovery failed for group {}, refreshing metadata", groupId) with kafka 0.11.0.x

Categories

Resources