We have 8 nodes kafka cluster and kafka manager installed.
We are monitoring via new relic.
new Relic and kafka manager both are reporting kafka is rejecting bytes. I am not able to find the cause.
In broker logs there are no error lines.
JMS BEAN - JMX/kafka.server/BrokerTopicMetrics/BytesRejectedPerSec/OneMinuteRate
Kafka Config -
auto.create.topics.enable=false
auto.leader.rebalance.enable=true
broker.id=180
controlled.shutdown.enable=true
controlled.shutdown.max.retries=3
default.replication.factor=1
delete.topic.enable=true
kafka.http.metrics.host=0.0.0.0
kafka.http.metrics.port=24042
kafka.log4j.dir=/logs/kafka
kerberos.auth.enable=false
leader.imbalance.check.interval.seconds=300
leader.imbalance.per.broker.percentage=10
log.cleaner.dedupe.buffer.size=134217728
log.cleaner.delete.retention.ms=604800000
log.cleaner.enable=true
log.cleaner.min.cleanable.ratio=0.5
log.cleaner.threads=1
log.dirs=/kafka/data
log.retention.bytes=5368709120
log.retention.check.interval.ms=300000
log.retention.hours=72
log.retention.ms=259200000
log.roll.hours=168
log.segment.bytes=1073741824
message.max.bytes=3145728
min.insync.replicas=1
num.io.threads=8
num.partitions=1
num.replica.fetchers=6
offsets.topic.num.partitions=50
offsets.topic.replication.factor=3
port=9092
quota.consumer.default=52428800
quota.consumer.default=52428800
quota.producer.default=26214400
quota.producer.default=26214400
replica.fetch.max.bytes=4194304
replica.lag.max.messages=6000
replica.lag.time.max.ms=60000
unclean.leader.election.enable=false
zookeeper.session.timeout.ms=6000
zookeeper.connect=zookeeper01.prod.***.com:2181,zookeeper02.prod.***.com:2181,zookeeper03.prod.***.com:2181
security.inter.broker.protocol=PLAINTEXT
listeners=PLAINTEXT://kafka01.prod.***.com:9092,
broker.id.generation.enable=false
sasl.kerberos.service.name=kafka
listeners=PLAINTEXT://:9092
num.network.threads=8
By examining Kafka sources (ref1, ref2), it seems that the only reason counted into BytesRejectedPerSec (bytesRejectedRate) is the message size exceeding config.maxMessageSize.
Note: recompression and message format conversion may also affect the message size beyond what's being sent by the producer.
Related
I have a problem with using Debezium. I searched on the internet but i cant find solution.
I'm using Windows 11 and Kafka 3.1
Here is my config values:
Zookeepers.properties:
dataDir=C:/debezium/kafka/data/zookeper
clientPort=2181
maxClientCnxns=0
admin.enableServer=false
server.properties
broker.id=0
listeners=PLAINTEXT://localhost:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
connect-standalone.properties
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=C:/debezium/kafka/connect/connect.offsets
offset.flush.interval.ms=10000
offset.reset=latest
plugin.path=C:/debezium/kafka/connect
and transaction_connector.properties
name=wallet-transaction-connector
connector.class=io.debezium.connector.sqlserver.SqlServerConnector
database.hostname= {MY_HOSTNAME}
database.port=1433
database.user=sa
database.password= {SQL_PASSWORD}
database.server.name= {REMOTE_SQL_SERVER}
database.dbname=WalletDB
table.include.list=dbo.TxOpenProvision
database.history.kafka.bootstrap.servers=localhost:9092
database.history.kafka.topic=dbhistory.TxOpenProvision
include.schema.changes=true
I run zookeeper, kafka and connect command below:
Zookeper: .\bin\windows\zookeeper-server-start.bat .\config\zookeeper.properties
Kafka: .\bin\windows\kafka-server-start.bat .\config\server.properties
Connect: .\bin\windows\connect-standalone.bat .\config\connect-standalone.properties .\config\wallet_connector.properties
My SQL Server is remote server.
I'm getting this error and i cant resolve it. How can i solve this?
ERROR [wallet-transaction-connector|task-0]
WorkerSourceTask{id=wallet-transaction-connector-0} Task threw an
uncaught and unrecoverable exception. Task is being killed and will
not recover until manually restarted
(org.apache.kafka.connect.runtime.WorkerTask:195)
org.apache.kafka.common.config.ConfigException: Invalid value earl²est
for configuration auto.offset.reset: String must be one of: latest,
earliest, none
at org.apache.kafka.common.config.ConfigDef$ValidString.ensureValid(ConfigDef.java:961)
at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:499)
at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:483)
at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:113)
at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:133)
at org.apache.kafka.clients.consumer.ConsumerConfig.(ConsumerConfig.java:630)
at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:664)
at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:645)
at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:625)
at io.debezium.relational.history.KafkaDatabaseHistory.storageExists(KafkaDatabaseHistory.java:356)
at io.debezium.relational.HistorizedRelationalDatabaseSchema.initializeStorage(HistorizedRelationalDatabaseSchema.java:80)
at io.debezium.connector.sqlserver.SqlServerConnectorTask.start(SqlServerConnectorTask.java:81)
at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:130)
at org.apache.kafka.connect.runtime.WorkerSourceTask.initializeAndStart(WorkerSourceTask.java:225)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:186)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
As you can see in the logs, you have a special character ² - Invalid value earl²est
In connect-standalone.properties, the config offset.reset is not a valid config...
Debezium is a producer (source connector), so setting auto.offset.reset doesn't make sense for it.
Also worth pointing out that Windows support for Kafka is very lacking; try using WSL2 instead.
I have an application which consumers messages from 4 kafka topics. For simplicity sake, let's call the topics: a, b, c, d. Each new version of the application uses a new consumer group id (basically a docker image ID).
Today, I had a problem where a new version of the application launched with a new consumer group which connected to a,b,d, but not c topic. Looking in Kafka manager, the new consumer group had no entry for topic c.
I can see an error in the client error logs
Consumer clientId=indexer, groupId=650-c6ac848] Node 331 sent an invalid full fetch response with extra=(a-28, response=(c-28","logger_name":"org.apache.kafka.clients.FetchSessionHandler","thread_name":"kafka-coordinator-heartbeat-thread
I suspect it may be an infrastructure / configuration issue, but I can't be certain. I'm a developer - and I'm not very familiar with Kafka, so I don't where to look. The application code changes were minimal and shouldn't have impacted consumer group setup.
The log message to me suggests something related to heartbeat, and topics a and c have had their wires crossed somehow.
server.properties..
advertised.listeners=PLAINTEXT://kafka1.dub1.cloud:9092
auto.create.topics.enable=false
broker.id=16
broker.rack=dub1-zone4
default.replication.factor=3
delete.topic.enable=true
group.initial.rebalance.delay.ms=3
log.dirs=/var/lib/kafka
log.retention.check.interval.ms=300000
log.retention.hours=168
log.segment.bytes=1073741824
min.insync.replicas=2
num.io.threads=8
num.network.threads=3
num.partitions=30
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
socket.send.buffer.bytes=102400
transaction.state.log.min.isr=2
transaction.state.log.replication.factor=3
unclean.leader.election.enable=false
zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
zookeeper.connection.timeout.ms=6000
Looking at the source code isn't helpful to the uninitiated
https://github.com/apache/kafka/blob/b8a99be7847c61d7792689b71fda5b283f8340a8/clients/src/main/java/org/apache/kafka/clients/FetchSessionHandler.java#L394
Any suggestions on how to further diagnose this problem would be great appreciated.
Turns out the topic c had No messages and that seems to be the reason for errors I saw.
I have a kafka cluster of 3 kafka brokers on 3 different servers.
Lets assume the three servers are .
99.99.99.1
99.99.99.2
99.99.99.3
All 3 servers have a shared path on which kafka is residing.
I have created 3 server.properties with name
server1.properties
server2.properties
server3.properties
The server1.properties look like below:
broker.id=1
port=9094
listeners=SSL://99.99.99.1:9094
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=3
zookeeper.connect=99.99.99.1:2181,99.99.99.2:2182,99.99.99.3:2183
ssl.keystore.location=xyz.jks
ssl.keystore.password=password
ssl.key.password=password
ssl.truststore.location=xyz.jks
ssl.truststore.password=password
ssl.client.auth=required
security.inter.broker.protocol=SSL
Similarly, the other two server properties look.
Issues/Query:
I need the consumer and producer should connect using SSL and even all the brokers should connect to each other using SSL. Is my configuration right for this?
I keep on getting below error is this usual?
WARN Failed to send SSL Close message
(org.apache.kafka.common.network.SslTransportLayer)
java.io.IOException: Broken pipe
i'm trying to built a system that requires collecting data from agents and push them into kafka servers ( via logstash). After configuring kafka server.. i just tested with consumer and producer (on kafka) on local anh everything is still ok. But when i try to push data from logstash agent (on a remote server) into kafka and monitor using consumer, nothing occurs, no data pushed
Plz tell me some hints??
My config as below:
logstash config:
output {
stdout { codec => rubydebug }
kafka {
bootstrap_servers => "public_IP_remote_server"
topic_id => "iis"
}
}
consumer.proper (kafka server)
zookeeper.connect=*public_IP_remote_server*:2181
zookeeper.connection.timeout.ms=6000
group.id=test-consumer-group
producer.proper (kafka server)
bootstrap.servers=*public_IP_remote_server*:9092
compression.type=none
server.proper (kafka server)
broker.id=0
advertised.host.name=*public_IP_remote_server*
advertised.port=9092
listeners=PLAINTEXT://*public_IP_remote_server*:9092
delete.topic.enable = true
advertised.listeners=PLAINTEXT://*public_IP_remote_server*:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeezookeeper.connection.timeout.ms=6000
zookeeper.connect=localhost:2181
zookeeper.proper (on kafka server)
dataDir=/tmp/zookeeper
clientPort=2181
maxClientCnxns=0
As part of our current Kafka cluster, high-availability testing (HA) is being done. The objective is, while a producer job is pushing data to a particular partition of a topic, all the brokers in Kafka cluster are restarted sequentially (Stop-first broker- restart it and after first broker comes up, do same steps for second broker and so-on). The producer job is pushing around 7 million records for about 30 minutes while this test is going on. At the end of job, it was noticed that around 1000 records are missing.
Below are specifics of our Kafka cluster: (kafka_2.10-0.8.2.0)
-3 Kafka brokers each with 2 100GB mounts
Topic was created with:
-Replication factor of 3
-min.insync.replica=2
server.properties:
broker.id=1
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dirs=/drive1,/drive2
num.partitions=1
num.recovery.threads.per.data.dir=1
log.flush.interval.messages=10000
log.retention.hours=1
log.segment.bytes=1073741824
log.retention.check.interval.ms=1800000
log.cleaner.enable=false
zookeeper.connect=ZK1:2181,ZK2:2181,ZK3:2181
zookeeper.connection.timeout.ms=10000
advertised.host.name=XXXX
auto.leader.rebalance.enable=true
auto.create.topics.enable=false
queued.max.requests=500
delete.topic.enable=true
controlled.shutdown.enable=true
unclean.leader.election=false
num.replica.fetchers=4
controller.message.queue.size=10
Producer.properties (aync producer with new producer API)
bootstrap.servers=broker1:9092,broker2:9092,broker3:9092
acks=all
buffer.memory=33554432
compression.type=snappy
batch.size=32768
linger.ms=5
max.request.size=1048576
block.on.buffer.full=true
reconnect.backoff.ms=10
retry.backoff.ms=100
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
Can someone share any info about Kafka-cluster and HA to ensure that data would not be lost while rolling restarting Kafka brokers?
Also, here is my producer code. This is a fire and forget kind of producer. we are not handling failures explicitly as of now. Working fine for almost millions of records. I am seeing problem, only when Kafka brokers are restarted as explained above.
public void sendMessage(List<byte[]> messages, String destination, Integer parition, String kafkaDBKey) {
for(byte[] message : messages) {
producer.send(new ProducerRecord<byte[], byte[]>(destination, parition, kafkaDBKey.getBytes(), message));
}
}
By increasing default retries value from 0 to 4000 on producer side, we are able to send data successfully without loosing.
retries=4000
Due to this setting, there is a possibility of sending same message twice and messages are out of sequence by the time consumer receives it (second msg might reach before first msg). But for our current problem that is not an issue and is handled on consumer side to ensure everything is in order.