Multi node kafka cluster: Producer and Consumer not working - apache-kafka

I have a kafka cluster consisting of two machines. This is my server.properties:
broker.id=2
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://a.b.c.d:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=2
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=a.b.c.d:2181,a.b.c.e:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
And this is my zookeeper.properties:
dataDir=/tmp/zookeeper
clientPort=2181
maxClientCnxns=0
tickTime=2000
server.1=a.b.c.d:2888:3888
server.2=a.b.c.e:2888:3888
initLimit=20
syncLimit=10
a.b.c.d = The IPs these machines have, e.g. 192.168.....
I start the zookeeper server on both machines using:
bin/zookeeper-server-start config/zookeeper.properties
I then start kafka servers on both the nodes. After this, I am able to create a new topic and get its details using --describe. However I am unable to read from consumer or write to producer. I run these by:
bin/kafka-console-consumer --bootstrap-server a.b.c.d:9092,a.b.c.e:9092 --topic randomTopic --from beginning
bin/kafka-console-producer --broker-list a.b.c.d:9092,a.b.c.e:9092 --topic randomTopic
When I run the producer, the prompt(>) appears and I can write into it. However, kafka cannot read anything from the consumer and the screen remains black.
How do I make the consumer read the data in the topics or make producer able to write the data in the topics?

Related

Configure Apache KAFKA with external and internal listeners and SASL Authentication for external publish/subscribe

I want to configure Kafka authentication (just authentication no encryption is needed by now) using 2 listeners:
one for interbroker private comunication with PLAINTEXT security
one for consumer/producers public communication with SASL_PLAINTEXT and SCRAM-SHA-256
I've one Kafka cluster with just one broker (for testing purposes) and Zookeeper cluster with 2 nodes
The steps I've done are:
Create 'admin' and 'test-user' users on zookeeper
kafka-configs.sh --zookeeper zk:2181 --alter --add-config 'SCRAM-SHA-256=[iterations=8192,password=test-secret]' \
--entity-type users --entity-name test-user
kafka-configs.sh --zookeeper zk:2181 --alter --add-config 'SCRAM-SHA-256=[password=admin-secret]' \
--entity-type users --entity-name admin
configure server properties as follows:
############################# Server Basics #############################
broker.id=1
############################# Socket Server Settings #############################
listeners=EXTERNAL://0.0.0.0:9095,INTERNAL://:9092
advertised.listeners=EXTERNAL://172.20.30.40:9095,INTERNAL://:9092
listener.security.protocol.map=INTERNAL:PLAINTEXT, EXTERNAL:SASL_PLAINTEXT
inter.broker.listener.name=INTERNAL
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-256
sasl.enabled.mechanisms=PLAIN, SCRAM-SHA-256
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
############################# Log Basics #############################
log.dirs=/opt/kafka/logs
num.partitions=1
num.recovery.threads.per.data.dir=1
delete.topic.enable=false
auto.create.topics.enable=true
default.replication.factor=1
############################# Log Flush Policy #############################
#log.flush.interval.messages=10000
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
log.retention.hours=168
#log.retention.bytes=1073741824
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=true
############################# Offset Retention #############################
offsets.retention.minutes=1440
############################# Connect Policy #############################
zookeeper.connect=10.42.203.74:2181,10.42.214.116:2181
zookeeper.connection.timeout.ms=6000
create a file kafka_server_jaas.conf and pass it to kafka during boot using
-Djava.security.auth.login.config=/opt/kafka/config/kafka_server_jaas.conf
internal.KafkaServer {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="admin"
password="admin-secret";
};
external.KafkaServer {
org.apache.kafka.common.security.scram.ScramLoginModule required;
};
create a test-topic to publish/subscribe
kafka-topics.sh --create --zookeeper zk:2181 --replication-factor 1 --partitions 3 --topic test-topic
create a client-secure.properties file to publish using test-user and its credentials:
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
username="test-user" \
password="test-secret";
security.protocol=SASL_PLAINTEXT
sasl.mechanism=SCRAM-SHA-256
and finally try publishing using EXTERNAL listener to the 'test-topic' previously created authenticating using 'test-user'
kafka-console-producer.sh --broker-list 172.20.30.40:9095 --topic test-topic
--producer.config client-secure.properties
and I always get the following error:
ERROR [Producer clientId=console-producer] Connection to node -1 failed authentication due to:
Client SASL mechanism 'SCRAM-SHA-256' not enabled in the server, enabled mechanisms are [PLAIN]
(org.apache.kafka.clients.NetworkClient)
why SCRAM-SHA-256 mechanism is not enabled on server? shouldn't it be enabled with 'sasl.enabled.mechanisms=PLAIN, SCRAM-SHA-256' property on 'server.properties' file and with scram config on external listener configuration defined on 'kafka_server_jaas.conf' file?
I've already spent 2 days in a row fighting with this applying different configurations without any success. Any help would be very very appreciate
Thanks in advance
After days struggling with it I've found the solution.
I didn't mention in the post that I'm running KAFKA as container in Rancher and the port 9095 of the EXTERNAL listener was not mapped in Rancher so wasn't in the docker container either.
Although I was doing the tests from inside the container if the port of the listener where you're publishing/subcribing is not mapped, it doesn't work.

Kafka Broker Issue (Replica Manager with max size)

I am seeing the following errors in my kafka env. It works for a few hours and then chokes.
20200224;21:01:38: [2020-02-24 21:01:38,615] ERROR [ReplicaManager broker=0] Error processing fetch with max size 1048576 from consumer on partition SANDBOX.BROKER.NEWORDER-0: (fetchOffset=211886, logStartOffset=-1, maxBytes=1048576, currentLeaderEpoch=Optional.empty) (kafka.server.ReplicaManager)
20200224;21:01:38: org.apache.kafka.common.errors.CorruptRecordException: Found record size 0 smaller than minimum record overhead (14) in file /data/tmp/kafka-topic-logs/SANDBOX.BROKER.NEWORDER-0/00000000000000000000.log.
20200224;21:05:48: [2020-02-24 21:05:48,711] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 1 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
20200224;21:10:22: [2020-02-24 21:10:22,204] INFO [GroupCoordinator 0]: Member xxxxxxxx_011-9e61d2c9-ce5a-4231-bda1-f04e6c260dc0-StreamThread-1-consumer-27768816-ee87-498f-8896-191912282d4f in group yyyyyyyyy_011 has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
Setup:
1. Kafka broker (kafka_2.12-2.1.1/ )
1. Zookeeper
Config for kafka:
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
group.initial.rebalance.delay.ms=0
delete.topic.enable=true
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/data/tmp/kafka-topic-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.flush.interval.ms=1000
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
delete.topic.enable=true
zookeper config
dataDir=/data/tmp/zookeeper
clientPort=2181
maxClientCnxns=0
delete.topic.enable=true

Kafka Broker leader change without effect

I have 3 kafka brokers, with 3 partitions :
broker.id 1001: 10.18.0.73:9092 LEADER
broker.id 1002: 10.18.0.73:9093
broker.id 1005: 10.18.0.73:9094
Zookeeper set with 127.0.0.1:2181
Launch with:
1001 -> .\kafka-server-start.bat ..\..\config\server.properties
1002 -> .\kafka-server-start.bat ..\..\config\server1.properties
1005 -> .\kafka-server-start.bat ..\..\config\server2.properties
This is server.properties
broker.id=-1
listeners=PLAINTEXT://10.18.0.73:9092
advertised.listeners=PLAINTEXT://10.18.0.73:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=10.18.0.73:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
advertised.port=9092
advertised.host.name=10.18.0.73
port=9092
This is server1.properties
broker.id=-1
listeners=PLAINTEXT://10.18.0.73:9093
advertised.listeners=PLAINTEXT://10.18.0.73:9093
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs4
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
advertised.port=9093
advertised.host.name=10.18.0.73
port=9093
This is server2.properties
broker.id=-1
listeners=PLAINTEXT://10.18.0.73:9094
advertised.listeners=PLAINTEXT://10.18.0.73:9094
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs2
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
advertised.port=9094
advertised.host.name=10.18.0.73
port=9094
in folder C:\kafka_2.12-2.4.0\config
Run All
Run Producer
.\kafka-console-producer.bat --broker-list 10.18.0.73:9092,10.18.0.73:9093,10.18.0.73:9094 --topic clinicaleventmanager
Run Consumer
.\kafka-console-consumer.bat --bootstrap-server 10.18.0.73:9092,10.18.0.73:9093,10.18.0.73:9094 --topic clinicaleventmanager
I launch a test message
Receive ok!
Now, i shutdown broker 1001 (the leader)
The new leader is 1002
In the consumer this message appeared for 1 second, I imagine for the time necessary for the election of the new leader
[2020-01-16 15:33:35,802] WARN [Consumer clientId=consumer-console-consumer-56669-1, groupId=console-consumer-56669] Connection to node 2147482646 (/10.18.0.73:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
if I try to send another message, this is not read by the consume
The new leader 1002 does not appear to be sending messages.
Why?
If i run 1001 broker.id, works all.
Thanks
First, Kafka never "sends (pushes) messages", the consumer asks for them.
Second, it would seem you've changed nothing but the listeners, port, and log dir.
You don't explicitly create any topic, so you would end up with the defaults of one partition and one replica. For your topic and the internal consumer offsets topic
If any replica is offline from the broker you stopped, then no other process can read (or write) to that replica, regardless of which broker is the controller.
So, change the offsets (and transactions) replication factor to 3 and try again

Why kafka cluster did error "Number of alive brokers '0' does not meet the required replication factor"?

I have 2 kafka brokers and 1 zookeeper. Brokers config: server.properties file:
1 broker:
auto.create.topics.enable=true
broker.id=1
delete.topic.enable=true
group.initial.rebalance.delay.ms=0
listeners=PLAINTEXT://5.1.2.3:9092
log.dirs=/opt/kafka_2.12-2.1.0/logs
log.retention.check.interval.ms=300000
log.retention.hours=168
log.segment.bytes=1073741824
max.message.bytes=105906176
message.max.bytes=105906176
num.io.threads=8
num.network.threads=3
num.partitions=10
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
replica.fetch.max.bytes=105906176
socket.receive.buffer.bytes=102400
socket.request.max.bytes=105906176
socket.send.buffer.bytes=102400
transaction.state.log.min.isr=1
transaction.state.log.replication.factor=1
zookeeper.connect=5.1.3.6:2181
zookeeper.connection.timeout.ms=6000
2 broker:
auto.create.topics.enable=true
broker.id=2
delete.topic.enable=true
group.initial.rebalance.delay.ms=0
listeners=PLAINTEXT://18.4.6.6:9092
log.dirs=/opt/kafka_2.12-2.1.0/logs
log.retention.check.interval.ms=300000
log.retention.hours=168
log.segment.bytes=1073741824
max.message.bytes=105906176
message.max.bytes=105906176
num.io.threads=8
num.network.threads=3
num.partitions=10
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
replica.fetch.max.bytes=105906176
socket.receive.buffer.bytes=102400
socket.request.max.bytes=105906176
socket.send.buffer.bytes=102400
transaction.state.log.min.isr=1
transaction.state.log.replication.factor=1
zookeeper.connect=5.1.3.6:2181
zookeeper.connection.timeout.ms=6000
if i ask zookeeper like this:
echo dump | nc zook_IP 2181
i got:
SessionTracker dump:
Session Sets (3):
0 expire at Sun Jan 04 03:40:27 MSK 1970:
1 expire at Sun Jan 04 03:40:30 MSK 1970:
0x1000bef9152000b
1 expire at Sun Jan 04 03:40:33 MSK 1970:
0x1000147d4b40003
ephemeral nodes dump:
Sessions with Ephemerals (2):
0x1000147d4b40003:
/controller
/brokers/ids/2
0x1000bef9152000b:
/brokers/ids/1
looke fine, but not works :(. Zookeeper see 2 brokers, but in first kafka broker we have error:
ERROR [KafkaApi-1] Number of alive brokers '0' does not meet the required replication factor '1' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)
also we use kafka_exporter for prometheus, and he log this error:
Cannot get oldest offset of topic Some.TOPIC partition 9: kafka server: Request was for a topic or partition that does not exist on this broker." source="kafka_exporter.go:296
pls help ! were i mistake in config ?
Are your clocks working? Zookeeper thinks it's 1970
Sun Jan 04 03:40:27 MSK 1970
You may want to look at the rest of the logs or see if Kafka and Zookeeper are actively running and ports are open.
In your first message, after starting a fresh cluster you see this, so it's not a true error
This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)
The properties you show, though, have listeners on entirely different subnets and you're not using advertised.listeners
Kafka broker.id changes maybe cause this problem. Clean up the kafka metadata under zk, note: kafka data will be lost
I got this error message in this situation :
Cluster talking in SSL
Every broker is a container
Updated the certificate with new password inside ALL brokers
Rolling update
After the first broker reboot, it spammed this error message and the broker controller talked about "a new broker connected but password verification failed".
Solutions :
Set the new certificate password with the old password
Down then Up your entire cluster at once
(not tested yet) Change the certificate on one broker, reboot it then move to the next broker until you reboot all of them (ending with the controller)

Kafka producer and consumer on separate computers aren't communicating

I'm using kafka_2.11-1.1.0. This is my server.properties file:
broker.id=1
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=192.168.1.110:2181,192.168.1.64:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
On the second computer, broker.id=2. I got the ip numbers for the zookeeper.connect line by typing ipconfig into the command prompt and using the IPv4 Address under Ethernet adapter Local Area Connection on one computer and the IPv4 Address under Wireless LAN adapter Wi-Fi for the other.
I ran these commands on each computer (to anyone following along, run the first on both computers before running the second):
bin\windows\zookeeper-server-start.bat config\zookeeper.properties
bin\windows\kafka-server-start.bat config\server.properties
On the first computer, I made a topic and started a producer console:
bin\windows\kafka-topics.bat --create --zookeeper 192.168.1.110:2181 --replication-factor 2 --partitions 1 --topic test
bin\windows\kafka-console-producer.bat --broker-list 192.168.1.110:2181 --topic test
On the second one, I started a consumer console:
bin\windows\kafka-console-consumer.bat --bootstrap-server 192.168.1.64:2181 --topic test
When I tried sending a message, the consumer did not receive it. The zookeeper server console on each computer looped through the following messages, but with each IP value corresponding to its respective PC, and the port number increasing by one with each loop (about once a second):
INFO Accepted socket connection from /192.168.1.110:55371 (org.apache.zookeeper.server.NIOServerCnxnFactory)
WARN Exception causing close of session 0x0 due to java.io.EOFException (org.apache.zookeeper.server.NIOServerCnxn)
INFO Closed socket connection for client /192.168.1.110:55371 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
In the producer console, this error was received after a minute:
ERROR Error when sending message to topic test with key: null, value: 6 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
How do I fix this? Any help would be appreciated.
UPDATE - SOLVED - courtesy of Victor:
Change:
bin\windows\kafka-console-producer.bat --broker-list 192.168.1.110:2181 --topic test
and
bin\windows\kafka-console-consumer.bat --bootstrap-server 192.168.1.64:2181 --topic test
To:
bin\windows\kafka-console-producer.bat --broker-list 192.168.1.110:9092 --topic test
and
bin\windows\kafka-console-consumer.bat --bootstrap-server 192.168.1.64:9092 --topic test
UPDATE 2
For anyone who is following this to set up two computers with Kafka - I found that this method doesn't always work. The permanent solution I later found was to use the same IP for both computers. I used the IP of the computer with the ethernet connection, which happened to be the one with the producer.
I believe you have to pass to the producer a list of Kafka brokers and not a Zookeeper quorum:
So change this:
bin\windows\kafka-console-producer.bat --broker-list 192.168.1.110:2181
To something like this:
bin\windows\kafka-console-producer.bat --broker-list 192.168.1.110:9092
(I´m assuming you are running your Kafka server there)
I got a similar error, writing to Kafka with Spark Streaming:
Error connecting to Zookeeper with Spark Streaming Structured