Kafka TOPIC_AUTHORIZATION_FAILED - apache-kafka

I'm actually working on setting up simple Kafka authentication using SASL Plain Text and add ACL authorization. But I have an issue when I try to consume data.
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 0.10.0.0
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : b8642491e78c5a13
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 1 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 2 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 3 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 4 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 5 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 6 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 7 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 8 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 9 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
[main] WARN org.apache.kafka.clients.NetworkClient - Error while fetching metadata with correlation id 10 : {test-topic=TOPIC_AUTHORIZATION_FAILED}
Next, you can see my configuration files.
server.properties
listeners=SASL_PLAINTEXT://localhost:9092
security.inter.broker.protocol=SASL_PLAINTEXT
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.enabled.mechanisms=PLAIN
broker.id=0
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer
producer.properties
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
bootstrap.servers=localhost:9092
compression.type=none
consumer.properties
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
zookeeper.connect=127.0.0.1:2181
zookeeper.connection.timeout.ms=6000
group.id=test-consumer-group
kafka_server_jaas.conf
KafkaServer {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="admin"
password="admin-secret"
user_admin="admin-secret"
user_alice="alice-secret";
};
KafkaClient {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="alice"
password="alice-secret";
};
Environment variable:
export KAFKA_OPTS="-Djava.security.auth.login.config=/home/user/kafka_2.10-0.10.0.1/kafka_server_jaas.conf"
Commands
Set ACL:
bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:alice --operation All --group test-consumer-group --topic test-topic
start Kafka Server :
./bin/kafka-server-start.sh config/server.properties
Start Producer:
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test-topic --producer.config=config/producer.properties
Start Consumer:
bin/kafka-console-consumer.sh --new-consumer --zookeeper localhost:2181 --topic test-topic --from-beginning --consumer.config=config/consumer.properties --bootstrap-server=localhost:9092
When I try to start the consumer, I have the issue described above. Also, in the kafka logs, I have this:
[2016-10-22 20:17:14,091] ERROR [KafkaApi-0] Error when handling request {group_id=test-consumer-group} (kafka.server.KafkaApis)
kafka.admin.AdminOperationException: replication factor: 3 larger than available brokers: 1
at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:117)
at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:403)
at kafka.server.KafkaApis.kafka$server$KafkaApis$$createTopic(KafkaApis.scala:629)
at kafka.server.KafkaApis.kafka$server$KafkaApis$$createGroupMetadataTopic(KafkaApis.scala:651)
at kafka.server.KafkaApis$$anonfun$getOrCreateGroupMetadataTopic$1.apply(KafkaApis.scala:657)
at kafka.server.KafkaApis$$anonfun$getOrCreateGroupMetadataTopic$1.apply(KafkaApis.scala:657)
at scala.Option.getOrElse(Option.scala:121)
at kafka.server.KafkaApis.getOrCreateGroupMetadataTopic(KafkaApis.scala:657)
at kafka.server.KafkaApis.handleGroupCoordinatorRequest(KafkaApis.scala:818)
at kafka.server.KafkaApis.handle(KafkaApis.scala:86)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
at java.lang.Thread.run(Thread.java:745)
How can I fix this?

Issue fixed by separating jaas client and jaas server.
kafka_server_jaas.conf
KafkaServer {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="admin"
password="admin-secret"
user_admin="admin-secret"
user_alice="alice-secret";
};
kafka_client_jaas.conf
KafkaClient {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="alice"
password="alice-secret";
};
On the same terminal, export jaas server conf file and start kafka broker:
$ export KAFKA_OPTS="-Djava.security.auth.login.config=/home/user/kafka_2.10-0.10.0.1/kafka_server_jaas.conf"
$ ./bin/kafka-server-start.sh config/server.properties
On a client terminal, export client jaas conf file and start consumer:
$ export KAFKA_OPTS="-Djava.security.auth.login.config=/home/user/kafka_2.10-0.10.0.1/kafka_client_jaas.conf"
$ ./bin/kafka-console-consumer.sh --new-consumer --zookeeper localhost:2181 --topic test-topic --from-beginning --consumer.config=config/consumer.properties --bootstrap-server=localhost:9092
If you also want to produce, do this on another terminal window:
$ export KAFKA_OPTS="-Djava.security.auth.login.config=/home/user/kafka_2.10-0.10.0.1/kafka_client_jaas.conf"
$ ./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test-topic --producer.config=config/producer.properties

I have faced similar issue with using the ACLs in Kafka v.0.10. I found this discussion helpful. Especially enabling the authorization log in order to check what is the incoming username for the request and what is it specified in your ACLs.
Firstly check if the server principal admin is provided all the authorization needed. Server principal needs to be allowed to perform all types of authorization on all topics, groups as well as cluster. It's better to declare the admin in the super-users in server.properties file. If this doesn't resolve the issue, then you can enable the authorization log to find out which specimen is being deined for what operation.
Authorization log can be enabled by modifying the log4j.properties in the config folder. In log4j.properties file, change WARN to DEBUG and restart the kafka-servers.
log4j.logger.kafka.authorizer.logger=DEBUG, authorizerAppender
This helped me in sorting out my issue. Hope that helps.
PS: The authorization logs generated will be very lengthy and consume a lot of space. So, remember to turn this off when done with debugging.

Seems you have created a topic with replication factor of 3 but you only have 1 broker running. Try creating a topic with "--replication-factor 1". You might also want to change the default replication factor to be 1 (default.replication.factor in config/server.properties) if you are creating topics automatically.

Related

How to fix kafka SCRAM authentication failure

version of confluent platform: 5.4.1
I followed the document and previous question to setup the SCRAM authentication:
https://docs.confluent.io/current/kafka/authentication_sasl/authentication_sasl_scram.html#
kafka SASL/SCRAM Failed authentication
After I modified my configurations, the SASL authentication of zookeeper server is successful but the kafka server is still failed. the below shows the log messages and my related configuration, please help advise on it
zookeeper server output:
[2020-07-18 23:53:42,917] INFO Successfully authenticated client: authenticationID=adminuser; authorizationID=adminuser. (org.apache.zookeeper.server.auth.SaslServerCallbackHandler)
[2020-07-18 23:53:43,143] INFO Setting authorizedID: adminuser (org.apache.zookeeper.server.auth.SaslServerCallbackHandler)
[2020-07-18 23:53:43,143] INFO adding SASL authorization for authorizationID: adminuser (org.apache.zookeeper.server.ZooKeeperServer)
[2020-07-18 23:53:51,162] INFO Successfully authenticated client: authenticationID=adminuser; authorizationID=adminuser. (org.apache.zookeeper.server.auth.SaslServerCallbackHandler)
[2020-07-18 23:53:51,162] INFO Setting authorizedID: adminuser (org.apache.zookeeper.server.auth.SaslServerCallbackHandler)
[2020-07-18 23:53:51,162] INFO adding SASL authorization for authorizationID: adminuser (org.apache.zookeeper.server.ZooKeeperServer)
kafka server error message:
org.apache.kafka.common.errors.DisconnectException: Cancelled fetchMetadata request with correlation id 11 due to node -1 being disconnected
[2020-07-19 00:23:59,921] INFO [SocketServer brokerId=0] Failed authentication with /192.168.20.10 (Unexpected Kafka request of type METADATA during SASL handshake.) (org.apache.kafka.common.network.Selector)
[2020-07-19 00:24:00,095] WARN [Producer clientId=confluent-metrics-reporter] Bootstrap broker 192.168.20.10:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient)
[2020-07-19 00:24:00,403] INFO [SocketServer brokerId=0] Failed authentication with /192.168.20.10 (Unexpected Kafka request of type METADATA during SASL handshake.) (org.apache.kafka.common.network.Selector)
[2020-07-19 00:24:00,597] INFO [SocketServer brokerId=0] Failed authentication with /192.168.20.10 (Unexpected Kafka request of type METADATA during SASL handshake.) (org.apache.kafka.common.network.Selector)
[2020-07-19 00:24:00,805] INFO [SocketServer brokerId=0] Failed authentication with /192.168.20.10 (Unexpected Kafka request of type METADATA during SASL handshake.) (org.apache.kafka.common.network.Selector)
zookeeper_server_jaas.conf:
Server {
org.apache.zookeeper.server.auth.DigestLoginModule required
user_adminuser="adminuserpwd";
};
zookeeper.properties:
server.001=192.168.20.10:2888:3888
authProvider.001=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
requireClientAuthScheme=sasl
zookeeper-server-start:
...
export ZK_AUTH_ARGS=$base_dir/../data/zookeeper_server_jaas.conf
exec $base_dir/kafka-run-class $EXTRA_ARGS -Djava.security.auth.login.config=$ZK_AUTH_ARGS org.apache.zookeeper.server.quorum.QuorumPeerMain "$#"
Added user:
bin/kafka-configs --zookeeper 192.168.20.10:2181 --alter --add-config 'SCRAM-SHA-256=[password=adminuserpwd],SCRAM-SHA-512=[password=adminuserpwd]' --entity-type users --entity-name adminuser
bin/kafka-configs --zookeeper 192.168.20.10:2181 --describe --entity-type users --entity-name adminuser
Configs for user-principal 'adminuser' are SCRAM-SHA-512=salt=MTdxamZocWJlY2F2dDFhZGc0dmluZm5hcmo=,stored_key=o21ptVzTVZoR/hafmOgTSYmr2F1TORPo6xDaZGAph+6OncE1pw/AyLRwduCx0Qx97bKoPWmlYShfXtbug6u8kg==,server_key=1B/1/CzPTpMBO9MpfKZb504JFLZUia0D6LatAllSYkrTa8XWbaISDGQ29Yf4UU+jQmo+iQgK0jX+KaV+fUV6XA==,iterations=4096,SCRAM-SHA-256=salt=MWlrZGs5dHd4dDhiZmdqZGxnN2cwOGpuaGs=,stored_key=vSJ83eDvilj4JyQyehPaGmG3EZISRRfo3j8iY8uiWLU=,server_key=Bu/KfHnv6bSay/n4dO/h55O9WLLaAjiLtJQzfpr4cs0=,iterations=4096
kafka_server_jaas.conf:
KafkaServer {
org.apache.kafka.common.security.scram.ScramLoginModule required
username="adminuser"
password="adminuserpwd";
};
Client {
org.apache.zookeeper.server.auth.DigestLoginModule required
username="adminuser"
password="adminuserpwd";
};
kafka server.properties:
...
listeners=SASL_PLAINTEXT://192.168.20.10:9092
security.inter.broker.protocol=SASL_PLAINTEXT
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-256
sasl.enabled.mechanisms=SCRAM-SHA-256
advertised.listeners=SASL_PLAINTEXT://192.168.20.10:9092
zookeeper.connect=192.168.20.10:2181
authorizer.class.name=io.confluent.kafka.security.authorizer.ConfluentServerAuthorizer
super.users=User:adminuser
allow.everyone.if.no.acl.found=false
...
kafka-server-start:
...
KAFKA_AUTH_ARGS=$base_dir/../data/kafka_server_jaas.conf
exec $base_dir/kafka-run-class $EXTRA_ARGS -Djava.security.auth.login.config=$KAFKA_AUTH_ARGS io.confluent.support.metrics.SupportedKafka "$#"

Bootstrap broker disconnected => KeeperErrorCode = NoNode for /brokers/topcis/xxxx/partitions/2/state + Key no found

I am trying to run kafka-console-consume.sh for a topic XXX
sh kafka-console-consumer.sh --bootstrap-server abcd:9092, bcde:9092, cdef:9092 --topic XXX
below error appear:
WARN clients.NetworkClient: Botstrapbroker abcd:9092 disconnected
WARN clients.NetworkClient: Botstrapbroker bcde:9092 disconnected
WARN clients.NetworkClient: Botstrapbroker cdef:9092 disconnected
When I check in the broker log, no error except below warning:
WARN nl.techop.kafka.dao.zookeeper.KafkaZkClient: KeeperErrorCode =
NoNode for /brokers/topics/XXX/partitions/2/state WARN
com.yammer.metrics.reporting.MetricsServlet: Error evaluating gauge
java.util.NoSuchElementException: key not found: [XXX,0]
What I have tried:
I created a testing123 topic successfully with below:
sh kafka-topics.sh --create --zookeeper defg:2181, dfde:2181, cdef:2181
--replication-factor 3 --partitions 3 --topic testing123
however when I try with kafka-console-producer.sh, same error happen:
sh kafka-console-producer.sh --broker-list abcd:9092, bcde:9092, cdef:9092
WARN clients.NetworkClient: Botstrap broker abcd:9092 disconnected
WARN clients.NetworkClient: Botstrap broker bcde:9092 disconnected
WARN clients.NetworkClient: Botstrap broker cdef:9092 disconnected
Thanks all.
The cluster is Kerberos Enabled.
I got the issue resolved by adding the below to my consumer.config properties file:
group.id = flume
security.protocol=SASL_PLAINTEXT
sasl.kerberos.service.name=kafka sasl.mechanism=GSSAPI
ssl.client.auth=none
and run below:
sh kafka-console-consumer.sh --bootstrap-server abcd:9092, bcde:9092, cdef:9092 --topic XXX --consumer.config /xxx/xxx/xxx/consumer.config.properties

Not able to access kafka(confluent) installed on Azure VM using public IP

I have installed confluent-oss-5.0.0 on Azure VM and exposed all necessary ports to access using public IP Address.
I tried to change the etc/kafka/server.properties below things to achieve but no luck
Approach - 1
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://<publicIP>:9092
--------------------------------------
Approach - 2
advertised.listeners=PLAINTEXT://<publicIP>:9092
--------------------------------------
Approach - 3
listeners=PLAINTEXT://<publicIP>:9092
I experienced below error
pj#pj-HP-EliteBook-840-G1:~/confluent-kafka/confluent-oss-5.0.0/bin$ kafka-console-producer --broker-list <publicIp>:9092 --topic pj_test123>dfsds
[2019-03-25 19:13:38,784] WARN [Producer clientId=console-producer] Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
pj#pj-HP-EliteBook-840-G1:~/confluent-kafka/confluent-oss-5.0.0/bin$ kafka-console-producer --broker-list <publicIp>:9092 --topic pj_test123
>message1
>message2
>[2019-03-25 19:20:13,216] ERROR Error when sending message to topic pj_test123 with key: null, value: 3 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 2 record(s) for pj_test123-0: 1503 ms has passed since batch creation plus linger time
[2019-03-25 19:20:13,218] ERROR Error when sending message to topic pj_test123 with key: null, value: 3 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
pj#pj-HP-EliteBook-840-G1:~/confluent-kafka/confluent-oss-5.0.0/bin$ kafka-console-consumer --bootstrap-server <publicIp>:9092 --topic pj_test123 --from-beginning
[2019-03-25 19:29:27,742] WARN [Consumer clientId=consumer-1, groupId=console-consumer-42352] Error while fetching metadata with correlation id 2 : {pj_test123=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
pj#pj-HP-EliteBook-840-G1:~/confluent-kafka/confluent-oss-5.0.0/bin$ kafka-console-consumer --bootstrap-server <publicIp>:9092 --topic pj_test123 --from-beginning
[2019-03-25 19:27:06,589] WARN [Consumer clientId=consumer-1, groupId=console-consumer-33252] Connection to node 0 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
All other service like zookeeper, kafka-connect and restAPI are working fine using the <PublicIP>:<port>
kafka-topics --zookeeper 13.71.115.20:2181 --list --- This is working
Ref:
Not able to access messages from confluent kafka on EC2
https://kafka.apache.org/documentation/#brokerconfigs
Why I cannot connect to Kafka from outside?
Solutions
Thanks, #Robin Moffatt, It works for me. I do below changes along with allowing all Kafka related ports on Azure networking
kafka#kafka:~/confluent-oss-5.0.0$ sudo vi etc/kafka/server.properties
listeners=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:19092
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
advertised.listeners=INTERNAL://<privateIp>:9092,EXTERNAL://<publicIp>:19092
inter.broker.listener.name=INTERNAL
You need to configure both internal and external listeners for your broker. This article details how: https://rmoff.net/2018/08/02/kafka-listeners-explained/.
You will also have to give public access to port 9092 (your broker). TO do that,
Go to your Virtual machine in Azure portal
Select Networking under settings in the left menu
Add inbound port rule
Add port 9092 to be accessbile from anywhere

LoggingMessageFormatter with kafka-avro-console-consumer

I am trying to print avro messages on a kafka topic using kafka-avro-console-consumer in a log4j format.
For that I use the following kafka-avro-console-consumer command:
bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic avro-test -property print.key=true --formatter kafka.tools.LoggingMessageFormatter
I have exported KAFKA_OPTS via the following command:
export $KAFKA_OPTS= -Dlog4j.configuration=file:/path/to/file/kafka-console-consumer-log4j.properties
Now if I run regular kafka-console-consumer,using the following command:
bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic avro-test -property print.key=true --formatter kafka.tools.LoggingMessageFormatter
I am able to produce a log4j enabled output:
[2018-07-17 19:09:40,514] INFO [Consumer clientId=consumer-1, groupId=console-consumer-10597] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2018-07-17 19:09:40,522] INFO [Consumer clientId=consumer-1, groupId=console-consumer-10597] Successfully joined group with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2018-07-17 19:09:40,523] INFO [Consumer clientId=consumer-1, groupId=console-consumer-10597] Setting newly assigned partitions [avro-test-0] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-07-17 19:09:40,531] INFO [Consumer clientId=consumer-1, groupId=console-consumer-10597] Resetting offset for partition avro-test-0 to offset 23. (org.apache.kafka.clients.consumer.internals.Fetcher)
However this formatting option does not kick in if I use a avro consumer using the following command:
bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic avro-test -property print.key=true --formatter kafka.tools.LoggingMessageFormatter
It just resorts to a default formatter.
Is there something I may be missing here?
I think if you override the --formatter, you won't get Avro messages anymore, as kafka.tools.LoggingMessageFormatter doesn't understand how to deserialize Avro
Ref - source code
DEFAULT_AVRO_FORMATTER="--formatter io.confluent.kafka.formatter.AvroMessageFormatter"
...
for OPTION in "$#"
do
case $OPTION in
--formatter)
DEFAULT_AVRO_FORMATTER=""
...
exec $(dirname $0)/schema-registry-run-class kafka.tools.ConsoleConsumer $DEFAULT_AVRO_FORMATTER ...
so, it should run kafka.tools.ConsoleConsumer --formatter kafka.tools.LoggingMessageFormatter, as expected becuase the default is being unassigned, and schema-registry-run-class is defining KAFKA_OPTS, but you need to not have spaces or dollar signs on that line
export KAFKA_OPTS='-Dlog4j.configuration=file:/path/to/file/kafka-console-consumer-log4j.properties'
bin/kafka-avro-console-consumer ...

Correlation Id errors for Kafka console producer and consumer

I have 2 Kafkas backed by 3 ZK nodes. I want to test the Kafka nodes by running the kafka-console-producer and -consumer locally on each node.
So I SSH into one of my Kafka brokers using 2 different terminals. In terminal #1 I run the consumer like so:
/opt/kafka/bin/kafka-console-consumer.sh --zookeeper a.b.c.d:2181 --topic test1
Where a.b.c.d is the private IP of one of my 3 ZK nodes.
Then in terminal #2 I run the producer like so:
/opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test1
I am able to start both the consumer and producer just fine without any issues.
However, in the producer terminal, if I "fire" a message at the test1 topic by entering some text (such as "hello") and hitting the ENTER key, I immediately begin seeing this:
[2017-01-17 19:45:57,353] WARN Error while fetching metadata with correlation id 0 : {test1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2017-01-17 19:45:57,372] WARN Error while fetching metadata with correlation id 1 : {test1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2017-01-17 19:45:57,477] WARN Error while fetching metadata with correlation id 2 : {test1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2017-01-17 19:45:57,582] WARN Error while fetching metadata with correlation id 3 : {test1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
...and it keeps going!
And, in the consumer terminal, even though I don't get any errors when I start the consumer, after about 30 seconds I get the following warning message:
[2017-01-17 19:46:07,292] WARN Fetching topic metadata with correlation id 1 for topics [Set(test1)] from broker [BrokerEndPoint(1,ip-x-y-z-w.ec2.internal,9092)] failed (kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:110)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:80)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:79)
at kafka.producer.SyncProducer.send(SyncProducer.scala:124)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:94)
at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
Interestingly, ip-x-y-z-w.ec2.internal is the private DNS for the other Kafka broker, so perhaps this is some kind of failure during interbroker communication?
Any ideas as to what is going on here and what I can do to troubleshoot?
Update
Here's my entire server.properties file for both Kafkas nodes:
listeners=PLAINTEXT://0.0.0.0:9092
advertised.host.name=<private-aws-ec2-ip-addr>.ec2.internal
advertised.listeners=PLAINTEXT://0.0.0.0:9092
broker.id=1
port=9092
num.partitions=4
zookeeper.connect=zkA:2181,zkB:2181,zkC:2181
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
log.dirs=/tmp/kafka-logs
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connection.timeout.ms=6000
offset.metadata.max.bytes=4096
Please let me know if anything looks like config smell.