Kafka: Connection issues - All servers failed to process request" - apache-kafka

Setup
We have a 3 node kafka cluster, processing messages coming in through nginx. The nginx hands it off to a php which in turn forks a python process and calls the KafkaClient, SimpleProducer & Send_Message
The zookeeper is running on the same host as kafka, nginx is on a separate host. The ports 2181, 2182, 3888, 9092 are all open. No errors seen in starting zookeeper, kafka. All this setup is on AWS in the same vpc.
Kafka & Zookeeper is running as kafka user, Nginx is running as nginx, php-fpm running as apache
Versions
Kafka: 0.8.2
Python: 2.7.5
Relevant snippets from property files.
zookeeper.properties
dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=100
tickTime=2000
initLimit=5
syncLimit=2
server.1=172.31.41.78:2888:3888
server.2=172.31.45.245:2888:3888
server.3=172.31.23.101:2888:3888
producer.properties
metadata.broker.list=172.31.41.78:9092,172.31.45.245:9092,172.31.23.101:9092
consumer.properties
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=6000
server.properties (setup with appropriate IP on other machines)
port=9092
advertised.host.name=172.31.41.78
host.name=172.31.41.78
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=60000
php code
function sendDataToKafka($_data,$_srcType) {
try{
$pyKafka = "/usr/bin/python /etc/nginx/html/postMon.py ".$_srcType;
$dspec = array(
0 => array("pipe","r"),
1 => array("pipe","w"),
2 => array("file","/dev/null", "a")
);
$process = proc_open($pyKafka,$dspec,$pipes);
if (is_resource($process)) {
if(fwrite($pipes[0],$_data) == true){
fclose($pipes[0]);
echo stream_get_contents($pipes[1]);
fclose($pipes[1]);
proc_close($process);
echo "Process completed";
python code
import sys,json,time,ConfigParser
import traceback
sys.path.append("/etc/nginx/html/kafka_python-0.9.4-py2.7.egg")
from kafka import KafkaClient,SimpleProducer
try:
srcMap = {
'Alert' : 'alerts'
}
topic = srcMap.get(sys.argv[1],'events')
data = ''
data = 'Testing static Kafka message'
print 'Host: 172.31.23.101:9092'
kafka = KafkaClient("172.31.23.101:9092")
producer = SimpleProducer(kafka,random_start=True)
producer.send_messages(topic,data);
except Exception as e: # most generic exception you can catch
print str(e)
Scenarios
Scenario 1:
Running a
bin/kafka-console-producer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
on 1 shell
and
running
./kafka-console-consumer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
we are able to view messages
Scenario 2:
Running the python code command line (from the nginx host), able to view messages from the consumer
Scenario 3:
Running the php code command line (from nginx host), able to view messages from the consumer
Scenario 4:
Running from a REST client (as POSTMAN)/ CURL using the REST URL, get the following message:
<html>
<body>
Host: 172.31.23.101:9092
All servers failed to process request
<pre>Process completed</pre></body>
<html>
This shows, traffic going to nginx, nginx executing the php & python scripts, but erroring out when the first call to Kafka is made - KafkaClient happens. Somehow the python is unable to access Kafka.
Don't know if this is user permission/ silly config mistake.
Also......
We have a similar working setup in another vpc
The security groups, config files, codebase properties etc. are consistent
Upgrade options are not a possibility in near term
Any pointers/help/fresh pair of eyes would really help us going.
Thanks !

Finally figured the apache user did not have "right" permission.
selinuxconlist apache helped fix the issue.

Related

Produce/Consume to Remote Kafka Does not Work

I have set up a AWS EC2 instance running Apache Kafka 0.8 via a Bitnami AMI image. The server properties are pretty much default (Kafka located at localhost:9092 and zookeeper located at localhost:2181).
When I SSH into the machine, I can produce/consume data using the scripts provided by Kafka, located at kafka/bin.
To produce I run the following command:
./kafka-console-producer.sh --broker-list localhost:9092 --topic test
To Consume:
./kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
This works correctly, thus I have determined that Kafka is functioning correctly. Next I attempted to produce/consume from my machine, using the python library pykafka:
client = KafkaClient(hosts = KAFKA_HOST)
topic = client.topics[sys.argv[1]]
try:
with topic.get_producer(max_queued_messages=1, auto_start=True) as producer:
while True:
for i in range(10):
message = "Test message sent on: " + str(datetime.datetime.now().strftime("%I:%M%p on %B %d, %Y"))
encoded_message = message.encode("utf-8")
mess = producer.produce(encoded_message)
except Exception as error:
print('Something went wrong; printing exception:')
print(error)
And I consume as follows:
client = KafkaClient(hosts = KAFKA_HOST)
topic = client.topics[sys.argv[1]]
try:
while True:
consumer = topic.get_simple_consumer(auto_start=True)
for message in consumer:
if message is not None:
print (message.offset, message.value)
except Exception as error:
print('Something went wrong; printing exception:')
print(error)
These snippets run without errors or exceptions, but no messages are produced or consumed, not even the ones created via the local scripts.
I have confirmed that both ports 9092 and 2181 are open via telnet.
My questions are as follows:
Is there a way to debug such problems and find the root cause? I would expect the library to throw an exception if there is some connectivity issues.
What is going on?

Kafka Remote Producer - advertised.listeners

I am running Kafka 0.10.0 on CDH 5.9, cluster is kerborized.
What I am trying to do is to write messages from a remote machine to my Kafka broker.
The cluster (where Kafka is installed) has internal as well as external IP addresses.
The machines' hostnames within the cluster get resolved to the private IPs, the remote machine resolves the same hostnames to the public IP addreses.
I opened the necessary port 9092 (I am using SASL_PLAINTEXT protocol) from remote machine to Kafka Broker, verified that using telnet.
First Step - in addition to the standard properties for the Kafka Broker, I configured the following:
listeners=SASL_PLAINTEXT://0.0.0.0:9092
advertised.listeners=SASL_PLAINTEXT://<hostname>:9092
I am able to start the console consumer with
kafka-console-consumer --new consumer --topic <topicname> --from-beginning --bootstrap-server <hostname>:9092 --consumer.config consumer.properties
I am able to use my custom producer from another machine within the cluster.
Relevant excerpt of producer properties:
security.protocol=SASL_PLAINTEXT
bootstrap.servers=<hostname>:9092
I am not able to use my custom producer from the remote machine:
Exception org.apache.kafka.common.errors.TimeoutException: Batch containing 1 record(s) expired due to timeout while requesting metadata from brokers for <topicname>-<partition>
using the same producer properties. I am able to telnet the Kafka Broker from the machine and /etc/hosts includes hostnames and public IPs.
Second Step - I modified server.properties:
listeners=SASL_PLAINTEXT://0.0.0.0:9092
advertised.listeners=SASL_PLAINTEXT://<kafkaBrokerInternalIP>:9092
consumer & producer within the same cluster still run fine (bootstrap
servers are now the internal IP with port 9092)
as expected remote producer fails (but that is obvious given that it
is not aware of the internal IP addresses)
Third Step - where it gets hairy :(
listeners=SASL_PLAINTEXT://0.0.0.0:9092
advertised.listeners=SASL_PLAINTEXT://<kafkaBrokerPublicIP>:9092
starting my consumer with
kafka-console-consumer --new-consumer --topic <topicname> --from-beginning --bootstrap-server <hostname>:9092 --consumer.config consumer.properties
gives me a warning, but I don't think this is right...
WARN clients.NetworkClient: Error while fetching metadata with correlation id 1 : {<topicname>=LEADER_NOT_AVAILABLE}
starting my consumer with
kafka-console-consumer --new-consumer --topic <topicname> --from-beginning --bootstrap-server <KafkaBrokerPublicIP>:9092 --consumer.config consumer.properties
just hangs after those log messages:
INFO utils.AppInfoParser: Kafka version : 0.10.0-kafka-2.1.0
INFO utils.AppInfoParser: Kafka commitId : unknown
seems like it cannot find a coordinator as in the normal flow this would be the next log:
INFO internals.AbstractCoordinator: Discovered coordinator <hostname>:9092 (id: <someNumber> rack: null) for group console-consumer-<someNumber>.
starting the producer on a cluster node with bootstrap.servers=:9092
I observe the same as with the producer:
WARN NetworkClient:600 - Error while fetching metadata with correlation id 0 : {<topicname>=LEADER_NOT_AVAILABLE}
starting the producer on a cluster node with bootstrap.servers=:9092 I get
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
starting the producer on my remote machine with either bootstrap.servers=:9092 or bootstrap.servers=:9092 I get
NetworkClient:600 - Error while fetching metadata with correlation id 0 : {<topicname>=LEADER_NOT_AVAILABLE}
I have been struggling for the past three days to get this to work, however I am out of ideas :/ My understanding is that advertised.hostnames serves for exactly this purpose, however either I am doing something wrong, or there is something wrong in the machine setup.
Any hints are very much appreciated!
I met this issue recently.
In my case , I enabled Kafka ACL, and after disable it by comment this 2 configuration, the problem worked around.
authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer
super.users=User:kafka
And an thread may help you I think:
https://gist.github.com/jorisdevrede/a7933a99251452bb1867
What mentioned in it at the end:
If you only use a SASL_PLAINTEXT listener on the Kafka Broker, you
have to make sure that you have set the
security.inter.broker.protocol=SASL_PLAINTEXT too, otherwise you will
get a LEADER_NOT_AVAILABLE error in the client.

Problems publishing messages with kafka running on mesos DCOS

I have a small cluster running DCOS. I'm able to successfully install kafka following this guide. running
$ dcos kafka connection
gives
{
"address": [
"10.131.17.126:9475",
"10.131.24.6:9655",
"10.131.14.192:9181"
],
"zookeeper": "master.mesos:2181/dcos-service-kafka",
"dns": [
"broker-0.kafka.mesos:9475",
"broker-1.kafka.mesos:9655",
"broker-2.kafka.mesos:9181"
]
}
I can create topics and I've examined zookeeper with the cli tool and the state appears to be good
get /dcos-service-kafka/brokers/ids/0
{"jmx_port":-1,"timestamp":"1474206074029","endpoints":["PLAINTEXT://10.131.17.126:9475"],"host":"10.131.17.126","version":3,"port":9475}
get /dcos-service-kafka/brokers/ids/1
{"jmx_port":-1,"timestamp":"1474206120002","endpoints":["PLAINTEXT://10.131.24.6:9655"],"host":"10.131.24.6","version":3,"port":9655}
get /dcos-service-kafka/brokers/ids/2
{"jmx_port":-1,"timestamp":"1474206122985","endpoints":["PLAINTEXT://10.131.14.192:9181"],"host":"10.131.14.192","version":3,"port":9181}
However when I try publishing
echo "Hello, World." | ./kafka-console-producer.sh --broker-list 10.131.17.126:9475, 10.131.24.6:9655, 10.131.14.192:9181 --topic topic1
I get
[2016-09-18 18:49:32,909] ERROR Error when sending message to topic topic1 with key: null, value: 13 bytes with error: Failed to update metadata after 60000 ms. (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
I suspect it might be something to do with private vs. public ip addresses and perhaps host.name in server.properties.
Can anyone give some suggestions as to how I might debug (and hopefully fix!) the problem so I can successfully publish messages?
Thanks
AJ
Update - this does appear to have been caused by missing entries in /etc/hosts. I've updated my terraform script to write these during setup and your example above now works as expected.
Thanks for your help
Edit: For anyone looking in the future. This was a problem in /etc/hosts caused by a terraform script.
Your suspicion is correct. Those are private IP addresses which are not addressable from outside the cluster. In order to communicate with Kafka you will either have to setup a VPN such that those IP addresses become reachable, or run your publishing command on a machine in the cluster.
Also, it looks like you're running on a DC/OS version earlier than 1.8. If you use 1.8, you'll get an easier broker endpoint to use, regardless of the dynamically assigned IP addresses. You can used the named VIP broker.kafka.l4lb.thisdcos.directory:9092 however this is only addressable from machines in the cluster.
Setting up haproxy or nginx to point to the named VIP is also a way to get easy external access to a service (in this case Kafka) running on a DC/OS cluster. You would want to ensure that these proxies run on a public Agent. See here for more details.
Here is an example of installing, producing and consuming from the default Kafka installation:
~ $ dcos package install kafka
Installing Marathon app for package [kafka] version [1.1.11-0.10.0.0]
Installing CLI subcommand for package [kafka] version [1.1.11-0.10.0.0]
New command available: dcos kafka
DC/OS Kafka Service is being installed.
Documentation: https://docs.mesosphere.com/usage/services/kafka/
Issues: https://docs.mesosphere.com/support/
~ $ dcos kafka connection
{
"address": [
"10.0.3.64:9951",
"10.0.3.68:9795",
"10.0.3.66:9531"
],
"zookeeper": "master.mesos:2181/dcos-service-kafka",
"dns": [
"broker-0.kafka.mesos:9951",
"broker-1.kafka.mesos:9795",
"broker-2.kafka.mesos:9531"
],
"vip": "broker.kafka.l4lb.thisdcos.directory:9092"
}
~ $ dcos kafka topic create topic0
{
"message": "Output: Created topic \"topic0\".\n"
}
~ $ dcos node ssh --master-proxy --leader
core#ip-10-0-7-56 ~ $ wget http://download.nextag.com/apache/kafka/0.10.0.1/kafka_2.11-0.10.0.1.tgz
core#ip-10-0-7-56 ~ $ tar xf kafka_2.11-0.10.0.1.tgz
core#ip-10-0-7-56 ~ $ cd kafka_2.11-0.10.0.1
core#ip-10-0-7-56 ~/kafka_2.11-0.10.0.1 $ bin/kafka-console-producer.sh --broker-list broker.kafka.l4lb.thisdcos.directory:9092 --topic topic0
This is a message
This is another message
^Ccore#ip-10-0-7-56 ~/kafka_2.11-0.10.0.1 $ bin/kafka-console-consumer.sh --zookeeper master.mesos:2181/dcos-service-kafka --topic topic0 --from-beginning
This is a message
This is another message
^CProcessed a total of 2 messages
$ bin/kafka-console-producer.sh --broker-list 10.0.3.64:9951,10.0.3.68:9795,10.0.3.66:9531 --topic topic0
foo
bar
baz
^Ccore#ip-10-0-7-56 ~/kafka_2.11-0.10.0.1 $ bin/kafka-console-consumer.sh --zookeeper master.mesos:2181/dcos-service-kafka --topic topic0 --from-beginning
This is a message
This is another message
foo
bar
baz
^CProcessed a total of 5 messages

kafka-python: producer is not able to connect

kafka-python (1.0.0) throws error while connecting to the broker.
At the same time /usr/bin/kafka-console-producer and /usr/bin/kafka-console-consumer work fine.
Python application used to work well also, but after zookeeper restart, it no longer can connect.
I am using bare bones example from the docs:
from kafka import KafkaProducer
from kafka.common import KafkaError
producer = KafkaProducer(bootstrap_servers=['hostname:9092'])
# Asynchronous by default
future = producer.send('test-topic', b'raw_bytes')
I am getting this error:
Traceback (most recent call last): File "pp.py", line 4, in <module>
producer = KafkaProducer(bootstrap_servers=['hostname:9092']) File "/usr/lib/python2.6/site-packages/kafka/producer/kafka.py", line 246, in __init__
self.config['api_version'] = client.check_version() File "/usr/lib/python2.6/site-packages/kafka/client_async.py", line 629, in check_version
connect(node_id) File "/usr/lib/python2.6/site-packages/kafka/client_async.py", line 592, in connect
raise Errors.NodeNotReadyError(node_id) kafka.common.NodeNotReadyError: 0 Exception AttributeError: "'KafkaProducer' object has no attribute '_closed'" in <bound method KafkaProducer.__del__ of <kafka.producer.kafka.KafkaProducer object at 0x7f6171294c50>> ignored
When stepping through ( /usr/lib/python2.6/site-packages/kafka/client_async.py) I noticed that line 270 evaluates as false:
270 if not self._metadata_refresh_in_progress and not self.cluster.ttl() == 0:
271 if self._can_send_request(node_id):
272 return True
273 return False
In my case self._metadata_refresh_in_progress is False, but the ttl() = 0;
At the same time kafka-console-* are happily pushing messages around:
/usr/bin/kafka-console-producer --broker-list hostname:9092 --topic test-topic
hello again
hello2
Any advice?
I had the same problem, and none of the solutions above worked. Then I read the exception messages and it seems it's mandatory to specify api_version, so
producer = KafkaProducer(bootstrap_servers=['localhost:9092'],api_version=(0,1,0))
note: tuple (1,0,0) matching to kafka version 1.0.0
works fine (at least completes without exceptions, now have to convince it to accept messages ;) )
I had a similar problem. In my case, broker hostname was unresolvable on the client side . Try to explicitly set advertised.host.name in the configuration file.
I had the same problem.
I solved the problem with hint of user3503929.
The kafka server was installed on windows.
server.properties
...
host.name = 0.0.0.0
...
.
producer = KafkaProducer(bootstrap_servers='192.168.1.3:9092',
value_serializer=str.encode)
producer.send('test', value='aaa')
producer.close()
print("DONE.")
There was no problem with the processing in the windows kafka client.
However, when I send a message to topic using kafka-python in ubuntu, a NoBrokersAvailable exception is raised.
Add the following settings to server.properties.
...
advertised.host.name = 192.168.1.3
...
It runs successfully in the same code.
I spent three hours because of this.
Thanks
A host could have multiple dns aliases. Any of them would work for ssh or ping test. However kafka connection should use the alias that matches advertised.host.name in server.properties file of the broker.
I was using a different alias in bootstrap_servers parameter. Hence an error. Once I changed the call to use advertised.hostname, the problem was solved
Install kafka-python using pip install kafka-python
Steps to create kafka data pipeline:-
1. Run the Zookeeper using shell command or install zookeeperd using
sudo apt-get install zookeeperd
This will run zookeeper as a daemon and by default listens to 2181 port
Run the kafka Server
Run the script with producer.py and consumer.py on separate consoles to see the live data.
Here are the commands to run:-
cd kafka-directory
./bin/zookeeper-server-start.sh ./config/zookeeper.properties
./bin/kafka-server-start.sh ./config/server.properties
Now that you have zookeeper and kafka server running, Run the producer.py script and consumer.py
Producer.py:
from kafka import KafkaProducer
import time
producer = KafkaProducer(bootstrap_servers=['localhost:9092'])
topic = 'test'
lines = ["1","2","3","4","5","6","7","8"]
for line in lines:
try:
producer.send(topic, bytes(line, "UTF-8")).get(timeout=10)
except IndexError as e:
print(e)
continue
Consumer.py:-
from kafka import KafkaConsumer
topic = 'test'
consumer = KafkaConsumer(topic, bootstrap_servers=['localhost:9092'])
for message in consumer:
# message value and key are raw bytes -- decode if necessary!
# e.g., for unicode: `message.value.decode('utf-8')`
# print ("%s:%d:%d: key=%s value=%s" % (message.topic, message.partition,
# message.offset, message.key,
# message.value))
print(message)
Now run the producer.py and consumer.py in separate terminals to see the live data..!
Note: Above producer.py script runs once only to run it forever, use while loop and use time module.
I had a similar problem and removing the port from the bootstrap_servers helped.
consumer = KafkaConsumer('my_topic',
#group_id='x',
bootstrap_servers='kafka.com')
In your server.properties file make sure he Listener IP is set to your box Ip address which is accessible to remote machine. By default it is localhost
Update this line in your server.properties:
listeners=PLAINTEXT://<Your-IP-address>:9092
Also make sure you don't have a firewall which might be blocking other IP addresses to reach you. If you have sudo previleges. The try disabling the firewall.
sudo systemctl stop firewalld

Kafka QuickStart, advertised.host.name gives kafka.common.LeaderNotAvailableException

I am able to get a simple one-node Kafka (kafka_2.11-0.8.2.1) working locally on one linux machine, but when I try to run a producer remotely I'm getting some confusing errors.
I'm following the quickstart guide at http://kafka.apache.org/documentation.html#quickstart. I stopped the kafka processes and deleted all the zookeeper & karma files in /tmp. I am on a local 10.0.0.0/24 network NAT-ed with an external IP address, so I modified server.properties to tell zookeeper how to broadcast my external address, as per https://medium.com/#thedude_rog/running-kafka-in-a-hybrid-cloud-environment-17a8f3cfc284:
advertised.host.name=MY.EXTERNAL.IP
Then I'm running this:
$ bin/zookeeper-server-start.sh config/zookeeper.properties
--> ...
$ export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M" # small test server!
$ bin/kafka-server-start.sh config/server.properties
--> ...
I opened up the firewall for my producer on the remote machine, and created a new topic and verified it:
$ bin/kafka-topics.sh --create --zookeeper MY.EXTERNAL.IP:2181 --replication-factor 1 --partitions 1 --topic test123
--> Created topic "test123".
$ bin/kafka-topics.sh --list --zookeeper MY.EXTERNAL.IP:2181
--> test123
However, the producer I'm running remotely gives me errors:
$ bin/kafka-console-producer.sh --broker-list MY.EXTERNAL.IP:9092 --topic test123
--> [2015-06-16 14:41:19,757] WARN Property topic is not valid (kafka.utils.VerifiableProperties)
My Test Message
--> [2015-06-16 14:42:43,347] WARN Error while fetching metadata [{TopicMetadata for topic test123 ->
No partition metadata for topic test123 due to kafka.common.LeaderNotAvailableException}] for topic [test123]: class kafka.common.LeaderNotAvailableException (kafka.producer.BrokerPartitionInfo)
--> (repeated several times)
(I disabled the whole firewall to make sure that wasn't the problem.)
The stdout errors in the karma-startup are repeated: [2015-06-16 20:42:42,768] INFO Closing socket connection to /MY.EXTERNAL.IP. (kafka.network.Processor)
And the controller.log gives me this, several times:
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
[2015-06-16 20:44:08,128] INFO [Controller-0-to-broker-0-send-thread], Controller 0 connected to id:0,host:MY.EXTERNAL.IP,port:9092 for sending state change requests (kafka.controller.RequestSendThread)
[2015-06-16 20:44:08,428] WARN [Controller-0-to-broker-0-send-thread], Controller 0 epoch 1 fails to send request Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:1;CorrelationId:7;ClientId:id_0-host_null-port_9092;Leaders:id:0,host:MY.EXTERNAL.IP,port:9092;PartitionState:(test123,0) -> (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:0) to broker id:0,host:MY.EXTERNAL.IP,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread)
Running this seems to indicate that there is a leader at 0:
$ ./bin/kafka-topics.sh --zookeeper MY.EXTERNAL.IP:2181 --describe --topic test123
--> Topic:test123 PartitionCount:1 ReplicationFactor:1 Configs:
Topic: test123 Partition: 0 Leader: 0 Replicas: 0 Isr: 0
I reran this test and my server.log indicates that there is a leader at 0:
...
[2015-06-16 21:58:04,498] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2015-06-16 21:58:04,642] INFO Registered broker 0 at path /brokers/ids/0 with address MY.EXTERNAL.IP:9092. (kafka.utils.ZkUtils$)
[2015-06-16 21:58:04,670] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
[2015-06-16 21:58:04,736] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
I see this error in the logs when I send a message from the producer:
[2015-06-16 22:18:24,584] ERROR [KafkaApi-0] error when handling request Name: TopicMetadataRequest; Version: 0; CorrelationId: 7; ClientId: console-producer; Topics: test123 (kafka.server.KafkaApis)
kafka.admin.AdminOperationException: replication factor: 1 larger than available brokers: 0
at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:70)
I assume this means that the broker can't be found for some reason? I'm confused what this means...
For the recent versions of Kafka (0.10.0 as of this writing), you don't want to use advertised.host.name at all. In fact, even the [documentation] states that advertised.host.name is already deprecated. Moreover, Kafka will use this not only as the "advertised" host name for the producers/consumers, but for other brokers as well (in a multi-broker environment)...which is kind of a pain if you're using using a different (perhaps internal) DNS for the brokers...and you really don't want to get into the business of adding entries to the individual /etc/hosts of the brokers (ew!)
So, basically, you would want the brokers to use the internal name, but use the external FQDNs for the producers and consumers only. To do this, you will update advertised.listeners instead.
Set advertised.host.name to a host name, not an IP address. The default is to return a FQDN using getCanonicalHostName(), but this is only best effort and falls back to an IP. See the java docs for getCanonicalHostName().
The trick is to get that host name to always resolve to the correct IP. For small environments I usually setup all of the hosts with all of their internal IPs in /etc/hosts. This way all machines know how to talk to each other over the internal network, by name. In fact, configure your Kafka clients by name now too, not by IP. If managing all the /etc/hosts files is a burden then setup an internal DNS server to centralize it, but internal DNS should return internal IPs. Either of these options should be less work than having IP addresses scattered throughout various configuration files on various machines.
Once everything is communicating by name all that's left is to configure external DNS with the external IPs and everything just works. This includes configuring Kafka clients with the server names, not IPs.
So to summarize, the solution to this was to add a route via NAT so that the machine can access its own external IP address.
Zookeeper uses the address it finds in advertised.host.name both to tell clients where to find the broker as well as to communicate with the broker itself. The error that gets reported doesn't make this very clear, and it's confusing because a client has no problem opening a TCP connection.
Taking cue from above: for my single node (while still learning) I modified server.properties file having text "advertised.host.name" to value=127.0.01. So finally it looks something like this
advertised.host.name=127.0.0.1
While starting producer it still shows warning, but now it is atleast working while I can see messages on consumer terminal perfectly comming
On your machine where Kafka is installed, check if it is up and running. The error states, 0 brokers are available that means Kafka is not up and running.
On linux machine you can use the netstat command to check if the service is running.
netstat -an|grep port_kafka_is_Listening ( default is 9092)
conf/server.properties:
host.name
DEPRECATED: only used when listeners is not set. Use listeners instead. hostname of broker. If this is set, it will only bind to this address. If this is not set, it will bind to all interfaces