Unable to connect to kafka broker of HDP Sandbox from Host machine - apache-kafka

I am unable to connect to kafka broker (http://127.0.0.1:6667) of Hortonworks Sandbox 2.6 from my host machine. Every time I connect, it is saying the site is not reachable. What I am doing wrong.
FYI, I have enabled port forwarding for 6667 for my sandbox.

have you checked this: https://de.hortonworks.com/tutorial/sandbox-port-forwarding-guide/section/1/#add-ports-to-the-docker-script
Essentially you also need to enable port forwarding for the Docker Sandbox, just Host -> VM wont suffice, you need Host -> VM -> Docker container.
Helped in my case for Sandbox HDP 2.6+

Kafka is not an HTTP server; your browser will never be able to reach that address.
You will need to download Kafka CLI tools or SSH into the machine to connect to Kafka.

Related

unable to connect to kafka broker (via zookeeper) using Conduktor client

Able to connect successfully to local kafka broker/cluster running locally (dockerized) using Conduktor, but when trying to connect to Kafka cluster running on Unix VM, getting below error.
Error:
"The broker [...] is reachable but Kafka can't connect. Ensure you have access to the advertised listeners of the the brokers and the proper authorization"
Appreciate any assistance.
running locally (dockerized)
When running in docker, you need to ensure that the ports are accessible from outside of your container. To verify this, try doing a telnet <ip> <port> and check if you are able to connect.
Since the error message says, the broker is reachable, I suppose you would be able to successfully telnet to the broker.
Next, check your broker config called advertised.listeners. Here you need to mention your IP:Port combination where IP is what you will be giving in your client program i.e. Conduktor.
An example for that would be
advertised.listeners=PLAINTEXT://1.2.3.4:9092
and then restart your broker and reconnect. If you are using ssl then you need to provide some extra configuration. See Configuring Kafka brokers for more.
Try to add in /etc/hosts (Unix-like) or C:\Windows\System32\drivers\etc\hosts (windows-like) the Kafka server in such manner kafka_server_ip kafka_server_name_in_dns (e.g. 10.10.0.1 kafka).

How to reach Cloudera Kafka Broker on private network from outside?

I have a cluster inside a VPN which contains a server with private IP. I'm trying to set up a Kafka communication between an external server to my private server. My approach is to set an IP table where a public IP is pointing my private IP. Also, I opened the port 9092 and 9093 to make it reachable from outside. Now I am available to connect successfully to my server with the public IP from the external server.
telnet <public_ip> 9092
Connected to <public_ip>
My kafka broker is under a cloudera cluster and I created it with Cloudera Manager. The configuration is the following:
kafka.properties:
listeners=PLAINTEXT://<private_ip>:9092,SSL://<private_ip>:9093
advertised.listeners=PLAINTEXT://<private_ip>:9092,SSL://<private_ip>:9093
advertised.host.name:
<public_ip>
Using this broker configuration the comunication works perfectly inside the cluster either using the public_ip or private_ip of the kafka broker host.
What I see now is that I have a working broker that can be used with a public_ip and a external server that is able to reach the public_ip and it's required ports. But when I try to connect to the broker from a external server, I have the following error:
NO BROKERS AVAILABLE
There's no more information of the error. On my external server I have the kafka python package where I configure the producer as:
"bootstrap_servers": ["<publi_ip>:9092"]
on a existing TOPIC of my kafka broker.
Especifications:
private host
cloudera: CDH 5.12.0
kafka: kafka 2.2.0-1.2.2.0
zookeeper: Zookeeper 3.4.5
external host
kafka Python package: kafka-python==1.4.2
The problem is very similar to this post. But in this case he uses a forwarded port with public ip. Is any possibility to do it with ip tables? Anyone has managed to do it on a cloudera cluster?
Thank you in advance.
The question isn't specific to Cloudera or Python. And I don't think Cloudera Manager has some setting that'll set this up for you.
advertised.listeners will have to be a publicly resolvable address that can be used to access each broker individually by clients (e.g two brokers cannot have the same listener setting and be used from a port forward from the public address to the internal address)
Your setup is very similar to Kafka running in Docker or Cloud providers such as AWS, in that you're interacting over two networks, so refer to this blog for more information
Also, unless you setup some other firewall settings to prevent random access, don't expose brokers in the plaintext protocol

Produce Messages in Kafka in HDF3.0.2 from local machine

I have Hortonworks DataFlow (HDF3.0.2) running in VmWare on my mac. Kafka broker is running on port 6667 and the IP address of sandbox is 172.17.0.2
In java program, running locally on my mac, I have bootstrap server configured as below:
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "172.17.0.2:6667");
Java program just hangs cannot produce message in kafka topic. I have tried disabling the firewall, added entry in /etc/hosts as:
172.17.0.2 sandbox-hdf.hortonworks.com
and changed the bootstrap servers config entry to use sandbox-hdf.hortonworks.com, but no luck.
telnet command to 172.17.0.2 6667 hangs as well and it gives connection timeout error.
Any help to produce message in Kafka running in HDF 3.0.2 from outside of Vmware is highly appreciated. Please let me know if I am missing anything.
Thanks for your time and help.

Consume from a Kafka Cluster through SSH Tunnel

We are trying to consume from a Kafka Cluster using the Java Client. The Cluster is a behind a Jump host and hence the only way to access is through a SSH Tunnel. But we are not able read because once the consumer fetches metadata it uses the original hosts to connect to brokers. Can this behaviour be overridden? Can we ask Kafka Client to not use the metadata?
Not as far as I know.
The trick I used when I needed to do something similar was:
setup a virtual interface for each Kafka broker
open a tunnel to each broker so that broker n is bound to virtual interface n
configure your /etc/hosts file so that the advertised hostname of broker n is resolved to the ip of the virtual interface n.
Es.
Kafka brokers:
broker1 (advertised as broker1.mykafkacluster)
broker2 (advertised as broker2.mykafkacluster)
Virtual interfaces:
veth1 (192.168.1.1)
veth2 (192.168.1.2)
Tunnels:
broker1: ssh -L 192.168.1.1:9092:broker1.mykafkacluster:9092 jumphost
broker2: ssh -L 192.168.1.2:9092:broker1.mykafkacluster:9092 jumphost
/etc/hosts:
192.168.1.1 broker1.mykafkacluster
192.168.1.2 broker2.mykafkacluster
If you configure your system like this you should be able reach all the brokers in your Kafka cluster.
Note: if you configured your Kafka brokers to advertise an ip address instead of a hostname the procedure can still work but you need to configure the virtual interfaces with the same ip address that the broker advertises.
You don't actually have to add virtual interfaces to acces the brokers via SSH tunnel if they advertise a hostname. It's enough to add a hosts entry in /etc/hosts of your client and bind the tunnel to the added name.
Assuming broker.kafkacluster is the advertised.hostname of your broker:
/etc/hosts:
127.0.2.1 broker.kafkacluster
Tunnel:
ssh -L broker.kafkacluster:9092:broker.kafkacluster:9092 <brokerhostip/name>
Try sshuttle like this:
sshuttle -r user#host broker-1-ip:port broker-2-ip:port broker-3-ip:port
Of course, the list of broker depends on advertised listeners broker setting.
Absolutely best solution for me was to use kafkatunnel (https://github.com/simple-machines/kafka-tunnel). Worked like a charm.
Changing the /etc/hosts file is NOT the right way.
Quoting Confluent blog post:
I saw a Stack Overflow answer suggesting to just update my hosts file…isn’t that easier?
This is nothing more than a hack to work around a misconfiguration instead of actually fixing it.
You need to set advertised.listeners (or KAFKA_ADVERTISED_LISTENERS if you’re using Docker images) to the external address (host/IP) so that clients can correctly connect to it. Otherwise, they’ll try to connect to the internal host address—and if that’s not reachable, then problems ensue.
Confluent blog post
Additionally you can have a look at this Pull Request on GitHub where I wrote an integration test to connect to Kafka via SSH. It should be easy to understand even if you don't know Golang.
There you have a full client and server example (see TestSSH). The test is bringing up actual Docker containers and it runs assertions against them.
TL;DR I had to configure the KAFKA_ADVERTISED_LISTENERS when connecting over SSH so that the host advertised by each broker would be one reachable from the SSH host. This is because the client connects to the SSH host first and then from there it connects to a Kafka broker. So the host in the advertised.listeners must be reachable from the SSH server.

Having Kafka connected with ip as well as service name - Openshift

In our Openshift ecosystem, we have a kafka instance sourced from wurstmeister/kafka. As of now I am able to have the kafka accessible withing the Openshift system using the below parameters,
KAFKA_LISTENERS=PLAINTEXT://:9092
KAFKA_ADVERTISED_HOST_NAME=kafka_service_name
And ofcourse, the params for port and zookeper is there.
I am able to access the kafka from the pods within the openshift system. But I am unable to access kafka service from the host machine. Eventhough I am able to access the kafka pod using its IP and able to telnet the pod using, telnet Pod_IP 9092
When I am trying to connect using the kafka producer from the host machine, I am getting the below error,
2017-08-07 07:45:13,925] WARN Error while fetching metadata with
correlation id 2 : {tls21=LEADER_NOT_AVAILABLE}
(org.apache.kafka.clients.NetworkClient)
And When I try to connect from Kafka consumer from the host machine using IP, it is blank.
Note: As of now, its a single openshift server. And the use case is for dev testing.
Maybe you want to take a look at this POC for having Kafka on OpenShift ?
https://github.com/EnMasseProject/barnabas