Kafka connect error when trying to use multiple listeners for kafka server - apache-kafka

I am deploying a confluent single-node so I am only manipulating connect-standalone.properties and server.properties.
I am trying to connect a remote producer to my local set-up so I have the following overrides in server.properties
listeners=PLAINTEXT://10.20.23.105:9092,EXTERNAL://10.20.23.105:29092
advertised.listeners=PLAINTEXT://10.20.23.105:9092,EXTERNAL://localhost:29092
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
After checking using Offset Explorer, I can see that Kafka is still working and I am successfully getting the remote stream. However, Connect fails upon trying to start the service.
[2023-02-13 10:26:02,992] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectDistributed:85)
org.apache.kafka.connect.errors.ConnectException: Failed to connect to and describe Kafka cluster. Check worker's broker connection and security properties.
at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:79)
at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:60)
at org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:96)
at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:79)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: listNodes
at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:73)
What are the possible fixes for this problem?
I have checked out this question Kafka-connect, Bootstrap broker disconnected, but since I am still using PLAINTEXT for my external listener, there shouldn't need to be any changes to the workers right?

Kafka Connect isn't the problem. Start debugging with kafka-console-producer, for example.
listeners should not be hard-coded to any one IP.
Use bind addresses to allow connection from all interfaces.
listeners=PLAINTEXT://0.0.0.0:9092,EXTERNAL://0.0.0.0:29092
For advertised listeners, "external" addresses should use a LAN IP. Not clear what your "localhost" listener is needed for here, since any connection that IP from that same machine would route back to itself, by default. More importantly, you don't need two ports opened for the same protocol-connection.
You've not shown your connect worker properties, but if it is running on an external machine, make sure there is no firewall interfering with the connection, and that you are using the correct IP/hostname and ports.

Related

unable to connect to kafka broker (via zookeeper) using Conduktor client

Able to connect successfully to local kafka broker/cluster running locally (dockerized) using Conduktor, but when trying to connect to Kafka cluster running on Unix VM, getting below error.
Error:
"The broker [...] is reachable but Kafka can't connect. Ensure you have access to the advertised listeners of the the brokers and the proper authorization"
Appreciate any assistance.
running locally (dockerized)
When running in docker, you need to ensure that the ports are accessible from outside of your container. To verify this, try doing a telnet <ip> <port> and check if you are able to connect.
Since the error message says, the broker is reachable, I suppose you would be able to successfully telnet to the broker.
Next, check your broker config called advertised.listeners. Here you need to mention your IP:Port combination where IP is what you will be giving in your client program i.e. Conduktor.
An example for that would be
advertised.listeners=PLAINTEXT://1.2.3.4:9092
and then restart your broker and reconnect. If you are using ssl then you need to provide some extra configuration. See Configuring Kafka brokers for more.
Try to add in /etc/hosts (Unix-like) or C:\Windows\System32\drivers\etc\hosts (windows-like) the Kafka server in such manner kafka_server_ip kafka_server_name_in_dns (e.g. 10.10.0.1 kafka).

Reload kafka producer's bootstrap.server config on broker restart

We have a kafka broker setup on internal cloud. We find actual url using zookeeper and provide in bootstrap.server config.
Now the problem is when the broker restarts the internal cloud restarts it on dynamically allotted machine with new host port. Now the host port which i have initially given in producer config is not valid.
Question is how can i reload this config without restart.
Note: i know this is a bad design to host broker where it can restart in a different machine, but this is how it is right now.
I think you can use the domain name instead of ip in bootstrap.server config.

Is there a way to start a Zookeeper server using my static ip instead of localhost

I've started learning some big data tools for a new project, and right now I'm on Kafka and Zookeeper.
I have them both install on my local machine, and I can start them up and start producing and consuming messages just fine. Now, I want to try it having two machines, one with a kafka broker, zookeepr and a producer, and the other with a consumer. Lets call them Machine A and Machine B.
Machine A has runs the Zookeeper server, the broker and a producer. Machine B runs a consumer. From what I think I understand, I should be able to setup the consumer to listen to a topic from the producer on Machine A, using Zookeeper. Since both machines are on the same network (i.e. my local home network), I thought I could change the kafka broker server.properties to use my static ip address for Machine A, and then have the consumer on Machine B connect to it.
My problem, is that zookeeper keeps spinning up on localhost, and connecting to 0.0.0.0/0.0.0.0:2181 so when my broker tries to connect to it using my static ip address (i.e 192.168.x.x), it times out. I have looked all over for a solution, but I cannot find anything that tells me how to configure the Zookeeper sever to start on a different ip address.
Maybe my understanding of these technologies is simply wrong, but I thought this would be a fairly simple thing to do. Does anyone know any way to resolve this? Or else if I'm doing it completely wrong, what is the correct approach
zookeeper keeps spinning up on localhost, and connecting to 0.0.0.0/0.0.0.0:2181
Well, that is the bind address.
You need to also (preferably) have a static IP for Zookeeper, then set zookeeper.connect within the server.properties file of Kafka to reach to that other machine's external address.
From the Zookeeper configuration file, you would make sure you have the myid file and have a line in the property file that looks like this (without the double brackets)
server.{{ myid }}={{ ip_address }}:2888:3888
You wouldn't find this in the Kafka documentation, but it is in the Zookeeper documentation
However, if Kafka and Zookeeper are on the same machine, this isn't necessary.
Your external consumer should be setting bootstrap.servers property and the Kafka IP address(es) w/ port 9092.
Your problem might me related instead to the advertised.listeners setting within Kafka.
For example, start with listeners=PLAINTEXT://:9092
As of Zookeeper 3.3.0 (see Advanced Configuration):
clientPortAddress : New in 3.3.0: the address (ipv4, ipv6 or hostname)
to listen for client connections; that is, the address that clients
attempt to connect to. This is optional, by default we bind in such a
way that any connection to the clientPort for any
address/interface/nic on the server will be accepted
So you could use:
clientPortAddress=127.0.0.1

Error running multiple kafka standalone hdfs connectors

We are trying to launch multiple standalone kafka hdfs connectors on a given node.
For each connector, we are setting the rest.port and offset.storage.file.filename to different ports and path respectively.
Also kafka broker JMX port is # 9999.
When I start the kafka standalone connector, I get the error
Error: Exception thrown by the agent : java.rmi.server.ExportException: Port already in use: 9999; nested exception is:
java.net.BindException: Address already in use (Bind failed)
Though the rest.port is set to 9100
kafka version: 2.12-0.10.2.1
kafka-connect-hdfs version: 3.2.1
Please help.
We are trying to launch multiple standalone kafka hdfs connectors on a given node.
Have you considered running these multiple connectors within a single instance of Kafka Connect? This might make things easier.
Kafka Connect itself can handle running multiple connectors within a single worker process. Kafka Connect in distributed mode can run on a single node, or across multiple ones.
For those who trying to use rest.port flag and still getting Address already in use error. That flag has been marked as deprecated in KIP-208 and finally removed in PR.
From that point listeners can be used to change default REST port.
Examples from Javadoc
listeners=HTTP://myhost:8083
listeners=HTTP://:8083
Configuring and Running Workers - Standalone mode
You may have open Kafka Connect connections that you don't know about. You can check this with:
ps -ef | grep connect
If you find any, kill those processes.

Kafka scheduler in Vertica 7.2 is running and working, but produce errors

At the time when I run /opt/vertica/packages/kafka/bin/vkconfig launch I get such warning:
Unable to determine hostname, defaulting to 'unknown' in scheduler history
But the scheduler continues working fine and consuming messages from Kafka. What does it means?
The next strange thing is thet I find next records in /home/dbadmin/events/dbLog (I think it is Kafka consumer log file):
%3|14470569%3|1446726706.945|FAIL|vertica#consumer-1|
localhost:4083/bootstrap: Failed to connect to broker at
[localhost]:4083: Connection refused
%3|1446726706.945|ERROR|vertica#consumer-1| localhost:4083/bootstrap:
Failed to connect to broker at [localhost]:4083: Connection refused
%3|1446726610.267|ERROR|vertica#consumer-1| 1/1 brokers are down
As I mention, the scheduler is finally starting, but this records periodicaly appear in logs. What is this localhost:4083? Normally my broker runs on 9092 port on separate server which is described in kafka_config.kafka_scheduler table.
In the scheduler history table it attempts to get the hostname using Java:
InetAddress.getLocalHost().getHostAddress();
This will sometimes result in an UnknownHostException for various reasons (you can check documentation here: https://docs.oracle.com/javase/7/docs/api/java/net/UnknownHostException.html)
If this occurs, the hostname will default to "unknown" in that table. Luckily, the schedulers work by locking with your Vertica database, so knowing exactly which Scheduler host is unnecessary for functionality (just monitoring).
The Kafka-related logging in dbLog probably is the standard out from rdkafka (https://github.com/edenhill/librdkafka). I'm not sure what is going on with that log message, unfortunately. Vertica should only be using the configured broker list.