Kafka Snowflake ConnectStandalone - Error while starting the Snowflake connector - apache-kafka

[SF_KAFKA_CONNECTOR] SnowflakeSinkTask[ID:0]:start. Time: 0 seconds (com.snowflake.kafka.connector.SnowflakeSinkTask:154)
[2021-09-07 23:19:44,145] INFO WorkerSinkTask{id=snowflakeslink-0} Sink task finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:309)
[2021-09-07 23:19:44,169] WARN [Consumer clientId=connector-consumer-snowflakeslink-0, groupId=connect-snowflakeslink] Connection to node -1 (localhost/127.0.0.1:9092) terminated during authentication. This may happen due to any of the following reasons: (1) Authentication failed due to invalid credentials with brokers older than 1.0.0, (2) Firewall blocking Kafka TLS traffic (eg it may only allow HTTPS traffic), (3) Transient network issue. (org.apache.kafka.clients.NetworkClient:769)
[2021-09-07 23:19:44,170] WARN [Consumer clientId=connector-consumer-snowflakeslink-0, groupId=connect-snowflakeslink] Bootstrap broker localhost:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient:1060)

Connection ... terminated during authentication
You need to remove consumer.security.protocol=SSL in your connect-standalone.properties since your broker's server.properties listener is not using SSL
Your next error
Failed to find any class that implements Connector and which name matches com.snowflake.kafka.connector.SnowflakeSinkConnector, available connectors are: PluginDesc{klass=class org.apache.kafka.connect.file.FileStreamSinkConnector, name='org.apache.kafka.connect.file.FileStreamSinkConnector
Look at the list, it indeed doesn't exist, which means you've not correctly extracted the Snowflake connector libraries into the plugin.path, which should be a folder that is external to Kafka's internal lib folder, for example plugin.path=/opt/kafka-connectors/, with a subfolder for snowflake containing all its needed JARs. This way, it will not conflict with the actual classpath of the broker and other Kafka/Zookeeper CLI tools that rely on this folder

Related

Connection terminates between Mule 4 and Confluent Cloud with Apache Kafka Connector 4.5.0 but connects with 3.0.7

Setting up a (very simple) POC with Mule 4 and Confluent Cloud:
I have been unable to establish a successful connection using the latest version of the Mule 4 Apache Kafka Connector (4.5.0). If I downgrade it to 3.0.7 and use the same configuration it works fine. Why is this?
The working 3.0.7 configuration (for a basic producer) looks like this:
<kafka:kafka-producer-config name="Apache_Kafka_Producer_configuration" doc:name="Apache Kafka Producer configuration" doc:id="2ba6262d-2ff8-4282-910e-5c9e3d347d50" >
<kafka:basic-kafka-producer-connection bootstrapServers="${kafka.bootstrapserver}" >
<kafka:additional-properties >
<kafka:additional-property key="sasl.jaas.config" value="org.apache.kafka.common.security.plain.PlainLoginModule required username='${kafka.key}' password='${kafka.secret}';" />
<kafka:additional-property key="ssl.endpoint.identification.algorithm" value="https" />
<kafka:additional-property key="security.protocol" value="SASL_SSL" />
<kafka:additional-property key="sasl.mechanism" value="PLAIN" />
<kafka:additional-property key="serviceName" value="kafka" />
</kafka:additional-properties>
</kafka:basic-kafka-producer-connection>
</kafka:kafka-producer-config>
And the failing 4.5.0 configuration (also for a basic producer) looks like this:
<kafka:producer-config name="Apache_Kafka_Producer_configuration" doc:name="Apache Kafka Producer configuration" doc:id="7aa22dcc-7895-4254-ba51-e8bc5e2e9c2e" >
<kafka:producer-sasl-plain-connection username="${kafka.key}" password="${kafka.secret}" endpointIdentificationAlgorithm="https">
<kafka:bootstrap-servers >
<kafka:bootstrap-server value="${kafka.bootstrapserver}" />
</kafka:bootstrap-servers>
</kafka:producer-sasl-plain-connection>
</kafka:producer-config>
You can see that they both:
Use an SASL plain text connection
Have an SSL endpoint identification algorithm of HTTPS
Specify the same bootstrap server, API key, and secret
There is very little else in the flow other than an HTTP listener and a Set Payload.
Messages sent using the earlier connector version arrive on the Confluent Cloud topic fine, however using the application fails to start and recursively prints errors such as:
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator: [Producer clientId=producer-1] Set SASL client state to RECEIVE_APIVERSIONS_RESPONSE
org.apache.kafka.clients.NetworkClient: [Producer clientId=producer-1] Completed connection to node -1. Fetching API versions.
org.apache.kafka.clients.NetworkClient: [Producer clientId=producer-1] Found least loaded connecting node pkc-4vndj.australia-southeast1.gcp.confluent.cloud:9092 (id: -1 rack: null)
org.mule.runtime.module.extension.internal.runtime.config.LifecycleAwareConfigurationInstance.testConnectivity:179 #23ad5b4f] [processor: ; event: ] org.apache.kafka.clients.NetworkClient: [Consumer clientId=consumer-connectivity-1, groupId=connectivity] Node -1 disconnected.
org.mule.runtime.module.extension.internal.runtime.config.LifecycleAwareConfigurationInstance.testConnectivity:179 #23ad5b4f] [processor: ; event: ] org.apache.kafka.clients.NetworkClient: [Consumer clientId=consumer-connectivity-1, groupId=connectivity] Connection to node -1 (xxxx.australia-southeast1.gcp.confluent.cloud/35.244.90.132:9092) terminated during authentication. This may happen due to any of the following reasons: (1) Authentication failed due to invalid credentials with brokers older than 1.0.0, (2) Firewall blocking Kafka TLS traffic (eg it may only allow HTTPS traffic), (3) Transient network issue.
org.apache.kafka.clients.NetworkClient: [Consumer clientId=consumer-connectivity-1, groupId=connectivity] Bootstrap broker pkc-4vndj.australia-southeast1.gcp.confluent.cloud:9092 (id: -1 rack: null) disconnected
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient: [Consumer clientId=consumer-connectivity-1, groupId=connectivity] Cancelled request with header RequestHeader(apiKey=METADATA, apiVersion=9, clientId=consumer-connectivity-1, correlationId=17) due to node -1 being disconnected
org.apache.kafka.common.network.Selector: [Producer clientId=producer-1] Connection with xxxxx.australia-southeast1.gcp.confluent.cloud/35.244.90.132 disconnected
And stacktrace with End of File Exception:
java.io.EOFException: null
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:120) ~[kafka-clients-2.7.0.jar:?]
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.receiveResponseOrToken(SaslClientAuthenticator.java:470) ~[kafka-clients-2.7.0.jar:?]
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.receiveKafkaResponse(SaslClientAuthenticator.java:560) ~[kafka-clients-2.7.0.jar:?]
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.authenticate(SaslClientAuthenticator.java:248) ~[kafka-clients-2.7.0.jar:?]
at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:176) ~[kafka-clients-2.7.0.jar:?]
Which (looking at the Apache source code) looks like a zero-byte message response.
Version 4.5.0 may not be constructing and instantiating a proper org.apache.kafka.common.security.plain.PlainLoginModule that is required for Confluent Cloud to authenticate requests.

kafka listening on multiple interfaces

I have a requirement as below:
Kafka needs to listen to multiple interfaces, one external and one internal interface. All other components within the system will connect kafka to internal interfaces.
At installation time internal ips on other host are not reachable, need to do some configuration to make them reachable, we do not have control over that. So, assume that when kafka is coming up, internal IPs on other nodes are not reachable to each other.
Scenario:
I have two nodes in cluster:
node1 (External IP: 10.10.10.4, Internal IP: 5.5.5.4)
node2 (External IP: 10.10.10.5, Internal IP: 5.5.5.5)
Now, while installation, 10.10.10.4 can ping to 10.10.10.5 and vice versa, but 5.5.5.4 can not reach to 5.5.5.5. That will happen once kafka installation is done and after that someone does some config to make it reachable, so before kafka installation, we can do make them reachable.
Now the requirement is kafka brokers will exchange the messages on 10.10.10 interface, such that cluster will be formed, but clients will send messages on 5.5.5.X interface.
What I tried was as below:
listeners=USERS://0.0.0.0:9092,REPLICATION://0.0.0.0:9093
advertised.listeners=USERS://5.5.5.5:9092,REPLICATION://5.5.5.5:9093
Where 5.5.5.5 is the internal ip address.
But with this, while restarting kafka, I see below logs:
{"log":"[2020-06-23 19:05:34,923] INFO Creating /brokers/ids/2 (is it secure? false) (kafka.zk.KafkaZkClient)\n","stream":"stdout","time":"2020-06-23T19:05:34.923403973Z"}
{"log":"[2020-06-23 19:05:34,925] INFO Result of znode creation at /brokers/ids/2 is: OK (kafka.zk.KafkaZkClient)\n","stream":"stdout","time":"2020-06-23T19:05:34.925237419Z"}
{"log":"[2020-06-23 19:05:34,926] INFO Registered broker 2 at path /brokers/ids/2 with addresses: ArrayBuffer(EndPoint(5.5.5.5,9092,ListenerName(USERS),PLAINTEXT), EndPoint(5.5.5.5,9093,ListenerName(REPLICATION),PLAINTEXT)) (kafka.zk.KafkaZkClient)\n","stream":"stdout","time":"2020-06-23T19:05:34.926127438Z"}
.....
{"log":"[2020-06-23 19:05:35,078] INFO Kafka version : 1.1.0 (org.apache.kafka.common.utils.AppInfoParser)\n","stream":"stdout","time":"2020-06-23T19:05:35.078444509Z"}
{"log":"[2020-06-23 19:05:35,078] INFO Kafka commitId : fdcf75ea326b8e07 (org.apache.kafka.common.utils.AppInfoParser)\n","stream":"stdout","time":"2020-06-23T19:05:35.078471358Z"}
{"log":"[2020-06-23 19:05:35,079] INFO [KafkaServer id=2] started (kafka.server.KafkaServer)\n","stream":"stdout","time":"2020-06-23T19:05:35.079436798Z"}
{"log":"[2020-06-23 19:05:35,136] ERROR [KafkaApi-2] Number of alive brokers '0' does not meet the required replication factor '2' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)\n","stream":"stdout","time":"2020-06-23T19:05:35.136792119Z"}
And after that this msg continuously comes up.
{"log":"[2020-06-23 19:05:35,166] ERROR [KafkaApi-2] Number of alive brokers '0' does not meet the required replication factor '2' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)\n","stream":"stdout","time":"2020-06-23T19:05:35.166895344Z"}
Is there any way we can achieve that?
With regards,
-M-

Kafka giving warn failed to send SSL close message

I have a kafka cluster of 3 kafka brokers on 3 different servers.
Lets assume the three servers are .
99.99.99.1
99.99.99.2
99.99.99.3
All 3 servers have a shared path on which kafka is residing.
I have created 3 server.properties with name
server1.properties
server2.properties
server3.properties
The server1.properties look like below:
broker.id=1
port=9094
listeners=SSL://99.99.99.1:9094
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=3
zookeeper.connect=99.99.99.1:2181,99.99.99.2:2182,99.99.99.3:2183
ssl.keystore.location=xyz.jks
ssl.keystore.password=password
ssl.key.password=password
ssl.truststore.location=xyz.jks
ssl.truststore.password=password
ssl.client.auth=required
security.inter.broker.protocol=SSL
Similarly, the other two server properties look.
Issues/Query:
I need the consumer and producer should connect using SSL and even all the brokers should connect to each other using SSL. Is my configuration right for this?
I keep on getting below error is this usual?
WARN Failed to send SSL Close message
(org.apache.kafka.common.network.SslTransportLayer)
java.io.IOException: Broken pipe

Kafka Remote Producer - advertised.listeners

I am running Kafka 0.10.0 on CDH 5.9, cluster is kerborized.
What I am trying to do is to write messages from a remote machine to my Kafka broker.
The cluster (where Kafka is installed) has internal as well as external IP addresses.
The machines' hostnames within the cluster get resolved to the private IPs, the remote machine resolves the same hostnames to the public IP addreses.
I opened the necessary port 9092 (I am using SASL_PLAINTEXT protocol) from remote machine to Kafka Broker, verified that using telnet.
First Step - in addition to the standard properties for the Kafka Broker, I configured the following:
listeners=SASL_PLAINTEXT://0.0.0.0:9092
advertised.listeners=SASL_PLAINTEXT://<hostname>:9092
I am able to start the console consumer with
kafka-console-consumer --new consumer --topic <topicname> --from-beginning --bootstrap-server <hostname>:9092 --consumer.config consumer.properties
I am able to use my custom producer from another machine within the cluster.
Relevant excerpt of producer properties:
security.protocol=SASL_PLAINTEXT
bootstrap.servers=<hostname>:9092
I am not able to use my custom producer from the remote machine:
Exception org.apache.kafka.common.errors.TimeoutException: Batch containing 1 record(s) expired due to timeout while requesting metadata from brokers for <topicname>-<partition>
using the same producer properties. I am able to telnet the Kafka Broker from the machine and /etc/hosts includes hostnames and public IPs.
Second Step - I modified server.properties:
listeners=SASL_PLAINTEXT://0.0.0.0:9092
advertised.listeners=SASL_PLAINTEXT://<kafkaBrokerInternalIP>:9092
consumer & producer within the same cluster still run fine (bootstrap
servers are now the internal IP with port 9092)
as expected remote producer fails (but that is obvious given that it
is not aware of the internal IP addresses)
Third Step - where it gets hairy :(
listeners=SASL_PLAINTEXT://0.0.0.0:9092
advertised.listeners=SASL_PLAINTEXT://<kafkaBrokerPublicIP>:9092
starting my consumer with
kafka-console-consumer --new-consumer --topic <topicname> --from-beginning --bootstrap-server <hostname>:9092 --consumer.config consumer.properties
gives me a warning, but I don't think this is right...
WARN clients.NetworkClient: Error while fetching metadata with correlation id 1 : {<topicname>=LEADER_NOT_AVAILABLE}
starting my consumer with
kafka-console-consumer --new-consumer --topic <topicname> --from-beginning --bootstrap-server <KafkaBrokerPublicIP>:9092 --consumer.config consumer.properties
just hangs after those log messages:
INFO utils.AppInfoParser: Kafka version : 0.10.0-kafka-2.1.0
INFO utils.AppInfoParser: Kafka commitId : unknown
seems like it cannot find a coordinator as in the normal flow this would be the next log:
INFO internals.AbstractCoordinator: Discovered coordinator <hostname>:9092 (id: <someNumber> rack: null) for group console-consumer-<someNumber>.
starting the producer on a cluster node with bootstrap.servers=:9092
I observe the same as with the producer:
WARN NetworkClient:600 - Error while fetching metadata with correlation id 0 : {<topicname>=LEADER_NOT_AVAILABLE}
starting the producer on a cluster node with bootstrap.servers=:9092 I get
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
starting the producer on my remote machine with either bootstrap.servers=:9092 or bootstrap.servers=:9092 I get
NetworkClient:600 - Error while fetching metadata with correlation id 0 : {<topicname>=LEADER_NOT_AVAILABLE}
I have been struggling for the past three days to get this to work, however I am out of ideas :/ My understanding is that advertised.hostnames serves for exactly this purpose, however either I am doing something wrong, or there is something wrong in the machine setup.
Any hints are very much appreciated!
I met this issue recently.
In my case , I enabled Kafka ACL, and after disable it by comment this 2 configuration, the problem worked around.
authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer
super.users=User:kafka
And an thread may help you I think:
https://gist.github.com/jorisdevrede/a7933a99251452bb1867
What mentioned in it at the end:
If you only use a SASL_PLAINTEXT listener on the Kafka Broker, you
have to make sure that you have set the
security.inter.broker.protocol=SASL_PLAINTEXT too, otherwise you will
get a LEADER_NOT_AVAILABLE error in the client.

Kafka QuickStart, advertised.host.name gives kafka.common.LeaderNotAvailableException

I am able to get a simple one-node Kafka (kafka_2.11-0.8.2.1) working locally on one linux machine, but when I try to run a producer remotely I'm getting some confusing errors.
I'm following the quickstart guide at http://kafka.apache.org/documentation.html#quickstart. I stopped the kafka processes and deleted all the zookeeper & karma files in /tmp. I am on a local 10.0.0.0/24 network NAT-ed with an external IP address, so I modified server.properties to tell zookeeper how to broadcast my external address, as per https://medium.com/#thedude_rog/running-kafka-in-a-hybrid-cloud-environment-17a8f3cfc284:
advertised.host.name=MY.EXTERNAL.IP
Then I'm running this:
$ bin/zookeeper-server-start.sh config/zookeeper.properties
--> ...
$ export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M" # small test server!
$ bin/kafka-server-start.sh config/server.properties
--> ...
I opened up the firewall for my producer on the remote machine, and created a new topic and verified it:
$ bin/kafka-topics.sh --create --zookeeper MY.EXTERNAL.IP:2181 --replication-factor 1 --partitions 1 --topic test123
--> Created topic "test123".
$ bin/kafka-topics.sh --list --zookeeper MY.EXTERNAL.IP:2181
--> test123
However, the producer I'm running remotely gives me errors:
$ bin/kafka-console-producer.sh --broker-list MY.EXTERNAL.IP:9092 --topic test123
--> [2015-06-16 14:41:19,757] WARN Property topic is not valid (kafka.utils.VerifiableProperties)
My Test Message
--> [2015-06-16 14:42:43,347] WARN Error while fetching metadata [{TopicMetadata for topic test123 ->
No partition metadata for topic test123 due to kafka.common.LeaderNotAvailableException}] for topic [test123]: class kafka.common.LeaderNotAvailableException (kafka.producer.BrokerPartitionInfo)
--> (repeated several times)
(I disabled the whole firewall to make sure that wasn't the problem.)
The stdout errors in the karma-startup are repeated: [2015-06-16 20:42:42,768] INFO Closing socket connection to /MY.EXTERNAL.IP. (kafka.network.Processor)
And the controller.log gives me this, several times:
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
[2015-06-16 20:44:08,128] INFO [Controller-0-to-broker-0-send-thread], Controller 0 connected to id:0,host:MY.EXTERNAL.IP,port:9092 for sending state change requests (kafka.controller.RequestSendThread)
[2015-06-16 20:44:08,428] WARN [Controller-0-to-broker-0-send-thread], Controller 0 epoch 1 fails to send request Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:1;CorrelationId:7;ClientId:id_0-host_null-port_9092;Leaders:id:0,host:MY.EXTERNAL.IP,port:9092;PartitionState:(test123,0) -> (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:0) to broker id:0,host:MY.EXTERNAL.IP,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread)
Running this seems to indicate that there is a leader at 0:
$ ./bin/kafka-topics.sh --zookeeper MY.EXTERNAL.IP:2181 --describe --topic test123
--> Topic:test123 PartitionCount:1 ReplicationFactor:1 Configs:
Topic: test123 Partition: 0 Leader: 0 Replicas: 0 Isr: 0
I reran this test and my server.log indicates that there is a leader at 0:
...
[2015-06-16 21:58:04,498] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2015-06-16 21:58:04,642] INFO Registered broker 0 at path /brokers/ids/0 with address MY.EXTERNAL.IP:9092. (kafka.utils.ZkUtils$)
[2015-06-16 21:58:04,670] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
[2015-06-16 21:58:04,736] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
I see this error in the logs when I send a message from the producer:
[2015-06-16 22:18:24,584] ERROR [KafkaApi-0] error when handling request Name: TopicMetadataRequest; Version: 0; CorrelationId: 7; ClientId: console-producer; Topics: test123 (kafka.server.KafkaApis)
kafka.admin.AdminOperationException: replication factor: 1 larger than available brokers: 0
at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:70)
I assume this means that the broker can't be found for some reason? I'm confused what this means...
For the recent versions of Kafka (0.10.0 as of this writing), you don't want to use advertised.host.name at all. In fact, even the [documentation] states that advertised.host.name is already deprecated. Moreover, Kafka will use this not only as the "advertised" host name for the producers/consumers, but for other brokers as well (in a multi-broker environment)...which is kind of a pain if you're using using a different (perhaps internal) DNS for the brokers...and you really don't want to get into the business of adding entries to the individual /etc/hosts of the brokers (ew!)
So, basically, you would want the brokers to use the internal name, but use the external FQDNs for the producers and consumers only. To do this, you will update advertised.listeners instead.
Set advertised.host.name to a host name, not an IP address. The default is to return a FQDN using getCanonicalHostName(), but this is only best effort and falls back to an IP. See the java docs for getCanonicalHostName().
The trick is to get that host name to always resolve to the correct IP. For small environments I usually setup all of the hosts with all of their internal IPs in /etc/hosts. This way all machines know how to talk to each other over the internal network, by name. In fact, configure your Kafka clients by name now too, not by IP. If managing all the /etc/hosts files is a burden then setup an internal DNS server to centralize it, but internal DNS should return internal IPs. Either of these options should be less work than having IP addresses scattered throughout various configuration files on various machines.
Once everything is communicating by name all that's left is to configure external DNS with the external IPs and everything just works. This includes configuring Kafka clients with the server names, not IPs.
So to summarize, the solution to this was to add a route via NAT so that the machine can access its own external IP address.
Zookeeper uses the address it finds in advertised.host.name both to tell clients where to find the broker as well as to communicate with the broker itself. The error that gets reported doesn't make this very clear, and it's confusing because a client has no problem opening a TCP connection.
Taking cue from above: for my single node (while still learning) I modified server.properties file having text "advertised.host.name" to value=127.0.01. So finally it looks something like this
advertised.host.name=127.0.0.1
While starting producer it still shows warning, but now it is atleast working while I can see messages on consumer terminal perfectly comming
On your machine where Kafka is installed, check if it is up and running. The error states, 0 brokers are available that means Kafka is not up and running.
On linux machine you can use the netstat command to check if the service is running.
netstat -an|grep port_kafka_is_Listening ( default is 9092)
conf/server.properties:
host.name
DEPRECATED: only used when listeners is not set. Use listeners instead. hostname of broker. If this is set, it will only bind to this address. If this is not set, it will bind to all interfaces