How to use Kafka connect to transmit data to Kafka broker in another machine? - apache-kafka

I'm trying to use Kafka connect in Confluent platform 3.2.1 and everything works fine in my local env. Then I encountered this problem when I try to use Kafka source connector to send data to another machine.
I deploy Kafka JDBC source connector in machine A and trying to capture database A. Then I deploy a Kafka borker B(along with zk, schema registry) in machine B. The source connector cannot send data to broker B and throws the following exception:
[2017-05-19 16:37:22,709] ERROR Failed to commit offsets for WorkerSourceTask{id=test-multi-0} (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter:112)
[2017-05-19 16:38:27,711] ERROR Failed to flush WorkerSourceTask{id=test-multi-0}, timed out while waiting for producer to flush outstanding 3 messages (org.apache.kafka.connect.runtime.WorkerSourceTask:304)
I tried config the server.properties in broker B like this:
listeners=PLAINTEXT://:9092
and leave the advertised.listeners setting commented.
Then I use
bootstrap.servers=192.168.19.234:9092
in my source connector where 192.168.19.234 is the IP of machine B. Machine A and B are in the same subnet.
I suspect this has something to do with my server.properties.
How should I config to get the things done? Thanks in advance.

Related

KafkaJS reconnecting after initial connection with kafka and kafka connect, throwing error?

I am trying to connect with kafka with my kafka microservice,
connection configuration kafkajs
but throwing following error :
errors even after initial connection
Ny best guess it is something regarding my kafka environment ,
which is like following right now:
kafka setup server
It could also be because of how kafka works when configured to work with kafka connect, does it behaviour or environment change.

Using Amazon MSK and Debezium SQL Server Connector. Error while fetching metadata with correlation id 7 : {TestKafkaDB= UNKNOWN_TOPIC_OR_PARTITION}

I was trying to connect my RDS MS SQL server with Debezium SQL Server Connector to stream changes to Kafka Cluster on Amazon MSK.
I configured connector and Kafka Connect worker well run the Connect by
bin/connect-standalone.sh ../worker.properties connect/dbzmmssql.properties
Got WARN [Producer clientId=producer-1] Error while fetching metadata with correlation id 10 : {TestKafkaDB=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient:1031)
I've solved this problem and just want to share my possible solution with other fresher with Kafka.
TestKafkaDB=UNKNOWN_TOPIC_OR_PARTITION basically means the connector didn't find the a usable topic in Kafka broker. The reason I am facing this is the Kafka broker didn't automatically create a new topic for the stream.
To solving this, I changed Cluster Configuration in AWS MSK console, change auto.create.topics.enable from default false to true and update this configuration to the Cluster, then my problem solved.

Configuring kafka connect with multi brokers

Steps
I have used two kafka brokers and I have started zookeeper,kafka server and kafka connect services.
I have one source type kafka connector which can be used for getting data from Database.
If i start the connector[connector 1] by using the rest API, then it will hit any one kafka server [Server 1] using load balancer.After that server 1 will store and running the connector.But server 2 does not know the connector [connector 1] which is running in the server 1.
Expectation
So if the kafka server 1 is down, then the another kafka server 2 should be able to run the connector in the failed kafka server 1.
While starting the connector, kafka server should know how many connectors are in running, so that if any one broker failed to do the job then another server will be able to continue the job.
Reality
Another Kafka server 2 which is not doing the job as per the requirement.
is there any thing to make it by configuration setup with kafka?.
Kindly suggest me some ideas.
Kafka Server 1
Kafka Server 2
It appears that you have started all processes in single pods.
You should run Kafka, Zookeeper, and Connect all as separate services in different pods.
I suggest you refer the Confluent or Strimzi sites to find Kafka Kubernetes Helm Charts / Operators
But to answer the question - You could give one or more broker to connect-distributed.properties bootstrap.server value. Then each broker is connected to as part of the Kafka cluster, and will reconnect in the event that one broker is unavailable
"Kakfa servers" (brokers) do not run Connectors
If you want to run a cluster of connect workers, you also need to setup their rest.advertised.listener address so that they can communicate with each other.

Kafka and Kafka Connect deployment environment

if I already have Kafka running on premises, is Kafka Connect just a configuration on top of my existing Kafka, or does Kafka Connect require it's own Server/Environment separate from that of my existing Kafka?
Kafka Connect is part of Apache Kafka, but it runs as a separate process, called a Kafka Connect Worker. Except in a sandbox environment, you would usually deploy it on a separate machine/node from your Kafka brokers.
This diagram shows conceptually how it runs, separate from your brokers:
You can run Kafka Connect on a single node, or as part of a cluster (for throughput and redundancy).
You can read more here about installation and configuration and architecture of Kafka Connect.
Kafka Connect is its own configuration on top of your bootstrap-server's configuration.
For Kafka Connect you can choose between a standalone server or distributed connect servers and you'll have to update the corresponding properties file to point to your currently running Kafka server(s).
Look under {kafka-root}/config and you'll see
You'll basically update connect-standalone or connect-distributed properties based on your need.

Setup kafka-connect to fetch data from remote brokers

I'm trying to set up Kafka connect sink connector. Kafka connect is part of Kafka connect worker (confluent-3.2.0). I have a Kafka broker (confluent-3.2.0) up and running on machine A. I want to set up Kafka-connect-sink connector on another machine B to consume messages, using a custom Kafka-connect-sink connector jar. Assume that Kafka broker and Zoo keeper ports on machine A are open to machine B.
So should I install/setup confluent-3.2.0 on machine B (Since Kafka Connect is part of Kafka package) by setting the classpath to the Kafka-connect-sink connector jar and run the following command?
./bin/connect-distributed.sh worker.properties
Yes. What you describe will work and is the easiest way to setup this system even though on machine B you really only need the start script, the configuration properties file, the jars for Kafka Connect, and the jars for the custom connector.