I have a Kafka Connect source and sink connector for putting data into Kafka and taking it back out.
I am running Kafka and Kafka Connect using docker-compose which runs connect in distributed mode. see that it finds my plugin when connect starts up, but it doesn't actually do anything unless I do a POST to the /connectors API, including the configuration in JSON.
I have a properties file with the configuration in it and I've tried putting it under /etc where I find similar properties files for the other plugins that are installed.
Am I missing a step when installing my plugin, or is it required to register the connector via the REST API before it will be assigned to workers?
Yes, you have to configure Kafka Connect using the REST API when using distributed mode.
It's possible to script the creation of connectors though, using a Docker Compose like this:
command:
- bash
- -c
- |
/etc/confluent/docker/run &
echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
while [ $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) -eq 000 ] ; do
echo -e $$(date) " Kafka Connect listener HTTP state: " $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) " (waiting for 200)"
sleep 5
done
nc -vz kafka-connect 8083
echo -e "\n--\n+> Creating Kafka Connect Elasticsearch sink"
/scripts/create-es-sink.sh
sleep infinity
where /scripts/create-es-sink.sh is the REST call from curl in a file mounted locally to the container.
(source)
You can install a Kafka connector before you start the distributed Connect worker using "confluent-hub install" as shown here: Install Kafka connector manually). However, I'm not sure what the magic is if you aren't using confluent-hub though.
Related
----- i use kafka, kafka-connect(image: confluentinc/cp-kafka-connect)
when you use kafka in docker container if you wanna operate kafka, you have to go into the container(like 'docker exec -it kafka' or 'docker exec -it kafka-connect' ----> this is another question what i wanna ask) , right..??
i tried putting some connector (jdbc connector, mysql connector) into kafka-connect container, but it didn't work.
so.. my question is
after docker-compose up(put in container), if i wanna connect with some connectors('./bin/connect-distributed.sh ./etc/kafka/connect-distributed.properties'),
what container i have to go into???
if i type plugin path, where should i write?( kafka? kafka-connect?)
I wouldn't mind if it was difficult to read... sorry for that
No, you don't need to exec anywhere unless you cannot download Kafka on your host machine to get the CLI scripts. But you'd only exec for kafka-topics, console producer/consumer, kafka-consumer-groups, etc, not any of the Connect scripts.
The Connect container automatically runs the Distributed script and you simply provide CONNECT_PLUGIN_PATH as an environment variable to any directory in the container you want to use for the plugins (I like /opt/connectors if I mount volume, but that's not where confluent-hub installs to for that image). That variable doesn't do anything for the broker image, only Connect.
Related How to install connectors to the docker image of apache kafka connect
If your requirement is startup a Kafka Connect.
You can use the basic guide published by Confluent "Build Your Own Apache Kafka® Demos"
Basically you need execute the following instructions:
git clone https://github.com/confluentinc/cp-all-in-one.git
cd cp-all-in-one/cp-all-in-one
git checkout 7.1.1-post
docker-compose up -d
This has Control Center at http://localhost:8088
If you need install a Connector, you can go to the https://www.confluent.io/hub select your specific connector.
Then, you can create your DockerImage of specific Kafka Connect server.
1.- Write a Dockerfile.
vim Dockerfile
2.- Add connector "example JDBC" from Confluent Hub.
FROM confluentinc/cp-kafka-connect
ENV MYSQL_DRIVER_VERSION 5.1.39
RUN confluent-hub install --no-prompt confluentinc/kafka-connect-jdbc:10.5.0
RUN curl -k -SL "https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-${MYSQL_DRIVER_VERSION}.tar.gz" \
| tar -xzf - -C /usr/share/confluent-hub-components/confluentinc-kafka-connect-jdbc/lib \
--strip-components=1 mysql-connector-java-5.1.39/mysql-connector-java-${MYSQL_DRIVER_VERSION}-bin.jar
3.- Build the docker image.
docker build . -t my-kafka-connect-jdbc:1.0.0
4.- Then, you can go to edit your docker-compose.yml, change the line 57
from:
image: cnfldemos/cp-server-connect-datagen:0.5.3-7.1.0
to:
image: my-kafka-connect-jdbc:1.0.0
5.- Finally, stop and start your Confluent Platform local environment:
docker-compose down
docker-compose up
Verify your docker
docker ps
Test your Connect server:
curl --location --request GET 'http://localhost:8083/connectors'
I have a Ubuntu machine, where I followed this steps in order to run Confluent Platform with docker.
https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html
I can produce and subscribe to messages just fine.
I'm trying to add a MongoDB Sink Connector, in order to sync data with a mongo database.
I've downloaded this zip file https://www.confluent.io/hub/hpgrahsl/kafka-connect-mongodb
I've edited the etc/MongoDbSinkConnector.properties file with the correct mongo endpoint
I've uploaded the zip to my Ubuntu machine
I've created a file Dockerfile with the following content
FROM confluentinc/cp-kafka-connect-base
COPY hpgrahsl-kafka-connect-mongodb-1.4.0.zip /tmp/hpgrahsl-kafka-connect-mongodb-1.4.0.zip
RUN confluent-hub install --no-prompt /tmp/hpgrahsl-kafka-connect-mongodb-1.4.0.zip
I've executed the following command docker build . -t my-custom-image:1.0.0
Sending build context to Docker daemon 15.03MB
Step 1/3 : FROM confluentinc/cp-kafka-connect-base
---> 8fe065fffe44
Step 2/3 : COPY hpgrahsl-kafka-connect-mongodb-1.4.0.zip /tmp/hpgrahsl-kafka-connect-mongodb-
1.4.0.zip
---> Using cache
---> ed2e4ec7ff97
Step 3/3 : RUN confluent-hub install --no-prompt /tmp/hpgrahsl-kafka-connect-mongodb-1.4.0.zip
---> Using cache
---> 034f82e2e136
Successfully built 034f82e2e136
Successfully tagged my-custom-image:1.0.0
Am I missing something? My mongo does not get updated
Do I have to edit docker-compose.yml also?
How do I debug this connector, does it have logs?
When you run Kafka Connect under Docker (including with the cp-kafka-connect-base) image it is usually in distributed mode. To create a connector configuration in this mode you use a REST call; it won't load the configuration from a flat file (per standalone mode).
You can either launch the container that you've created and then manually create the connector with a REST call, or you can automate that REST call - here's an example of doing it within a Docker Compose:
kafka-connect-01:
image: confluentinc/cp-kafka-connect-base:6.2.0
depends_on:
- kafka
ports:
- 8083:8083
environment:
CONNECT_BOOTSTRAP_SERVERS: "kafka:29092"
[…]
command:
- bash
- -c
- |
#
echo "Installing connector plugins"
confluent-hub install --no-prompt hpgrahsl/kafka-connect-mongodb:1.4.0
#
echo "Launching Kafka Connect worker"
/etc/confluent/docker/run &
#
echo "Waiting for Kafka Connect to start listening on localhost ⏳"
while : ; do
curl_status=$$(curl -s -o /dev/null -w %{http_code} http://localhost:8083/connectors)
echo -e $$(date) " Kafka Connect listener HTTP state: " $$curl_status " (waiting for 200)"
if [ $$curl_status -eq 200 ] ; then
break
fi
sleep 5
done
echo -e "\n--\n+> Creating connector"
curl -s -X PUT -H "Content-Type:application/json" http://localhost:8083/connectors/mongo-connector/config \
-d '{
[… connector JSON config goes here …]
}'
sleep infinity
References:
https://rmoff.net/2018/12/15/docker-tips-and-tricks-with-kafka-connect-ksqldb-and-kafka/
https://developer.confluent.io/learn-kafka/kafka-connect/docker/
I setup Kafka and Zookeeper on my local machine and I would like to use Kafdrop as UI. I tried running with docker command below:
docker run -d --rm -p 9000:9000 \
-e KAFKA_BROKERCONNECT=<localhost:9092> \
-e JVM_OPTS="-Xms32M -Xmx64M" \
-e SERVER_SERVLET_CONTEXTPATH="/" \
obsidiandynamics/kafdrop
and I get -bash: https://locahost:9092: No such file or directory
When I remove the KAFKA_BROKERCONNECT parameter, the application run but I got error below:
[AdminClient clientId=kafdrop-admin] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
2020-07-22 09:39:29.108 WARN 1 [| kafdrop-admin] o.a.k.c.NetworkClient
I did setup Kafka Server's listener setting to this but did not help.
listeners=PLAINTEXT://localhost:9092
I found this similar issue on GitHub but couldn't understand most of the answers.
Kafka is not HTTP-based. You do not need a schema protocol to connect to Kafka, and angle brackets do not need used.
You also cannot use localhost, as that is Kafdrop container, not Kafka.
I suggest you use Docker Compose with Kafdrop and Kafka
I followed what #OneCricketeer said in his answer and everything worked perfectly
Here what I did :
I downloaded the compose file from GitHub the link above or Click here
Run the compose file by going to the directory where the file exists and run it using cmd docker-compose up
Stop all your kafka server and Zookeeper because everything going to be downloaded with the docker-compose command
after go to http://localhost:9000/ and Voila
Is there a way to automatically load (multiple) Kafka Connect connectors upon the start of Kafka Connect (e.g. in Confluent Platform)?
What I've found out so far:
Confluent Docs state to use the bin/connect-standalone
the command for Standalone Mode with a properties file for the worker and for every single connector.
For Distributed Mode you have to run the connector via REST API.
https://docs.confluent.io/current/connect/userguide.html#standalone-mode, https://docs.confluent.io/current/connect/managing/configuring.html#standalone-example
Is there another method, e.g. to include all connectors that should be run in the 'connect-[standalone|distributed].properties' file (similar to providing KSQL queries file in ksql-server.properties) so that they are loaded automatically upon the start of Kafka Connect (e.g. in Confluent Platform)?
Or are the connectors loaded "manually" as described above even in production environments?
Normally, you'd have to use the REST API when running Kafka Connect in distributed mode. However, you can use docker compose to script the creation of connectors;
#Robin Moffatt has written a nice article about this:
kafka-connect:
image: confluentinc/cp-kafka-connect:5.1.2
environment:
CONNECT_REST_PORT: 18083
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
[…]
volumes:
- $PWD/scripts:/scripts
command:
- bash
- -c
- |
/etc/confluent/docker/run &
echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
while [ $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) -eq 000 ] ; do
echo -e $$(date) " Kafka Connect listener HTTP state: " $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) " (waiting for 200)"
sleep 5
done
nc -vz kafka-connect 8083
echo -e "\n--\n+> Creating Kafka Connect Elasticsearch sink"
/scripts/create-es-sink.sh
sleep infinity
Notes:
In the command section, $ are replaced with $$ to avoid the error
Invalid interpolation format for "command" option
sleep infinity is
necessary, because we’ve sent the /etc/confluent/docker/run process to
a background thread (&) and so the container will exit if the main
command finishes.
The actual script to configure the connector is a
curl call in a separate file. You could build this into the Docker
Compose but it feels a bit yucky.
You could combine both this and the
technique above if you wanted to install a custom connector plugin
before launching Kafka Connect, e.g.
confluent-hub install --no-prompt confluentinc/kafka-connect-gcs:5.0.0 /etc/confluent/docker/run
While going through the apache official page
https://kafka.apache.org/quickstart
A text file is created as
echo -e "foo\nbar" > test.txt
And to use kakfa connect following command is used
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
But while above command gets executed it shows a message kafka-connect stopped
Something else is using the same port that Kafka Connect wants to use.
You can use netstat -plnt to identify the other program (you'll need to run it as root if the process is owned by a different user).
If you want to get Kafka Connect to use a different port, edit config/connect-standalone.properties to add:
rest.port=18083
Where 18083 is an available port.