I'm trying to perform a proof of concept using kafka-connect with a rabbitMQ connector. Basically, I have two simple spring boot applications; a RabbitMQ producer and a Kafka consumer. The consumer can not handle the messages from the connector because it's transforming somehow my JSON message; RabbitMQ sends {"transaction": "PAYMENT", "amount": "$125.0"} and kafka-connect prints X{"transaction": "PAYMENT", "amount": "$125.0"}. Please note the X at the beginning. If I add a field, let's say "foo": "bar" then that letter becomes a t or whatever.
Dockerfile (connector):
FROM confluentinc/cp-kafka-connect-base:5.3.2
RUN confluent-hub install --no-prompt confluentinc/kafka-connect-rabbitmq:latest
Please generate the image as follows: docker build . -t rabbit-connector, so you can reference it in the docker-compose file as rabbit-connector.
docker-compose.yml:
version: '2'
networks:
kafka-connect-network:
driver: bridge
services:
zookeeper:
image: confluentinc/cp-zookeeper:5.3.2
networks:
- kafka-connect-network
ports:
- '31000:31000'
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
KAFKA_JMX_HOSTNAME: "localhost"
KAFKA_JMX_PORT: 31000
kafka:
image: confluentinc/cp-enterprise-kafka:5.3.2
networks:
- kafka-connect-network
ports:
- '9092:9092'
- '31001:31001'
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 100
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: kafka:29092
CONFLUENT_METRICS_REPORTER_ZOOKEEPER_CONNECT: zookeeper:2181
CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
CONFLUENT_METRICS_ENABLE: 'false'
CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
KAFKA_JMX_HOSTNAME: "localhost"
KAFKA_JMX_PORT: 31001
schema-registry:
image: confluentinc/cp-schema-registry:5.3.2
depends_on:
- zookeeper
- kafka
networks:
- kafka-connect-network
ports:
- '8081:8081'
- '31002:31002'
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: zookeeper:2181
SCHEMA_REGISTRY_JMX_HOSTNAME: "localhost"
SCHEMA_REGISTRY_JMX_PORT: 31002
rabbitmq:
image: rabbitmq
environment:
RABBITMQ_DEFAULT_USER: guest
RABBITMQ_DEFAULT_PASS: guest
RABBITMQ_DEFAULT_VHOST: "/"
networks:
- kafka-connect-network
ports:
- '15672:15672'
- '5672:5672'
kafka-connect:
image: rabbit-connector
networks:
- kafka-connect-network
ports:
- '8083:8083'
- '31004:31004'
environment:
CONNECT_BOOTSTRAP_SERVERS: "kafka:29092"
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
CONNECT_LOG4J_ROOT_LOGLEVEL: "ERROR"
CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components
KAFKA_JMX_HOSTNAME: "localhost"
KAFKA_JMX_PORT: 31004
depends_on:
- zookeeper
- kafka
- schema-registry
- rabbitmq
rest-proxy:
image: confluentinc/cp-kafka-rest:5.3.2
depends_on:
- zookeeper
- kafka
- schema-registry
networks:
- kafka-connect-network
ports:
- '8082:8082'
- '31005:31005'
environment:
KAFKA_REST_HOST_NAME: rest-proxy
KAFKA_REST_BOOTSTRAP_SERVERS: 'kafka:29092'
KAFKA_REST_LISTENERS: "http://0.0.0.0:8082"
KAFKA_REST_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
KAFKAREST_JMX_HOSTNAME: "localhost"
KAFKAREST_JMX_PORT: 31005
schema.avsc:
{
"type": "record",
"name": "CustomMessage",
"namespace": "com.poc.model",
"fields": [
{
"name": "transaction",
"type": "string"
},
{
"name": "amount",
"type": "string"
}
]
}
So here I am using a StringConverter for my key (which I don't care to be honest) and AvroConverter for the value. Maybe I am missing something or I'm misconfiguring my kafka-connect worker.
My connector configuration is (connector-config.json):
{
"name" : "rabbit_to_kafka_poc",
"config" : {
"connector.class" : "io.confluent.connect.rabbitmq.RabbitMQSourceConnector",
"tasks.max" : "1",
"key.converter":"org.apache.kafka.connect.storage.StringConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081",
"kafka.topic" : "spectrum-message",
"rabbitmq.queue" : "spectrum-queue",
"rabbitmq.username": "guest",
"rabbitmq.password": "guest",
"rabbitmq.host": "rabbitmq",
"rabbitmq.port": "5672",
"rabbitmq.virtual.host": "/"
}
}
To register my connector I do curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" http://localhost:8083/connectors/ -d #connector-config.json.
Once I configure everything, I run the following command to print out my messages:
kafka-avro-console-consumer --bootstrap-server localhost:9092 \
--topic spectrum-message \
--from-beginning
And the JSON starts with a letter, so my question is why is this happening? I think something is encoding my message but my rabbitMQ producer is sending a plain JSON message. I can confirm by testing with a RabbitMQ consumer and debugging my application to the point where the message is being sent out.
You need to use the ByteArrayConverter. It's just bytes that the connector pulls from RabbitMQ - it won't try to coerce it to a schema. Even if you serialise it to Avro, the schema is just a single field of bytes:
$ curl -s -XGET localhost:8081/subjects/rabbit-test-avro-00-value/versions/1 | jq '.'
{
"subject": "rabbit-test-avro-00-value",
"version": 1,
"id": 1,
"schema": "\"bytes\""
}
If you want to write it to a topic in Avro (which is a good idea) with a schema then use something like Kafka Streams or ksqlDB to do this, applying a stream processor to the source topic which Kafka Connect writes to with the ByteArrayConverter.
For example in ksqlDB you would do:
-- Inspect the topic - ksqlDB recognises the format as JSON
ksql> PRINT 'rabbit-test-00' FROM BEGINNING;
Format:JSON
{"ROWTIME":1578477403591,"ROWKEY":"null","transaction":"PAYMENT","amount":"$125.0"}
{"ROWTIME":1578477598555,"ROWKEY":"null","transaction":"PAYMENT","amount":"$125.0"}
-- Declare the schema
CREATE STREAM rabbit (transaction VARCHAR,
amount VARCHAR)
WITH (KAFKA_TOPIC='rabbit-test-00',
VALUE_FORMAT='JSON');
-- Reserialise to Avro
CREATE STREAM TRANSACTIONS WITH (VALUE_FORMAT='AVRO',
KAFKA_TOPIC='reserialised_data') AS
SELECT *
FROM rabbit
EMIT CHANGES;
For more details, see this blog that I've written up.
You don't have JSON messages, you have Avro messages coming out of Kafka based on the usages of the AvroConverter.
That letter is not actually a letter, but your terminal showing the UTF8 representation of the first 5 bytes of the binary data. That commonly happens when using regular console consumer, rather than the avro-console-consumer which itself will parse the bytes out of the topic correctly for Avro data
If you want JSON throughout, use the JSONConverter instead
Related
I am new to kafka and am using debezium kafka to track changes in my postgrest table. Following is my docker-complse.yml
version: '3.8'
volumes:
shared-workspace:
name: "hadoop-distributed-file-system"
driver: local
services:
postgres:
restart: always
image: debezium/postgres
container_name: postgres
ports:
- "5432:5432"
environment:
- POSTGRES_PASSWORD=mosip123
- POSTGRES_DB=anonprofile
# to activate WAL
# command: postgres -c wal_level=logical -c archive_mode=on -c max_wal_senders=5
volumes:
- shared-workspace:/opt/workspace
- ./PostgresDB:/docker-entrypoint-initdb.d/
zookeeper:
image: debezium/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
- "2888:2888"
- "3888:3888"
container_name: zookeeper
volumes:
- shared-workspace:/opt/workspace
kafka:
image: debezium/kafka
container_name: kafka
ports:
- "9092:9092"
- "29092:29092"
depends_on:
- zookeeper
environment:
- ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_ADVERTISED_LISTENERS=LISTENER_EXT://localhost:29092,LISTENER_INT://kafka:9092
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=LISTENER_INT:PLAINTEXT,LISTENER_EXT:PLAINTEXT
- KAFKA_LISTENERS=LISTENER_INT://0.0.0.0:9092,LISTENER_EXT://0.0.0.0:29092
- KAFKA_INTER_BROKER_LISTENER_NAME=LISTENER_INT
volumes:
- shared-workspace:/opt/workspace
connect:
image: debezium/connect
container_name: connect
ports:
- "8083:8083"
environment:
- BOOTSTRAP_SERVERS=kafka:9092
- GROUP_ID=1
- CONFIG_STORAGE_TOPIC=my_connect_configs
- OFFSET_STORAGE_TOPIC=my_connect_offsets
- STATUS_STORAGE_TOPIC=my_connect_statuses
depends_on:
- zookeeper
- kafka
volumes:
- shared-workspace:/opt/workspace
The shell script inside postgres container.
Please note the datatype is JSON, if that is the source of the error?
#!/bin/bash
apt-get update && apt-get install postgresql-13-pgoutput
psql -U postgres -d anonprofile <<-EOSQL
CREATE TABLE IF NOT EXISTS anon_profiles (id SERIAL PRIMARY KEY, profiledata JSON );
ALTER TABLE anon_profiles REPLICA IDENTITY USING INDEX anon_profiles_pkey;
ALTER SYSTEM SET wal_level to 'logical';
EOSQL
The connector json file
{ "name": "anonprofile-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"tasks.max": "1",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "user",
"database.password": "mosip123",
"database.dbname" : "anonprofile",
"database.server.name": "MOSIP",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable": "false",
"value.converter.schemas.enable": "false",
"database.history.kafka.bootstrap.servers": "kafka:29092",
"database.history.kafka.topic": "schema-changes.anon_profiles",
"plugin.name": "pgoutput",
"publication.autocreate.mode": "all_tables",
"publication.name": "my_publication",
"snapshot.mode": "always"
}
}
After setting everything up I don't find any errors but by examining the topics list, no topic is being created for the above postgres connection. Am I missing something?
Topics list
$docker exec -it \
$(docker ps | grep kafka | awk '{ print $1 }') \
/kafka/bin/kafka-topics.sh \
--bootstrap-server localhost:9092 --list
__consumer_offsets
my_connect_configs
my_connect_offsets
my_connect_statuses
The issue was with the database.user, the user needs to have access to create replication slots. Setting it as postgres did the work. Or else grant the user with the required permissions (I guess that would be to make it as a superuser).
I'm following similar example as in this blog post:
https://rmoff.net/2019/11/12/running-dockerised-kafka-connect-worker-on-gcp/
Except that I'm not running kafka connect worker on GCP but locally.
Everything is fine I run the docker-compose up and kafka connect starts but when I try to create instance of source connector via CURL I get the following ambiguous message (Note: there is literally no log being outputed in the kafka connect logs):
{"error_code":400,"message":"Connector configuration is invalid and contains the following 1 error(s):\nUnable to connect to the server.\nYou can also find the above list of errors at the endpoint `/{connectorType}/config/validate`"}
I know I can connect to confluent cloud because I see that there are topics being created:
docker-connect-configs
docker-connect-offsets
docker-connect-status
My docker-compose.yml looks like this:
---
version: '2'
services:
kafka-connect-01:
image: confluentinc/cp-kafka-connect:5.4.0
container_name: kafka-connect-01
restart: always
depends_on:
# - zookeeper
# - kafka
- schema-registry
ports:
- 8083:8083
environment:
CONNECT_LOG4J_APPENDER_STDOUT_LAYOUT_CONVERSIONPATTERN: "[%d] %p %X{connector.context}%m (%c:%L)%n"
CONNECT_BOOTSTRAP_SERVERS: "my-server-name.confluent.cloud:9092"
CONNECT_REST_PORT: 8083
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect-01"
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
#CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: 'http://my-server-name.confluent.cloud:8081'
#CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://my-server-name.confluent.cloud:8081'
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO"
CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
CONNECT_REPLICATION_FACTOR: "3"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_PLUGIN_PATH: '/usr/share/java'
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
#ENV VARS FOR CCLOUD CONNECTION
CONNECT_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM: "https"
CONNECT_SASL_MECHANISM: PLAIN
CONNECT_SECURITY_PROTOCOL: SASL_SSL
CONNECT_SASL_JAAS_CONFIG: "${SASL_JAAS_CONFIG}"
CONNECT_CONSUMER_SECURITY_PROTOCOL: SASL_SSL
CONNECT_CONSUMER_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM: https
CONNECT_CONSUMER_SASL_MECHANISM: PLAIN
CONNECT_CONSUMER_SASL_JAAS_CONFIG: "${SASL_JAAS_CONFIG}"
CONNECT_PRODUCER_SECURITY_PROTOCOL: SASL_SSL
CONNECT_PRODUCER_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM: https
CONNECT_PRODUCER_SASL_MECHANISM: PLAIN
CONNECT_PRODUCER_SASL_JAAS_CONFIG: "${SASL_JAAS_CONFIG}"
volumes:
- db-leach:/db-leach/
- $PWD/connectors:/usr/share/java/kafka-connect-jdbc/jars/
command:
- /bin/bash
- -c
I have dockerized mongo instances running and I want to create mongo source connector, this is my CURL request:
curl -X PUT http://localhost:8083/connectors/my-mongo-source-connector/config -H "Content-Type: application/json" -d '{
"tasks.max":"1",
"connector.class":"com.mongodb.kafka.connect.MongoSourceConnector",
"connection.uri":"mongodb://mongo1:27017,mongo2:27017,mongo3:27017",
"topic.prefix":"topic.prefix",
"topic.suffix":"mySuffix",
"database":"myMongoDB",
"collection":"myMongoCollection",
"copy.existing": "true",
"output.format.key": "json",
"output.format.value": "json",
"change.stream.full.document": "updateLookup",
"publish.full.document.only": "false",
"confluent.topic.bootstrap.servers" : "'${CCLOUD_BROKER_HOST}':9092",
"confluent.topic.sasl.jaas.config" : "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"'${CCLOUD_API_KEY}'\" password=\"'${CCLOUD_API_SECRET}'\";",
"confluent.topic.security.protocol": "SASL_SSL",
"confluent.topic.ssl.endpoint.identification.algorithm": "https",
"confluent.topic.sasl.mechanism": "PLAIN"
}';
What am I missing?
I managed to get it to work, this is a correct configuration...
The message "Unable to connect to the server" was because I had wrongly deployed mongo instance so it's not related to kafka-connect or confluent cloud.
I'm going to leave this question as an example if somebody struggles with this in the future. It took me a while to figure out how to configure docker-compose for kafka-connect that connects to confluent cloud.
What I essentially try to do is to have multiple Kafka Connect instances with Docker Compose. I want ksqlDB to use this cluster. For now, they all run on a single machine, but eventually I want to deploy this to a multi-node environment. My problem is that ksqlDB apparently can't find the Kafka Connect cluster. There is the KSQL_KSQL_CONNECT_URL, which stands for the URL of a single Kafka Connect instance. Not providing this variable results in the default value, which is localhost:8083.
I found this docker-compose file, which I think does what I want to do: ksqlDB and multiple Kafka Connect instances. Unfortunately, it didn't help me that much, since it uses an old version of KSQL Server. Here is my docker-compose file:
---
version: '3'
services:
ksqldb-server-connect-test:
image: confluentinc/ksqldb-server:0.15.0
hostname: ksqldb-server-connect-test
container_name: ksqldb-server-connect-test
#ports:
# - "8088:8088"
network_mode: "host"
environment:
KSQL_KSQL_SERVICE_ID: "default_"
KSQL_LISTENERS: http://0.0.0.0:8088
KSQL_BOOTSTRAP_SERVERS: localhost:9092
KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
KSQL_KSQL_SCHEMA_REGISTRY_URL: http://localhost:8081
#KSQL_KSQL_CONNECT_URL: http://localhost:8083
ksqldb-cli-connect-test:
image: confluentinc/ksqldb-cli:0.15.0
container_name: ksqldb-cli-connect-test
network_mode: "host"
depends_on:
- ksqldb-server-connect-test
entrypoint: /bin/sh
tty: true
schema-registry-connect-test:
image: confluentinc/cp-schema-registry:6.0.1
container_name: schema-registry-connect-test
network_mode: "host"
#ports:
# - "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: localhost:9092
restart: always
kafka-connect-1:
image: confluentinc/cp-kafka-connect-base:6.0.1
container_name: kafka-connect-1
network_mode: "host"
environment:
CONNECT_BOOTSTRAP_SERVERS: "localhost:9092"
CONNECT_REST_PORT: 8082
CONNECT_GROUP_ID: kafka-connect-test
CONNECT_CONFIG_STORAGE_TOPIC: _connect-configs-test
CONNECT_OFFSET_STORAGE_TOPIC: _connect-offsets-test
CONNECT_STATUS_STORAGE_TOPIC: _connect-status-test
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://localhost:8081'
CONNECT_REST_ADVERTISED_HOST_NAME: "localhost"
CONNECT_LOG4J_APPENDER_STDOUT_LAYOUT_CONVERSIONPATTERN: "[%d] %p %X{connector.context}%m (%c:%L)%n"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_PARTITIONS: "25"
CONNECT_STATUS_STORAGE_PARTITIONS: "5"
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components,/data/connect-jars
volumes:
- $PWD/data/connect-jars/:/usr/share/java/kafka-connect-jdbc/jars/
- $PWD/jmx:/usr/app/
kafka-connect-2:
image: confluentinc/cp-kafka-connect-base:6.0.1
container_name: kafka-connect-2
network_mode: "host"
environment:
CONNECT_BOOTSTRAP_SERVERS: "localhost:9092"
CONNECT_REST_PORT: 8084
CONNECT_GROUP_ID: kafka-connect-test
CONNECT_CONFIG_STORAGE_TOPIC: _connect-configs-test
CONNECT_OFFSET_STORAGE_TOPIC: _connect-offsets-test
CONNECT_STATUS_STORAGE_TOPIC: _connect-status-test
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://localhost:8081'
CONNECT_REST_ADVERTISED_HOST_NAME: "localhost"
CONNECT_LOG4J_APPENDER_STDOUT_LAYOUT_CONVERSIONPATTERN: "[%d] %p %X{connector.context}%m (%c:%L)%n"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_PARTITIONS: "25"
CONNECT_STATUS_STORAGE_PARTITIONS: "5"
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components,/data/connect-jars
volumes:
- $PWD/data/connect-jars/:/usr/share/java/kafka-connect-jdbc/jars/
- $PWD/jmx:/usr/app/
Note that I use network_mode: "host" because the Kafka cluster itself does not run in a Docker container, so this eases the communication to Kafka in my case.
Does anybody have an idea or a solution on how to get ksqlDB connected to a Kafka Connect cluster using only docker-compose?
what I need to achieve is fault tolerance.
OK, so what you need is >1 Kafka Connect worker, within a single Kafka Connect group. This is what you've got with your configuration of the same storage topics and group.id 👍
So the question is how to get ksqlDB to connect to a cluster of Kafka Connect workers. Since Kafka Connect uses Kafka itself to hold configuration, it doesn't matter which worker it connects to. ksql.connect.url (and thus KSQL_KSQL_CONNECT_URL environment variable in docker) is the correct way to do this, but it's not clear from the docs if you can specify multiple values.
If you can't then I'm guessing you'd need to stick a stateless load balancer in front of the workers and point ksqlDB at that.
Also, the hostname is going to be the name of the container (kafka-connect-1 / kafka-connect-2), not localhost.
QUESTION:
How do I configure "securityMechanism=9, encryptionAlgorithm=2" for a db2 database connection in my docker-compose file?
NOTE: When running my local kafka installation (kafka_2.13-2.6.0) to connect to a db2 database on the network, I only had to modify the bin/connect-standalone.sh file
by modifying the existing "EXTRA_ARGS=" line like this:
(...)
EXTRA_ARGS=${EXTRA_ARGS-'-name connectStandalone -Ddb2.jcc.securityMechanism=9 -Ddb2.jcc.encryptionAlgorithm=2'}
(...)
it worked fine.
However, when I tried using the same idea for a containerized kafka/broker "service" (docker-compose.yml),
by mounting a volume with the modified "connect-standalone" file content (to replace the "/usr/bin/connect-standalone" file in the container) it did not work.
I did verify that the container's file was changed.
...I receive this exception when I attempt to use a kafka-jdbc-source-connector to connect to the database:
Caused by: com.ibm.db2.jcc.am.SqlInvalidAuthorizationSpecException: [jcc][t4][201][11237][4.25.13] Connection authorization failure occurred.
Reason: Security mechanism not supported. ERRORCODE=-4214, SQLSTATE=28000
So, again, how do I configure the securityMechanism/encryptionAlgorithm setting in a docker-compose.yml?
Thx for any help
-sairn
here is a docker-compose.yml - you can see I've tried mounting volume with the modified "connect-standalone" file in both the broker(kafka) service and the kafka-connect service... neither achieved the desired effect
version: '3.8'
services:
zookeeper:
image: confluentinc/cp-zookeeper:6.0.0
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: confluentinc/cp-enterprise-kafka:6.0.0
container_name: kafka
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://kafka:9092
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 100
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: kafka:29092
CONFLUENT_METRICS_REPORTER_ZOOKEEPER_CONNECT: zookeeper:2181
CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
CONFLUENT_METRICS_ENABLE: 'true'
CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
JVM_OPTS: "-Ddb2.jcc.securityMechanism=9 -Ddb2.jcc.encryptionAlgorithm=2"
volumes:
- ./connect-standalone:/usr/bin/connect-standalone
schema-registry:
image: confluentinc/cp-schema-registry:6.0.0
container_name: schema-registry
hostname: schema-registry
depends_on:
- zookeeper
- kafka
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181'
SCHEMA_REGISTRY_LISTENERS: http://schema-registry:8081
kafka-connect:
image: confluentinc/cp-kafka-connect:6.0.0
container_name: kafka-connect
hostname: kafka-connect
depends_on:
- kafka
- schema-registry
ports:
- "8083:8083"
environment:
CONNECT_BOOTSTRAP_SERVERS: "kafka:29092"
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: kafka-connect
CONNECT_CONFIG_STORAGE_TOPIC: kafka-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: kafka-connect-offsets
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: kafka-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181'
CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR
JVM_OPTS: "-Ddb2.jcc.securityMechanism=9 -Ddb2.jcc.encryptionAlgorithm=2"
volumes:
- ./kafka-connect-jdbc-10.0.1.jar:/usr/share/java/kafka-connect-jdbc/kafka-connect-jdbc-10.0.1.jar
- ./db2jcc-db2jcc4.jar:/usr/share/java/kafka-connect-jdbc/db2jcc-db2jcc4.jar
- ./connect-standalone:/usr/bin/connect-standalone
Fwiw, the connector looks similar to this...
curl -X POST http://localhost:8083/connectors -H "Content-Type: application/json" -d '{
"name": "CONNECTOR01",
"config": {
"connector.class":"io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url":"jdbc:db2://THEDBURL:50000/XXXXX",
"connection.user":"myuserid",
"connection.password":"mypassword",
"poll.interval.ms":"15000",
"table.whitelist":"YYYYY.TABLEA",
"topic.prefix":"tbl-",
"mode":"timestamp",
"timestamp.initial":"-1",
"timestamp.column.name":"TIME_UPD",
"poll.interval.ms":"15000"
}
}'
Try to use KAFKA_OPTS instead of JVM_OPTS
I have following docker-compose.yml configuration for my Debezium connector:
version: '3.1'
services:
db:
image: mysql
network_mode: host
environment:
MYSQL_ROOT_PASSWORD: 123456
volumes:
- ./mysql.cnf:/etc/mysql/conf.d/mysql.cnf
ports:
- 3306:3306
container_name: mydb
zookeeper:
image: confluentinc/cp-zookeeper
network_mode: host
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
container_name: myzookeeper
kafka:
image: confluentinc/cp-kafka
network_mode: host
depends_on:
- zookeeper
- db
ports:
- "9092:9092"
environment:
KAFKA_ZOOKEEPER_CONNECT: localhost:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_BROKER_ID: 1
KAFKA_MIN_INSYNC_REPLICAS: 1
container_name: mykafka
connector:
image: debezium/connect:0.10
network_mode: host
ports:
- "8083:8083"
environment:
GROUP_ID: 1
CONFIG_STORAGE_TOPIC: my_connect_configs
OFFSET_STORAGE_TOPIC: my_connect_offsets
BOOTSTRAP_SERVERS: localhost:9092
HOST_NAME: localhost
depends_on:
- zookeeper
- db
- kafka
container_name: myconnector
When I want to create a connector I receive the following error:
curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" localhost:8083/connectors/ -d '{ "name": "mystore-connector", "config": { "connector.class": "io.debezium.connector.mysql.MySqlConnector", "tasks.max": "1", "database.hostname": "localhost", "database.port": "3306", "database.user": "morteza", "database.password": "morteza_password", "database.server.id": "223344", "database.server.name": "dbserver1", "database.whitelist": "mystore", "database.history.kafka.bootstrap.servers": "localhost:9092", "database.history.kafka.topic": "dbhistory.mystore" } }'
HTTP/1.1 400 Bad Request
Date: Fri, 08 Mar 2019 09:48:34 GMT
Content-Type: application/json
Content-Length: 255
Server: Jetty(9.4.12.v20180830)
{"error_code":400,"message":"Connector configuration is invalid and contains the following 1 error(s):\nUnable to connect: Public Key Retrieval is not allowed\nYou can also find the above list of errors at the endpoint `/{connectorType}/config/validate`"}
And Debezium connector shows this error:
The connection password is empty [io.debezium.connector.mysql.MySqlConnector]
myconnector | 2019-03-08 09:38:26,755 INFO || Failed testing connection for jdbc:mysql://localhost:3306/?useInformationSchema=true&nullCatalogMeansCurrent=false&useSSL=false&useUnicode=true&characterEncoding=UTF-8&characterSetResults=UTF-8&zeroDateTimeBehavior=CONVERT_TO_NULL with user 'morteza' [io.debezium.connector.mysql.MySqlConnector]
I have set the value of the field database.password as you see, but I receive the error for the password of the database.
As Jiri Pechanec suggested, I changed image: mysql to image: mysql:5.7 in docker-compose.yml file and now it works fine.
This error (Public Key Retrieval is not allowed) can come about in Debezium 0.9 when the user in the source MySQL database has not been granted the required privileges. In Debezium 0.8 the error was Unable to connect: Communications link failure.
I wrote a quick blog about this here because I encountered this issue and the error messages are kinda non-obvious :)
tl;dr: ALTER USER 'debezium'#'%' IDENTIFIED WITH mysql_native_password BY 'dbz';