ksqldb not failing over to standby schema registry - apache-kafka

I am trying to test failover scenario for kafka schema registry.
I spanned up two Schema registry docker containers(Primary and standby) and I have a KSQLDB server running in a docker container pointing to primary schema registry. The source kafka connecter is streaming the data from the database to kafka topics. The ksqlDB server is able to validate the schema of the kafka message using primary schema registry. Now I shutdown the primary schema registry. The ksqldb server is not failing over to the stand by schema registry to validate the schema, causing ksqldb server not receiving the data from kafka topics.
How should ksqldb server should know what is the standby schema-registry that it need to connect to when primary is down.
Below is docker-compose.yml file that I have used
schema-registry:
image: confluentinc/cp-schema-registry:${CP_VERSION}
depends_on:
- zookeeper
- kafka
ports:
- "8081:8081"
container_name: schema-registry
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://kafka:9092
SCHEMA_REGISTRY_ACCESS_CONTROL_ALLOW_ORIGIN: '*'
SCHEMA_REGISTRY_ACCESS_CONTROL_ALLOW_METHODS: 'GET,POST,PUT,OPTIONS'
SCHEMA_REGISTRY_LEADER_ELIGIBILITY : "true"
SCHEMA_REGISTRY_GROUP_ID : "schema-registry-group"
SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081
schema-registry-2:
image: confluentinc/cp-schema-registry:${CP_VERSION}
depends_on:
- kafka
- schema-registry
ports:
- "8082:8082"
container_name: schema-registry-2
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry-2
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://kafka:9092
SCHEMA_REGISTRY_ACCESS_CONTROL_ALLOW_ORIGIN: '*'
SCHEMA_REGISTRY_ACCESS_CONTROL_ALLOW_METHODS: 'GET,POST,PUT,OPTIONS'
SCHEMA_REGISTRY_LEADER_ELIGIBILITY : "true"
SCHEMA_REGISTRY_GROUP_ID : "schema-registry-group"
SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8082
primary-ksqldb-server:
image: ${KSQL_IMAGE_BASE}confluentinc/ksqldb-server:${KSQL_VERSION}
hostname: primary-ksqldb-server
container_name: primary-ksqldb-server
depends_on:
- kafka
- schema-registry
ports:
- "8088:8088"
environment:
KSQL_CONFIG_DIR: "/etc/ksql"
KSQL_LISTENERS: http://0.0.0.0:8088
KSQL_BOOTSTRAP_SERVERS: kafka:9092
KSQL_KSQL_ADVERTISED_LISTENER : http://localhost:8088
KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081
KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
KSQL_KSQL_EXTENSION_DIR: "/usr/ksqldb/ext/"
KSQL_KSQL_SERVICE_ID: "nrt_"
KSQL_KSQL_STREAMS_NUM_STANDBY_REPLICAS: 1
KSQL_KSQL_QUERY_PULL_ENABLE_STANDBY_READS: "true"
KSQL_KSQL_HEARTBEAT_ENABLE: "true"
KSQL_KSQL_LAG_REPORTING_ENABLE : "true"
KSQL_KSQL_QUERY_PULL_MAX_ALLOWED_OFFSET_LAG : 100
KSQL_LOG4J_APPENDER_KAFKA_APPENDER: "org.apache.kafka.log4jappender.KafkaLog4jAppender"
KSQL_LOG4J_APPENDER_KAFKA_APPENDER_LAYOUT: "io.confluent.common.logging.log4j.StructuredJsonLayout"
KSQL_LOG4J_APPENDER_KAFKA_APPENDER_BROKERLIST: localhost:9092
KSQL_LOG4J_APPENDER_KAFKA_APPENDER_TOPIC: KSQL_LOG
KSQL_LOG4J_LOGGER_IO_CONFLUENT_KSQL: INFO,kafka_appender
KSQL_KSQL_QUERY_PULL_METRICS_ENABLED: "true"
KSQL_JMX_OPTS: >
-Djava.rmi.server.hostname=localhost
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=1099
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.rmi.port=1099
When I stop primary schema registry, ksqldb is supposed to connect to standy schema registry

How would it know the other is available if you don't provide it?
KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081,http://schema-registry-2:8082
In other words, you shut down schema-registry container, so it will simply not respond. It will not forward requests or update the clients to talk to another server... So, you need to provide a URL-list, or you need to setup an external reverse proxy to round-robin the requests to the active instance.

Related

How to setup local Kafka to validate schema?

I read that schema validation is available in Kafka Confluent Server only. But maybe since then someone has found a solution? Does anyone know how can I test message validity running Kafka locally?
how to enforce broker to validate producers' input
Kafka doesn't do this. Confluent Server can, yes, but it is enterprise licensed. There is no alternative to server-side validation without forking Kafka like Confluent did. Otherwise, you would need to write your own Serializer class, but that won't scale to every producer client you may use.
You can still use Docker as the other answer shows, and you can use confluentinc/cp-server image, then you simply add an environment variable to enable schema valdiation.
https://docs.confluent.io/platform/current/schema-registry/schema-validation.html
Otherwise, simply using JsonSchemaSerializer, for example, will validate each record does adhere to a schema (client-side validation), but it won't stop anyone else from sending garbage data into the topic (server-side validation).
If you know a little bite docker-compose, you can deploy a Kafka ecosystem for testing purposes.
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.1.0
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-kafka:7.1.0
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "29092:29092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
schema-registry:
image: confluentinc/cp-schema-registry:7.1.0
hostname: schema-registry
container_name: schema-registry
depends_on:
- broker
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:9092'
I got the docker-compose file from https://github.com/confluentinc/kafka-tutorials/blob/master/_includes/tutorials/aggregating-sum/ksql/code/docker-compose.yml, but for your testing, it is not needed the KSQL components and you can delete those.

Failed to collect cluster Default info java.lang.IllegalStateException: Error while creating AdminClient for Cluster Default

I would like to use network_mode: bridge for kafka for being able to reach kafka through localhost:9092 from another service
I'm trying to use the provectus/kafka-ui but when I open the consumers menu I get the following error
my docker-compose.yml file :
kafka-ui:
container_name: kafka-ui
image: provectuslabs/kafka-ui:latest
ports:
- 8080:8080
depends_on:
- kafka
environment:
KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092
KAFKA_CLUSTERS_0_JMXPORT: 9997
kafka:
image: johnnypark/kafka-zookeeper
ports:
- "2181:2181"
- "9092:9092"
network_mode: bridge
environment:
ADVERTISED_HOST: 127.0.0.1
NUM_PARTITIONS: 1
volumes:
- /var/run/docker.sock:/var/run/docker.sock
log error:
2022-01-13 09:16:50,014 ERROR [parallel-5] c.p.k.u.s.MetricsService: Failed to collect cluster Default info
java.lang.IllegalStateException: Error while creating AdminClient for Cluster Default
provectus/kafka-ui
I was using the johnnypark/kafka-zookeeper library for both kafka and zookeeper. I was able to solve this problem by using two separate libraries as in the example below
zookeeper1:
image: confluentinc/cp-zookeeper:5.2.4
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka1:
image: confluentinc/cp-kafka:5.3.1
depends_on:
- zookeeper1
ports:
- 9093:9093
- 9998:9998
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper1:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka1:29092,PLAINTEXT_HOST://localhost:9093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
JMX_PORT: 9998
KAFKA_JMX_OPTS: -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=kafka1 -Dcom.sun.management.jmxremote.rmi.port=9998
being able to reach kafka through localhost:9092 from another service
You can't use localhost to reach Kafka since that would be the Kafka UI container itself.
Changing ADVERTISED_HOST to kafka and using kafka:9092 from other containers is correct for a bridge network. However, this have the side effect of preventing any access to Kafka outside the Docker network, such as clients directly on the host machine.
Internal and External clients can be configured separately. bitnami/bitnami-docker-kafka
Here's an example using Bitnami's Kafka Image - this allows host clients to connect on port 9093 while allowing kafka-ui to connect with the default port.
version: "3"
services:
zookeeper:
image: 'bitnami/zookeeper:latest'
ports:
- '2181:2181'
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
kafka:
image: 'bitnami/kafka:latest'
ports:
- '9092:9092'
- '9093:9093'
environment:
- KAFKA_BROKER_ID=1
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CLIENT:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_CFG_LISTENERS=CLIENT://:9092,EXTERNAL://:9093
- KAFKA_CFG_ADVERTISED_LISTENERS=CLIENT://kafka:9092,EXTERNAL://localhost:9093
- KAFKA_CFG_INTER_BROKER_LISTENER_NAME=CLIENT
- ALLOW_PLAINTEXT_LISTENER=yes
depends_on:
- zookeeper
kafka-ui:
image: provectuslabs/kafka-ui
container_name: kafka-ui
ports:
- "8081:8081"
restart: always
environment:
- KAFKA_CLUSTERS_0_NAME=local
- KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS=kafka:9092
- SERVER_PORT=8081

ksqldb, issue setting schema registry

I have set in ksqldb-server in /etc/ksqldb/ksql-server.properties file, my schema registry as they say in the documentation :
ksql.schema.registry.url=http://myipaddress:8090
but when I go inside my ksqldb container :
docker exec -it ksqldb-cli ksql http://ksqldb-server:8088
and I try to run :
CREATE STREAM tracking WITH (KAFKA_TOPIC='tracking', VALUE_FORMAT='AVRO');
I get error:
Cannot create topic 'tracking' with format AVRO without configuring 'ksql.schema.registry.url'
I have also tried setting it with the following, even it's not recommended :
SET 'ksql.schema.registry.url'='http://myipaddress:8090';
but still getting same error, not sure what I'm doing wrong.
this is my docker-compose file:
version: "3.3"
services:
# Kafka/Zookeeper container
divolte-kafka:
image: krisgeus/docker-kafka
restart: always
environment:
ADVERTISED_HOST: divolte-kafka
LOG_RETENTION_HOURS: 1
AUTO_CREATE_TOPICS: "false"
KAFKA_CREATE_TOPICS: tracking:4:1
ADVERTISED_LISTENERS: PLAINTEXT://divolte-kafka:9092,INTERNAL://localhost:9093
LISTENERS: PLAINTEXT://0.0.0.0:9092,INTERNAL://0.0.0.0:9093
SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,INTERNAL:PLAINTEXT
INTER_BROKER: INTERNAL
# Schema Registry
schema-registry:
image: confluentinc/cp-schema-registry:5.5.3
restart: always
depends_on:
- divolte-kafka
ports:
- 8090:8081
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: divolte-kafka:2181
# ksql server
ksqldb-server:
image: confluentinc/ksqldb-server:0.20.0
restart: always
hostname: ksqldb-server
container_name: ksqldb-server
depends_on:
- divolte-kafka
ports:
- "8088:8088"
environment:
KSQL_LISTENERS: http://0.0.0.0:8088
KSQL_BOOTSTRAP_SERVERS: divolte-kafka:9092
KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
# ksql cli
ksqldb-cli:
image: confluentinc/ksqldb-cli:0.20.0
restart: always
container_name: ksqldb-cli
depends_on:
- divolte-kafka
- ksqldb-server
entrypoint: /bin/sh
tty: true
I have set in ksqldb-server in /etc/ksqldb/ksql-server.properties file
Well, you're never using this file inside Docker
You need to add a variable for it
KSQL_KSQL_SCHEMA_REGISTRY_URL: "schema-registry:8081"
Also, you should use Kafka on the registry, not deprecated Zookeeper, with property KSQL_KAFKASTORE_BOOTSTRAP_SERVERS

Confluent Schema Registry unable to connect to Kafka container

I'm trying to run a Kafka service with Zookeper, Kafdrop and Schema Registry, I made it work by installing Kafka, Zookeper and Kadrop in a container and then installing Confluent Schema Registry in another (with its own Kafka, Zookeeper, Ksql Server and Rest Proxy), however trying to make kafdrop read the schema registry is not working, so now I want to install exclusively just one container with Kafka, Zookeeper, Kafdrop and Schema Registry, and even though everything is installed successfully, the Schema Registry is restarting every 10 seconds or so, and I cannot reach out the service (localhost:8085) to add my schema, so I'm wondering if it's even possible to run the Confluent Schema Registry outside the Confluent suite of services, here it is my YAML file:
version: '2'
services:
kafka:
image: wurstmeister/kafka
container_name: kafka
ports:
- "9092:9092"
environment:
- KAFKA_ADVERTISED_HOST_NAME=127.0.0.1
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_LISTENERS=INTERNAL://:29092,EXTERNAL://:9092
- KAFKA_ADVERTISED_LISTENERS=INTERNAL://kafka:29092,EXTERNAL://localhost:9092
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_INTER_BROKER_LISTENER_NAME=INTERNAL
- KAFKA_SCHEMA_REGISTRY_URL=schemaregistry:8085
depends_on:
- zookeeper
zookeeper:
image: wurstmeister/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
- KAFKA_ADVERTISED_HOST_NAME=zookeeper
schemaregistry:
image: confluentinc/cp-schema-registry:6.2.0
restart: always
depends_on:
- zookeeper
environment:
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: "zookeeper:2181"
SCHEMA_REGISTRY_HOST_NAME: schemaregistry
SCHEMA_REGISTRY_LISTENERS: "http://0.0.0.0:8085"
ports:
- 8085:8085
kafdrop:
image: obsidiandynamics/kafdrop
container_name: kafdrop
restart: "no"
ports:
- "9000:9000"
environment:
KAFKA_BROKERCONNECT: "kafka:29092"
JVM_OPTS: "-Xms16M -Xmx48M -Xss180K -XX:-TieredCompilation -XX:+UseStringDeduplication -noverify"
SCHEMAREGISTRY_CONNECT: schemaregistry:8085
depends_on:
- "kafka"
So it turned out that the Schema Registry couldn't connect because Kafka was not using 'PLAINTEXT' as the internal broker listener name, here is the YAML version working with the Kafdrop also working when deserializing the message to AVRO:
version: '2'
services:
kafka:
image: wurstmeister/kafka
container_name: kafka
ports:
- "9092:9092"
environment:
- KAFKA_ADVERTISED_HOST_NAME=127.0.0.1
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_LISTENERS=PLAINTEXT://:29092,EXTERNAL://:9092
- KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:29092,EXTERNAL://localhost:9092
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_INTER_BROKER_LISTENER_NAME=PLAINTEXT
- KAFKA_SCHEMA_REGISTRY_URL=schemaregistry:8085
depends_on:
- zookeeper
zookeeper:
image: wurstmeister/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
- KAFKA_ADVERTISED_HOST_NAME=zookeeper
schemaregistry:
image: confluentinc/cp-schema-registry:6.2.0
restart: always
depends_on:
- zookeeper
environment:
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: "zookeeper:2181"
SCHEMA_REGISTRY_HOST_NAME: schemaregistry
SCHEMA_REGISTRY_LISTENERS: "http://0.0.0.0:8085"
ports:
- 8085:8085
kafdrop:
image: obsidiandynamics/kafdrop
container_name: kafdrop
restart: "no"
ports:
- "9000:9000"
environment:
KAFKA_BROKERCONNECT: "kafka:29092"
JVM_OPTS: "-Xms16M -Xmx48M -Xss180K -XX:-TieredCompilation -XX:+UseStringDeduplication -noverify"
SCHEMAREGISTRY_CONNECT: http://schemaregistry:8085
depends_on:
- "kafka"
Your both YAML are not correct
You have in zookeeper:
KAFKA_ADVERTISED_HOST_NAME
And in schema registry you give zookeeper port 2181 in kafkastore....

Create Kafka-Connect cluster with Docker Compose to be used by ksqlDB

What I essentially try to do is to have multiple Kafka Connect instances with Docker Compose. I want ksqlDB to use this cluster. For now, they all run on a single machine, but eventually I want to deploy this to a multi-node environment. My problem is that ksqlDB apparently can't find the Kafka Connect cluster. There is the KSQL_KSQL_CONNECT_URL, which stands for the URL of a single Kafka Connect instance. Not providing this variable results in the default value, which is localhost:8083.
I found this docker-compose file, which I think does what I want to do: ksqlDB and multiple Kafka Connect instances. Unfortunately, it didn't help me that much, since it uses an old version of KSQL Server. Here is my docker-compose file:
---
version: '3'
services:
ksqldb-server-connect-test:
image: confluentinc/ksqldb-server:0.15.0
hostname: ksqldb-server-connect-test
container_name: ksqldb-server-connect-test
#ports:
# - "8088:8088"
network_mode: "host"
environment:
KSQL_KSQL_SERVICE_ID: "default_"
KSQL_LISTENERS: http://0.0.0.0:8088
KSQL_BOOTSTRAP_SERVERS: localhost:9092
KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
KSQL_KSQL_SCHEMA_REGISTRY_URL: http://localhost:8081
#KSQL_KSQL_CONNECT_URL: http://localhost:8083
ksqldb-cli-connect-test:
image: confluentinc/ksqldb-cli:0.15.0
container_name: ksqldb-cli-connect-test
network_mode: "host"
depends_on:
- ksqldb-server-connect-test
entrypoint: /bin/sh
tty: true
schema-registry-connect-test:
image: confluentinc/cp-schema-registry:6.0.1
container_name: schema-registry-connect-test
network_mode: "host"
#ports:
# - "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: localhost:9092
restart: always
kafka-connect-1:
image: confluentinc/cp-kafka-connect-base:6.0.1
container_name: kafka-connect-1
network_mode: "host"
environment:
CONNECT_BOOTSTRAP_SERVERS: "localhost:9092"
CONNECT_REST_PORT: 8082
CONNECT_GROUP_ID: kafka-connect-test
CONNECT_CONFIG_STORAGE_TOPIC: _connect-configs-test
CONNECT_OFFSET_STORAGE_TOPIC: _connect-offsets-test
CONNECT_STATUS_STORAGE_TOPIC: _connect-status-test
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://localhost:8081'
CONNECT_REST_ADVERTISED_HOST_NAME: "localhost"
CONNECT_LOG4J_APPENDER_STDOUT_LAYOUT_CONVERSIONPATTERN: "[%d] %p %X{connector.context}%m (%c:%L)%n"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_PARTITIONS: "25"
CONNECT_STATUS_STORAGE_PARTITIONS: "5"
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components,/data/connect-jars
volumes:
- $PWD/data/connect-jars/:/usr/share/java/kafka-connect-jdbc/jars/
- $PWD/jmx:/usr/app/
kafka-connect-2:
image: confluentinc/cp-kafka-connect-base:6.0.1
container_name: kafka-connect-2
network_mode: "host"
environment:
CONNECT_BOOTSTRAP_SERVERS: "localhost:9092"
CONNECT_REST_PORT: 8084
CONNECT_GROUP_ID: kafka-connect-test
CONNECT_CONFIG_STORAGE_TOPIC: _connect-configs-test
CONNECT_OFFSET_STORAGE_TOPIC: _connect-offsets-test
CONNECT_STATUS_STORAGE_TOPIC: _connect-status-test
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://localhost:8081'
CONNECT_REST_ADVERTISED_HOST_NAME: "localhost"
CONNECT_LOG4J_APPENDER_STDOUT_LAYOUT_CONVERSIONPATTERN: "[%d] %p %X{connector.context}%m (%c:%L)%n"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_PARTITIONS: "25"
CONNECT_STATUS_STORAGE_PARTITIONS: "5"
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components,/data/connect-jars
volumes:
- $PWD/data/connect-jars/:/usr/share/java/kafka-connect-jdbc/jars/
- $PWD/jmx:/usr/app/
Note that I use network_mode: "host" because the Kafka cluster itself does not run in a Docker container, so this eases the communication to Kafka in my case.
Does anybody have an idea or a solution on how to get ksqlDB connected to a Kafka Connect cluster using only docker-compose?
what I need to achieve is fault tolerance.
OK, so what you need is >1 Kafka Connect worker, within a single Kafka Connect group. This is what you've got with your configuration of the same storage topics and group.id 👍
So the question is how to get ksqlDB to connect to a cluster of Kafka Connect workers. Since Kafka Connect uses Kafka itself to hold configuration, it doesn't matter which worker it connects to. ksql.connect.url (and thus KSQL_KSQL_CONNECT_URL environment variable in docker) is the correct way to do this, but it's not clear from the docs if you can specify multiple values.
If you can't then I'm guessing you'd need to stick a stateless load balancer in front of the workers and point ksqlDB at that.
Also, the hostname is going to be the name of the container (kafka-connect-1 / kafka-connect-2), not localhost.