Connect Kafka to Neo4j - apache-kafka

I have created a simple python code that generates user_id, receipent_id and amount. I have created a kafka producer and consumer. Python code returns data as a json. Now I am trying connect my data to Neo4j through kafka but I am unable to do it.
https://neo4j.com/docs/kafka/quickstart-connect/
I started to check the documents but when I directly copy the docker-compose.yml
---
version: '2'
services:
neo4j:
image: neo4j:4.0.3-enterprise
hostname: neo4j
container_name: neo4j
ports:
- "7474:7474"
- "7687:7687"
environment:
NEO4J_kafka_bootstrap_servers: broker:9093
NEO4J_AUTH: neo4j/connect
NEO4J_dbms_memory_heap_max__size: 8G
NEO4J_ACCEPT_LICENSE_AGREEMENT: yes
zookeeper:
image: confluentinc/cp-zookeeper
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-enterprise-kafka
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "9092:9092"
expose:
- "9093"
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9093,OUTSIDE://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9093,OUTSIDE://0.0.0.0:9092
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker:9093
# workaround if we change to a custom name the schema_registry fails to start
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
CONFLUENT_METRICS_REPORTER_ZOOKEEPER_CONNECT: zookeeper:2181
CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
CONFLUENT_METRICS_ENABLE: 'true'
CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
schema_registry:
image: confluentinc/cp-schema-registry
hostname: schema_registry
container_name: schema_registry
depends_on:
- zookeeper
- broker
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema_registry
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181'
connect:
image: confluentinc/cp-kafka-connect
hostname: connect
container_name: connect
depends_on:
- zookeeper
- broker
- schema_registry
ports:
- "8083:8083"
volumes:
- ./plugins:/tmp/connect-plugins
environment:
CONNECT_BOOTSTRAP_SERVERS: 'broker:9093'
CONNECT_REST_ADVERTISED_HOST_NAME: connect
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema_registry:8081'
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema_registry:8081'
CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181'
CONNECT_PLUGIN_PATH: /usr/share/java,/tmp/connect-plugins
CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=DEBUG,org.I0Itec.zkclient=DEBUG,org.reflections=ERROR
control-center:
image: confluentinc/cp-enterprise-control-center
hostname: control-center
container_name: control-center
depends_on:
- zookeeper
- broker
- schema_registry
- connect
ports:
- "9021:9021"
environment:
CONTROL_CENTER_BOOTSTRAP_SERVERS: 'broker:9093'
CONTROL_CENTER_ZOOKEEPER_CONNECT: 'zookeeper:2181'
CONTROL_CENTER_CONNECT_CLUSTER: 'connect:8083'
CONTROL_CENTER_REPLICATION_FACTOR: 1
CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1
CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
CONFLUENT_METRICS_TOPIC_REPLICATION: 1
PORT: 9021
Docker containers
I get an error from schema-registry container which is
===> User
uid=1000(appuser) gid=1000(appuser) groups=1000(appuser)
===> Configuring ...
===> Running preflight checks ... 
===> Check if Zookeeper is healthy ...
[2022-12-14 12:18:38,319] INFO Client environment:zookeeper.version=3.6.3--6401e4ad2087061bc6b9f80dec2d69f2e3c8660a, built on 04/08/2021 16:35 GMT (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:host.name=schema_registry (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.version=11.0.16.1 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.vendor=Azul Systems, Inc. (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.home=/usr/lib/jvm/zulu11-ca (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.class.path=/usr/share/java/cp-base-new/disk-usage-agent-7.3.0.jar:/usr/share/java/cp-base-new/reload4j-1.2.19.jar:/usr/share/java/cp-base-new/kafka-server-common-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jopt-simple-5.0.4.jar:/usr/share/java/cp-base-new/scala-logging_2.13-3.9.4.jar:/usr/share/java/cp-base-new/scala-java8-compat_2.13-1.0.2.jar:/usr/share/java/cp-base-new/zookeeper-3.6.3.jar:/usr/share/java/cp-base-new/json-simple-1.1.1.jar:/usr/share/java/cp-base-new/metrics-core-2.2.0.jar:/usr/share/java/cp-base-new/audience-annotations-0.5.0.jar:/usr/share/java/cp-base-new/kafka-storage-api-7.3.0-ccs.jar:/usr/share/java/cp-base-new/kafka-clients-7.3.0-ccs.jar:/usr/share/java/cp-base-new/slf4j-reload4j-1.7.36.jar:/usr/share/java/cp-base-new/snappy-java-1.1.8.4.jar:/usr/share/java/cp-base-new/commons-cli-1.4.jar:/usr/share/java/cp-base-new/scala-collection-compat_2.13-2.6.0.jar:/usr/share/java/cp-base-new/jackson-core-2.13.2.jar:/usr/share/java/cp-base-new/jmx_prometheus_javaagent-0.14.0.jar:/usr/share/java/cp-base-new/kafka-raft-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jackson-module-scala_2.13-2.13.2.jar:/usr/share/java/cp-base-new/re2j-1.6.jar:/usr/share/java/cp-base-new/jose4j-0.7.9.jar:/usr/share/java/cp-base-new/snakeyaml-1.30.jar:/usr/share/java/cp-base-new/logredactor-metrics-1.0.10.jar:/usr/share/java/cp-base-new/logredactor-1.0.10.jar:/usr/share/java/cp-base-new/jackson-dataformat-yaml-2.13.2.jar:/usr/share/java/cp-base-new/kafka_2.13-7.3.0-ccs.jar:/usr/share/java/cp-base-new/kafka-storage-7.3.0-ccs.jar:/usr/share/java/cp-base-new/utility-belt-7.3.0.jar:/usr/share/java/cp-base-new/jackson-annotations-2.13.2.jar:/usr/share/java/cp-base-new/minimal-json-0.9.5.jar:/usr/share/java/cp-base-new/lz4-java-1.8.0.jar:/usr/share/java/cp-base-new/zookeeper-jute-3.6.3.jar:/usr/share/java/cp-base-new/zstd-jni-1.5.2-1.jar:/usr/share/java/cp-base-new/jackson-dataformat-csv-2.13.2.jar:/usr/share/java/cp-base-new/slf4j-api-1.7.36.jar:/usr/share/java/cp-base-new/jackson-databind-2.13.2.2.jar:/usr/share/java/cp-base-new/jolokia-jvm-1.7.1.jar:/usr/share/java/cp-base-new/paranamer-2.8.jar:/usr/share/java/cp-base-new/gson-2.9.0.jar:/usr/share/java/cp-base-new/metrics-core-4.1.12.1.jar:/usr/share/java/cp-base-new/kafka-metadata-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jackson-datatype-jdk8-2.13.2.jar:/usr/share/java/cp-base-new/common-utils-7.3.0.jar:/usr/share/java/cp-base-new/scala-reflect-2.13.5.jar:/usr/share/java/cp-base-new/scala-library-2.13.5.jar:/usr/share/java/cp-base-new/argparse4j-0.7.0.jar:/usr/share/java/cp-base-new/jolokia-core-1.7.1.jar (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.version=5.10.104-linuxkit (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:user.name=appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:user.home=/home/appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:user.dir=/home/appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.memory.free=51MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.memory.max=952MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.memory.total=60MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,326] INFO Initiating client connection, connectString=zookeeper:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher#3c0a50da (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,332] INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)
[2022-12-14 12:18:38,341] INFO jute.maxbuffer value is 1048575 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2022-12-14 12:18:38,351] INFO zookeeper.request.timeout value is 0. feature enabled=false (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,372] INFO Opening socket connection to server zookeeper/172.18.0.2:2181. (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,375] INFO SASL config status: Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,388] INFO Socket connection established, initiating session, client: /172.18.0.5:47172, server: zookeeper/172.18.0.2:2181 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,542] INFO Session establishment complete on server zookeeper/172.18.0.2:2181, session id = 0x10000250f890000, negotiated timeout = 40000 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,587] WARN An exception was thrown while closing send thread for session 0x10000250f890000. (org.apache.zookeeper.ClientCnxn)
EndOfStreamException: Unable to read additional data from server sessionid 0x10000250f890000, likely server has closed socket
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1290)
[2022-12-14 12:18:38,699] INFO Session: 0x10000250f890000 closed (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,699] INFO EventThread shut down for session: 0x10000250f890000 (org.apache.zookeeper.ClientCnxn)
Using log4j config /etc/schema-registry/log4j.properties
===> Check if Kafka is healthy ...
[2022-12-14 12:18:39,567] INFO Client environment:zookeeper.version=3.6.3--6401e4ad2087061bc6b9f80dec2d69f2e3c8660a, built on 04/08/2021 16:35 GMT (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,567] INFO Client environment:host.name=schema_registry (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.version=11.0.16.1 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.vendor=Azul Systems, Inc. (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.home=/usr/lib/jvm/zulu11-ca (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.class.path=/usr/share/java/cp-base-new/disk-usage-agent-7.3.0.jar:/usr/share/java/cp-base-new/reload4j-1.2.19.jar:/usr/share/java/cp-base-new/kafka-server-common-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jopt-simple-5.0.4.jar:/usr/share/java/cp-base-new/scala-logging_2.13-3.9.4.jar:/usr/share/java/cp-base-new/scala-java8-compat_2.13-1.0.2.jar:/usr/share/java/cp-base-new/zookeeper-3.6.3.jar:/usr/share/java/cp-base-new/json-simple-1.1.1.jar:/usr/share/java/cp-base-new/metrics-core-2.2.0.jar:/usr/share/java/cp-base-new/audience-annotations-0.5.0.jar:/usr/share/java/cp-base-new/kafka-storage-api-7.3.0-ccs.jar:/usr/share/java/cp-base-new/kafka-clients-7.3.0-ccs.jar:/usr/share/java/cp-base-new/slf4j-reload4j-1.7.36.jar:/usr/share/java/cp-base-new/snappy-java-1.1.8.4.jar:/usr/share/java/cp-base-new/commons-cli-1.4.jar:/usr/share/java/cp-base-new/scala-collection-compat_2.13-2.6.0.jar:/usr/share/java/cp-base-new/jackson-core-2.13.2.jar:/usr/share/java/cp-base-new/jmx_prometheus_javaagent-0.14.0.jar:/usr/share/java/cp-base-new/kafka-raft-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jackson-module-scala_2.13-2.13.2.jar:/usr/share/java/cp-base-new/re2j-1.6.jar:/usr/share/java/cp-base-new/jose4j-0.7.9.jar:/usr/share/java/cp-base-new/snakeyaml-1.30.jar:/usr/share/java/cp-base-new/logredactor-metrics-1.0.10.jar:/usr/share/java/cp-base-new/logredactor-1.0.10.jar:/usr/share/java/cp-base-new/jackson-dataformat-yaml-2.13.2.jar:/usr/share/java/cp-base-new/kafka_2.13-7.3.0-ccs.jar:/usr/share/java/cp-base-new/kafka-storage-7.3.0-ccs.jar:/usr/share/java/cp-base-new/utility-belt-7.3.0.jar:/usr/share/java/cp-base-new/jackson-annotations-2.13.2.jar:/usr/share/java/cp-base-new/minimal-json-0.9.5.jar:/usr/share/java/cp-base-new/lz4-java-1.8.0.jar:/usr/share/java/cp-base-new/zookeeper-jute-3.6.3.jar:/usr/share/java/cp-base-new/zstd-jni-1.5.2-1.jar:/usr/share/java/cp-base-new/jackson-dataformat-csv-2.13.2.jar:/usr/share/java/cp-base-new/slf4j-api-1.7.36.jar:/usr/share/java/cp-base-new/jackson-databind-2.13.2.2.jar:/usr/share/java/cp-base-new/jolokia-jvm-1.7.1.jar:/usr/share/java/cp-base-new/paranamer-2.8.jar:/usr/share/java/cp-base-new/gson-2.9.0.jar:/usr/share/java/cp-base-new/metrics-core-4.1.12.1.jar:/usr/share/java/cp-base-new/kafka-metadata-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jackson-datatype-jdk8-2.13.2.jar:/usr/share/java/cp-base-new/common-utils-7.3.0.jar:/usr/share/java/cp-base-new/scala-reflect-2.13.5.jar:/usr/share/java/cp-base-new/scala-library-2.13.5.jar:/usr/share/java/cp-base-new/argparse4j-0.7.0.jar:/usr/share/java/cp-base-new/jolokia-core-1.7.1.jar (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:os.version=5.10.104-linuxkit (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:user.name=appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:user.home=/home/appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:user.dir=/home/appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,569] INFO Client environment:os.memory.free=50MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,569] INFO Client environment:os.memory.max=952MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,569] INFO Client environment:os.memory.total=60MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,574] INFO Initiating client connection, connectString=zookeeper:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher#221af3c0 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,578] INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)
[2022-12-14 12:18:39,587] INFO jute.maxbuffer value is 1048575 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2022-12-14 12:18:39,597] INFO zookeeper.request.timeout value is 0. feature enabled=false (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,621] INFO Opening socket connection to server zookeeper/172.18.0.2:2181. (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,623] INFO SASL config status: Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,639] INFO Socket connection established, initiating session, client: /172.18.0.5:47176, server: zookeeper/172.18.0.2:2181 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,658] INFO Session establishment complete on server zookeeper/172.18.0.2:2181, session id = 0x10000250f890001, negotiated timeout = 40000 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,678] WARN An exception was thrown while closing send thread for session 0x10000250f890001. (org.apache.zookeeper.ClientCnxn)
EndOfStreamException: Unable to read additional data from server sessionid 0x10000250f890001, likely server has closed socket
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1290)
[2022-12-14 12:18:39,785] INFO Session: 0x10000250f890001 closed (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,785] INFO EventThread shut down for session: 0x10000250f890001 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,785] INFO Initiating client connection, connectString=zookeeper:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher#55a1c291 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,786] INFO jute.maxbuffer value is 1048575 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2022-12-14 12:18:39,786] INFO zookeeper.request.timeout value is 0. feature enabled=false (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,787] INFO Opening socket connection to server zookeeper/172.18.0.2:2181. (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,787] INFO SASL config status: Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,788] INFO Socket connection established, initiating session, client: /172.18.0.5:47178, server: zookeeper/172.18.0.2:2181 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,799] INFO Session establishment complete on server zookeeper/172.18.0.2:2181, session id = 0x10000250f890002, negotiated timeout = 40000 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,979] INFO Session: 0x10000250f890002 closed (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,979] INFO EventThread shut down for session: 0x10000250f890002 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:40,122] INFO AdminClientConfig values: 
bootstrap.servers = [broker:9093]
client.dns.lookup = use_all_dns_ips
client.id = 
connections.max.idle.ms = 300000
default.api.timeout.ms = 60000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 2147483647
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.connect.timeout.ms = null
sasl.login.read.timeout.ms = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.login.retry.backoff.max.ms = 10000
sasl.login.retry.backoff.ms = 100
sasl.mechanism = GSSAPI
sasl.oauthbearer.clock.skew.seconds = 30
sasl.oauthbearer.expected.audience = null
sasl.oauthbearer.expected.issuer = null
sasl.oauthbearer.jwks.endpoint.refresh.ms = 3600000
sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 10000
sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100
sasl.oauthbearer.jwks.endpoint.url = null
sasl.oauthbearer.scope.claim.name = scope
sasl.oauthbearer.sub.claim.name = sub
sasl.oauthbearer.token.endpoint.url = null
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
socket.connection.setup.timeout.max.ms = 30000
socket.connection.setup.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.certificate.chain = null
ssl.keystore.key = null
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.3
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.certificates = null
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
 (org.apache.kafka.clients.admin.AdminClientConfig)
[2022-12-14 12:18:40,312] INFO Kafka version: 7.3.0-ccs (org.apache.kafka.common.utils.AppInfoParser)
[2022-12-14 12:18:40,312] INFO Kafka commitId: b8341813ae2b0444 (org.apache.kafka.common.utils.AppInfoParser)
[2022-12-14 12:18:40,312] INFO Kafka startTimeMs: 1671020320310 (org.apache.kafka.common.utils.AppInfoParser)
Using log4j config /etc/schema-registry/log4j.properties
===> Launching ... 
===> Launching schema-registry ... 
[2022-12-14 12:18:41,910] INFO SchemaRegistryConfig values: 
access.control.allow.headers = 
access.control.allow.methods = 
access.control.allow.origin = 
access.control.skip.options = true
authentication.method = NONE
authentication.realm = 
authentication.roles = [*]
authentication.skip.paths = []
avro.compatibility.level = 
compression.enable = true
connector.connection.limit = 0
csrf.prevention.enable = false
csrf.prevention.token.endpoint = /csrf
csrf.prevention.token.expiration.minutes = 30
csrf.prevention.token.max.entries = 10000
debug = false
dos.filter.delay.ms = 100
dos.filter.enabled = false
dos.filter.insert.headers = true
dos.filter.ip.whitelist = []
dos.filter.managed.attr = false
dos.filter.max.idle.tracker.ms = 30000
dos.filter.max.requests.ms = 30000
dos.filter.max.requests.per.connection.per.sec = 25
dos.filter.max.requests.per.sec = 25
dos.filter.max.wait.ms = 50
dos.filter.throttle.ms = 30000
dos.filter.throttled.requests = 5
host.name = schema_registry
http2.enabled = true
idle.timeout.ms = 30000
inter.instance.headers.whitelist = []
inter.instance.protocol = http
kafkastore.bootstrap.servers = []
kafkastore.checkpoint.dir = /tmp
kafkastore.checkpoint.version = 0
kafkastore.connection.url = zookeeper:2181
kafkastore.group.id = 
kafkastore.init.timeout.ms = 60000
kafkastore.sasl.kerberos.kinit.cmd = /usr/bin/kinit
kafkastore.sasl.kerberos.min.time.before.relogin = 60000
kafkastore.sasl.kerberos.service.name = 
kafkastore.sasl.kerberos.ticket.renew.jitter = 0.05
kafkastore.sasl.kerberos.ticket.renew.window.factor = 0.8
kafkastore.sasl.mechanism = GSSAPI
kafkastore.security.protocol = PLAINTEXT
kafkastore.ssl.cipher.suites = 
kafkastore.ssl.enabled.protocols = TLSv1.2,TLSv1.1,TLSv1
kafkastore.ssl.endpoint.identification.algorithm = 
kafkastore.ssl.key.password = [hidden]
kafkastore.ssl.keymanager.algorithm = SunX509
kafkastore.ssl.keystore.location = 
kafkastore.ssl.keystore.password = [hidden]
kafkastore.ssl.keystore.type = JKS
kafkastore.ssl.protocol = TLS
kafkastore.ssl.provider = 
kafkastore.ssl.trustmanager.algorithm = PKIX
kafkastore.ssl.truststore.location = 
kafkastore.ssl.truststore.password = [hidden]
kafkastore.ssl.truststore.type = JKS
kafkastore.timeout.ms = 500
kafkastore.topic = _schemas
kafkastore.topic.replication.factor = 3
kafkastore.topic.skip.validation = false
kafkastore.update.handlers = []
kafkastore.write.max.retries = 5
leader.eligibility = true
listener.protocol.map = []
listeners = []
master.eligibility = null
metric.reporters = []
metrics.jmx.prefix = kafka.schema.registry
metrics.num.samples = 2
metrics.sample.window.ms = 30000
metrics.tag.map = []
mode.mutability = true
nosniff.prevention.enable = false
port = 8081
proxy.protocol.enabled = false
reject.options.request = false
request.logger.name = io.confluent.rest-utils.requests
request.queue.capacity = 2147483647
request.queue.capacity.growby = 64
request.queue.capacity.init = 128
resource.extension.class = []
resource.extension.classes = []
resource.static.locations = []
response.http.headers.config = 
response.mediatype.default = application/vnd.schemaregistry.v1+json
response.mediatype.preferred = [application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json]
rest.servlet.initializor.classes = []
schema.cache.expiry.secs = 300
schema.cache.size = 1000
schema.canonicalize.on.consume = []
schema.compatibility.level = backward
schema.providers = []
schema.registry.group.id = schema-registry
schema.registry.inter.instance.protocol = 
schema.registry.resource.extension.class = []
server.connection.limit = 0
shutdown.graceful.ms = 1000
ssl.cipher.suites = []
ssl.client.auth = false
ssl.client.authentication = NONE
ssl.enabled.protocols = []
ssl.endpoint.identification.algorithm = null
ssl.key.password = [hidden]
ssl.keymanager.algorithm = 
ssl.keystore.location = 
ssl.keystore.password = [hidden]
ssl.keystore.reload = false
ssl.keystore.type = JKS
ssl.keystore.watch.location = 
ssl.protocol = TLS
ssl.provider = 
ssl.trustmanager.algorithm = 
ssl.truststore.location = 
ssl.truststore.password = [hidden]
ssl.truststore.type = JKS
suppress.stack.trace.response = true
thread.pool.max = 200
thread.pool.min = 8
websocket.path.prefix = /ws
websocket.servlet.initializor.classes = []
 (io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig)
[2022-12-14 12:18:42,007] INFO Logging initialized #879ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log)
[2022-12-14 12:18:42,066] INFO Initial capacity 128, increased by 64, maximum capacity 2147483647. (io.confluent.rest.ApplicationServer)
[2022-12-14 12:18:42,172] WARN DEPRECATION warning: `listeners` configuration is not configured. Falling back to the deprecated `port` configuration. (io.confluent.rest.ApplicationServer)
[2022-12-14 12:18:42,175] INFO Adding listener with HTTP/2: http://0.0.0.0:8081 (io.confluent.rest.ApplicationServer)
[2022-12-14 12:18:42,589] WARN DEPRECATION warning: `listeners` configuration is not configured. Falling back to the deprecated `port` configuration. (io.confluent.rest.ApplicationServer)
[2022-12-14 12:18:42,744] ERROR Server died unexpectedly:  (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain)
org.apache.kafka.common.config.ConfigException: No supported Kafka endpoints are configured. kafkastore.bootstrap.servers must have at least one endpoint matching kafkastore.security.protocol.
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig.endpointsToBootstrapServers(SchemaRegistryConfig.java:666)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig.bootstrapBrokers(SchemaRegistryConfig.java:615)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1566)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:171)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:71)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:90)
at io.confluent.rest.Application.configureHandler(Application.java:285)
at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:270)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:44)
Server dies unexpectedly because there is no supported Kafka endpoints are configured.I found similar problems that had been asked like 6 years ago. So that did not help
I have searched confluent docs, I have tried to use different types of versions.

The error is informing you that you need to remove the (deprecated) property SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL and use SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS instead, and set it to broker:9093
You could start with a working compose file then add Neo4j to that.
Note: The Schema Registry is not a requirement to use Kafka Connect with or without Neo4j, or any Python library.

Related

Can't read data from Postgres to Kafka

Can't read data from Postgres to Kafka. I connected Kafka to my Postgres database by debezium in docker. But when I run kafkacat in docker to read postgres I get an error
ERROR: Failed to format message in postgres.public.users [0] at offset 0: Avro/Schema-registry message deserialization: REST request failed (code -1): HTTP request failed: Couldn't resolve host name : terminating
I run kafkacat by the command:
docker run --tty --network pythonproject5_default confluentinc/cp-kafkacat kafkacat -b kafka:9092 -C -s key=s -s value=avro -r http://schema-regisrty:8081 -t postgres.public.users
debezium connector file looks like this
{
"name": "db-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"plugin.name": "pgoutput",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "George",
"database.password": "tech1337",
"database.dbname": "tech_db",
"database.server.name": "postgres",
"table.include.list": "public.users"
}
}
schemas in application looks like this:
name: str
time_created: int
gender: str
age: int
last_name: str
ip: str
city: str
premium: bool = None
birth_day: str
balance: int
user_id: int
and model like this:
class User(Base):
__tablename__ = 'users'
name = Column(String)
time_created = Column(Integer)
gender = Column(String)
age = Column(Integer)
last_name = Column(String)
ip = Column(String)
city = Column(String)
premium = Column(Boolean)
birth_day = Column(String)
user_id = Column(Integer, primary_key=True, index=True)
my_vet = relationship("VET", back_populates="owner")
docker compose file:
version: "3.7"
services:
postgres:
image: debezium/postgres:13
ports:
- 5432:5432
environment:
- POSTGRES_USER=goerge
- POSTGRES_PASSWORD=tech1337
- POSTGRES_DB=5_pm_db
zookeeper:
image: confluentinc/cp-zookeeper:5.5.3
environment:
ZOOKEEPER_CLIENT_PORT: 2181
broker:
image: confluentinc/cp-kafka:7.3.0
container_name: broker
ports:
- "5056:5056"
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_INTERNAL:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092 \
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
kafka:
image: confluentinc/cp-enterprise-kafka:5.5.3
depends_on: [zookeeper]
environment:
KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_BROKER_ID: 1
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_JMX_PORT: 9991
ports:
- 9092:9092
debezium:
image: debezium/connect:1.4
environment:
BOOTSTRAP_SERVERS: kafka:9092
GROUP_ID: 1
CONFIG_STORAGE_TOPIC: connect_configs
OFFSET_STORAGE_TOPIC: connect_offsets
KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema- registry:8081
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
depends_on: [kafka]
ports:
- 8083:8083
schema-registry:
image: confluentinc/cp-schema-registry:5.5.3
environment:
- SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181
- SCHEMA_REGISTRY_HOST_NAME=schema-registry
- SCHEMA_REGISTRY_LISTENERS=http://schema- registry:8081,http://localhost:8081
ports:
- 8081:8081
depends_on: [zookeeper, kafka]
All operation in code with Postgres's I realised in SQLAlchemy.
If anyone get this error please write how you handle with this and how can I fix this error?
Avro/Schema-registry message deserialization: REST request failed
This needs to be http://0.0.0.0:8081, which is the default value.
SCHEMA_REGISTRY_LISTENERS=http://schema- registry:8081,http://localhost:8081
Also, you need to remove spaces in schema- registry in all the other places this is used.
Alternatively, don't use AvroConverters, and use something else that doesn't need the Registry, such as JSONConverter
KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
You also can remove the broker service since it is never used in your Compose

Schema registry kafka connect Mongodb

Trying to send data to mongodb through kafka using schema registry
My docker compsoe looks like
schema-registry:
image: confluentinc/cp-schema-registry:latest
hostname: schema-registry
container_name: schema-registry
depends_on:
- zookeeper
- broker
ports:
- "8081:8081"
networks:
- localnet
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS:
"PLAINTEXT://broker:29092"
SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081
connect:
image: quickstart-connect-1.7.0:1.0
build:
context: .
dockerfile: Dockerfile-MongoConnect
hostname: connect
container_name: connect
depends_on:
- zookeeper
- broker
- schema-registry
networks:
- localnet
environment:
CONNECT_BOOTSTRAP_SERVERS: "broker:29092"
CONNECT_REST_ADVERTISED_HOST_NAME: connect
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: connect-cluster-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_KEY_CONVERTER: "io.confluent.connect.avro.AvroConverter"
CONNECT_VALUE_CONVERTER: "io.confluent.connect.avro.AvroConverter"
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: "http://localhost:8081"
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://localhost:8081"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_ZOOKEEPER_CONNECT: "zookeeper:2181"
CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
CONNECT_AUTO_CREATE_TOPICS_ENABLE: "true"
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
and my sink connector is :
curl -X POST \
-H "Content-Type: application/json" \
--data '
{"name": "mongo-sink-dinos",
"config": {
"connector.class":"com.mongodb.kafka.connect.MongoSinkConnector",
"connection.uri":"mongodb://mongo1:27017/?replicaSet=rs0",
"database":"quickstart",
"collection":"abcd",
"topics":"abcdf",
"key.converter":"io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url":"http://localhost:8081",
"value.converter":"io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url":"http://localhost:8081"
}
}
' \
http://connect:8083/connectors -w "\n"
I use command : kafka-avro-console-producer --topic dinosaurs --
broker-list broker:29092 \
--property schema.registry.url="http://localhost:8081" \
--property value.schema="$(< abc.avsc)"
(I have a file abc.avsc in schema registry)
But the data is not pushed to mongodb, it is actually received by consumer. When I check Connect logs it shows like :
org.apache.kafka.common.config.ConfigException: Missing required
configuration "schema.registry.url" which has no default value.
What might the reason that the data is not pushed to mongodb

Broken DAG: [/usr/local/airflow/dags/my_dag.py] No module named 'airflow.operators.subdag'

I'm running airflow inside docker container and getting airflow image (puckel/docker-airflow:latest) from docker hub. I can access Airflow UI through localhost:8080 but without executing the DAG and the error mentioned in the subject above. I'm even writing pip command to install apache-airflow in my Dockerfile. Here is how my Dockerfile, docker-compose.yml and dag.py looks like:
Dockerfile:
FROM puckel/docker-airflow:latest
RUN pip install requests
RUN pip install pandas
RUN pip install 'apache-airflow'
docker-compose.yml:
version: '3.7'
services:
redis:
image: redis:5.0.5
environment:
REDIS_HOST: redis
REDIS_PORT: 6379
ports:
- 6379:6379
postgres:
image: postgres:9.6
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
- PGDATA=/var/lib/postgresql/data/pgdata
volumes:
- ./pgdata:/var/lib/postgresql/data/pgdata
logging:
options:
max-size: 10m
max-file: "3"
webserver:
build: ./dockerfiles
restart: always
depends_on:
- postgres
- redis
environment:
- LOAD_EX=n
- FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- EXECUTOR=Celery
logging:
options:
max-size: 10m
max-file: "3"
volumes:
- ./dags:/usr/local/airflow/dags
- ./config/airflow.cfg:/usr/local/airflow/airflow.cfg
ports:
- "8080:8080"
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
flower:
build: ./dockerfiles
restart: always
depends_on:
- redis
environment:
- EXECUTOR=Celery
ports:
- "5555:5555"
command: flower
scheduler:
build: ./dockerfiles
restart: always
depends_on:
- webserver
volumes:
- ./dags:/usr/local/airflow/dags
environment:
- LOAD_EX=n
- FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- EXECUTOR=Celery
command: scheduler
worker:
build: ./dockerfiles
restart: always
depends_on:
- scheduler
volumes:
- ./dags:/usr/local/airflow/dags
environment:
- FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- EXECUTOR=Celery
command: worker
dag.py:
from airflow import DAG
from airflow.operators.subdag import SubDagOperator
from airflow.operators.python import PythonOperator, BranchPythonOperator
from airflow.operators.bash import BashOperator
from datetime import datetime
from random import randint
def _choosing_best_model(ti):
accuracies = ti.xcom_pull(task_ids=[
'training_model_A',
'training_model_B',
'training_model_C'
])
if max(accuracies) > 8:
return 'accurate'
return 'inaccurate'
def _training_model(model):
return randint(1, 10)
with DAG("test",
start_date=datetime(2021, 1 ,1),
schedule_interval='#daily',
catchup=False) as dag:
training_model_tasks = [
PythonOperator(
task_id=f"training_model_{model_id}",
python_callable=_training_model,
op_kwargs={
"model": model_id
}
) for model_id in ['A', 'B', 'C']
]
choosing_best_model = BranchPythonOperator(
task_id="choosing_best_model",
python_callable=_choosing_best_model
)
accurate = BashOperator(
task_id="accurate",
bash_command="echo 'accurate'"
)
inaccurate = BashOperator(
task_id="inaccurate",
bash_command=" echo 'inaccurate'"
)
training_model_tasks >> choosing_best_model >> [accurate, inaccurate]
Am I missing anything here? Please let me know if you can. Thanks :)

DynamoDB connector in degraded state

I'm trying to write Kafka topic data to local Dynamodb. However, the Connector state is always in degraded state. Below is my connector cofig properties.
{
"key.converter.schemas.enable": "false",
"value.converter.schemas.enable": "false",
"name": "dynamo-sink-connector",
"connector.class": "io.confluent.connect.aws.dynamodb.DynamoDbSinkConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"topics": [
"KAFKA_STOCK"
],
"aws.dynamodb.pk.hash": "value.companySymbol",
"aws.dynamodb.pk.sort": "value.txTime",
"aws.dynamodb.endpoint": "http://localhost:8000",
"confluent.topic.bootstrap.servers": [
"broker:29092"
]
}
I was referring to this https://github.com/RWaltersMA/mongo-source-sink and replaced mongo with DynamoDB sink
Could someone provide a simple working example, please?
Below is the full example for using AWS DynamoDB Sink connector.
Thanks! to #OneCricketeer for his suggestions
Dockerfile-DynamoDBConnect which is referred to in docker-compose.yml below
FROM confluentinc/cp-kafka-connect:latest
ENV CONNECT_PLUGIN_PATH="/usr/share/java,/usr/share/confluent-hub-components"
RUN confluent-hub install --no-prompt confluentinc/kafka-connect-aws-dynamodb:latest
docker-compose.yml
version: '3.6'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
networks:
- localnet
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-kafka:latest
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "19092:19092"
- "9092:9092"
networks:
- localnet
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:19092,PLAINTEXT_HOST://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
schema-registry:
image: confluentinc/cp-schema-registry:latest
hostname: schema-registry
container_name: schema-registry
depends_on:
- zookeeper
- broker
ports:
- "8081:8081"
networks:
- localnet
environment:
SCHEMA_REGISTRY_HOST_NAME: localhost
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181'
dynamodb-local:
command: "-jar DynamoDBLocal.jar -sharedDb -dbPath ./data"
image: "amazon/dynamodb-local:latest"
container_name: dynamodb-local
ports:
- "8000:8000"
networks:
- localnet
volumes:
- "./docker/dynamodb:/home/dynamodblocal/data"
working_dir: /home/dynamodblocal
connect:
image: confluentinc/cp-kafka-connect-base:latest
build:
context: .
dockerfile: Dockerfile-DynamoDBConnect
hostname: connect
container_name: connect
depends_on:
- zookeeper
- broker
- schema-registry
ports:
- "8083:8083"
networks:
- localnet
environment:
CONNECT_BOOTSTRAP_SERVERS: 'broker:19092'
CONNECT_REST_ADVERTISED_HOST_NAME: connect
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.storage.StringConverter"
CONNECT_VALUE_CONVERTER: "io.confluent.connect.avro.AvroConverter"
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO"
CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181'
# Assumes image is based on confluentinc/kafka-connect-datagen:latest which is pulling 5.3.0 Connect image
CLASSPATH: "/usr/share/java/monitoring-interceptors/monitoring-interceptors-5.3.0.jar"
CONNECT_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
CONNECT_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
command: "bash -c 'if [ ! -d /usr/share/confluent-hub-components/confluentinc-kafka-connect-datagen ]; then echo \"WARNING: Did not find directory for kafka-connect-datagen (did you remember to run: docker-compose up -d --build ?)\"; fi ; /etc/confluent/docker/run'"
volumes:
- ../build/confluent/kafka-connect-aws-dynamodb:/usr/sahre/confluent-hub-components/confluentinc-kafka-connect-aws-dynamodb
- $HOME/.aws/credentialstest:/home/appuser/.aws/credentials
- $HOME/.aws/configtest:/home/appuser/.aws/config
rest-proxy:
image: confluentinc/cp-kafka-rest:5.3.0
depends_on:
- zookeeper
- broker
- schema-registry
ports:
- "8082:8082"
hostname: rest-proxy
container_name: rest-proxy
networks:
- localnet
environment:
KAFKA_REST_HOST_NAME: rest-proxy
KAFKA_REST_BOOTSTRAP_SERVERS: 'broker:19092'
KAFKA_REST_LISTENERS: "http://0.0.0.0:8082"
KAFKA_REST_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
control-center:
image: confluentinc/cp-enterprise-control-center:6.0.0
hostname: control-center
container_name: control-center
networks:
- localnet
depends_on:
- broker
- schema-registry
- connect
- dynamodb-local
ports:
- "9021:9021"
environment:
CONTROL_CENTER_BOOTSTRAP_SERVERS: PLAINTEXT://broker:19092
CONTROL_CENTER_KAFKA_CodeCamp_BOOTSTRAP_SERVERS: PLAINTEXT://broker:19092
CONTROL_CENTER_REPLICATION_FACTOR: 1
CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1
CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_REPLICATION: 1
CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
CONTROL_CENTER_METRICS_TOPIC_REPLICATION: 1
CONTROL_CENTER_METRICS_TOPIC_PARTITIONS: 1
# Amount of heap to use for internal caches. Increase for better throughput
CONTROL_CENTER_STREAMS_CACHE_MAX_BYTES_BUFFERING: 100000000
CONTROL_CENTER_STREAMS_CONSUMER_REQUEST_TIMEOUT_MS: "960032"
CONTROL_CENTER_STREAMS_NUM_STREAM_THREADS: 1
# HTTP and HTTPS to Control Center UI
CONTROL_CENTER_REST_LISTENERS: http://0.0.0.0:9021
PORT: 9021
# Connect
CONTROL_CENTER_CONNECT_CONNECT1_CLUSTER: http://connect:8083
# Schema Registry
CONTROL_CENTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
networks:
localnet:
attachable: true
Java Example to publish Messages in Avro Format
public class AvroProducer {
public static void main(String[] args) {
// Variables (bootstrap server, topic name, logger)
final Logger logger = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
final String bootstrapServers = "127.0.0.1:9092";
final String topicName = "stockdata";
// Properties declaration (bootstrap server, key serializer, value serializer
// Note use of the ProducerConfig object
Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class.getName());
properties.setProperty(AbstractKafkaSchemaSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081");
// Create Producer object
// Note the generics
KafkaProducer<String, Stock> producer = new KafkaProducer<>(properties);
Stock stock = Stock.newBuilder()
.setStockCode("APPL")
.setStockName("Apple")
.setStockPrice(150.0)
.build();
// Create ProducerRecord object
ProducerRecord<String, Stock> rec = new ProducerRecord<>(topicName, stock);
// Send data to the producer (optional callback)
producer.send(rec);
// Call producer flush() and/or close()
producer.flush();
producer.close();
}
}

how to create subject for ksqldb from kafka tapic

I use Mysql database. Suppose I have a table for orders. And using debezium mysql connect for Kafka, the order topic has been created. But I have trouble creating a stream in ksqldb.
CREATE STREAM orders WITH (
kafka_topic = 'myserver.mydatabase.orders',
value_format = 'avro'
);
my docker-compose file look like this
zookeeper:
image: confluentinc/cp-zookeeper:latest
container_name: zookeeper
privileged: true
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: confluentinc/cp-kafka:latest
container_name: kafka
depends_on:
- zookeeper
ports:
- '9092:9092'
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
schema-registry:
image: confluentinc/cp-schema-registry:latest
container_name: schema-registry
depends_on:
- kafka
- zookeeper
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: "zookeeper:2181"
SCHEMA_REGISTRY_HOST_NAME: schema-registry
kafka-connect:
hostname: kafka-connect
image: confluentinc/cp-kafka-connect:latest
container_name: kafka-connect
ports:
- 8083:8083
depends_on:
- schema-registry
environment:
CONNECT_BOOTSTRAP_SERVERS: kafka:9092
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: "quickstart-avro"
CONNECT_CONFIG_STORAGE_TOPIC: "quickstart-avro-config"
CONNECT_OFFSET_STORAGE_TOPIC: "quickstart-avro-offsets"
CONNECT_STATUS_STORAGE_TOPIC: "quickstart-avro-status"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
CONNECT_LOG4J_ROOT_LOGLEVEL: DEBUG
CONNECT_PLUGIN_PATH: "/usr/share/java,/etc/kafka-connect/jars"
volumes:
- $PWD/kafka/jars:/etc/kafka-connect/jars
ksqldb-server:
image: confluentinc/ksqldb-server:latest
hostname: ksqldb-server
container_name: ksqldb-server
depends_on:
- kafka
ports:
- "8088:8088"
environment:
KSQL_LISTENERS: http://0.0.0.0:8088
KSQL_BOOTSTRAP_SERVERS: "kafka:9092"
KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
KSQL_CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
ksqldb-cli:
image: confluentinc/ksqldb-cli:latest
container_name: ksqldb-cli
depends_on:
- kafka
- ksqldb-server
- schema-registry
entrypoint: /bin/sh
tty: true
subject must be created for this table first. What is the difference between the avro, json?
Using debezium mysql connect for Kafka
You can set that to use AvroConverter, then the subject will be created automatically
Otherwise, you can have KSQL use VALUE_FORMAT=JSON and you need to manually specify all the field names. Unclear what difference you're asking about (they are different serialization formats), but from a KSQL perspective, JSON alone is seen as plain-text (similar to DELIMITED) and needs to be parsed, as compared to the other formats like Avro where the schema+fields are already known.
I solved the issue. Using this configuration, you can send mysql table in the topic without the before and next state.
CREATE SOURCE CONNECTOR final_connector WITH (
'connector.class' = 'io.debezium.connector.mysql.MySqlConnector',
'database.hostname' = 'mysql',
'database.port' = '3306',
'database.user' = 'root',
'database.password' = 'mypassword',
'database.allowPublicKeyRetrieval' = 'true',
'database.server.id' = '184055',
'database.server.name' = 'db',
'database.whitelist' = 'mydb',
'database.history.kafka.bootstrap.servers' = 'kafka:9092',
'database.history.kafka.topic' = 'mydb',
'table.whitelist' = 'mydb.user',
'include.schema.changes' = 'false',
'transforms'= 'unwrap,extractkey',
'transforms.unwrap.type'= 'io.debezium.transforms.ExtractNewRecordState',
'transforms.extractkey.type'= 'org.apache.kafka.connect.transforms.ExtractField$Key',
'transforms.extractkey.field'= 'id',
'key.converter'= 'org.apache.kafka.connect.converters.IntegerConverter',
'value.converter'= 'io.confluent.connect.avro.AvroConverter',
'value.converter.schema.registry.url'= 'http://schema-registry:8081'
);
and create your stream simply !
This video can help you a lot
https://www.youtube.com/watch?v=2fUOi9wJPhk&t=1550s