Connect Kafka to Neo4j - apache-kafka
I have created a simple python code that generates user_id, receipent_id and amount. I have created a kafka producer and consumer. Python code returns data as a json. Now I am trying connect my data to Neo4j through kafka but I am unable to do it.
https://neo4j.com/docs/kafka/quickstart-connect/
I started to check the documents but when I directly copy the docker-compose.yml
---
version: '2'
services:
neo4j:
image: neo4j:4.0.3-enterprise
hostname: neo4j
container_name: neo4j
ports:
- "7474:7474"
- "7687:7687"
environment:
NEO4J_kafka_bootstrap_servers: broker:9093
NEO4J_AUTH: neo4j/connect
NEO4J_dbms_memory_heap_max__size: 8G
NEO4J_ACCEPT_LICENSE_AGREEMENT: yes
zookeeper:
image: confluentinc/cp-zookeeper
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-enterprise-kafka
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "9092:9092"
expose:
- "9093"
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9093,OUTSIDE://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9093,OUTSIDE://0.0.0.0:9092
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker:9093
# workaround if we change to a custom name the schema_registry fails to start
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
CONFLUENT_METRICS_REPORTER_ZOOKEEPER_CONNECT: zookeeper:2181
CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
CONFLUENT_METRICS_ENABLE: 'true'
CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
schema_registry:
image: confluentinc/cp-schema-registry
hostname: schema_registry
container_name: schema_registry
depends_on:
- zookeeper
- broker
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema_registry
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181'
connect:
image: confluentinc/cp-kafka-connect
hostname: connect
container_name: connect
depends_on:
- zookeeper
- broker
- schema_registry
ports:
- "8083:8083"
volumes:
- ./plugins:/tmp/connect-plugins
environment:
CONNECT_BOOTSTRAP_SERVERS: 'broker:9093'
CONNECT_REST_ADVERTISED_HOST_NAME: connect
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema_registry:8081'
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema_registry:8081'
CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181'
CONNECT_PLUGIN_PATH: /usr/share/java,/tmp/connect-plugins
CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=DEBUG,org.I0Itec.zkclient=DEBUG,org.reflections=ERROR
control-center:
image: confluentinc/cp-enterprise-control-center
hostname: control-center
container_name: control-center
depends_on:
- zookeeper
- broker
- schema_registry
- connect
ports:
- "9021:9021"
environment:
CONTROL_CENTER_BOOTSTRAP_SERVERS: 'broker:9093'
CONTROL_CENTER_ZOOKEEPER_CONNECT: 'zookeeper:2181'
CONTROL_CENTER_CONNECT_CLUSTER: 'connect:8083'
CONTROL_CENTER_REPLICATION_FACTOR: 1
CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1
CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
CONFLUENT_METRICS_TOPIC_REPLICATION: 1
PORT: 9021
Docker containers
I get an error from schema-registry container which is
===> User
uid=1000(appuser) gid=1000(appuser) groups=1000(appuser)
===> Configuring ...
===> Running preflight checks ...
===> Check if Zookeeper is healthy ...
[2022-12-14 12:18:38,319] INFO Client environment:zookeeper.version=3.6.3--6401e4ad2087061bc6b9f80dec2d69f2e3c8660a, built on 04/08/2021 16:35 GMT (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:host.name=schema_registry (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.version=11.0.16.1 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.vendor=Azul Systems, Inc. (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.home=/usr/lib/jvm/zulu11-ca (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.class.path=/usr/share/java/cp-base-new/disk-usage-agent-7.3.0.jar:/usr/share/java/cp-base-new/reload4j-1.2.19.jar:/usr/share/java/cp-base-new/kafka-server-common-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jopt-simple-5.0.4.jar:/usr/share/java/cp-base-new/scala-logging_2.13-3.9.4.jar:/usr/share/java/cp-base-new/scala-java8-compat_2.13-1.0.2.jar:/usr/share/java/cp-base-new/zookeeper-3.6.3.jar:/usr/share/java/cp-base-new/json-simple-1.1.1.jar:/usr/share/java/cp-base-new/metrics-core-2.2.0.jar:/usr/share/java/cp-base-new/audience-annotations-0.5.0.jar:/usr/share/java/cp-base-new/kafka-storage-api-7.3.0-ccs.jar:/usr/share/java/cp-base-new/kafka-clients-7.3.0-ccs.jar:/usr/share/java/cp-base-new/slf4j-reload4j-1.7.36.jar:/usr/share/java/cp-base-new/snappy-java-1.1.8.4.jar:/usr/share/java/cp-base-new/commons-cli-1.4.jar:/usr/share/java/cp-base-new/scala-collection-compat_2.13-2.6.0.jar:/usr/share/java/cp-base-new/jackson-core-2.13.2.jar:/usr/share/java/cp-base-new/jmx_prometheus_javaagent-0.14.0.jar:/usr/share/java/cp-base-new/kafka-raft-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jackson-module-scala_2.13-2.13.2.jar:/usr/share/java/cp-base-new/re2j-1.6.jar:/usr/share/java/cp-base-new/jose4j-0.7.9.jar:/usr/share/java/cp-base-new/snakeyaml-1.30.jar:/usr/share/java/cp-base-new/logredactor-metrics-1.0.10.jar:/usr/share/java/cp-base-new/logredactor-1.0.10.jar:/usr/share/java/cp-base-new/jackson-dataformat-yaml-2.13.2.jar:/usr/share/java/cp-base-new/kafka_2.13-7.3.0-ccs.jar:/usr/share/java/cp-base-new/kafka-storage-7.3.0-ccs.jar:/usr/share/java/cp-base-new/utility-belt-7.3.0.jar:/usr/share/java/cp-base-new/jackson-annotations-2.13.2.jar:/usr/share/java/cp-base-new/minimal-json-0.9.5.jar:/usr/share/java/cp-base-new/lz4-java-1.8.0.jar:/usr/share/java/cp-base-new/zookeeper-jute-3.6.3.jar:/usr/share/java/cp-base-new/zstd-jni-1.5.2-1.jar:/usr/share/java/cp-base-new/jackson-dataformat-csv-2.13.2.jar:/usr/share/java/cp-base-new/slf4j-api-1.7.36.jar:/usr/share/java/cp-base-new/jackson-databind-2.13.2.2.jar:/usr/share/java/cp-base-new/jolokia-jvm-1.7.1.jar:/usr/share/java/cp-base-new/paranamer-2.8.jar:/usr/share/java/cp-base-new/gson-2.9.0.jar:/usr/share/java/cp-base-new/metrics-core-4.1.12.1.jar:/usr/share/java/cp-base-new/kafka-metadata-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jackson-datatype-jdk8-2.13.2.jar:/usr/share/java/cp-base-new/common-utils-7.3.0.jar:/usr/share/java/cp-base-new/scala-reflect-2.13.5.jar:/usr/share/java/cp-base-new/scala-library-2.13.5.jar:/usr/share/java/cp-base-new/argparse4j-0.7.0.jar:/usr/share/java/cp-base-new/jolokia-core-1.7.1.jar (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.version=5.10.104-linuxkit (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:user.name=appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:user.home=/home/appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:user.dir=/home/appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.memory.free=51MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.memory.max=952MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,320] INFO Client environment:os.memory.total=60MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,326] INFO Initiating client connection, connectString=zookeeper:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher#3c0a50da (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,332] INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)
[2022-12-14 12:18:38,341] INFO jute.maxbuffer value is 1048575 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2022-12-14 12:18:38,351] INFO zookeeper.request.timeout value is 0. feature enabled=false (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,372] INFO Opening socket connection to server zookeeper/172.18.0.2:2181. (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,375] INFO SASL config status: Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,388] INFO Socket connection established, initiating session, client: /172.18.0.5:47172, server: zookeeper/172.18.0.2:2181 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,542] INFO Session establishment complete on server zookeeper/172.18.0.2:2181, session id = 0x10000250f890000, negotiated timeout = 40000 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:38,587] WARN An exception was thrown while closing send thread for session 0x10000250f890000. (org.apache.zookeeper.ClientCnxn)
EndOfStreamException: Unable to read additional data from server sessionid 0x10000250f890000, likely server has closed socket
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1290)
[2022-12-14 12:18:38,699] INFO Session: 0x10000250f890000 closed (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:38,699] INFO EventThread shut down for session: 0x10000250f890000 (org.apache.zookeeper.ClientCnxn)
Using log4j config /etc/schema-registry/log4j.properties
===> Check if Kafka is healthy ...
[2022-12-14 12:18:39,567] INFO Client environment:zookeeper.version=3.6.3--6401e4ad2087061bc6b9f80dec2d69f2e3c8660a, built on 04/08/2021 16:35 GMT (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,567] INFO Client environment:host.name=schema_registry (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.version=11.0.16.1 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.vendor=Azul Systems, Inc. (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.home=/usr/lib/jvm/zulu11-ca (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.class.path=/usr/share/java/cp-base-new/disk-usage-agent-7.3.0.jar:/usr/share/java/cp-base-new/reload4j-1.2.19.jar:/usr/share/java/cp-base-new/kafka-server-common-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jopt-simple-5.0.4.jar:/usr/share/java/cp-base-new/scala-logging_2.13-3.9.4.jar:/usr/share/java/cp-base-new/scala-java8-compat_2.13-1.0.2.jar:/usr/share/java/cp-base-new/zookeeper-3.6.3.jar:/usr/share/java/cp-base-new/json-simple-1.1.1.jar:/usr/share/java/cp-base-new/metrics-core-2.2.0.jar:/usr/share/java/cp-base-new/audience-annotations-0.5.0.jar:/usr/share/java/cp-base-new/kafka-storage-api-7.3.0-ccs.jar:/usr/share/java/cp-base-new/kafka-clients-7.3.0-ccs.jar:/usr/share/java/cp-base-new/slf4j-reload4j-1.7.36.jar:/usr/share/java/cp-base-new/snappy-java-1.1.8.4.jar:/usr/share/java/cp-base-new/commons-cli-1.4.jar:/usr/share/java/cp-base-new/scala-collection-compat_2.13-2.6.0.jar:/usr/share/java/cp-base-new/jackson-core-2.13.2.jar:/usr/share/java/cp-base-new/jmx_prometheus_javaagent-0.14.0.jar:/usr/share/java/cp-base-new/kafka-raft-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jackson-module-scala_2.13-2.13.2.jar:/usr/share/java/cp-base-new/re2j-1.6.jar:/usr/share/java/cp-base-new/jose4j-0.7.9.jar:/usr/share/java/cp-base-new/snakeyaml-1.30.jar:/usr/share/java/cp-base-new/logredactor-metrics-1.0.10.jar:/usr/share/java/cp-base-new/logredactor-1.0.10.jar:/usr/share/java/cp-base-new/jackson-dataformat-yaml-2.13.2.jar:/usr/share/java/cp-base-new/kafka_2.13-7.3.0-ccs.jar:/usr/share/java/cp-base-new/kafka-storage-7.3.0-ccs.jar:/usr/share/java/cp-base-new/utility-belt-7.3.0.jar:/usr/share/java/cp-base-new/jackson-annotations-2.13.2.jar:/usr/share/java/cp-base-new/minimal-json-0.9.5.jar:/usr/share/java/cp-base-new/lz4-java-1.8.0.jar:/usr/share/java/cp-base-new/zookeeper-jute-3.6.3.jar:/usr/share/java/cp-base-new/zstd-jni-1.5.2-1.jar:/usr/share/java/cp-base-new/jackson-dataformat-csv-2.13.2.jar:/usr/share/java/cp-base-new/slf4j-api-1.7.36.jar:/usr/share/java/cp-base-new/jackson-databind-2.13.2.2.jar:/usr/share/java/cp-base-new/jolokia-jvm-1.7.1.jar:/usr/share/java/cp-base-new/paranamer-2.8.jar:/usr/share/java/cp-base-new/gson-2.9.0.jar:/usr/share/java/cp-base-new/metrics-core-4.1.12.1.jar:/usr/share/java/cp-base-new/kafka-metadata-7.3.0-ccs.jar:/usr/share/java/cp-base-new/jackson-datatype-jdk8-2.13.2.jar:/usr/share/java/cp-base-new/common-utils-7.3.0.jar:/usr/share/java/cp-base-new/scala-reflect-2.13.5.jar:/usr/share/java/cp-base-new/scala-library-2.13.5.jar:/usr/share/java/cp-base-new/argparse4j-0.7.0.jar:/usr/share/java/cp-base-new/jolokia-core-1.7.1.jar (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:os.version=5.10.104-linuxkit (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:user.name=appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:user.home=/home/appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,568] INFO Client environment:user.dir=/home/appuser (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,569] INFO Client environment:os.memory.free=50MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,569] INFO Client environment:os.memory.max=952MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,569] INFO Client environment:os.memory.total=60MB (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,574] INFO Initiating client connection, connectString=zookeeper:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher#221af3c0 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,578] INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)
[2022-12-14 12:18:39,587] INFO jute.maxbuffer value is 1048575 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2022-12-14 12:18:39,597] INFO zookeeper.request.timeout value is 0. feature enabled=false (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,621] INFO Opening socket connection to server zookeeper/172.18.0.2:2181. (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,623] INFO SASL config status: Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,639] INFO Socket connection established, initiating session, client: /172.18.0.5:47176, server: zookeeper/172.18.0.2:2181 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,658] INFO Session establishment complete on server zookeeper/172.18.0.2:2181, session id = 0x10000250f890001, negotiated timeout = 40000 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,678] WARN An exception was thrown while closing send thread for session 0x10000250f890001. (org.apache.zookeeper.ClientCnxn)
EndOfStreamException: Unable to read additional data from server sessionid 0x10000250f890001, likely server has closed socket
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1290)
[2022-12-14 12:18:39,785] INFO Session: 0x10000250f890001 closed (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,785] INFO EventThread shut down for session: 0x10000250f890001 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,785] INFO Initiating client connection, connectString=zookeeper:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher#55a1c291 (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,786] INFO jute.maxbuffer value is 1048575 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2022-12-14 12:18:39,786] INFO zookeeper.request.timeout value is 0. feature enabled=false (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,787] INFO Opening socket connection to server zookeeper/172.18.0.2:2181. (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,787] INFO SASL config status: Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,788] INFO Socket connection established, initiating session, client: /172.18.0.5:47178, server: zookeeper/172.18.0.2:2181 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,799] INFO Session establishment complete on server zookeeper/172.18.0.2:2181, session id = 0x10000250f890002, negotiated timeout = 40000 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:39,979] INFO Session: 0x10000250f890002 closed (org.apache.zookeeper.ZooKeeper)
[2022-12-14 12:18:39,979] INFO EventThread shut down for session: 0x10000250f890002 (org.apache.zookeeper.ClientCnxn)
[2022-12-14 12:18:40,122] INFO AdminClientConfig values:
bootstrap.servers = [broker:9093]
client.dns.lookup = use_all_dns_ips
client.id =
connections.max.idle.ms = 300000
default.api.timeout.ms = 60000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 2147483647
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.connect.timeout.ms = null
sasl.login.read.timeout.ms = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.login.retry.backoff.max.ms = 10000
sasl.login.retry.backoff.ms = 100
sasl.mechanism = GSSAPI
sasl.oauthbearer.clock.skew.seconds = 30
sasl.oauthbearer.expected.audience = null
sasl.oauthbearer.expected.issuer = null
sasl.oauthbearer.jwks.endpoint.refresh.ms = 3600000
sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 10000
sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100
sasl.oauthbearer.jwks.endpoint.url = null
sasl.oauthbearer.scope.claim.name = scope
sasl.oauthbearer.sub.claim.name = sub
sasl.oauthbearer.token.endpoint.url = null
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
socket.connection.setup.timeout.max.ms = 30000
socket.connection.setup.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.certificate.chain = null
ssl.keystore.key = null
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.3
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.certificates = null
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
(org.apache.kafka.clients.admin.AdminClientConfig)
[2022-12-14 12:18:40,312] INFO Kafka version: 7.3.0-ccs (org.apache.kafka.common.utils.AppInfoParser)
[2022-12-14 12:18:40,312] INFO Kafka commitId: b8341813ae2b0444 (org.apache.kafka.common.utils.AppInfoParser)
[2022-12-14 12:18:40,312] INFO Kafka startTimeMs: 1671020320310 (org.apache.kafka.common.utils.AppInfoParser)
Using log4j config /etc/schema-registry/log4j.properties
===> Launching ...
===> Launching schema-registry ...
[2022-12-14 12:18:41,910] INFO SchemaRegistryConfig values:
access.control.allow.headers =
access.control.allow.methods =
access.control.allow.origin =
access.control.skip.options = true
authentication.method = NONE
authentication.realm =
authentication.roles = [*]
authentication.skip.paths = []
avro.compatibility.level =
compression.enable = true
connector.connection.limit = 0
csrf.prevention.enable = false
csrf.prevention.token.endpoint = /csrf
csrf.prevention.token.expiration.minutes = 30
csrf.prevention.token.max.entries = 10000
debug = false
dos.filter.delay.ms = 100
dos.filter.enabled = false
dos.filter.insert.headers = true
dos.filter.ip.whitelist = []
dos.filter.managed.attr = false
dos.filter.max.idle.tracker.ms = 30000
dos.filter.max.requests.ms = 30000
dos.filter.max.requests.per.connection.per.sec = 25
dos.filter.max.requests.per.sec = 25
dos.filter.max.wait.ms = 50
dos.filter.throttle.ms = 30000
dos.filter.throttled.requests = 5
host.name = schema_registry
http2.enabled = true
idle.timeout.ms = 30000
inter.instance.headers.whitelist = []
inter.instance.protocol = http
kafkastore.bootstrap.servers = []
kafkastore.checkpoint.dir = /tmp
kafkastore.checkpoint.version = 0
kafkastore.connection.url = zookeeper:2181
kafkastore.group.id =
kafkastore.init.timeout.ms = 60000
kafkastore.sasl.kerberos.kinit.cmd = /usr/bin/kinit
kafkastore.sasl.kerberos.min.time.before.relogin = 60000
kafkastore.sasl.kerberos.service.name =
kafkastore.sasl.kerberos.ticket.renew.jitter = 0.05
kafkastore.sasl.kerberos.ticket.renew.window.factor = 0.8
kafkastore.sasl.mechanism = GSSAPI
kafkastore.security.protocol = PLAINTEXT
kafkastore.ssl.cipher.suites =
kafkastore.ssl.enabled.protocols = TLSv1.2,TLSv1.1,TLSv1
kafkastore.ssl.endpoint.identification.algorithm =
kafkastore.ssl.key.password = [hidden]
kafkastore.ssl.keymanager.algorithm = SunX509
kafkastore.ssl.keystore.location =
kafkastore.ssl.keystore.password = [hidden]
kafkastore.ssl.keystore.type = JKS
kafkastore.ssl.protocol = TLS
kafkastore.ssl.provider =
kafkastore.ssl.trustmanager.algorithm = PKIX
kafkastore.ssl.truststore.location =
kafkastore.ssl.truststore.password = [hidden]
kafkastore.ssl.truststore.type = JKS
kafkastore.timeout.ms = 500
kafkastore.topic = _schemas
kafkastore.topic.replication.factor = 3
kafkastore.topic.skip.validation = false
kafkastore.update.handlers = []
kafkastore.write.max.retries = 5
leader.eligibility = true
listener.protocol.map = []
listeners = []
master.eligibility = null
metric.reporters = []
metrics.jmx.prefix = kafka.schema.registry
metrics.num.samples = 2
metrics.sample.window.ms = 30000
metrics.tag.map = []
mode.mutability = true
nosniff.prevention.enable = false
port = 8081
proxy.protocol.enabled = false
reject.options.request = false
request.logger.name = io.confluent.rest-utils.requests
request.queue.capacity = 2147483647
request.queue.capacity.growby = 64
request.queue.capacity.init = 128
resource.extension.class = []
resource.extension.classes = []
resource.static.locations = []
response.http.headers.config =
response.mediatype.default = application/vnd.schemaregistry.v1+json
response.mediatype.preferred = [application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json]
rest.servlet.initializor.classes = []
schema.cache.expiry.secs = 300
schema.cache.size = 1000
schema.canonicalize.on.consume = []
schema.compatibility.level = backward
schema.providers = []
schema.registry.group.id = schema-registry
schema.registry.inter.instance.protocol =
schema.registry.resource.extension.class = []
server.connection.limit = 0
shutdown.graceful.ms = 1000
ssl.cipher.suites = []
ssl.client.auth = false
ssl.client.authentication = NONE
ssl.enabled.protocols = []
ssl.endpoint.identification.algorithm = null
ssl.key.password = [hidden]
ssl.keymanager.algorithm =
ssl.keystore.location =
ssl.keystore.password = [hidden]
ssl.keystore.reload = false
ssl.keystore.type = JKS
ssl.keystore.watch.location =
ssl.protocol = TLS
ssl.provider =
ssl.trustmanager.algorithm =
ssl.truststore.location =
ssl.truststore.password = [hidden]
ssl.truststore.type = JKS
suppress.stack.trace.response = true
thread.pool.max = 200
thread.pool.min = 8
websocket.path.prefix = /ws
websocket.servlet.initializor.classes = []
(io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig)
[2022-12-14 12:18:42,007] INFO Logging initialized #879ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log)
[2022-12-14 12:18:42,066] INFO Initial capacity 128, increased by 64, maximum capacity 2147483647. (io.confluent.rest.ApplicationServer)
[2022-12-14 12:18:42,172] WARN DEPRECATION warning: `listeners` configuration is not configured. Falling back to the deprecated `port` configuration. (io.confluent.rest.ApplicationServer)
[2022-12-14 12:18:42,175] INFO Adding listener with HTTP/2: http://0.0.0.0:8081 (io.confluent.rest.ApplicationServer)
[2022-12-14 12:18:42,589] WARN DEPRECATION warning: `listeners` configuration is not configured. Falling back to the deprecated `port` configuration. (io.confluent.rest.ApplicationServer)
[2022-12-14 12:18:42,744] ERROR Server died unexpectedly: (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain)
org.apache.kafka.common.config.ConfigException: No supported Kafka endpoints are configured. kafkastore.bootstrap.servers must have at least one endpoint matching kafkastore.security.protocol.
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig.endpointsToBootstrapServers(SchemaRegistryConfig.java:666)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig.bootstrapBrokers(SchemaRegistryConfig.java:615)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1566)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:171)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:71)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:90)
at io.confluent.rest.Application.configureHandler(Application.java:285)
at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:270)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:44)
Server dies unexpectedly because there is no supported Kafka endpoints are configured.I found similar problems that had been asked like 6 years ago. So that did not help
I have searched confluent docs, I have tried to use different types of versions.
The error is informing you that you need to remove the (deprecated) property SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL and use SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS instead, and set it to broker:9093
You could start with a working compose file then add Neo4j to that.
Note: The Schema Registry is not a requirement to use Kafka Connect with or without Neo4j, or any Python library.
Related
Can't read data from Postgres to Kafka
Can't read data from Postgres to Kafka. I connected Kafka to my Postgres database by debezium in docker. But when I run kafkacat in docker to read postgres I get an error ERROR: Failed to format message in postgres.public.users [0] at offset 0: Avro/Schema-registry message deserialization: REST request failed (code -1): HTTP request failed: Couldn't resolve host name : terminating I run kafkacat by the command: docker run --tty --network pythonproject5_default confluentinc/cp-kafkacat kafkacat -b kafka:9092 -C -s key=s -s value=avro -r http://schema-regisrty:8081 -t postgres.public.users debezium connector file looks like this { "name": "db-connector", "config": { "connector.class": "io.debezium.connector.postgresql.PostgresConnector", "plugin.name": "pgoutput", "database.hostname": "postgres", "database.port": "5432", "database.user": "George", "database.password": "tech1337", "database.dbname": "tech_db", "database.server.name": "postgres", "table.include.list": "public.users" } } schemas in application looks like this: name: str time_created: int gender: str age: int last_name: str ip: str city: str premium: bool = None birth_day: str balance: int user_id: int and model like this: class User(Base): __tablename__ = 'users' name = Column(String) time_created = Column(Integer) gender = Column(String) age = Column(Integer) last_name = Column(String) ip = Column(String) city = Column(String) premium = Column(Boolean) birth_day = Column(String) user_id = Column(Integer, primary_key=True, index=True) my_vet = relationship("VET", back_populates="owner") docker compose file: version: "3.7" services: postgres: image: debezium/postgres:13 ports: - 5432:5432 environment: - POSTGRES_USER=goerge - POSTGRES_PASSWORD=tech1337 - POSTGRES_DB=5_pm_db zookeeper: image: confluentinc/cp-zookeeper:5.5.3 environment: ZOOKEEPER_CLIENT_PORT: 2181 broker: image: confluentinc/cp-kafka:7.3.0 container_name: broker ports: - "5056:5056" depends_on: - zookeeper environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181' KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_INTERNAL:PLAINTEXT KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092 \ KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1 KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1 kafka: image: confluentinc/cp-enterprise-kafka:5.5.3 depends_on: [zookeeper] environment: KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181" KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 KAFKA_BROKER_ID: 1 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 KAFKA_JMX_PORT: 9991 ports: - 9092:9092 debezium: image: debezium/connect:1.4 environment: BOOTSTRAP_SERVERS: kafka:9092 GROUP_ID: 1 CONFIG_STORAGE_TOPIC: connect_configs OFFSET_STORAGE_TOPIC: connect_offsets KEY_CONVERTER: io.confluent.connect.avro.AvroConverter VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema- registry:8081 CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081 depends_on: [kafka] ports: - 8083:8083 schema-registry: image: confluentinc/cp-schema-registry:5.5.3 environment: - SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181 - SCHEMA_REGISTRY_HOST_NAME=schema-registry - SCHEMA_REGISTRY_LISTENERS=http://schema- registry:8081,http://localhost:8081 ports: - 8081:8081 depends_on: [zookeeper, kafka] All operation in code with Postgres's I realised in SQLAlchemy. If anyone get this error please write how you handle with this and how can I fix this error?
Avro/Schema-registry message deserialization: REST request failed This needs to be http://0.0.0.0:8081, which is the default value. SCHEMA_REGISTRY_LISTENERS=http://schema- registry:8081,http://localhost:8081 Also, you need to remove spaces in schema- registry in all the other places this is used. Alternatively, don't use AvroConverters, and use something else that doesn't need the Registry, such as JSONConverter KEY_CONVERTER: io.confluent.connect.avro.AvroConverter VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter You also can remove the broker service since it is never used in your Compose
Schema registry kafka connect Mongodb
Trying to send data to mongodb through kafka using schema registry My docker compsoe looks like schema-registry: image: confluentinc/cp-schema-registry:latest hostname: schema-registry container_name: schema-registry depends_on: - zookeeper - broker ports: - "8081:8081" networks: - localnet environment: SCHEMA_REGISTRY_HOST_NAME: schema-registry SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: "PLAINTEXT://broker:29092" SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081 connect: image: quickstart-connect-1.7.0:1.0 build: context: . dockerfile: Dockerfile-MongoConnect hostname: connect container_name: connect depends_on: - zookeeper - broker - schema-registry networks: - localnet environment: CONNECT_BOOTSTRAP_SERVERS: "broker:29092" CONNECT_REST_ADVERTISED_HOST_NAME: connect CONNECT_REST_PORT: 8083 CONNECT_GROUP_ID: connect-cluster-group CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1 CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000 CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets CONNECT_KEY_CONVERTER: "io.confluent.connect.avro.AvroConverter" CONNECT_VALUE_CONVERTER: "io.confluent.connect.avro.AvroConverter" CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: "http://localhost:8081" CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://localhost:8081" CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1 CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1 CONNECT_ZOOKEEPER_CONNECT: "zookeeper:2181" CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components" CONNECT_AUTO_CREATE_TOPICS_ENABLE: "true" CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter" CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter" and my sink connector is : curl -X POST \ -H "Content-Type: application/json" \ --data ' {"name": "mongo-sink-dinos", "config": { "connector.class":"com.mongodb.kafka.connect.MongoSinkConnector", "connection.uri":"mongodb://mongo1:27017/?replicaSet=rs0", "database":"quickstart", "collection":"abcd", "topics":"abcdf", "key.converter":"io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url":"http://localhost:8081", "value.converter":"io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url":"http://localhost:8081" } } ' \ http://connect:8083/connectors -w "\n" I use command : kafka-avro-console-producer --topic dinosaurs -- broker-list broker:29092 \ --property schema.registry.url="http://localhost:8081" \ --property value.schema="$(< abc.avsc)" (I have a file abc.avsc in schema registry) But the data is not pushed to mongodb, it is actually received by consumer. When I check Connect logs it shows like : org.apache.kafka.common.config.ConfigException: Missing required configuration "schema.registry.url" which has no default value. What might the reason that the data is not pushed to mongodb
Broken DAG: [/usr/local/airflow/dags/my_dag.py] No module named 'airflow.operators.subdag'
I'm running airflow inside docker container and getting airflow image (puckel/docker-airflow:latest) from docker hub. I can access Airflow UI through localhost:8080 but without executing the DAG and the error mentioned in the subject above. I'm even writing pip command to install apache-airflow in my Dockerfile. Here is how my Dockerfile, docker-compose.yml and dag.py looks like: Dockerfile: FROM puckel/docker-airflow:latest RUN pip install requests RUN pip install pandas RUN pip install 'apache-airflow' docker-compose.yml: version: '3.7' services: redis: image: redis:5.0.5 environment: REDIS_HOST: redis REDIS_PORT: 6379 ports: - 6379:6379 postgres: image: postgres:9.6 environment: - POSTGRES_USER=airflow - POSTGRES_PASSWORD=airflow - POSTGRES_DB=airflow - PGDATA=/var/lib/postgresql/data/pgdata volumes: - ./pgdata:/var/lib/postgresql/data/pgdata logging: options: max-size: 10m max-file: "3" webserver: build: ./dockerfiles restart: always depends_on: - postgres - redis environment: - LOAD_EX=n - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho= - EXECUTOR=Celery logging: options: max-size: 10m max-file: "3" volumes: - ./dags:/usr/local/airflow/dags - ./config/airflow.cfg:/usr/local/airflow/airflow.cfg ports: - "8080:8080" command: webserver healthcheck: test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"] interval: 30s timeout: 30s retries: 3 flower: build: ./dockerfiles restart: always depends_on: - redis environment: - EXECUTOR=Celery ports: - "5555:5555" command: flower scheduler: build: ./dockerfiles restart: always depends_on: - webserver volumes: - ./dags:/usr/local/airflow/dags environment: - LOAD_EX=n - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho= - EXECUTOR=Celery command: scheduler worker: build: ./dockerfiles restart: always depends_on: - scheduler volumes: - ./dags:/usr/local/airflow/dags environment: - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho= - EXECUTOR=Celery command: worker dag.py: from airflow import DAG from airflow.operators.subdag import SubDagOperator from airflow.operators.python import PythonOperator, BranchPythonOperator from airflow.operators.bash import BashOperator from datetime import datetime from random import randint def _choosing_best_model(ti): accuracies = ti.xcom_pull(task_ids=[ 'training_model_A', 'training_model_B', 'training_model_C' ]) if max(accuracies) > 8: return 'accurate' return 'inaccurate' def _training_model(model): return randint(1, 10) with DAG("test", start_date=datetime(2021, 1 ,1), schedule_interval='#daily', catchup=False) as dag: training_model_tasks = [ PythonOperator( task_id=f"training_model_{model_id}", python_callable=_training_model, op_kwargs={ "model": model_id } ) for model_id in ['A', 'B', 'C'] ] choosing_best_model = BranchPythonOperator( task_id="choosing_best_model", python_callable=_choosing_best_model ) accurate = BashOperator( task_id="accurate", bash_command="echo 'accurate'" ) inaccurate = BashOperator( task_id="inaccurate", bash_command=" echo 'inaccurate'" ) training_model_tasks >> choosing_best_model >> [accurate, inaccurate] Am I missing anything here? Please let me know if you can. Thanks :)
DynamoDB connector in degraded state
I'm trying to write Kafka topic data to local Dynamodb. However, the Connector state is always in degraded state. Below is my connector cofig properties. { "key.converter.schemas.enable": "false", "value.converter.schemas.enable": "false", "name": "dynamo-sink-connector", "connector.class": "io.confluent.connect.aws.dynamodb.DynamoDbSinkConnector", "tasks.max": "1", "key.converter": "org.apache.kafka.connect.json.JsonConverter", "value.converter": "org.apache.kafka.connect.json.JsonConverter", "topics": [ "KAFKA_STOCK" ], "aws.dynamodb.pk.hash": "value.companySymbol", "aws.dynamodb.pk.sort": "value.txTime", "aws.dynamodb.endpoint": "http://localhost:8000", "confluent.topic.bootstrap.servers": [ "broker:29092" ] } I was referring to this https://github.com/RWaltersMA/mongo-source-sink and replaced mongo with DynamoDB sink Could someone provide a simple working example, please?
Below is the full example for using AWS DynamoDB Sink connector. Thanks! to #OneCricketeer for his suggestions Dockerfile-DynamoDBConnect which is referred to in docker-compose.yml below FROM confluentinc/cp-kafka-connect:latest ENV CONNECT_PLUGIN_PATH="/usr/share/java,/usr/share/confluent-hub-components" RUN confluent-hub install --no-prompt confluentinc/kafka-connect-aws-dynamodb:latest docker-compose.yml version: '3.6' services: zookeeper: image: confluentinc/cp-zookeeper:latest hostname: zookeeper container_name: zookeeper ports: - "2181:2181" networks: - localnet environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 broker: image: confluentinc/cp-kafka:latest hostname: broker container_name: broker depends_on: - zookeeper ports: - "19092:19092" - "9092:9092" networks: - localnet environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181' KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:19092,PLAINTEXT_HOST://localhost:9092 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0 CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous' schema-registry: image: confluentinc/cp-schema-registry:latest hostname: schema-registry container_name: schema-registry depends_on: - zookeeper - broker ports: - "8081:8081" networks: - localnet environment: SCHEMA_REGISTRY_HOST_NAME: localhost SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181' dynamodb-local: command: "-jar DynamoDBLocal.jar -sharedDb -dbPath ./data" image: "amazon/dynamodb-local:latest" container_name: dynamodb-local ports: - "8000:8000" networks: - localnet volumes: - "./docker/dynamodb:/home/dynamodblocal/data" working_dir: /home/dynamodblocal connect: image: confluentinc/cp-kafka-connect-base:latest build: context: . dockerfile: Dockerfile-DynamoDBConnect hostname: connect container_name: connect depends_on: - zookeeper - broker - schema-registry ports: - "8083:8083" networks: - localnet environment: CONNECT_BOOTSTRAP_SERVERS: 'broker:19092' CONNECT_REST_ADVERTISED_HOST_NAME: connect CONNECT_REST_PORT: 8083 CONNECT_GROUP_ID: compose-connect-group CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1 CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000 CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1 CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1 CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.storage.StringConverter" CONNECT_VALUE_CONVERTER: "io.confluent.connect.avro.AvroConverter" CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081' CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter" CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter" CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO" CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR" CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components" CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181' # Assumes image is based on confluentinc/kafka-connect-datagen:latest which is pulling 5.3.0 Connect image CLASSPATH: "/usr/share/java/monitoring-interceptors/monitoring-interceptors-5.3.0.jar" CONNECT_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor" CONNECT_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor" command: "bash -c 'if [ ! -d /usr/share/confluent-hub-components/confluentinc-kafka-connect-datagen ]; then echo \"WARNING: Did not find directory for kafka-connect-datagen (did you remember to run: docker-compose up -d --build ?)\"; fi ; /etc/confluent/docker/run'" volumes: - ../build/confluent/kafka-connect-aws-dynamodb:/usr/sahre/confluent-hub-components/confluentinc-kafka-connect-aws-dynamodb - $HOME/.aws/credentialstest:/home/appuser/.aws/credentials - $HOME/.aws/configtest:/home/appuser/.aws/config rest-proxy: image: confluentinc/cp-kafka-rest:5.3.0 depends_on: - zookeeper - broker - schema-registry ports: - "8082:8082" hostname: rest-proxy container_name: rest-proxy networks: - localnet environment: KAFKA_REST_HOST_NAME: rest-proxy KAFKA_REST_BOOTSTRAP_SERVERS: 'broker:19092' KAFKA_REST_LISTENERS: "http://0.0.0.0:8082" KAFKA_REST_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081' control-center: image: confluentinc/cp-enterprise-control-center:6.0.0 hostname: control-center container_name: control-center networks: - localnet depends_on: - broker - schema-registry - connect - dynamodb-local ports: - "9021:9021" environment: CONTROL_CENTER_BOOTSTRAP_SERVERS: PLAINTEXT://broker:19092 CONTROL_CENTER_KAFKA_CodeCamp_BOOTSTRAP_SERVERS: PLAINTEXT://broker:19092 CONTROL_CENTER_REPLICATION_FACTOR: 1 CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1 CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_REPLICATION: 1 CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1 CONTROL_CENTER_METRICS_TOPIC_REPLICATION: 1 CONTROL_CENTER_METRICS_TOPIC_PARTITIONS: 1 # Amount of heap to use for internal caches. Increase for better throughput CONTROL_CENTER_STREAMS_CACHE_MAX_BYTES_BUFFERING: 100000000 CONTROL_CENTER_STREAMS_CONSUMER_REQUEST_TIMEOUT_MS: "960032" CONTROL_CENTER_STREAMS_NUM_STREAM_THREADS: 1 # HTTP and HTTPS to Control Center UI CONTROL_CENTER_REST_LISTENERS: http://0.0.0.0:9021 PORT: 9021 # Connect CONTROL_CENTER_CONNECT_CONNECT1_CLUSTER: http://connect:8083 # Schema Registry CONTROL_CENTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081 networks: localnet: attachable: true Java Example to publish Messages in Avro Format public class AvroProducer { public static void main(String[] args) { // Variables (bootstrap server, topic name, logger) final Logger logger = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); final String bootstrapServers = "127.0.0.1:9092"; final String topicName = "stockdata"; // Properties declaration (bootstrap server, key serializer, value serializer // Note use of the ProducerConfig object Properties properties = new Properties(); properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers); properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName()); properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class.getName()); properties.setProperty(AbstractKafkaSchemaSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081"); // Create Producer object // Note the generics KafkaProducer<String, Stock> producer = new KafkaProducer<>(properties); Stock stock = Stock.newBuilder() .setStockCode("APPL") .setStockName("Apple") .setStockPrice(150.0) .build(); // Create ProducerRecord object ProducerRecord<String, Stock> rec = new ProducerRecord<>(topicName, stock); // Send data to the producer (optional callback) producer.send(rec); // Call producer flush() and/or close() producer.flush(); producer.close(); } }
how to create subject for ksqldb from kafka tapic
I use Mysql database. Suppose I have a table for orders. And using debezium mysql connect for Kafka, the order topic has been created. But I have trouble creating a stream in ksqldb. CREATE STREAM orders WITH ( kafka_topic = 'myserver.mydatabase.orders', value_format = 'avro' ); my docker-compose file look like this zookeeper: image: confluentinc/cp-zookeeper:latest container_name: zookeeper privileged: true ports: - "2181:2181" environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 kafka: image: confluentinc/cp-kafka:latest container_name: kafka depends_on: - zookeeper ports: - '9092:9092' environment: KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1 KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1 schema-registry: image: confluentinc/cp-schema-registry:latest container_name: schema-registry depends_on: - kafka - zookeeper ports: - "8081:8081" environment: SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: "zookeeper:2181" SCHEMA_REGISTRY_HOST_NAME: schema-registry kafka-connect: hostname: kafka-connect image: confluentinc/cp-kafka-connect:latest container_name: kafka-connect ports: - 8083:8083 depends_on: - schema-registry environment: CONNECT_BOOTSTRAP_SERVERS: kafka:9092 CONNECT_REST_PORT: 8083 CONNECT_GROUP_ID: "quickstart-avro" CONNECT_CONFIG_STORAGE_TOPIC: "quickstart-avro-config" CONNECT_OFFSET_STORAGE_TOPIC: "quickstart-avro-offsets" CONNECT_STATUS_STORAGE_TOPIC: "quickstart-avro-status" CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1 CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1 CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1 CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter" CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter" CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter" CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter" CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect" CONNECT_LOG4J_ROOT_LOGLEVEL: DEBUG CONNECT_PLUGIN_PATH: "/usr/share/java,/etc/kafka-connect/jars" volumes: - $PWD/kafka/jars:/etc/kafka-connect/jars ksqldb-server: image: confluentinc/ksqldb-server:latest hostname: ksqldb-server container_name: ksqldb-server depends_on: - kafka ports: - "8088:8088" environment: KSQL_LISTENERS: http://0.0.0.0:8088 KSQL_BOOTSTRAP_SERVERS: "kafka:9092" KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true" KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true" KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081" KSQL_CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081" ksqldb-cli: image: confluentinc/ksqldb-cli:latest container_name: ksqldb-cli depends_on: - kafka - ksqldb-server - schema-registry entrypoint: /bin/sh tty: true subject must be created for this table first. What is the difference between the avro, json?
Using debezium mysql connect for Kafka You can set that to use AvroConverter, then the subject will be created automatically Otherwise, you can have KSQL use VALUE_FORMAT=JSON and you need to manually specify all the field names. Unclear what difference you're asking about (they are different serialization formats), but from a KSQL perspective, JSON alone is seen as plain-text (similar to DELIMITED) and needs to be parsed, as compared to the other formats like Avro where the schema+fields are already known.
I solved the issue. Using this configuration, you can send mysql table in the topic without the before and next state. CREATE SOURCE CONNECTOR final_connector WITH ( 'connector.class' = 'io.debezium.connector.mysql.MySqlConnector', 'database.hostname' = 'mysql', 'database.port' = '3306', 'database.user' = 'root', 'database.password' = 'mypassword', 'database.allowPublicKeyRetrieval' = 'true', 'database.server.id' = '184055', 'database.server.name' = 'db', 'database.whitelist' = 'mydb', 'database.history.kafka.bootstrap.servers' = 'kafka:9092', 'database.history.kafka.topic' = 'mydb', 'table.whitelist' = 'mydb.user', 'include.schema.changes' = 'false', 'transforms'= 'unwrap,extractkey', 'transforms.unwrap.type'= 'io.debezium.transforms.ExtractNewRecordState', 'transforms.extractkey.type'= 'org.apache.kafka.connect.transforms.ExtractField$Key', 'transforms.extractkey.field'= 'id', 'key.converter'= 'org.apache.kafka.connect.converters.IntegerConverter', 'value.converter'= 'io.confluent.connect.avro.AvroConverter', 'value.converter.schema.registry.url'= 'http://schema-registry:8081' ); and create your stream simply ! This video can help you a lot https://www.youtube.com/watch?v=2fUOi9wJPhk&t=1550s