Kafka docker image that works without zookeeper - apache-kafka

I read that Kafka no longer requires zookeeper, so I don't want to have zookeeper in docker-compose. But I don't know which kafka image can work w/o zookeeper. can anyone give a hint?

Here's a Kafka Docker image which doesn't required Zookeeper (as described above):
https://hub.docker.com/r/bashj79/kafka-kraft
Disclaimer: I'm the author.

Confluent published a working docker-compose.yaml without zookeeper in their repository cp-all-in-one.
There is a script used as a workaround
#!/bin/sh
# Docker workaround: Remove check for KAFKA_ZOOKEEPER_CONNECT parameter
sed -i '/KAFKA_ZOOKEEPER_CONNECT/d' /etc/confluent/docker/configure
# Docker workaround: Ignore cub zk-ready
sed -i 's/cub zk-ready/echo ignore zk-ready/' /etc/confluent/docker/ensure
# KRaft required step: Format the storage directory with a new cluster ID
echo "kafka-storage format --ignore-formatted -t $(kafka-storage random-uuid) -c /etc/kafka/kafka.properties" >> /etc/confluent/docker/ensure
which is called in the command of the docker-compose-setup
broker:
image: confluentinc/cp-kafka:7.2.x-latest
hostname: broker
container_name: broker
ports:
- "9092:9092"
- "9101:9101"
environment:
KAFKA_BROKER_ID: 1
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092'
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_JMX_PORT: 9101
KAFKA_JMX_HOSTNAME: localhost
KAFKA_PROCESS_ROLES: 'broker,controller'
KAFKA_NODE_ID: 1
KAFKA_CONTROLLER_QUORUM_VOTERS: '1#broker:29093'
KAFKA_LISTENERS: 'PLAINTEXT://broker:29092,CONTROLLER://broker:29093,PLAINTEXT_HOST://0.0.0.0:9092'
KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
volumes:
- ./update_run.sh:/tmp/update_run.sh
command: "bash -c 'if [ ! -f /tmp/update_run.sh ]; then echo \"ERROR: Did you forget the update_run.sh file that came with this docker-compose.yml file?\" && exit 1 ; else /tmp/update_run.sh && /etc/confluent/docker/run ; fi'"

I read that Kafka no longer requires zookeeper
You may well have read that in the future Apache Kafka will not need Zookeeper - this is detailed in KIP-500
However, this is not yet implemented, so for the time being (January 2021) you will still need a Zookeeper in your Docker Compose ensemble.

You can use this image for no Zookeeper.
https://hub.docker.com/r/bitnami/kafka
Here is a example yaml.
version: "3"
services:
kafka:
image: 'bitnami/kafka:3.2.3'
restart: "no"
privileged: true
ports:
- 2181:2181
- 19092:19092
environment:
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_PROCESS_ROLES=broker,controller
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_LISTENERS=PLAINTEXT://:19092,CONTROLLER://:2181
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://127.0.0.1:19092
- KAFKA_BROKER_ID=1
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1#127.0.0.1:2181
- ALLOW_PLAINTEXT_LISTENER=yes

According to "What’s New in Apache Kafka 3.3" document and "KIP-833: Mark KRaft as Production Ready" Kafka can work without Zookeeper (but there are some features yet works only by Apache ZooKeeper (ZK) mode).
Example (docker-compose.yml):
version: "2.5"
volumes:
volume1:
services:
kafka1:
image: 'bitnami/kafka:3.3.1'
container_name: kafka
environment:
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_PROCESS_ROLES=broker,controller
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka1:9092
- KAFKA_CFG_BROKER_ID=1
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1#kafka1:9093
- ALLOW_PLAINTEXT_LISTENER=yes
- KAFKA_KRAFT_CLUSTER_ID=r4zt_wrqTRuT7W2NJsB_GA
volumes:
- volume1:/bitnami/kafka
kafka-ui:
container_name: kafka-ui
image: 'provectuslabs/kafka-ui:latest'
ports:
- "8080:8080"
environment:
- KAFKA_CLUSTERS_0_BOOTSTRAP_SERVERS=kafka1:9092
- KAFKA_CLUSTERS_0_NAME=r4zt_wrqTRuT7W2NJsB_GA
You could try localhost:8080 and you will see that it works perfectly.

Related

Unable to connect to Kafka(in docker) from Internet

I am attempting to make it so I can connect to Kafka from over the internet. I have opened Ports: 9092 and 2181 on my router.
I have had no luck at all! I am using OffsetExplorer and I am able to ping kafka from another network. The IP of the system that is is running on is: 10.0.1.104
I AM able to connect to kafka on the local network from another computer though.
Here is my Kafka Docker-Compose:
version: '3.7'
services:
zookeeper:
image: wurstmeister/zookeeper:3.4.6
environment:
JVMFLAGS: "-Djava.security.auth.login.config=/etc/zookeeper/zookeeper_jaas.conf"
volumes:
- ./zookeeper_jaas.conf:/etc/zookeeper/zookeeper_jaas.conf
ports:
- 2181:2181
kafka:
image: wurstmeister/kafka:2.13-2.8.1
depends_on:
- zookeeper
ports:
- 9092:9092
- 29092:29092
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: INTERNAL://:9093,EXTERNAL://:9092
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:9093,EXTERNAL://10.0.1.104:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:SASL_PLAINTEXT,EXTERNAL:SASL_PLAINTEXT
ALLOW_PLAINTEXT_LISTENER: 'yes'
KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'true'
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_SASL_ENABLED_MECHANISMS: PLAIN
KAFKA_SASL_MECHANISM_INTER_BROKER_PROTOCOL: PLAIN
KAFKA_OPTS: "-Djava.security.auth.login.config=/etc/kafka/kafka_jaas.conf"
volumes:
- ./kafka_server_jaas.conf:/etc/kafka/kafka_jaas.conf
Connection attempts result in this on Kafka's output
Thank you very much!

How to setup local Kafka to validate schema?

I read that schema validation is available in Kafka Confluent Server only. But maybe since then someone has found a solution? Does anyone know how can I test message validity running Kafka locally?
how to enforce broker to validate producers' input
Kafka doesn't do this. Confluent Server can, yes, but it is enterprise licensed. There is no alternative to server-side validation without forking Kafka like Confluent did. Otherwise, you would need to write your own Serializer class, but that won't scale to every producer client you may use.
You can still use Docker as the other answer shows, and you can use confluentinc/cp-server image, then you simply add an environment variable to enable schema valdiation.
https://docs.confluent.io/platform/current/schema-registry/schema-validation.html
Otherwise, simply using JsonSchemaSerializer, for example, will validate each record does adhere to a schema (client-side validation), but it won't stop anyone else from sending garbage data into the topic (server-side validation).
If you know a little bite docker-compose, you can deploy a Kafka ecosystem for testing purposes.
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.1.0
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-kafka:7.1.0
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "29092:29092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
schema-registry:
image: confluentinc/cp-schema-registry:7.1.0
hostname: schema-registry
container_name: schema-registry
depends_on:
- broker
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:9092'
I got the docker-compose file from https://github.com/confluentinc/kafka-tutorials/blob/master/_includes/tutorials/aggregating-sum/ksql/code/docker-compose.yml, but for your testing, it is not needed the KSQL components and you can delete those.

How to setup kafka-kinesis-connector if I use a Kafka container?

I'm a little bit confused. I've been following this to get started and the installation method.
https://aws.amazon.com/premiumsupport/knowledge-center/kinesis-kafka-connector-msk/
Setup AWS cli and configure it
Install maven
Compile connector
Set classpath with the jar generated
Set up the properties file of the connector
FYI: I have a docker-compose that creates all my containers (kafka, mqtt, etc.)
(all of the above is setup on-premise)
And then, I executed all this on my machine itself and not on the Kafka container, so for the last step how would that work when I try to run it standalone?
version: '3'
services:
nodered:
container_name: nodered
image: nodered/node-red
ports:
- "1880:1880"
volumes:
- ./nodered:/data
depends_on:
- mosquitto
environment:
- TZ=America/Toronto
- NODE_RED_ENABLE_PROJECTS=true
restart: always
mosquitto:
image: eclipse-mosquitto
container_name: mqtt
restart: always
ports:
- "1883:1883"
volumes:
- "./mosquitto/config:/mosquitto/config"
- "./mosquitto/data:/mosquitto/data"
- "./mosquitto/log:/mosquitto/log"
environment:
- TZ=America/Toronto
user: "${PUID}:${PGID}"
portainer:
ports:
- "9000:9000"
container_name: portainer
restart: always
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
- "./portainer/portainer_data:/data"
image: portainer/portainer-ce
zookeeper:
image: zookeeper:3.4
container_name: zookeeper
ports:
- "2181:2181"
volumes:
- "zookeeper_data:/data"
kafka:
image: wurstmeister/kafka:1.0.0
container_name: kafka
ports:
- "9092:9092"
- "9093:9093"
volumes:
- "kafka_data:/data"
environment:
- KAFKA_ZOOKEEPER_CONNECT=10.0.0.129:2181
- KAFKA_ADVERTISED_HOST_NAME=10.0.0.129
- JMX_PORT=9093
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_LOG_RETENTION_HOURS=1
- KAFKA_MESSAGE_MAX_BYTES=10000000
- KAFKA_REPLICA_FETCH_MAX_BYTES=10000000
- KAFKA_GROUP_MAX_SESSION_TIMEOUT_MS=60000
- KAFKA_NUM_PARTITIONS=2
- KAFKA_DELETE_RETENTION_MS=1000
depends_on:
- zookeeper
restart: on-failure
cmak:
image: hlebalbau/kafka-manager:1.3.3.16
container_name: kafka-manager
restart: always
depends_on:
- kafka
- zookeeper
ports:
- "9080:9080"
environment:
- ZK_HOSTS=10.0.0.129
- APPLICATION_SECRET=letmein
command: -Dconfig.file=/kafka-manager/conf/application.conf -Dapplication.home=/kafkamanager -Dhttp.port=9080
volumes:
zookeeper_data:
driver: local
kafka_data:
driver: local
I guess I have to go into my Kafka container and run the below code, but how can I reference by machine path... I'm stuck here or perhaps I'm missing something:
./bin/connect-standalone.sh {{path_from_machine_where_jar_is}}/kinesis-kafka-connector/config/worker.properties {{path_from_machine_where_jar_is}}/kinesis-kafka-connector/config/kinesis-streams-kafka-
connector.properties
Or I have to run all the previous steps in my Kafka container directly...
I was thinking of just doing this, copy my jar file and just moving it in my kafka container.
docker cp /hostfile (container_id):/(to_the_place_you_want_the_file_to_be)
Thank you!
guess I have to go into my Kafka container and run the below code
No. Containers should only run one process, which is kafka server.
So, either you download Kafka locally and run the Connect scripts from your host.
Or you simply add a new container for Kafka Connect, which will run Connect distributed mode, rather than standalone.
In either case, yes, you need to copy (or mount) the jar into Connect's plugin path
Alternatively, run MSK rather than Kinesis and produce your Kafka data there rather than locally.

Failed to collect cluster Default info java.lang.IllegalStateException: Error while creating AdminClient for Cluster Default

I would like to use network_mode: bridge for kafka for being able to reach kafka through localhost:9092 from another service
I'm trying to use the provectus/kafka-ui but when I open the consumers menu I get the following error
my docker-compose.yml file :
kafka-ui:
container_name: kafka-ui
image: provectuslabs/kafka-ui:latest
ports:
- 8080:8080
depends_on:
- kafka
environment:
KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092
KAFKA_CLUSTERS_0_JMXPORT: 9997
kafka:
image: johnnypark/kafka-zookeeper
ports:
- "2181:2181"
- "9092:9092"
network_mode: bridge
environment:
ADVERTISED_HOST: 127.0.0.1
NUM_PARTITIONS: 1
volumes:
- /var/run/docker.sock:/var/run/docker.sock
log error:
2022-01-13 09:16:50,014 ERROR [parallel-5] c.p.k.u.s.MetricsService: Failed to collect cluster Default info
java.lang.IllegalStateException: Error while creating AdminClient for Cluster Default
provectus/kafka-ui
I was using the johnnypark/kafka-zookeeper library for both kafka and zookeeper. I was able to solve this problem by using two separate libraries as in the example below
zookeeper1:
image: confluentinc/cp-zookeeper:5.2.4
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka1:
image: confluentinc/cp-kafka:5.3.1
depends_on:
- zookeeper1
ports:
- 9093:9093
- 9998:9998
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper1:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka1:29092,PLAINTEXT_HOST://localhost:9093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
JMX_PORT: 9998
KAFKA_JMX_OPTS: -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=kafka1 -Dcom.sun.management.jmxremote.rmi.port=9998
being able to reach kafka through localhost:9092 from another service
You can't use localhost to reach Kafka since that would be the Kafka UI container itself.
Changing ADVERTISED_HOST to kafka and using kafka:9092 from other containers is correct for a bridge network. However, this have the side effect of preventing any access to Kafka outside the Docker network, such as clients directly on the host machine.
Internal and External clients can be configured separately. bitnami/bitnami-docker-kafka
Here's an example using Bitnami's Kafka Image - this allows host clients to connect on port 9093 while allowing kafka-ui to connect with the default port.
version: "3"
services:
zookeeper:
image: 'bitnami/zookeeper:latest'
ports:
- '2181:2181'
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
kafka:
image: 'bitnami/kafka:latest'
ports:
- '9092:9092'
- '9093:9093'
environment:
- KAFKA_BROKER_ID=1
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CLIENT:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_CFG_LISTENERS=CLIENT://:9092,EXTERNAL://:9093
- KAFKA_CFG_ADVERTISED_LISTENERS=CLIENT://kafka:9092,EXTERNAL://localhost:9093
- KAFKA_CFG_INTER_BROKER_LISTENER_NAME=CLIENT
- ALLOW_PLAINTEXT_LISTENER=yes
depends_on:
- zookeeper
kafka-ui:
image: provectuslabs/kafka-ui
container_name: kafka-ui
ports:
- "8081:8081"
restart: always
environment:
- KAFKA_CLUSTERS_0_NAME=local
- KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS=kafka:9092
- SERVER_PORT=8081

Confluent Schema Registry unable to connect to Kafka container

I'm trying to run a Kafka service with Zookeper, Kafdrop and Schema Registry, I made it work by installing Kafka, Zookeper and Kadrop in a container and then installing Confluent Schema Registry in another (with its own Kafka, Zookeeper, Ksql Server and Rest Proxy), however trying to make kafdrop read the schema registry is not working, so now I want to install exclusively just one container with Kafka, Zookeeper, Kafdrop and Schema Registry, and even though everything is installed successfully, the Schema Registry is restarting every 10 seconds or so, and I cannot reach out the service (localhost:8085) to add my schema, so I'm wondering if it's even possible to run the Confluent Schema Registry outside the Confluent suite of services, here it is my YAML file:
version: '2'
services:
kafka:
image: wurstmeister/kafka
container_name: kafka
ports:
- "9092:9092"
environment:
- KAFKA_ADVERTISED_HOST_NAME=127.0.0.1
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_LISTENERS=INTERNAL://:29092,EXTERNAL://:9092
- KAFKA_ADVERTISED_LISTENERS=INTERNAL://kafka:29092,EXTERNAL://localhost:9092
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_INTER_BROKER_LISTENER_NAME=INTERNAL
- KAFKA_SCHEMA_REGISTRY_URL=schemaregistry:8085
depends_on:
- zookeeper
zookeeper:
image: wurstmeister/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
- KAFKA_ADVERTISED_HOST_NAME=zookeeper
schemaregistry:
image: confluentinc/cp-schema-registry:6.2.0
restart: always
depends_on:
- zookeeper
environment:
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: "zookeeper:2181"
SCHEMA_REGISTRY_HOST_NAME: schemaregistry
SCHEMA_REGISTRY_LISTENERS: "http://0.0.0.0:8085"
ports:
- 8085:8085
kafdrop:
image: obsidiandynamics/kafdrop
container_name: kafdrop
restart: "no"
ports:
- "9000:9000"
environment:
KAFKA_BROKERCONNECT: "kafka:29092"
JVM_OPTS: "-Xms16M -Xmx48M -Xss180K -XX:-TieredCompilation -XX:+UseStringDeduplication -noverify"
SCHEMAREGISTRY_CONNECT: schemaregistry:8085
depends_on:
- "kafka"
So it turned out that the Schema Registry couldn't connect because Kafka was not using 'PLAINTEXT' as the internal broker listener name, here is the YAML version working with the Kafdrop also working when deserializing the message to AVRO:
version: '2'
services:
kafka:
image: wurstmeister/kafka
container_name: kafka
ports:
- "9092:9092"
environment:
- KAFKA_ADVERTISED_HOST_NAME=127.0.0.1
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_LISTENERS=PLAINTEXT://:29092,EXTERNAL://:9092
- KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:29092,EXTERNAL://localhost:9092
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_INTER_BROKER_LISTENER_NAME=PLAINTEXT
- KAFKA_SCHEMA_REGISTRY_URL=schemaregistry:8085
depends_on:
- zookeeper
zookeeper:
image: wurstmeister/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
- KAFKA_ADVERTISED_HOST_NAME=zookeeper
schemaregistry:
image: confluentinc/cp-schema-registry:6.2.0
restart: always
depends_on:
- zookeeper
environment:
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: "zookeeper:2181"
SCHEMA_REGISTRY_HOST_NAME: schemaregistry
SCHEMA_REGISTRY_LISTENERS: "http://0.0.0.0:8085"
ports:
- 8085:8085
kafdrop:
image: obsidiandynamics/kafdrop
container_name: kafdrop
restart: "no"
ports:
- "9000:9000"
environment:
KAFKA_BROKERCONNECT: "kafka:29092"
JVM_OPTS: "-Xms16M -Xmx48M -Xss180K -XX:-TieredCompilation -XX:+UseStringDeduplication -noverify"
SCHEMAREGISTRY_CONNECT: http://schemaregistry:8085
depends_on:
- "kafka"
Your both YAML are not correct
You have in zookeeper:
KAFKA_ADVERTISED_HOST_NAME
And in schema registry you give zookeeper port 2181 in kafkastore....