Unable to send messages to Kafka - apache-kafka

I created a kafka docker from wurstmeisters docker image and after that followed the steps 3, 4 and 5 from apache documetnation to produce and consume messages. Unfortunately it is not possible to do so. 3 seconds after im sending a message I get the following error:
I cant find solutions for the given error. So what do I have to do to solve this issue?
docker-compose.yml
version: '2'
services:
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2181:2181"
kafka:
build: .
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_CREATE_TOPICS: "test:1:1"
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
volumes:
- /var/run/docker.sock:/var/run/docker.sock
EDIT
The zookeeper is reachable.
bin/kafka-topics.sh --list --zookeeper localhost:2181
returns:
baeldung
filtered
greeting
partitioned
test
So only the broker is not available.

In my opinion it is very strange but it seems that it was a simple conflict due to the version. The kafka/wurstmeister image is using Kafka Version kafka_2.12-0.10.2.1.tgz so ofcourse I downloaded the same version for the consumer and producer clients. That leaded to the result I posted above.
Now I tried the version kafka_2.11-0.11.0.0.tgz and everything is working fine. I thought the versions should match to work but obviously that is not the case.

Related

How to setup local Kafka to validate schema?

I read that schema validation is available in Kafka Confluent Server only. But maybe since then someone has found a solution? Does anyone know how can I test message validity running Kafka locally?
how to enforce broker to validate producers' input
Kafka doesn't do this. Confluent Server can, yes, but it is enterprise licensed. There is no alternative to server-side validation without forking Kafka like Confluent did. Otherwise, you would need to write your own Serializer class, but that won't scale to every producer client you may use.
You can still use Docker as the other answer shows, and you can use confluentinc/cp-server image, then you simply add an environment variable to enable schema valdiation.
https://docs.confluent.io/platform/current/schema-registry/schema-validation.html
Otherwise, simply using JsonSchemaSerializer, for example, will validate each record does adhere to a schema (client-side validation), but it won't stop anyone else from sending garbage data into the topic (server-side validation).
If you know a little bite docker-compose, you can deploy a Kafka ecosystem for testing purposes.
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.1.0
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-kafka:7.1.0
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "29092:29092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
schema-registry:
image: confluentinc/cp-schema-registry:7.1.0
hostname: schema-registry
container_name: schema-registry
depends_on:
- broker
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:9092'
I got the docker-compose file from https://github.com/confluentinc/kafka-tutorials/blob/master/_includes/tutorials/aggregating-sum/ksql/code/docker-compose.yml, but for your testing, it is not needed the KSQL components and you can delete those.

Kafka Consumer is not receiving Messages on docker

I'm a begginer on kafka as well as docker, I have been doing a course and working with kafka producer and consumer but for some reason it is not working.
When I do use of the producer the message are saved in the topic (I have already checked it) but when I try to get the message using the consumer it is not working and I have no idea why.
It had worked previously but not anymore.
The unique difference I have in this case is that I'm using the confluentinc image instead of the bitnami image.
So, if anyone has any idea or solution I would really appreciate it.
I share my compose and an screenshot so you can see it.
version: "3.2"
services:
###############################################################
zookeeper:
image: 'confluentinc/cp-zookeeper:latest'
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 2181:2181
###############################################################
broker:
image: 'confluentinc/cp-kafka:latest'
container_name: broker
depends_on:
- zookeeper
ports:
- 9092:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
# Exposes 9092 for external connections to the broker
# Use kafka:29092 for connections internal on the docker network
# See https://rmoff.net/2018/08/02/kafka-listeners-explained/ for details
KAFKA_LISTENERS: "PLAINTEXT://:29092,PLAINTEXT_HOST://:9092"
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_DELETE_TOPIC_ENABLE: "true"
Producer and Consumer
All is running in my local machine.
Take a look at docker-compose logs broker...
You should see a lot of Error processing create topic request CreatableTopic(name='__consumer_offsets', numPartitions=50, replicationFactor=3
Without a valid __consumer_offsets topic, no consumer will be able to run and commit offsets. Similarly, transactions won't work either (which are enabled by default in latest Kafka)
Add these variables and re-create the containers
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1

Docker compose multi-container with zookeeper, kafka and python script on Azure container instances not able to connect to kafka

I am trying to get a zookeeper/kafka non-clustered setup to be able to talk to containers with python scripts. I want to be able to run a zookeeper/kafka container and 2 or more containers with python scripts communicating to the zookeeper/kafka, all running in containers or container groups on Azure.
To test this, I have created the below docker container group, with zookeeper and kafka as 2 services and a 3rd service that starts a simple python script to produce a steady pace of messages to a kafka topic. The docker-compose.yml that I am using is as follows:
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 22181:2181
networks:
- my-network
kafka:
image: confluentinc/cp-kafka:latest
container_name: kafka
depends_on:
- zookeeper
ports:
- 29092:29092
networks:
- my-network
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
kafka_producer:
build: ../kafka_producer
image: annabotkafka.azurecr.io/kafka_producer:v1
container_name: kafka_producer
depends_on:
- kafka
volumes:
- .:/usr/src/kafka_producer
networks:
- my-network
environment:
KAFKA_SERVERS: kafka:9092
networks:
my-network:
driver: bridge
The kafka_producer.py script is as follows:
import os
from time import sleep
import json
from confluent_kafka import Producer
def acked(err, msg):
if err is not None:
print("Failed to deliver message: {0}: {1}"
.format(msg.value(), err.str()))
else:
print("Message produced: {0}".format(msg.value()))
# Function to send a status message out on the status topic
def send_status(producer,counter):
msg = {'counter':counter}
json_dump = json.dumps(msg)
producer.produce("counter", json_dump.encode('utf-8'), callback=acked)
producer.poll()
# Define kafkaProducer to push messages to the status topic
producer = Producer({'bootstrap.servers': 'kafka:9092'})
for j in range(9999):
print("Iteration", j)
send_status(producer, j)
sleep(2)
When I 'docker-compose up' this on my Ubuntu 20.04 dev machine, I get the expected behaviour: a stead stream of messages sent to the kafka producer.
After I 'docker-compuse push' this to Azure Container instances and create a container in Azure with the image, the kafka_producer script appears to no longer be able to connect to the kafka broker at kafka:9092.
These are the logs from the container group after startup:
Iteration 0
%3|1629363616.468|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Failed to resolve 'kafka:9092': Name or service not known (after 25ms in state CONNECT)
%3|1629363618.465|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Failed to resolve 'kafka:9092': Name or service not known (after 22ms in state CONNECT, 1 identical error(s) suppressed)
Iteration 1
Iteration 2
I had understood that the container group is on the same network subnet and on a single host so I would expect this to operate the same as on my dev machine locally.
My next step will be the have separate containers with different python scripts that I will want to communicate with kafka in this container group. Having the producer script within the same container group is not my longterm expectation, but I believed this simpler setup should work.
Any suggestions for where I am going wrong?
From Azure documentation
Within a container group, container instances can reach each other via localhost on any port, even if those ports aren't exposed externally on the group's IP address or from the container.
This makes it sound like the containers are using a host network, not a Docker bridge like you've setup in Compose (where your code works fine)
Therefore, you ought to connect with localhost:29092
If you don't actually need message persistence, then I'd suggest using sockets via HTTP, gRPC or Zeromq between your scripts rather than a Kafka container

Kafka docker image that works without zookeeper

I read that Kafka no longer requires zookeeper, so I don't want to have zookeeper in docker-compose. But I don't know which kafka image can work w/o zookeeper. can anyone give a hint?
Here's a Kafka Docker image which doesn't required Zookeeper (as described above):
https://hub.docker.com/r/bashj79/kafka-kraft
Disclaimer: I'm the author.
Confluent published a working docker-compose.yaml without zookeeper in their repository cp-all-in-one.
There is a script used as a workaround
#!/bin/sh
# Docker workaround: Remove check for KAFKA_ZOOKEEPER_CONNECT parameter
sed -i '/KAFKA_ZOOKEEPER_CONNECT/d' /etc/confluent/docker/configure
# Docker workaround: Ignore cub zk-ready
sed -i 's/cub zk-ready/echo ignore zk-ready/' /etc/confluent/docker/ensure
# KRaft required step: Format the storage directory with a new cluster ID
echo "kafka-storage format --ignore-formatted -t $(kafka-storage random-uuid) -c /etc/kafka/kafka.properties" >> /etc/confluent/docker/ensure
which is called in the command of the docker-compose-setup
broker:
image: confluentinc/cp-kafka:7.2.x-latest
hostname: broker
container_name: broker
ports:
- "9092:9092"
- "9101:9101"
environment:
KAFKA_BROKER_ID: 1
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092'
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_JMX_PORT: 9101
KAFKA_JMX_HOSTNAME: localhost
KAFKA_PROCESS_ROLES: 'broker,controller'
KAFKA_NODE_ID: 1
KAFKA_CONTROLLER_QUORUM_VOTERS: '1#broker:29093'
KAFKA_LISTENERS: 'PLAINTEXT://broker:29092,CONTROLLER://broker:29093,PLAINTEXT_HOST://0.0.0.0:9092'
KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
volumes:
- ./update_run.sh:/tmp/update_run.sh
command: "bash -c 'if [ ! -f /tmp/update_run.sh ]; then echo \"ERROR: Did you forget the update_run.sh file that came with this docker-compose.yml file?\" && exit 1 ; else /tmp/update_run.sh && /etc/confluent/docker/run ; fi'"
I read that Kafka no longer requires zookeeper
You may well have read that in the future Apache Kafka will not need Zookeeper - this is detailed in KIP-500
However, this is not yet implemented, so for the time being (January 2021) you will still need a Zookeeper in your Docker Compose ensemble.
You can use this image for no Zookeeper.
https://hub.docker.com/r/bitnami/kafka
Here is a example yaml.
version: "3"
services:
kafka:
image: 'bitnami/kafka:3.2.3'
restart: "no"
privileged: true
ports:
- 2181:2181
- 19092:19092
environment:
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_PROCESS_ROLES=broker,controller
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_LISTENERS=PLAINTEXT://:19092,CONTROLLER://:2181
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://127.0.0.1:19092
- KAFKA_BROKER_ID=1
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1#127.0.0.1:2181
- ALLOW_PLAINTEXT_LISTENER=yes
According to "What’s New in Apache Kafka 3.3" document and "KIP-833: Mark KRaft as Production Ready" Kafka can work without Zookeeper (but there are some features yet works only by Apache ZooKeeper (ZK) mode).
Example (docker-compose.yml):
version: "2.5"
volumes:
volume1:
services:
kafka1:
image: 'bitnami/kafka:3.3.1'
container_name: kafka
environment:
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_PROCESS_ROLES=broker,controller
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka1:9092
- KAFKA_CFG_BROKER_ID=1
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1#kafka1:9093
- ALLOW_PLAINTEXT_LISTENER=yes
- KAFKA_KRAFT_CLUSTER_ID=r4zt_wrqTRuT7W2NJsB_GA
volumes:
- volume1:/bitnami/kafka
kafka-ui:
container_name: kafka-ui
image: 'provectuslabs/kafka-ui:latest'
ports:
- "8080:8080"
environment:
- KAFKA_CLUSTERS_0_BOOTSTRAP_SERVERS=kafka1:9092
- KAFKA_CLUSTERS_0_NAME=r4zt_wrqTRuT7W2NJsB_GA
You could try localhost:8080 and you will see that it works perfectly.

Kafka Listener is not working! It is isolated in intranet

My Kafka node is hosted in Google Cloud Dataproc. However, we realized that the Kafka installed through default initialization script is set up in such a way that it only allows intranet access. It is completely isolated from the outside world. The producer outside the google cloud network can't publish the message to Kafka and the Kafka message can't chain to its extranet subscriber.
Remark
I have whitelisted the producer IP
After read thru the other StackOverflow, blog post and documentation. I think it could due to advertised.listeners parts of Socket Server Settings in /usr/lib/kafka/server.properties.
First solution
I added advertised.listeners=PLAINTEXT://[External_IP]:19092
then sudo /etc/init.d/kafka-server restart
OUTCOME
However, when I trying to Kafkacat or telnet, it always failed. I also tested advertised.listeners with various port
Second solution from https://rmoff.net/2018/08/02/kafka-listeners-explained/
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
############################# Socket Server Settings #############################
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
#
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
->>>>>>> I added below listener config according to https://rmoff.net/2018/08/02/kafka-listeners-explained/
listeners=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:19092
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
advertised.listeners=EXTERNAL://[External_IP]:19092,INTERNAL://[Internal_IP]:9092
inter.broker.listener.name=INTERNAL
OUTCOME
It's the same result as above, Not Working.
Firewall Rules [Updated]
This is my current firewall rules config. Am I doing a mistake?
Can anyone help me to resolve this?
Here is what worked for my cluster:
I've set the following properties from the second solution:
listeners=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:19092
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
advertised.listeners=EXTERNAL://[External_IP]:19092,INTERNAL://[Internal_IP]:9092
inter.broker.listener.name=INTERNAL
I've created a firewall rule opening port 19092 to my personal development machine IP and applied it to the network. From my machine, I've tried to telnet the kafka server and I got:
$ telnet [EXTERNAL-IP] 19092
Trying [EXTERNAL-IP]...
Connected to [EXTERNAL-IP].
Escape character is '^]'.
I then tried to use kafkacat, and got an error. Running in debug, I saw the error was because I have not set any topics:
%7|1578351264.551|METADATA|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: ===== Received metadata: application requested =====
%7|1578351264.551|METADATA|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: ClusterId: jYxfi6zzR0euAovYyKCFZg, ControllerId: -1
%7|1578351264.551|METADATA|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: 0 brokers, 0 topics
%7|1578351264.551|METADATA|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: No brokers or topics in metadata: should retry
%7|1578351264.551|REQERR|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: MetadataRequest failed: Local: Partial response: explicit actions Retry
%7|1578351264.551|RETRY|rdkafka#producer-1| [thrd:[EXTERNAL-IP]:19092/bootstrap]: [EXTERNAL-IP]:19092/bootstrap: Retrying MetadataRequest (v2, 25 bytes, retry 1/2, prev CorrId 3) in 100ms
Please notice that I've tried to connect to the kafka server from outside to the cluster. In the questions, the telnet and kafkacat are running on the same machine as the kafka server (kafka-tng-w-0).
Here is a sample docker-compose.yaml file.
version: '2'
services:
zookeeper:
image: strimzi/kafka:0.20.0-kafka-2.6.0
command: [
"sh", "-c",
"bin/zookeeper-server-start.sh config/zookeeper.properties"
]
ports:
- "2181:2181"
environment:
LOG_DIR: /tmp/logs
kafka:
image: strimzi/kafka:0.20.0-kafka-2.6.0
command: [
"sh", "-c",
"bin/kafka-server-start.sh config/server.properties --override
listeners=$${KAFKA_LISTENERS} --override
advertised.listeners=$${KAFKA_ADVERTISED_LISTENERS} --override
zookeeper.connect=$${KAFKA_ZOOKEEPER_CONNECT}"
]
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
LOG_DIR: "/tmp/logs"
# Dev GQ - Laptop
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://172.23.240.1:9092
# AWS Pre-Prod
#KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://11.122.200.229:9092
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
And here is a sample Quarkus application.properties file with kafka bootstrap server configured as advertised listeners in docker-compose.yaml.
# Configure the SmallRye Kafka connector
# Dev GQ - Laptop
mp.messaging.connector.smallrye-kafka.bootstrap.servers=172.23.240.1:9092
# AWS Pre-Prod
#mp.messaging.connector.smallrye-kafka.bootstrap.servers=11.122.200.229:9092
quarkus.kafka.health.enabled=true
# Configure the Kafka sink (we write to it)
mp.messaging.outgoing.generated-price.connector=smallrye-kafka
mp.messaging.outgoing.generated-price.topic=prices
mp.messaging.outgoing.generated-price.value.serializer=org.apache.kafka.common.serialization.IntegerSerializer
# Configure the Kafka source (we read from it)
mp.messaging.incoming.prices.connector=smallrye-kafka
mp.messaging.incoming.prices.topic=prices
# ..... more codes
version: "3"
services:
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka
hostname: kafka
ports:
- "9093:9093"
- "9092:9092"
environment:
TZ: CST-8
KAFKA_BROKER_ID: 3
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: INSIDE://:9092,OUTSIDE://:9093
KAFKA_ADVERTISED_LISTENERS: INSIDE://kafka:9092,OUTSIDE://${Your_External_IP}:9093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
volumes:
- /var/run/docker.sock:/var/run/docker.sock
links:
- zookeeper