Docker compose multi-container with zookeeper, kafka and python script on Azure container instances not able to connect to kafka - apache-kafka

I am trying to get a zookeeper/kafka non-clustered setup to be able to talk to containers with python scripts. I want to be able to run a zookeeper/kafka container and 2 or more containers with python scripts communicating to the zookeeper/kafka, all running in containers or container groups on Azure.
To test this, I have created the below docker container group, with zookeeper and kafka as 2 services and a 3rd service that starts a simple python script to produce a steady pace of messages to a kafka topic. The docker-compose.yml that I am using is as follows:
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 22181:2181
networks:
- my-network
kafka:
image: confluentinc/cp-kafka:latest
container_name: kafka
depends_on:
- zookeeper
ports:
- 29092:29092
networks:
- my-network
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
kafka_producer:
build: ../kafka_producer
image: annabotkafka.azurecr.io/kafka_producer:v1
container_name: kafka_producer
depends_on:
- kafka
volumes:
- .:/usr/src/kafka_producer
networks:
- my-network
environment:
KAFKA_SERVERS: kafka:9092
networks:
my-network:
driver: bridge
The kafka_producer.py script is as follows:
import os
from time import sleep
import json
from confluent_kafka import Producer
def acked(err, msg):
if err is not None:
print("Failed to deliver message: {0}: {1}"
.format(msg.value(), err.str()))
else:
print("Message produced: {0}".format(msg.value()))
# Function to send a status message out on the status topic
def send_status(producer,counter):
msg = {'counter':counter}
json_dump = json.dumps(msg)
producer.produce("counter", json_dump.encode('utf-8'), callback=acked)
producer.poll()
# Define kafkaProducer to push messages to the status topic
producer = Producer({'bootstrap.servers': 'kafka:9092'})
for j in range(9999):
print("Iteration", j)
send_status(producer, j)
sleep(2)
When I 'docker-compose up' this on my Ubuntu 20.04 dev machine, I get the expected behaviour: a stead stream of messages sent to the kafka producer.
After I 'docker-compuse push' this to Azure Container instances and create a container in Azure with the image, the kafka_producer script appears to no longer be able to connect to the kafka broker at kafka:9092.
These are the logs from the container group after startup:
Iteration 0
%3|1629363616.468|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Failed to resolve 'kafka:9092': Name or service not known (after 25ms in state CONNECT)
%3|1629363618.465|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Failed to resolve 'kafka:9092': Name or service not known (after 22ms in state CONNECT, 1 identical error(s) suppressed)
Iteration 1
Iteration 2
I had understood that the container group is on the same network subnet and on a single host so I would expect this to operate the same as on my dev machine locally.
My next step will be the have separate containers with different python scripts that I will want to communicate with kafka in this container group. Having the producer script within the same container group is not my longterm expectation, but I believed this simpler setup should work.
Any suggestions for where I am going wrong?

From Azure documentation
Within a container group, container instances can reach each other via localhost on any port, even if those ports aren't exposed externally on the group's IP address or from the container.
This makes it sound like the containers are using a host network, not a Docker bridge like you've setup in Compose (where your code works fine)
Therefore, you ought to connect with localhost:29092
If you don't actually need message persistence, then I'd suggest using sockets via HTTP, gRPC or Zeromq between your scripts rather than a Kafka container

Related

How to setup local Kafka to validate schema?

I read that schema validation is available in Kafka Confluent Server only. But maybe since then someone has found a solution? Does anyone know how can I test message validity running Kafka locally?
how to enforce broker to validate producers' input
Kafka doesn't do this. Confluent Server can, yes, but it is enterprise licensed. There is no alternative to server-side validation without forking Kafka like Confluent did. Otherwise, you would need to write your own Serializer class, but that won't scale to every producer client you may use.
You can still use Docker as the other answer shows, and you can use confluentinc/cp-server image, then you simply add an environment variable to enable schema valdiation.
https://docs.confluent.io/platform/current/schema-registry/schema-validation.html
Otherwise, simply using JsonSchemaSerializer, for example, will validate each record does adhere to a schema (client-side validation), but it won't stop anyone else from sending garbage data into the topic (server-side validation).
If you know a little bite docker-compose, you can deploy a Kafka ecosystem for testing purposes.
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.1.0
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-kafka:7.1.0
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "29092:29092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
schema-registry:
image: confluentinc/cp-schema-registry:7.1.0
hostname: schema-registry
container_name: schema-registry
depends_on:
- broker
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:9092'
I got the docker-compose file from https://github.com/confluentinc/kafka-tutorials/blob/master/_includes/tutorials/aggregating-sum/ksql/code/docker-compose.yml, but for your testing, it is not needed the KSQL components and you can delete those.

Kafka Consumer is not receiving Messages on docker

I'm a begginer on kafka as well as docker, I have been doing a course and working with kafka producer and consumer but for some reason it is not working.
When I do use of the producer the message are saved in the topic (I have already checked it) but when I try to get the message using the consumer it is not working and I have no idea why.
It had worked previously but not anymore.
The unique difference I have in this case is that I'm using the confluentinc image instead of the bitnami image.
So, if anyone has any idea or solution I would really appreciate it.
I share my compose and an screenshot so you can see it.
version: "3.2"
services:
###############################################################
zookeeper:
image: 'confluentinc/cp-zookeeper:latest'
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 2181:2181
###############################################################
broker:
image: 'confluentinc/cp-kafka:latest'
container_name: broker
depends_on:
- zookeeper
ports:
- 9092:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
# Exposes 9092 for external connections to the broker
# Use kafka:29092 for connections internal on the docker network
# See https://rmoff.net/2018/08/02/kafka-listeners-explained/ for details
KAFKA_LISTENERS: "PLAINTEXT://:29092,PLAINTEXT_HOST://:9092"
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_DELETE_TOPIC_ENABLE: "true"
Producer and Consumer
All is running in my local machine.
Take a look at docker-compose logs broker...
You should see a lot of Error processing create topic request CreatableTopic(name='__consumer_offsets', numPartitions=50, replicationFactor=3
Without a valid __consumer_offsets topic, no consumer will be able to run and commit offsets. Similarly, transactions won't work either (which are enabled by default in latest Kafka)
Add these variables and re-create the containers
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1

Kafka Listener is not working! It is isolated in intranet

My Kafka node is hosted in Google Cloud Dataproc. However, we realized that the Kafka installed through default initialization script is set up in such a way that it only allows intranet access. It is completely isolated from the outside world. The producer outside the google cloud network can't publish the message to Kafka and the Kafka message can't chain to its extranet subscriber.
Remark
I have whitelisted the producer IP
After read thru the other StackOverflow, blog post and documentation. I think it could due to advertised.listeners parts of Socket Server Settings in /usr/lib/kafka/server.properties.
First solution
I added advertised.listeners=PLAINTEXT://[External_IP]:19092
then sudo /etc/init.d/kafka-server restart
OUTCOME
However, when I trying to Kafkacat or telnet, it always failed. I also tested advertised.listeners with various port
Second solution from https://rmoff.net/2018/08/02/kafka-listeners-explained/
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
############################# Socket Server Settings #############################
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
#
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
->>>>>>> I added below listener config according to https://rmoff.net/2018/08/02/kafka-listeners-explained/
listeners=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:19092
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
advertised.listeners=EXTERNAL://[External_IP]:19092,INTERNAL://[Internal_IP]:9092
inter.broker.listener.name=INTERNAL
OUTCOME
It's the same result as above, Not Working.
Firewall Rules [Updated]
This is my current firewall rules config. Am I doing a mistake?
Can anyone help me to resolve this?
Here is what worked for my cluster:
I've set the following properties from the second solution:
listeners=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:19092
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
advertised.listeners=EXTERNAL://[External_IP]:19092,INTERNAL://[Internal_IP]:9092
inter.broker.listener.name=INTERNAL
I've created a firewall rule opening port 19092 to my personal development machine IP and applied it to the network. From my machine, I've tried to telnet the kafka server and I got:
$ telnet [EXTERNAL-IP] 19092
Trying [EXTERNAL-IP]...
Connected to [EXTERNAL-IP].
Escape character is '^]'.
I then tried to use kafkacat, and got an error. Running in debug, I saw the error was because I have not set any topics:
%7|1578351264.551|METADATA|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: ===== Received metadata: application requested =====
%7|1578351264.551|METADATA|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: ClusterId: jYxfi6zzR0euAovYyKCFZg, ControllerId: -1
%7|1578351264.551|METADATA|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: 0 brokers, 0 topics
%7|1578351264.551|METADATA|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: No brokers or topics in metadata: should retry
%7|1578351264.551|REQERR|rdkafka#producer-1| [thrd:main]: [EXTERNAL-IP]:19092/bootstrap: MetadataRequest failed: Local: Partial response: explicit actions Retry
%7|1578351264.551|RETRY|rdkafka#producer-1| [thrd:[EXTERNAL-IP]:19092/bootstrap]: [EXTERNAL-IP]:19092/bootstrap: Retrying MetadataRequest (v2, 25 bytes, retry 1/2, prev CorrId 3) in 100ms
Please notice that I've tried to connect to the kafka server from outside to the cluster. In the questions, the telnet and kafkacat are running on the same machine as the kafka server (kafka-tng-w-0).
Here is a sample docker-compose.yaml file.
version: '2'
services:
zookeeper:
image: strimzi/kafka:0.20.0-kafka-2.6.0
command: [
"sh", "-c",
"bin/zookeeper-server-start.sh config/zookeeper.properties"
]
ports:
- "2181:2181"
environment:
LOG_DIR: /tmp/logs
kafka:
image: strimzi/kafka:0.20.0-kafka-2.6.0
command: [
"sh", "-c",
"bin/kafka-server-start.sh config/server.properties --override
listeners=$${KAFKA_LISTENERS} --override
advertised.listeners=$${KAFKA_ADVERTISED_LISTENERS} --override
zookeeper.connect=$${KAFKA_ZOOKEEPER_CONNECT}"
]
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
LOG_DIR: "/tmp/logs"
# Dev GQ - Laptop
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://172.23.240.1:9092
# AWS Pre-Prod
#KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://11.122.200.229:9092
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
And here is a sample Quarkus application.properties file with kafka bootstrap server configured as advertised listeners in docker-compose.yaml.
# Configure the SmallRye Kafka connector
# Dev GQ - Laptop
mp.messaging.connector.smallrye-kafka.bootstrap.servers=172.23.240.1:9092
# AWS Pre-Prod
#mp.messaging.connector.smallrye-kafka.bootstrap.servers=11.122.200.229:9092
quarkus.kafka.health.enabled=true
# Configure the Kafka sink (we write to it)
mp.messaging.outgoing.generated-price.connector=smallrye-kafka
mp.messaging.outgoing.generated-price.topic=prices
mp.messaging.outgoing.generated-price.value.serializer=org.apache.kafka.common.serialization.IntegerSerializer
# Configure the Kafka source (we read from it)
mp.messaging.incoming.prices.connector=smallrye-kafka
mp.messaging.incoming.prices.topic=prices
# ..... more codes
version: "3"
services:
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka
hostname: kafka
ports:
- "9093:9093"
- "9092:9092"
environment:
TZ: CST-8
KAFKA_BROKER_ID: 3
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: INSIDE://:9092,OUTSIDE://:9093
KAFKA_ADVERTISED_LISTENERS: INSIDE://kafka:9092,OUTSIDE://${Your_External_IP}:9093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
volumes:
- /var/run/docker.sock:/var/run/docker.sock
links:
- zookeeper

Timeout on Kinesis Port when creating Kinesis Stream on Localstack

I'm trying to create a Kinesis stream using Localstack running on Docker.
My docker-compose.yml looks like this:
version: '3.2'
services:
localstack:
image: localstack/localstack:latest
container_name: localstack_test_serialize
ports:
- '4563-4599:4563-4599'
- '8055:8080'
environment:
- SERVICES=s3,kinesis:4569
- DEBUG=1
- DATA_DIR=/tmp/localstack/data
volumes:
- './.localstack:/tmp/localstack'
- '/var/run/docker.sock:/var/run/docker.sock'
Running docker-compose up -d starts everything just fine, and I'm able to create an S3 bucket on the normal S3 port.
However, when I try to run
aws --endpoint-url=http://localhost:4569 kinesis create-stream --stream-name sample-application-stream --shard-count 1
to create a Kinesis stream, I end up getting a timeout message for port 4569.
Any idea what I'm doing wrong or why Localstack isn't letting me create this stream?
You could use the port 4568.
The LocalStack documentation mark this port to use kinesis.

Unable to send messages to Kafka

I created a kafka docker from wurstmeisters docker image and after that followed the steps 3, 4 and 5 from apache documetnation to produce and consume messages. Unfortunately it is not possible to do so. 3 seconds after im sending a message I get the following error:
I cant find solutions for the given error. So what do I have to do to solve this issue?
docker-compose.yml
version: '2'
services:
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2181:2181"
kafka:
build: .
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_CREATE_TOPICS: "test:1:1"
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
volumes:
- /var/run/docker.sock:/var/run/docker.sock
EDIT
The zookeeper is reachable.
bin/kafka-topics.sh --list --zookeeper localhost:2181
returns:
baeldung
filtered
greeting
partitioned
test
So only the broker is not available.
In my opinion it is very strange but it seems that it was a simple conflict due to the version. The kafka/wurstmeister image is using Kafka Version kafka_2.12-0.10.2.1.tgz so ofcourse I downloaded the same version for the consumer and producer clients. That leaded to the result I posted above.
Now I tried the version kafka_2.11-0.11.0.0.tgz and everything is working fine. I thought the versions should match to work but obviously that is not the case.