apiVersion : apps/v1
kind: StatefulSet
metadata:
name: kafka
labels:
app: kafka
namespace: kafka
spec:
replicas: 3
selector:
matchLabels:
app: kafka
serviceName: kafka
template:
spec:
containers:
- name: kafka
image: debezium/kafka
ports:
- name: kafka-int-port
containerPort: 9092
- name: kafka-ext-port
containerPort: 9093
hostPort: 0
command:
- sh
- -c
args:
- BROKER_ID=${POD_NAME##*-} KAFKA_ADVERTISED_LISTENERS=EXTERNAL://localhost:${HOST_PORT},INTERNAL://${POD_NAME}:9092 /docker-entrypoint.sh start
# - /bin/start.sh
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: HOST_PORT
valueFrom:
fieldRef:
fieldPath: status.hostIP
resourceFieldRef:
containerName: kafka
resource: ports
name: "kafka-ext-port"
fieldPath: [?(#.name=="kafka-ext-port")].hostPort
- name: "ZOOKEEPER_CONNECT"
value: zookeeper-service.kafka.svc.cluster.local
- name: "KAFKA_LISTENERS"
value: "EXTERNAL://:9093,INTERNAL://:9092"
- name: KAFKA_INTER_BROKER_LISTENER_NAME
value: "INTERNAL"
- name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
value: "INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT"
metadata:
name: kafka
labels:
app: kafka
I am creating a StatefulSet that runs three replicas of the Debezium Kafka image, each with two ports exposed: kafka-int-port and kafka-ext-port. The kafka-ext-port port is being exposed to the host using hostPort and its value will be dynamically generated.
Each replica is also being passed the BROKER_ID, KAFKA_ADVERTISED_LISTENERS, and other environment variables. The BROKER_ID is being set based on the index of the replica, and KAFKA_ADVERTISED_LISTENERS is being set to EXTERNAL://localhost:${HOST_PORT},INTERNAL://${POD_NAME}:9092.
The HOST_PORT environment variable is being set based on the hostIP status of the pod, and the dynamically generated hostPort value for kafka-ext-port is being extracted from the ports field of the kafka container.
Which has a problem. Can someone please help me to extract hostPort value ?
Related
I'm developing a microservices-based application deployed with Kubernetes for a university project. I'm newbie with Kubernetes and Kafka and I'm trying to run Kafka and zookeeper in the same minikube cluster. I have created one pod for Kafka and one pod for Zookeeper but after deploying them on the cluster they begin to restart repeatedly going to "CrashLoopBackOff" error. Taking a look at the logs I noticed that kafka launch a "ConnectException: Connection refused", it seems that kafka cannot establish connection with zookeeper. I have created the pods manually with the following config file:
zookeeper-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: zookeeper
spec:
selector:
matchLabels:
app: zookeeper
template:
metadata:
labels:
app: zookeeper
spec:
containers:
- name: zookeeper
image: bitnami/zookeeper
ports:
- name: http
containerPort: 2181
env:
- name: ALLOW_ANONYMOUS_LOGIN
value: "yes"
---
apiVersion: v1
kind: Service
metadata:
name: zookeeper
spec:
selector:
app: zookeeper
ports:
- protocol: TCP
port: 2181
kafka-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: kafka
spec:
selector:
matchLabels:
app: kafka
template:
metadata:
labels:
app: kafka
spec:
containers:
- name: kafka
image: bitnami/kafka
ports:
- name: http
containerPort: 9092
env:
- name: KAFKA_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: KAFKA_BROKER_ID
value: "1"
- name: ALLOW_PLAINTEXT_LISTENER
value: "yes"
- name: KAFKA_CFG_LISTENERS
value: PLAINTEXT://:9092
- name: KAFKA_CFG_ADVERTISED_LISTENERS
value: PLAINTEXT://$(KAFKA_POD_IP):9092
- name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
value: PLAINTEXT:PLAINTEXT
- name: KAFKA_CFG_ZOOKEEPER_CONNECT
value: zookeeper:2181
---
apiVersion: v1
kind: Service
metadata:
name: kafka
spec:
selector:
app: kafka
ports:
- protocol: TCP
port: 9092
type: LoadBalancer
Kafka and zookeeper configurations are more or less the same that I used with docker compose with no errors. So, probably there is something wrong in my configuration for Kubernetes. Anyone could help me please, I don't understand the issue, thanks.
Good day everyone!
The main problem is: I want to connect from my local machine to Kafka which is running on cluster (let it be DNS node03.st) in k8s container by my own manifest.
The manifest of zookeeper deployment is here (image: confluentinc/cp-zookeeper:6.2.4):
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: aptmess
name: zookeeper-aptmess-deployment
labels:
name: zookeeper-service-filter
spec:
selector:
matchLabels:
app: zookeeper-label
template:
metadata:
labels:
app: zookeeper-label
spec:
containers:
- name: zookeeper
image: confluentinc/cp-zookeeper:6.2.4
imagePullPolicy: IfNotPresent
ports:
- containerPort: 2181 # ZK client
name: client
- containerPort: 2888 # Follower
name: follower
- containerPort: 3888 # Election
name: election
- containerPort: 8080 # AdminServer
name: admin-server
env:
- name: ZOOKEEPER_ID
value: "1"
- name: ZOOKEEPER_SERVER_1
value: zookeeper
- name: ZOOKEEPER_CLIENT_PORT
value: "2181"
- name: ZOOKEEPER_TICK_TIME
value: "2000"
---
apiVersion: v1
kind: Service
metadata:
namespace: aptmess
name: zookeeper-service-aptmess
labels:
name: zookeeper-service-filter
spec:
type: NodePort
ports:
- port: 2181
protocol: TCP
name: client
- name: follower
port: 2888
protocol: TCP
- name: election
port: 3888
protocol: TCP
- port: 8080
protocol: TCP
name: admin-server
selector:
app: zookeeper-label
My kafka StatefulSet manifest (image: confluentinc/cp-kafka:6.2.4):
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: aptmess
name: kafka-stateful-set-aptmess
labels:
name: kafka-service-filter
spec:
serviceName: kafka-broker
replicas: 1
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: kafka-label
template:
metadata:
labels:
app: kafka-label
spec:
volumes:
- name: config
emptyDir: {}
- name: extensions
emptyDir: {}
- name: kafka-storage
persistentVolumeClaim:
claimName: kafka-data-claim
terminationGracePeriodSeconds: 300
containers:
- name: kafka
image: confluentinc/cp-kafka:6.2.4
imagePullPolicy: Always
ports:
- containerPort: 9092
resources:
requests:
memory: "2Gi"
cpu: "1"
command:
- bash
- -c
- unset KAFKA_PORT; /etc/confluent/docker/run
env:
- name: KAFKA_ADVERTISED_HOST_NAME
value: kafka-broker
- name: KAFKA_ZOOKEEPER_CONNECT
value: zookeeper-service-aptmess:2181
- name: KAFKA_BROKER_ID
value: "1"
- name: KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR
value: "1"
- name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
value: "PLAINTEXT:PLAINTEXT,CONNECTIONS_FROM_HOST:PLAINTEXT"
- name: KAFKA_INTER_BROKER_LISTENER_NAME
value: "PLAINTEXT"
- name: KAFKA_LISTENERS
value: "PLAINTEXT://0.0.0.0:9092"
- name: KAFKA_ADVERTISED_LISTENERS
value: "PLAINTEXT://kafka-broker.aptmess.svc.cluster.local:9092"
volumeMounts:
- name: config
mountPath: /etc/kafka
- name: extensions
mountPath: /opt/kafka/libs/extensions
- name: kafka-storage
mountPath: /var/lib/kafka/
securityContext:
runAsUser: 1000
fsGroup: 1000
---
apiVersion: v1
kind: Service
metadata:
namespace: aptmess
name: kafka-broker
labels:
name: kafka-service-filter
spec:
type: NodePort
ports:
- port: 9092
name: kafka-port
protocol: TCP
selector:
app: kafka-label
NodePort for port 9092 is 30000.
When i try to connect from localhost a got error:
from kafka import KafkaProducer
producer = KafkaProducer(
bootstrap_servers=['node03.st:30000']
)
>> Error connecting to node kafka-broker.aptmess.svc.cluster.local:9092 (id: 1 rack: null)
I spent a long time by changing internal and external listeners, but it doesn't help me. What should i do to reach the goal of sending message from my localhost to remote Kafka broker?
Thanks in advance!
P.s: I have searched this links to find results:
Use SCRAM-SHA-512 authentication with SSL on LoadBalancer in Strimzi Kafka
https://github.com/strimzi/strimzi-kafka-operator/issues/1156
https://github.com/strimzi/strimzi-kafka-operator/issues/1463
https://githubhelp.com/Yolean/kubernetes-kafka/issues/328?ysclid=l4grqi7hc6364785597
Connecting Kafka running on EC2 machine from my local machine
Access kafka broker in a remote machine ERROR
How to Connect to kafka on localhost (host machine) from app inside kubernetes (minikube)
kafka broker not available at starting
https://github.com/SOHU-Co/kafka-node/issues/666
https://docs.confluent.io/operator/current/co-nodeports.html
https://developers.redhat.com/blog/2019/06/07/accessing-apache-kafka-in-strimzi-part-2-node-ports
https://www.confluent.io/blog/kafka-client-cannot-connect-to-broker-on-aws-on-docker-etc/
Kafka in Kubernetes Cluster- How to publish/consume messages from outside of Kubernetes Cluster
Kafka docker compose external connection
confluentinc image
NodePort for port 9092 is 30000
Then you need to define that node's hostname and port as part of KAFKA_ADVERTISED_LISTENERS, as mentioned in many of the linked posts... You've only defined one listener, and it's internal to k8s... However, keep in mind, that's a poor solution unless you force the broker pod to only be running on that one host, and that one port.
Alternatively, replace your setup with Strimzi operator, and read how you can use Ingress resources (ideally) to access the Kafka cluster, but they also support NodePort - https://strimzi.io/blog/2019/04/17/accessing-kafka-part-1/ (cross reference with latest documentation since that's an old post)
Ingress's would be ideal because the Ingress controller would be able to dynamically route requests to the broker pods while having a fixed external address, otherwise, you'll constantly need to use k8s api to describe the broker pods and get their current port information
I am trying to deploy Zookeeper and Kafka on Kubernetes using the confluentinc docker images. I based my solution on this question and this post. The Zookeeper is running without errors on the log. I want to deploy 3 Kafka brokers using StatefulSet. The problem with my yaml files is that I don't know how to configure the KAFKA_ADVERTISED_LISTENERS property for Kafka when using 3 brokers.
Here is the yaml files for zookeeper:
apiVersion: v1
kind: Service
metadata:
name: zookeeper
labels:
app: zookeeper
spec:
clusterIP: None
ports:
- name: client
port: 2181
protocol: TCP
- name: follower
port: 2888
protocol: TCP
- name: leader
port: 3888
protocol: TCP
selector:
app: zookeeper
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zookeeper
spec:
replicas: 1
serviceName: zookeeper
selector:
matchLabels:
app: zookeeper # has to match .spec.template.metadata.labels
template:
metadata:
labels:
app: zookeeper # has to match .spec.selector.matchLabels
spec:
hostname: zookeeper
containers:
- name: zookeeper
image: confluentinc/cp-zookeeper:5.5.0
ports:
- containerPort: 2181
env:
- name: ZOOKEEPER_CLIENT_PORT
value: "2181"
- name: ZOOKEEPER_ID
value: "1"
- name: ZOOKEEPER_SERVER_1
value: zookeeper
and for the kafka broker:
apiVersion: v1
kind: Service
metadata:
name: kafka-service
labels:
app: kafka
spec:
type: LoadBalancer
ports:
- port: 9092
name: kafka-port
protocol: TCP
selector:
app: kafka
id: "0"
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
spec:
replicas: 3
serviceName: kafka
podManagementPolicy: OrderedReady
selector:
matchLabels:
app: kafka # has to match .spec.template.metadata.labels
template:
metadata:
labels:
app: kafka # has to match .spec.selector.matchLabels
spec:
containers:
- name: kafka
image: confluentinc/cp-enterprise-kafka:5.5.0
ports:
- containerPort: 9092
env:
- name: KAFKA_ZOOKEEPER_CONNECT
value: zookeeper:2181 # zookeeper-2.zookeeper.default.svc.cluster.local
- name: KAFKA_ADVERTISED_LISTENERS
value: "LISTENER_0://kafka-0:9092,LISTENER_1://kafka-1:9093,LISTENER_2://kafka-2:9094"
- name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
value: "LISTENER_0:PLAINTEXT,LISTENER_1:PLAINTEXT,LISTENER_2:PLAINTEXT"
- name: KAFKA_INTER_BROKER_LISTENER_NAME
value: LISTENER_0
I get the 3 kafka pods running, the kafka-0 is connecting but the kafka-1 and kafka-2 are not connecting.
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kafka-0 1/1 Running 0 4m12s 172.17.0.4 minikube <none> <none>
kafka-1 1/1 Running 5 4m9s 172.17.0.5 minikube <none> <none>
kafka-2 0/1 CrashLoopBackOff 4 4m7s 172.17.0.6 minikube <none> <none>
zookeeper-0 1/1 Running 0 21m 172.17.0.3 minikube <none> <none>
The error is saying that I already advertised kafka-0:9092,kafka-1:9093,kafka-2:9094 in the first pod kafka-0. So, I suppose it has to be dynamic. How do I configure it?
[2020-09-30 14:56:40,519] ERROR [KafkaServer id=1017] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
java.lang.IllegalArgumentException: requirement failed: Configured end points kafka-0:9092,kafka-1:9093,kafka-2:9094 in advertised listeners are already registered by broker 1012
at kafka.server.KafkaServer.$anonfun$createBrokerInfo$3(KafkaServer.scala:436)
at kafka.server.KafkaServer.$anonfun$createBrokerInfo$3$adapted(KafkaServer.scala:434)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at kafka.server.KafkaServer.createBrokerInfo(KafkaServer.scala:434)
at kafka.server.KafkaServer.startup(KafkaServer.scala:293)
at io.confluent.support.metrics.SupportedServerStartable.startup(SupportedServerStartable.java:114)
at io.confluent.support.metrics.SupportedKafka.main(SupportedKafka.java:66)
I have been reading this blog post "Kafka Listeners - Explained" and I was able to configure 3 Kafka brokers with the following configuration.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
spec:
replicas: 3
serviceName: kafka
podManagementPolicy: OrderedReady
selector:
matchLabels:
app: kafka # has to match .spec.template.metadata.labels
template:
metadata:
labels:
app: kafka # has to match .spec.selector.matchLabels
spec:
restartPolicy: Always
containers:
- name: kafka
image: confluentinc/cp-enterprise-kafka:5.5.0
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits: # limit of 0.5 cpu and 512MiB of memory
memory: "512Mi"
cpu: "500m"
# imagePullPolicy: Always
ports:
- containerPort: 9092
name: kafka-0
- containerPort: 9093
name: kafka-1
- containerPort: 9094
name: kafka-2
env:
- name: MY_METADATA_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: STAS_DELAY
value: "120"
- name: KAFKA_ZOOKEEPER_CONNECT
value: zookeeper:2181 # zookeeper-2.zookeeper.default.svc.cluster.local
- name: KAFKA_ADVERTISED_LISTENERS
value: "INSIDE://$(MY_POD_IP):9092"
- name: KAFKA_LISTENERS
value: "INSIDE://$(MY_POD_IP):9092"
- name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
value: "INSIDE:PLAINTEXT"
- name: KAFKA_INTER_BROKER_LISTENER_NAME
value: "INSIDE"
I'm migrating an application to Docker/Kubernetes. This application has 20+ well-known ports it needs to be accessed on. It needs to be accessed from outside the kubernetes cluster. For this, the application writes its public accessible IP to a database so the outside service knows how to access it. The IP is taken from the downward API (status.hostIP).
One solution is defining the well-known ports as (static) nodePorts in the service, but I don't want this, because it will limit the usability of the node: if another service has started and incidentally taken one of the known ports the application will not be able to start. Also, because Kubernetes opens the ports on all nodes in the cluster, I can only run 1 instance of the application per cluster.
Now I want to make the application aware of the port mappings done by the NodePort-service. How can this be done? As I don't see a hard link between the Service and the Statefulset object in Kubernetes.
Here is my (simplified) Kubernetes config:
apiVersion: v1
kind: Service
metadata:
name: my-app-svc
labels:
app: my-app
spec:
ports:
- port: 6000
targetPort: 6000
protocol: TCP
name: debug-port
- port: 6789
targetPort: 6789
protocol: TCP
name: traffic-port-1
selector:
app: my-app
type: NodePort
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: my-app-sf
spec:
serviceName: my-app-svc
replicas: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: my-repo/myapp/my-app:latest
imagePullPolicy: Always
env:
- name: K8S_ServiceAccountName
valueFrom:
fieldRef:
fieldPath: spec.serviceAccountName
- name: K8S_ServerIP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: serverName
valueFrom:
fieldRef:
fieldPath: metadata.name
ports:
- name: debug
containerPort: 6000
- name: traffic1
containerPort: 6789
This can be done with an initContainer.
You can define an initContainer to get the nodeport and save into a directory that shared with the container, then container can get the nodeport from that directory later, a simple demo like this:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: my-app
image: busybox
command: ["sh", "-c", "cat /data/port; while true; do sleep 3600; done"]
volumeMounts:
- name: config-data
mountPath: /data
initContainers:
- name: config-data
image: tutum/curl
command: ["sh", "-c", "TOKEN=`cat /var/run/secrets/kubernetes.io/serviceaccount/token`; curl -kD - -H \"Authorization: Bearer $TOKEN\" https://kubernetes.default:443/api/v1/namespaces/test/services/app 2>/dev/null | grep nodePort | awk '{print $2}' > /data/port"]
volumeMounts:
- name: config-data
mountPath: /data
volumes:
- name: config-data
emptyDir: {}
I am trying to setup a multi broker kafka on a kubernetes cluster hosted in Azure. I have a single broker setup working. For the multi broker setup, currently I have an ensemble of zookeeper nodes(3) that manage the kafka service. I am deploying the kafka cluster as a replication controller with replication factor of 3. That is 3 brokers. How can I register the three brokers with Zookeeper such that they register different IP addresses with the Zookeeper?
I bring up my replication controller after the service is deployed and use the Cluster IP in my replication-controller yaml file to specify two advertised.listeners, one for SSL and another for PLAINTEXT. However, in this scenario all brokers register with the same IP and write to replicas fail. I don't want to deploy each broker as a separate replication controller/pod and service as scaling becomes an issue. I would really appreciate any thoughts/ideas on this.
Edit 1:
I am additionally trying to expose the cluster to another VPC in cloud. I have to expose SSL and PLAINTEXT ports for clients which I am doing using advertised.listeners. If I use a statefulset with replication factor of 3 and let kubernetes expose the canonical host names of the pods as host names, these cannot be resolved from an external client. The only way I got this working is to use/expose an external service corresponding to each broker. However, this does not scale.
Kubernetes has the concept of Statefulsets to solve these issues. Each instance of a statefulset has it's own DNS name so you can reference to each instance by a dns name.
This concept is described here in more detail. You can also take a look at this complete example:
apiVersion: v1
kind: Service
metadata:
name: zk-headless
labels:
app: zk-headless
spec:
ports:
- port: 2888
name: server
- port: 3888
name: leader-election
clusterIP: None
selector:
app: zk
---
apiVersion: v1
kind: ConfigMap
metadata:
name: zk-config
data:
ensemble: "zk-0;zk-1;zk-2"
jvm.heap: "2G"
tick: "2000"
init: "10"
sync: "5"
client.cnxns: "60"
snap.retain: "3"
purge.interval: "1"
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-budget
spec:
selector:
matchLabels:
app: zk
minAvailable: 2
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: zk
spec:
serviceName: zk-headless
replicas: 3
template:
metadata:
labels:
app: zk
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk-headless
topologyKey: "kubernetes.io/hostname"
containers:
- name: k8szk
imagePullPolicy: Always
image: gcr.io/google_samples/k8szk:v1
resources:
requests:
memory: "4Gi"
cpu: "1"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
env:
- name : ZK_ENSEMBLE
valueFrom:
configMapKeyRef:
name: zk-config
key: ensemble
- name : ZK_HEAP_SIZE
valueFrom:
configMapKeyRef:
name: zk-config
key: jvm.heap
- name : ZK_TICK_TIME
valueFrom:
configMapKeyRef:
name: zk-config
key: tick
- name : ZK_INIT_LIMIT
valueFrom:
configMapKeyRef:
name: zk-config
key: init
- name : ZK_SYNC_LIMIT
valueFrom:
configMapKeyRef:
name: zk-config
key: tick
- name : ZK_MAX_CLIENT_CNXNS
valueFrom:
configMapKeyRef:
name: zk-config
key: client.cnxns
- name: ZK_SNAP_RETAIN_COUNT
valueFrom:
configMapKeyRef:
name: zk-config
key: snap.retain
- name: ZK_PURGE_INTERVAL
valueFrom:
configMapKeyRef:
name: zk-config
key: purge.interval
- name: ZK_CLIENT_PORT
value: "2181"
- name: ZK_SERVER_PORT
value: "2888"
- name: ZK_ELECTION_PORT
value: "3888"
command:
- sh
- -c
- zkGenConfig.sh && zkServer.sh start-foreground
readinessProbe:
exec:
command:
- "zkOk.sh"
initialDelaySeconds: 15
timeoutSeconds: 5
livenessProbe:
exec:
command:
- "zkOk.sh"
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
securityContext:
runAsUser: 1000
fsGroup: 1000
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 20Gi