Strimzi Kubernetes Kafka Cluster ID not matching - kubernetes

I am currently facing a strange behaviour which I cannot really grab.
I created a kafka cluster and everything worked fine. Then I wanted to rebuild on a different namespace so I deleted everything (kafka CRD, strimzi yaml and devices) and created everything new in the different namespace.
Now I am getting constant error messages (kafka is not starting up):
The Cluster ID oAF2NGklQ0mF2_f3U0oJuw doesn't match stored clusterId Some(Gjb2frSRSS-v6Bpjx-dyqg) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
Thing is: I am not aware, that I kept some state between both namespaces (drop everything, formated filesystem). I even restarted the kubernetes cluster node to make sure that even emptyDir (strimzi-tmp) is reset, but no success...
The IDs stay the same (after reboot, rebuild of filesystem, drop and recreate of application).
My Cluster Config:
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: kafka-cluster
spec:
kafka:
version: 3.1.0
replicas: 1
listeners:
- name: tls
port: 9093
type: internal
tls: true
authentication:
type: tls
authorization:
type: simple
config:
offsets.topic.replication.factor: 1
transaction.state.log.replication.factor: 1
transaction.state.log.min.isr: 1
default.replication.factor: 1
min.insync.replicas: 1
inter.broker.protocol.version: "3.1"
storage:
type: persistent-claim
size: 20Gi
deleteClaim: false
jmxOptions: {}
zookeeper:
replicas: 1
storage:
type: persistent-claim
size: 20Gi
deleteClaim: false
entityOperator:
topicOperator: {}
userOperator: {}

Problem was, that the wrong persistent volume was mounted. The pod seems not to mount the requested iSCSI volume, but some local storage.

Related

pod has unbound immediate PersistentVolumeClaims, I deployed milvus server using

etcd:
enabled: true
name: etcd
replicaCount: 3
pdb:
create: false
image:
repository: "milvusdb/etcd"
tag: "3.5.0-r7"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 2379
peerPort: 2380
auth:
rbac:
enabled: false
persistence:
enabled: true
storageClass:
accessMode: ReadWriteOnce
size: 10Gi
Enable auto compaction
compaction by every 1000 revision
autoCompactionMode: revision
autoCompactionRetention: "1000"
Increase default quota to 4G
extraEnvVars:
name: ETCD_QUOTA_BACKEND_BYTES
value: "4294967296"
name: ETCD_HEARTBEAT_INTERVAL
value: "500"
name: ETCD_ELECTION_TIMEOUTenter code here
value: "2500"
Configuration values for the pulsar dependency
ref: https://github.com/apache/pulsar-helm-chart
enter image description here
I am trying to run the milvus cluster using kubernete in ubuntu server.
I used helm menifest https://milvus-io.github.io/milvus-helm/
Values.yaml
https://raw.githubusercontent.com/milvus-io/milvus-helm/master/charts/milvus/values.yaml
I checked PersistentValumeClaim their was an error
no persistent volumes available for this claim and no storage class is set
This error comes because you dont have a Persistent Volume.
A pvc needs a a pv with at least the same capacity of the pvc.
This can be done manually or with a Volume provvisioner.
The most easy way someone would say is to use the local storageClass, which uses the diskspace from the node where the pod is instanciated, adds a pod affinity so that the pod starts allways on the same node and can use the volume on that disk. In your case you are using 3 replicas. Allthough its possible to start all 3 instances on the same node, this is mostlikly not what you want to achieve with Kubernetes. If that node breaks you wont have any other instance running on another node.
You need first to thing about the infrastructure of your cluster. Where should the data of the volumes be stored?
An Network File System, nfs, might be a could solution.
In this case you have an nfs somewhere in your infrastructure and all the nodes can reach it.
So you can create a PV which is accessible from all your node.
To not allocate a PV always manualy you can install a Volumeprovisioner inside your cluster.
I use in some cluster this one here:
https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner
As i said you must have already an nfs and configure the provvisioner.yaml with the path.
it looks like this:
# patch_nfs_details.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nfs-client-provisioner
name: nfs-client-provisioner
spec:
template:
spec:
containers:
- name: nfs-client-provisioner
env:
- name: NFS_SERVER
value: <YOUR_NFS_SERVER_IP>
- name: NFS_PATH
value: <YOUR_NFS_SERVER_SHARE>
volumes:
- name: nfs-client-root
nfs:
server: <YOUR_NFS_SERVER_IP>
path: <YOUR_NFS_SERVER_SHARE>
If you use an nfs without provvisioner, you need to define a storageClass which is linked to your nfs.
There are a lot of solutions to hold persitent volumes.
Here you can find a list of StorageClasses:
https://kubernetes.io/docs/concepts/storage/storage-classes/
At the end it depends also where your cluster is provvisioned if you are not managing it by yourself.

Strimzi Kafka Using local Node Storage

i am running kafka on kubernetes (deployed on Azure) using strimzi for development environment and would prefer to use internal kubernetes node storage. if i use persistant-claim or jbod, it creates standard disks on azure storage. however i prefer to use internal node storage as i have 16 gb available there. i do not want to use ephemeral as i want the data to be persisted atleast on kubernetes nodes.
folllowing is my deployment.yml
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: kafka-cluster
spec:
kafka:
version: 3.1.0
replicas: 2
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
- name: external
type: loadbalancer
tls: false
port: 9094
config:
offsets.topic.replication.factor: 2
transaction.state.log.replication.factor: 2
transaction.state.log.min.isr: 2
default.replication.factor: 2
min.insync.replicas: 2
inter.broker.protocol.version: "3.1"
storage:
type: persistent-claim
size : 2Gi
deleteClaim: false
zookeeper:
replicas: 2
storage:
type: persistent-claim
size: 2Gi
deleteClaim: false
entityOperator:
topicOperator: {}
userOperator: {}
The persistent-claim storage as you use it will provision the storage using the default storage class which in your case I guess creates standard storage.
You have two options how to use local disk space of the worker node:
You can use the ephemeral type storage. But keep in mind that this is like a temporary directory, it will be lost in every rolling update. Also if you for example delete all the pods at the same time, you will loose all data. As such it is something recommended only for some short-lived clusters in CI, maybe some short development etc. But for sure not for anything where you need reliability.
You can use Local Persistent Volumes which are persistent volumes which are bound to a particular node. These are persistent, so the pods will re-use the volume between restarts and rolling udpates. However, it bounds the pod to the particular worker node the storage is on -> so you cannot easily reschedule it to another worker node. But apart from these limitation, it is something what can be (unlike the ephemeral storage) used with reliability and availability when done right. The local persistent volumes are normally provisioned through StorageClass as well -> so in the Kafka custom resource in Strimzi it will still use the persistent-claim type storage, just with different storage class.
You should really thing what exactly you want to use and why. From my experience, the local persistent volumes are great option when
You run on bare metal / on-premise clusters where often good shared block storage is not available
When you require maximum performance (local storage does not depend on network, so it can be often faster)
But in public clouds with good support for high quality for networked block storage such as Amazon EBS volumes and their Azure or Google counterparts, local storage often brings more problems than advantages because of how it bounds your Kafka brokers to a particular worker node.
Some more details about the local persistent volumes can be found here: https://kubernetes.io/docs/concepts/storage/volumes/#local ... there are also different provisioners which can help you use it. I'm not sure if Azure supports anything out of the box.
Sidenote: 2Gi of space is very small for Kafka. Not sure how much you will be able to do before running out of disk space. Even 16Gi would be quite small. If you know what are you doing, then fine. But if not, you should be careful.

how to migrate VM data from a disk to kubernetes cluster

how to migrate VM data from a disk to a Kubernetes cluster?
I have a VM with three disks attached mounted to it, each having data that needs to be migrated to a Kubernetes cluster to be attached to database services in statefulset.
This link https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/preexisting-pd does give me the way. But don't know how to use it or implement it with statefulsets so that one particular database resource (like Postgres) be able to use the same PV(created from one of those GCE persistent disks) and create multiple PVCs for new replicas.
Is the scenario I'm describing achievable?
If yes how to implement it?
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: 6.8.12
http:
tls:
selfSignedCertificate:
disabled: true
nodeSets:
- name: default
count: 3
config:
node.store.allow_mmap: false
xpack.security.enabled: false
xpack.security.authc:
anonymous:
username: anonymous
roles: superuser
authz_exception: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: managed-premium
StatefulSet is a preference for manually creating a PersistentVolumeClaim. For database workloads that can't be replicated, you can't set replicas: greater than 1 in either case, but the PVC management is valuable. You usually can't have multiple databases pointing at the same physical storage, containers or otherwise, and most types of Volumes can't be shared across Pods.
Postgres doesn't support multiple instances sharing the underlying volume without massive corruption so if you did set things up that way, it's definitely a mistake. More common would be to use the volumeClaimTemplate system so each pod gets its own distinct storage. Then you set up Postgres streaming replication yourself.
Refer to the document on Run a replicated stateful application and stackpost for more information.

strimzi 0.14: Kafka broker failed authentication

I tried to build a Kafka cluster using Strimzi (0.14) in a Kubernetes cluster.
I use the examples come with the strimzi, i.e. examples/kafka/kafka-persistent.yaml.
This yaml file looks like this:
apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
name: my-cluster
spec:
kafka:
version: 2.3.0
replicas: 3
listeners:
plain: {}
tls: {}
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
log.message.format.version: "2.3"
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 12Gi
deleteClaim: false
zookeeper:
replicas: 3
storage:
type: persistent-claim
size: 9Gi
deleteClaim: false
entityOperator:
topicOperator: {}
userOperator: {}
kubectl apply -f examples/kafka/kafka-persistent.yaml
Both zookeepers and kafka brokers were brought up.
However, I saw errors in kafka broker logs:
[SocketServer brokerId=1] Failed authentication with /10.244.5.94 (SSL handshake failed) (org.apache.kafka.common.network.Selector) [data-plane-kafka-network-thread-1-ListenerName(REPLICATION)-SSL-0]
Anyone know how to fix the problem?
One of the things which can cause this is if your cluster is using different DNS suffix for service domains (default is .cluster.local). You need to find out the right DNS suffix and use the environment variable KUBERNETES_SERVICE_DNS_DOMAIN in the Strimzi Cluster Operator deployment to override the default value.
If you exec into one of the Kafka or Zookeeper pods and do hostname -f it should show you the full hostname from which you can identifythe suffix.
(I pasted this from comments as a full answer since it helped to solve the question.)

Installing kafka and zookeeper cluster using kubernetes

Can anyone share me the yaml file for creating kafka cluster with two kafka broker and zookeeper cluster with 3 servers.I'm new to kubernetes.
Take look at https://github.com/Yolean/kubernetes-kafka, Make sure the broker memory limit is 2 GB or above.
Maintaining a reliable kafka cluster in kubernetes is still a challenge, good luck.
I recommend you to try Strimzi Kafka Operator. Using it you can define a Kafka cluster just like other Kubernetes object - writing a yaml file. Moreover, also users, topics and Kafka Connect cluster are just a k8s objects. Some (by not all!) features of Strimzi Kafka Operator:
Secure communication between brokers and between brokers and zookeeper with TLS
Ability to expose the cluster outside k8s cluster
Deployable as a helm chart (it simplifies things a lot)
Rolling updates when changing cluster configuration
Smooth scaling out
Ready to monitor the cluster using Prometheus and Grafana.
It's worth to mention a great documentation.
Creating a Kafka cluster is as simple as applying a Kubernetes manifest like this:
apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
name: my-cluster
spec:
kafka:
version: 2.2.0
replicas: 3
listeners:
plain: {}
tls: {}
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
log.message.format.version: "2.2"
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 100Gi
deleteClaim: false
zookeeper:
replicas: 3
storage:
type: persistent-claim
size: 100Gi
deleteClaim: false
entityOperator:
topicOperator: {}
userOperator: {}
I think that you could take a look at the Strimzi project here https://strimzi.io/.
It's based on the Kubernetes operator pattern and provide a simple way to deploy and manage a Kafka cluster on Kubernetes using custom resources.
The Kafka cluster is described through a new "Kafka" resource YAML file for setting all you need.
The operator takes care of that and deploys the Zookeeper ensemble + the Kafka cluster for you.
It also deploys more two operators for handling topics and users (but they are optional).
Another simple configuration of Kafka/Zookeeper on Kubernetes in DigitalOcean with external access:
https://github.com/StanislavKo/k8s_digitalocean_kafka
You can connect to Kafka from outside of AWS/DO/GCE by regular binary protocol. Connection is PLAINTEXT or SASL_PLAINTEXT (username/password).
Kafka cluster is StatefulSet, so you can scale cluster easily.