Kafka on Kubernetes - kubernetes

We have 12 API's deployed on a cluster and we are using Kafka which are deployed on 3 EC2 instances. Should I add the Kafka Servers in K8s too or should I keep it the same? Or should I start using AWS MSK?
Still Experimenting so any suggestions or good documentation would be helpful

This is opinion based so it's probably going to be closed but check out https://strimzi.io/. It's been working great for us.

Related

Is it possible to deploy a nestjs microservice backend on a kubernetes Cluster

Hello intelligent stackoverflow people,
i am trying to deploy my microservice backend developed with nestjs on Kubernetes.
But i donĀ“t know how to do it or even find a tutorial that shows me how to.
I found an article talking about a similar case using Kafka as the event-streaming-service.
https://limascloud.com/2022/03/22/nestjs-on-kubernetes-kubernetes-for-developers/
Instead of Kafka i used the native event based communication provided by the framework described in the docs. It is some basic topic based publish-subscribe mechanism.
Does that prohibit the use of Kubernetes. Do i need to use some kind of external communication software?
I am really confused at the moment and dont know if we/i made an error since the start.
I am the author of the post you mentioned. You should be able to use the event-streaming-service, but it's a different scenario than the one I represent in the post.
In the post, the pods are connecting to a Kafka service that is running outside of the Kubernetes network, but in your scenario, the pods need to be able to connect to one another inside the Kubernetes network.
If you are planning to use two separate services, I would recommend using an external broker. If you plan to use the default mechanism, make sure to set the host and port configuration for one of the pods. Lets say api is just going to produce, so set its configuration to the pod name and port of the worker. Let me know if it works. I would start trying to make it work on your local env before going to Kubernetes.

Can you run Zookeeper cluster without using statefulsets in openshift?

I have a single instance of zookeeper running without issues, however when I add two more nodes it crashes with leader election or we got a connection request from the server with own id.
Appreciate any help here.
In short, you should use statefulset.
Would you like community help you - please provide logs and errors of crushes.

How to deploy Storm, Zookeeper, and Supervisor nodes to GCP?

We're trying to set up a Storm cluster with Google Compute Engine but are having a hard time finding resources. Most Tutorials only cover deploying single applications to GCE. We've dockerized the project but don't know how to deploy to GCP. Any Suggestions?
You may try to configure an instance template and create instances with COS image which already have Docker installed.
Here you can have more information about this.
Other option is using Kubernetes Engine (GKE) which has more features that can help you to have more control on your workloads and it also supports autoscaling, auto upgrades and node repairs.

Hazelcast split-brain

I'm using hazelcast (3.7.4) with OpenShift.
Each application is starting a HazelcastInstance.
The network discovery is done via hazelcast-kubernetes (1.1.0).
Sometimes when I deploy the whole application, the cluster is stuck in a split-brain syndrom forever. It never fix and reconnect the whole cluster.
I have to restart pods to enable the reconstruction of a single cluster.
Can someone help me to prevent the split-brain or at least making it recover after ?
Thanks
Use StatefulSet instead of Deployment (or ReplicationController). Then, PODs start one by one which prevents the Split Brain issue. You can have a look at the official OpenShift Code Sample for Hazelcast or specifically at the OpenShift template for Hazelcast.
What's more, try to use the latest Hazelcast version, I think it should re-form the cluster even if you use Deployment and the cluster starts with a Split Brain.

Kafka cluster deploy in the cloud

I want to create and deploy Kafka cluster for data pipeline.
What is the prefer way to deploy it in the cloud , VMs or Kubernetes?
Kafka can run either way. However, if this is the question you are asking, then you might want to question if you really want to manage your own Kafka cluster in the cloud. Why not use an existing Kafka as a service offering?