Is it possible to have a Kafka cluster with different OSes? - apache-kafka

If I have one Kafka running on Windows server, and another Kafka running on Linux server, and the last one running on a Unix server, can I use those three as one cluster?
Server A - Windows
Server B - Linux
Server C - Unix
Can I use them for replication? As in --replication-factor 3?

Brokers communicate via well defined APIs that don't depend on the operating system, so in theory that should work. However, I doubt anybody has tried running such a cluster!
It's worth noting that Windows is not an officially supported platform: http://kafka.apache.org/documentation/#os so you may run into a few issues.
I'm not sure why you'd even consider doing this as this sounds like an operation nightmare! For science!

Related

How to run MirrorMaker 2.0 in production?

From the documentation, I see MirrorMaker 2.0 like this on the command line -
./bin/connect-mirror-maker.sh mm2.properties
In my case I can go to an EC2 instance and enter this command.
But what is the correct practice if this needs to run for many days. It is possible that EC2 instance can get terminated etc.
Therefore, I am trying to find out what are best practices if you need to run MirrorMaker 2.0 for a long period and want to make sure that it is able to stay up and running without any manual intervention.
You have many options, these include:
Add it as a service to systemd. Then you can specify that it should be started automatically and restarted on failure. systemd is very common now, but if you're not running systemd, there are many other process managers. https://superuser.com/a/687188/80826.
Running in a Docker container, where you can specify a restart policy.

Error while starting zookeeper with Kafka

While running Zookeeper with Kafka on windows 10 I am getting below error :
kafka_2.12-2.4.1>bin\windows\zookeeper-server-start.bat config\zookeeper.properties
The input line is too long.
The syntax of the command is incorrect.
Please advice on how can this be solved.
P.S : I am using JDK - 1.8.0_181.
You need to move your Kafka distribution somewhere else, so you don't have very long path to that directory + bin\windows\...
IMHO, on Windows it's better to use Kafka in the Docker images, than try to run it natively - Windows has a lot of restrictions, compared to Linux/Unix that are primary platforms for running Kafka & other big data applications.

Running kafka connect in Distributed mode?

I have a total of 3 VM's(CloudVPS). Each of them has java, confluent open source installed on them. In VM1 I am running 3 processes of Splunk-sink-connector which reads from different topics and are running on different ports. And using REST calls I posted JSON configuration to each of them.
Since I am running in distributed mode I want to take advantage of other 2 VM's also. Can anyone please tell me what to do, to add other 2 VM's to those 3 processes to achieve parallel processing.
You just need to run Kafka Connect in Distributed mode on the three VMs, follow the instructions here and make sure you give them all the same group.id which identifies them as members of the same cluster (and thus eligible for sharing workload of tasks out across them). More config details for distributed mode here.
See also:
https://rmoff.net/2019/11/22/common-mistakes-made-when-configuring-multiple-kafka-connect-workers/
http://rmoff.dev/ksldn19-kafka-connect

Is Spring Cloud Dataflow local server able to be distributed?

I have been using SCDF for a while, and realise the main diffs between XD and SCDF is XD is born to be distributed, but SCDF seems work like a platform for SC stream apps. At least local server works like so.
So my question is, is it possible that scdf local server being distributed? I see no trends on local server being distributed.
Any idea on this? thanks
As you might have known already, the local server is not intended for production deployments and for the development purpose only.
The local SCDF server is not intended to have distributed use case as there is no co-ordination service for the number of local servers running. But all the apps deployed by the local server run as separate processes.

MongoDB Replication with Ubuntu and CentOS not working

I am trying replication with MongoDB. The process is to be carried out on two different servers having different versions of Unix, the Ubuntu and CentOS respectively. I am facing various issues while doing this. Is this issue causing due to different OS. The replication works smooth with two servers having Ubuntu.
There should be no issue with this. Posting mongod logs from both servers for the failing case would be a good first step for us to help you debug your issue.