Incremental Cooperative Rebalancing leads to unevenly balanced connectors - apache-kafka

We encounter lots of unevenly balanced connectors on our setup since the upgrade to Kafka 2.3 (also with Kafka connect 2.3) that should include the new Incremental Cooperative Rebalancing in Kafka connect explained here :
https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect
Let me explain a bit our setup, we are deploying multiple Kafka connect clusters to dump Kafka topic on HDFS. A single connect cluster is spawned for each hdfs-connector, meaning that at any times, exactly one connector is running on a connect cluster. Those clusters are deployed on top of Kubernetes with randomly selected ips in a private poll.
Let's take an example. For this hdfs connector, we spawned a connect cluster with 20 workers. 40 tasks should run on this cluster, so we could expect to have 2 tasks per worker. But as shown in the command below, while querying the connect API after a while, the connector seems really unbalanced, some workers are even not working at all while one of them took ownership of 28 tasks.
bash-4.2$ curl localhost:8083/connectors/connector-name/status|jq '.tasks[] | .worker_id' | sort |uniq -c
...
1 "192.168.32.53:8083"
1 "192.168.33.209:8083"
1 "192.168.34.228:8083"
1 "192.168.34.46:8083"
1 "192.168.36.118:8083"
1 "192.168.42.89:8083"
1 "192.168.44.190:8083"
28 "192.168.44.223:8083"
1 "192.168.51.19:8083"
1 "192.168.57.151:8083"
1 "192.168.58.29:8083"
1 "192.168.58.74:8083"
1 "192.168.63.102:8083"
Here we would expect that the whole poll of workers are used and the connector evenly balanced after a while. We would expect to have somethings like :
bash-4.2$ curl localhost:8083/connectors/connector-name/status|jq '.tasks[] | .worker_id' | sort |uniq -c
...
2 "192.168.32.185:8083"
2 "192.168.32.53:8083"
2 "192.168.32.83:8083"
2 "192.168.33.209:8083"
2 "192.168.34.228:8083"
2 "192.168.34.46:8083"
2 "192.168.36.118:8083"
2 "192.168.38.0:8083"
2 "192.168.42.252:8083"
2 "192.168.42.89:8083"
2 "192.168.43.23:8083"
2 "192.168.44.190:8083"
2 "192.168.49.219:8083"
2 "192.168.51.19:8083"
2 "192.168.55.15:8083"
2 "192.168.57.151:8083"
2 "192.168.58.29:8083"
2 "192.168.58.74:8083"
2 "192.168.59.249:8083"
2 "192.168.63.102:8083"
The second result was actually achieved by manually killing some workers, and a bit of luck (we didn't found a proper way to force an even balance across the connect cluster for now, it's more a process of try and fail until the connector is evenly balanced.
Does anyone already came across this issue and manage to solve it properly ?

Related

How to get Kafka Connect to balance tasks/connectors evenly?

I trying to run a Kafka Connect distributed cluster with 5 workers and 2 sink connectors. One connector is configured with max.tasks and actual tasks set to 5. The other connector is configured for 6 tasks.
If I understand things correctly this should mean 13 "work units" (1 connector + 5 tasks + 1 connector + 6 tasks) that need to be distributed to the workers. Which i figured would mean around 2-3 tasks/connectors per worker.
However, I'm finding that the eventual assignments are as follows:
Worker 0: (3 Work Units)
- Connector 0
- Connector 1 Task 0
- Connector 1 Task 1
Worker 1: ( 4 Work units)
- Connector 1
- Connector 0 Task 0
- Connector 0 Task 2
- Connector 1 Task 3
Worker 2: (2 Work units)
- Connector 0 Task 3
- Connector 0 Task 4
Worker 3: (4 Work units)
- Connector 0 Task 1
- Connector 1 Task 2
- Connector 1 Task 4
- Connector 1 Task 5
Worker 4: ( 0 Work Units)
I'm using Kafka Connect workers running 2.6.0 and a connector built against 2.6.0 libraries. Workers are deployed as containers in an orchestration system (Nomad) and have gone through several rounds of rolling restarts/reallocations but the balance always seems off.
Based on this I'm finding myself at a loss for the following:
I'd get that tasks may get a bit unbalanced while restarting or moving containers but i'd expect once the last one came up and joined the cluster a final balancing would have sorted things out. Can some one point me to why this might not be happening?
Is there a recommended way to trigger a rebalancing for the whole cluster? From the docs it seems that changing config for a connector or a worker failing/joining might cause a rebalance but that does not seem to be an ideal process.
Can the task balancing process be configured or controlled in anyway?

Run two kafka server separately on same machine

I have two Kafka servers which are run in the same machine in Ubuntu. One Development Cluster consist of 1 ZK 1 Kafka Broker and 2 Workers, and Production Cluster consist of 3 ZK 3 Kafka Broker and planned 3-4 Workers.
Both are running, but Zookeeper in prod server are affected with the developer one, and when i see in my controller logs, it show some task which run in dev and show that my prod kafka run in the same cluster as kafka dev. And also after several minutes, production server are down and only one broker are run. How to isolate and separate both of them so no can affect another one?
Suggestion 1: Use Docker Compose to completely isolate both stacks
Suggestion 2: Don't run more than one of any service on a single, physical server. Otherwise, should this one machine crash and fail, then you lose everything.

Zookeeper cannot work correctly if one node down in cluster

I had problem on my cluster zookeeper, i have 2 server and each server have one zookeeper.
ex :
server.1=x.x.x.x:2888:3888
server.2=x.x.x.x:2888:3888
if one zookeeper down, the other one zookeeper can't run.
I don't have any idea to solve this problem, or should the best number zookeeper i have to build cluster zookeeper ?
In a highly available Zookeeper ensemble you must configure 2n + 1 Zookeeper instances where n is any number greater than 1. This means that for a Zookeeper quorum (i.e. a healthy ensemble), you must configure an odd number of instances (3, 5, 7, ...). This is because Zookeeper cannot perform majority elections for leadership using an even number of instances.
Assuming that you have correctly configured a Zookeeper ensemble with 2n + 1 instances (quorum), there can be up to n failed Zookeeper instances without taking the cluster down.
If the quorum breaks, then the Zookeeper cluster will go down. This is what had happened in your case. Try to use a higher number of Zookeeper instances in your ensemble (3 or 5) and it should do the trick. Alternatively, if you don't need high availability you can still use just 1 Zookeeper instance.
ZooKeeper will work as long as majority of its nodes if up, if N is number of nodes in cluster it will work as long as working_nodes_number > N/2
Now if you have 2 nodes, and one is down (working_nodes_number = 1), zoo keeper won't work: N / 2 = 1 and obviously 1 > 1 is wrong
In other words, cluster of two servers can't stay alive even if one node is down.
Try running 3 nodes, so that if one node goes down, there still be 2 nodes (majority) so that zookeeper will keep working
In general its a good strategy too use cluster of an odd number of nodes.
You might be interested to read this article that is relevant to the discussion

FlinkKafkaConsumer010 can not consume data with full parallelism

I have a Kafka(0.10.2.0) cluster with 10 partitions(with 10 individual kafka server ports) on one machine which holded 1 topic named "test"
And have a Flink Cluster with 294 task slots on 7 machines, a Flink app with parallelism 250 runs on this Flink Cluster using FlinkKafkaConsumer010 to consume data from Kafka Server with one group id "TestGroup".
But I found that there are only 2 flink ips with 171 tcp connection has been establised with kafka cluster, and more worse, only 10 connections are transfer data, only these 10 connections had data transfered from beginning to end.
I have checked this Reading from multiple broker kafka with flink, but not work in my case.
Appreciated for any information, thank you.

How is this possible to have 2 replicas in 2-node Kafka cluster?

My understanding is that Kafka replication requires quorum server setup which is 2*[replica factor] + 1.
I however managed to create a topic with replica factor 2 with 2 servers and it does seem to work.
Why is this possible at all? Does it have any side affects by using 2 servers only?