How to regain quorum in Zookeper Cluster when lost majority of servers permanently? - apache-zookeeper

Is there any solution to regain quorum in Zookeper Cluster when lost majority (for example 3 out of 5) of servers permanently? Can we reach quorum somehow again? I read about dynamic reconfiguration, but as i understand the quorum is needded to perform reconfiguration " a quorum of the old configuration is required to be available and connected for ZooKeeper to be able to make progress". How to deal with such a situation without loosing data?

Related

Kafka Cluster cotinues to run without zookeeper

I have a five node kafka cluster(confluent 5.5 community edition) with 3 zookeeper nodeseach on different aws instances.
While doing failover testing , noticed that the kafka cluster works fine even if all zookeeper nodes are down.
I was able to produce , consume and also create new consumers.
why does the kafka cluster not stop if it cannot connect to any zookeeper nodes ?
What would be the possible issues if we are unaware of such a failure scenario in production and kafka cluster continues to run without zookeeper connectivity ?
how do we handle such a scenario ?
Broker leader election, topic creation, simple ACLs (if you use them) still depend on Zookeeper. For other basic functions relying on the Kafka bootstrap protocols, they might still work, sure. There should definitely be broker logs indicating connection was lost
Ideally you'd have basic process healthchecking and incident management software that you shouldn't miss critical services going down in prod
How to handle? Restart Zookeeper...

Zookeeper Quorum and the Non- Quorum

Zookeeper Experts.
The question that I am asking might be basic to you, but I am new to ZK, and I haven't mastered the tool yet so forgive me. With that in mind here is my question.
Suppose I have a ZK Cluster of 5 Servers, and I have a quorum of 3. Now this guarantees that the servers won't go into split-brain scenarios, if they are located into two physically separate DC or machines right.
However, what I want to know is if the Quorum is set to three it means that the Leader server, will need to wait until at least 2 server replicate the written data, total of 3 replicated data. But what if a client connects to the server that is not part of the Quorum any of the 2 servers, isn't that means it gets the old data ?
First, you cannot "set" the quorum. It is automatically calculated from the configuration, using N/2+1 (the majority) where N is the number of zookeeper server *.
A Zookeeper server that is not part of a Quorum become unavailable and cannot server data to clients so no risk of seeing old data.

Running zookeeper on a cluster of 2 nodes

I am currently working on trying to use zookeeper in a two node cluster. I have my own cluster formation algorithm running on the nodes based on configuration. We only need Zookeeper's distributed DB functionality.
Is it possible to use Zookeeper in a two node cluster ? Do you know of any solutions where this has been done ?
Can we still retain the zookeepers DB functionality without forming a quorum ?
Note: Fault tolerance is not the main concern in this project. If one of the nodes go down we have enough code logic to run without the zookeeper service. We use the zookeeper to share data when both the nodes are alive.
Would greatly appreciate any help.
Zookeeper is a coordination system which is basically used to coordinate among nodes. When writes are occurred to such a distributed system, in ordered to coordinate and agree upon values which are being stored, all the writes are gone through master (aka leader). Reads can occur through any node. Zookeeper requires a master/leader to be elected per a quorum in order to serve write requests consistently. Zookeeper make use of the ZAB protocol as the consensus algorithm.
In order to elect a leader, a quorum should ideally have an odd number of nodes (Otherwise, a node will not be able to win majority and become the leader). In your case, with two nodes, zookeeper will not possibly be able to elect a leader for a long time since both nodes will be candidates and wait for the other node to vote for it. Even though they elect a leader, your ensemble will not work properly in network patitioning situations.
As I said, zookeeper is not a distributed storage. If you need to use it in a distributed manner (more than one node), it need to form a quorum.
As I see, what you need is a distributed database. Not a distributed coordination system.

Flink with zookeeper: Service temporarily unavailable due to an ongoing leader election. Please refresh

I want to run the flink cluster with High-availability mode. Hence I have made the setting as per JobManager High Availability into flink configuration files. When I start the zookeeper quorum by using start-zookeeper-quorum.sh, I am able to start two zookeerper servers(peers) on two machines. but when I start the flink cluster with 2 JobManagers, I get the message as Service temporarily unavailable due to an ongoing leader election. Please refresh. on web UI of flink.
What does this massage means? Is there a way to notify the leader in configuration file?
The problem is with your zookeeper installation. Your zk nodes can not choose a leader. Also number of two nodes is not the best choice. You should have at least 3 instances or other greater odd number.
You should check the admin docs of Zookeeper for instance here

Assign a server as leader in zookeeper ensemble

We have a quorum of 4 servers which has zookeeper 3.4.6 installed in all of them.The leader election is currently managed automatically. However we would like to assign a particular server as a leader as this box is more robust and has high capabilities.
I am looking for a setting to assign a server as leader always.Is it possible?. I even tried the zookeeper 3.5.1-alpha version but even that doesnt seem to have any particular setting. I understand there are algorithms for implementing the election but a setting will be more advantageous for us.
Any thoughts?
Thanks,
Ram
There is no such setting. Leader election is automatic unless you decide to implement an algorithm but seems to me thats not a solution you are looking for.