Zookeeper clarification on CAP - apache-zookeeper

I would like to clarify my understanding of CAP theorem
for eg : Zookeeper is classified as CP ( Consistent and Partition Tolerant )
What does this mean ? In the event of partition failure , does the system return consistent data ?
Or does it mean that the moment there is a connectivity issue between the nodes in ZK cluster , the ZK is not available.
If yes , what it means it that , when the nodes in cluster are not able to talk to each other , the entire ZK goes down.

Zookeeper serves requests as long as there is quorum meaning majority of nodes are available. Since it needs majority not all the nodes its tolerant to network partition.
It replicates data to all nodes (at least the quorum) to be consistent.
If leader cannot be elected then zookeeper (no quorum) will fail requests and this is why it is not highly available.
Typically 3 or 5 servers are used for zookeeper and the quorum will be 2 or 3 nodes respectively.
Refer to this blog post for more details.
https://www.ibm.com/developerworks/library/bd-zookeeper/index.html

Related

Kafka setup strategy for replication?

I have two vm servers (say S1 and S2) and need to install kafka in cluster mode where there will be topic with only one partition and two replicas(one is leader in itself and other is follower ) for reliability.
Got high level idea from this cluster setup Want to confirm If below strategy is correct.
First set up zookeeper as cluster on both nodes for high availability(HA). If I do setup zk on single node only and then that node goes down, complete cluster
will be down. Right ? Is it mandatory to use zk in latest kafka version also ? Looks it is must for older version Is Zookeeper a must for Kafka?
Start the kafka broker on both nodes . It can be on same port as it is hosted on different nodes.
Create Topic on any node with partition 1 and replica as two.
zookeeper will select any broker on one node as leader and another as follower
Producer will connect to any broker and start publishing the message.
If leader goes down, zookeeper will select another node as leader automatically . Not sure how replica of 2 will be maintained now as there is only
one node live now ?
Is above strategy correct ?
Useful resources
ISR
ISR vs replication factor
First set up zookeeper as cluster on both nodes for high
availability(HA). If I do setup zk on single node only and then that
node goes down, complete cluster will be down. Right ? Is it mandatory
to use zk in latest kafka version also ? Looks it is must for older
version Is Zookeeper a must for Kafka?
Answer: Yes. Zookeeper is still must until KIP-500 will be released. Zookeeper is responsible for electing controller, storing metadata about Kafka cluster and managing broker membership (link). Ideally the number of Zookeeper nodes should be at least 3. By this way you can tolerate one node failure. (2 healthy Zookeeper nodes (majority in cluster) are still capable of selecting a controller)) You should also consider to set up Zookeeper cluster on different machines other than the machines that Kafka is installed. Thus the failure of a server won't lead to loss of both Zookeeper and Kafka nodes.
Start the kafka broker on both nodes . It can be on same port as it is
hosted on different nodes.
Answer: You should first start Zookeeper cluster, then Kafka cluster. Same ports on different nodes are appropriate.
Create Topic on any node with partition 1 and replica as two.
Answer: Partitions are used for horizontal scalability. If you don't need this, one partition is okay. By having replication factor 2, one of the nodes will be leader and one of the nodes will be follower at any time. But it is not enough for avoiding data loss completely as well as providing HA. You should have at least 3 Kafka brokers, 3 replication factor of topics, min.insync.replicas=2 as broker config and acks=all as producer config in the ideal configuration for avoiding data loss by not compromising HA. (you can check this for more information)
zookeeper will select any broker on one node as leader and another as
follower
Answer: Controller broker is responsible for maintaining the leader/follower relationship for all the partitions. One broker will be partition leader and another one will be follower. You can check partition leaders/followers with this command.
bin/kafka-topics.sh --describe --bootstrap-server localhost:9092 --topic my-replicated-topic
Producer will connect to any broker and start publishing the message.
Answer: Yes. Setting only one broker as bootstrap.servers is enough to connect to Kafka cluster. But for redundancy you should provide more than one broker in bootstrap.servers.
bootstrap.servers: A list of host/port pairs to use for establishing
the initial connection to the Kafka cluster. The client will make use
of all servers irrespective of which servers are specified here for
bootstrapping—this list only impacts the initial hosts used to
discover the full set of servers. This list should be in the form
host1:port1,host2:port2,.... Since these servers are just used for the
initial connection to discover the full cluster membership (which may
change dynamically), this list need not contain the full set of
servers (you may want more than one, though, in case a server is
down).
If leader goes down, zookeeper will select another node as leader
automatically . Not sure how replica of 2 will be maintained now as
there is only one node live now ?
Answer: If Controller broker goes down, Zookeeper will select another broker as new Controller. If broker which is leader of your partition goes down, one of the in-sync-replicas will be the new leader. (Controller broker is responsible for this) But of course, if you have just two brokers then replication won't be possible. That's why you should have at least 3 brokers in your Kafka cluster.
Yes - ZooKeeper is still needed on Kafka 2.4, but you can read about KIP-500 which plans to remove the dependency on ZooKeeper in the near future and start using the Raft algorithm in order to create the quorum.
As you already understood, if you will install ZK on a single node it will work in a standalone mode and you won't have any resiliency. The classic ZK ensemble consist 3 nodes and it allows you to lose 1 ZK node.
After pointing your Kafka brokers to the right ZK cluster you can start your brokers and the cluster will be up and running.
In your example, I would suggest you to create another node in order to gain better resiliency and met the replication factor that you wanted, while still be able to lose one node without losing data.
Bear in mind that using single partition means that you are bounded to single consumer per Consumer Group. The rest of the consumers will be idle.
I suggest you to read this blog about Kafka Best Practices and how to choose the number of topics/partitions in a Kafka cluster.

Apache zookeeper Leader Election: can it work with only two nodes?

I have a two node redhat system with an identical set of services on each. I am looking for a way to determine which service is "in charge" and which is a "running backup". So for example; service-A exists and is running on both nodes but only one should be processing data while the other sleeps until the first crashes. Same for other services in the set.
Zookeeper's leader election capability looked like it would suffice; the whole ephemeral and sequential znode approach looked good on paper. I imagined that I would also need a zookeeper service running on each node for redundancy in the face of node failure, for example.
But the documentation points out issues with multiple zookeeper's requiring at least 3 instances in order to guarantee a quorum to elect the lead zookeeper among all other zookeepers. As I only have two nodes this looks like a deal-breaker.
So before I drop the zookeeper approach, I thought I ask if there were some configuration option to zookeeper to allow a two node system to work. Otherwise I'm off to find the next best fit for my problem.
You can run Zookeeper with just two instances. However, it gives you no benefit of fault tolerance because the quorum is till 2 in that case. Any one of them failing will result in Zookeeper ensemble rejecting client requests. That's why the default configuration for an ensemble is 3 Zookeeper instances because having 2 instances is no better than having 1 so why go through the trouble of creating 2? It actually creates more points of failures because when either instance dies, your Zookeeper ensemble halts and having either one of two to fail is more likely to have just one to fail.

Kafka rack awareness and locations of ISR

I want to build a HA Kafka cluster, where the cluster needs to span 2 availability zones.
I want to be able to continue to read and write to a topic, even if all the brokers in an AZ go down.
If I have at least 2 brokers in each AZ, a replication factor of 3, min ISR of 2 and acks set to All, then I think that a producer write will be acked when one other broker other than the leader also acks the write. Does the rack aware algorithm enforce that the ISR must be located in the other AZ? The docs just mention replicas, not the ISR.
Will this enable me to continue reading and writing in the event of the loss of an AZ? If not, what is needed to achieve this?
If you want a true HA Kafka cluster you need to start with an HA Zookeeper ensemble which typically means 3 Availability Zones because (unlike Kafka brokers) Zookeeper nodes need a quorum (a majority of the original nodes) to operate and you can’t have a majority when half of your nodes are down.
The reason Zookeeper is important is that a proper HA Kafka cluster should not just allow reads and writes after a failure, but also allow new topic creation and new leader elections, both of which require Zookeeper to be operational.

Running zookeeper on a cluster of 2 nodes

I am currently working on trying to use zookeeper in a two node cluster. I have my own cluster formation algorithm running on the nodes based on configuration. We only need Zookeeper's distributed DB functionality.
Is it possible to use Zookeeper in a two node cluster ? Do you know of any solutions where this has been done ?
Can we still retain the zookeepers DB functionality without forming a quorum ?
Note: Fault tolerance is not the main concern in this project. If one of the nodes go down we have enough code logic to run without the zookeeper service. We use the zookeeper to share data when both the nodes are alive.
Would greatly appreciate any help.
Zookeeper is a coordination system which is basically used to coordinate among nodes. When writes are occurred to such a distributed system, in ordered to coordinate and agree upon values which are being stored, all the writes are gone through master (aka leader). Reads can occur through any node. Zookeeper requires a master/leader to be elected per a quorum in order to serve write requests consistently. Zookeeper make use of the ZAB protocol as the consensus algorithm.
In order to elect a leader, a quorum should ideally have an odd number of nodes (Otherwise, a node will not be able to win majority and become the leader). In your case, with two nodes, zookeeper will not possibly be able to elect a leader for a long time since both nodes will be candidates and wait for the other node to vote for it. Even though they elect a leader, your ensemble will not work properly in network patitioning situations.
As I said, zookeeper is not a distributed storage. If you need to use it in a distributed manner (more than one node), it need to form a quorum.
As I see, what you need is a distributed database. Not a distributed coordination system.

zookeeper failover for kafka cluster

I am wondering is there any way to make the zookeeper failover for kafka cluster.
For example: i want to setup 2 zookeeper instances for my kafka cluster. In case of one zookeeper fails, Kafka servers still able to read metadata of topics from second zookeeper.
any advice is highle appricicated.
Zookeeper works as a so-called quorum – a cluster of nodes that forms a consensus based on simple majority votes.
For production, you should use 3 or 5 Zookeeper instances in a quorum.
If you're using 3, your cluster can survive losing one server (because the remaining two form a simple majority). With 5, you can lose two servers because 3 is a majority of 5.
2 is a bad idea because your cluster won't work if 1 node goes down.
Please check this question
$KAFKA_HOME/config/server.properties
Here you can set multiple zookeeper
zookeeper.connect=<server1>:2181,<server2>:2181,<server2>:2181
Maintain 2n+1(quorum ) rule in case of zookeeper