Going through the Google's Chubby Paper,
Like a lock service, a consensus service would
allow clients to make progress safely even with only one
active client process; a similar technique has been used to
reduce the number of state machines needed for Byzantine fault tolerance [24]. However, assuming a consensus
service is not used exclusively to provide locks (which
reduces it to a lock service), this approach solves none of
the other problems described above
they mention how Chubby is not a consensus service, but a lock service,
and also how a consensus service could be used to achieve consensus amongst a peer of nodes as well.
In my understanding I thought services like Chubby and Zookeeper are used to offload your distributed application problems (like leader election, cluster management, access to shared resources) to a different application (chubby/zookeeper) and these are lock based services. Having locks on files/znodes in how consensus is achieved.
What are consensus services and how are they then different from lock services ?
When would one use either of them ?
Zookeeper is a co-ordination service, modeled after Google's Chubby
The major features it provides are
Linearizable atomic operations
Total ordering of operations
Failure detection
Change notifications
Out of these, Linearizable atomic operations requires ZooKeeper to implement a consensus algorithm (Zab), and therefore Linearizability can be used for achieving consensus among peers in distributed systems, using Zookeper locks
Quoting from the book Designing Data-Intensive Application
Coordination services like Apache ZooKeeper [15] and etcd [16] are
often used to implement distributed locks and leader election. They
use consensus algorithms to implement linearizable operations in a
fault-tolerant way
Based on my understanding, consensus services, and coordination services, both run on top of some consensus algorithm, it's just that lock-services represent that consensus through a distributed lock
Similar to what is also mentioned in the Chubby paper,
However, assuming a consensus service is not used exclusively to provide locks (which reduces it to a lock service)
I found chapter 9, "Consistency and Consensus" from the book Designing Data-Intensive Applications, to be very helpful on this topic, if you wanna dig further, would definitely recommend reading that
You can take a lock to propose your value, publish your value, and that's the consensus.
Related
I was reading the Chubby paper from OSDI 2006, and had a question regarding coarse and fine-grained locking. The paper describes at-length why they opt for only supporting coarse-grained locking, however at a certain point in the paper they mention it is straightforward for clients to implement fine-grained locking.
Here is the excerpt:
Chubby is intended to provide only coarse-grained
locking. Fortunately, it is straightforward for clients to
implement their own fine-grained locks tailored to their
application. An application might partition its locks into
groups and use Chubby’s coarse-grained locks to allocate
these lock groups to application-specific lock servers.
Little state is needed to maintain these fine-grain locks;
the servers need only keep a non-volatile, monotonicallyincreasing acquisition counter that is rarely updated.
Clients can learn of lost locks at unlock time, and if a
simple fixed-length lease is used, the protocol can be
simple and efficient.
Is Chubby's role in client-implemented fine-grained locking simply the consensus around which application-server is responsible for the lock group associated with a fine-grained lock? And then inside the lock group server, there would be an acquisition counter to keep track of state?
Thanks in advance!
I think it means although Chubby is for coarse-grained lock, we still can use Chubby for find-grained lock occasions, if we separate the locks which require coarse-grained and fine-grained. For example, there userService and productService, userService update user info, while productService update product info. Maybe product info update require a fine-grained lock, so we can deploy a specific Chubby cluster to handle it.
There's a paper titled 'Hierarchical-Chubby' to make Chubby more scalable.
I am designing IoT system with board computers such as raspberry pi.
Particularly, am designing application messaging platform that enables pub-sub, esb and so on.
To make it easy and simple, I am considering to employ rabbitmq.
Furthermore, I want to build rabbitmq cluster on those node, to avoid SPoF.
However, those devices sometimes will be turned off.
I think this means a node leaves from cluster temporarily.
I expect rabbitmq cluster assumes this situation a certain degree, but I cannot assume how much it is able to accept, what problems occurs.
To experts of rabbitmq cluster,
Could you tell me any concerns about it, and cases that we should care, please?
Do you think it does work in production?
Please tell me any cases similar to my assumption.
I really look forward to your reply.
Even if it is tiny things, would be nice for me.
TL;DR RabbitMQ doesn't work well in this scenario. Better use another thing.
RabbitMQ is intended to work with stable nodes, it uses the Raft algorithm for distributed consensus and elects their leader (see http://thesecretlivesofdata.com/raft). As we can observe with this approach the process to elect a leader is compounded by several steps. If the network is partitioned or the leader fails another leader must be elected. If this happens frequently the entire network would be unstable.
Maybe you could want to have a look at other technologies like https://deepstream.io.
I'm looking at the options in ActiveMQ Artemis for data recovery if we lose an entire data centre. We have two data centres, one on the east coast and one on the west coast.
From the documentation and forums I've found four options:
Disk based methods:
Block based replication of the data directory between the sites, run Artemis on one site (using Ciphy or DRBD with protocol A). In the event of disaster (or fail over test), stop Artemis on the dead site, and start it on the live site.
The same thing but with both Artemis servers active, using an ha-policy to indicate the master and the slave using a shared store.
Network replication:
Like number 2, but with data replication enabled in Artemis, so Artemis handles the replication.
Mirror broker connections.
Our IT team uses / is familiar with MySQL replication, NFS, and rsync for our other services. We are currently handling JMS with a JBoss 4 server replicated over MySQL.
My reaction from reading the documentation is that high availability data replication is the way to go, but is there trade offs I'm not seeing. The only one that mentions DR and cross site is the mirror broker connection, but on the surface it looks like a more difficult to manage version of the same thing?
Our constraints are that we need high performance on the live cluster (on the order of 10s of thousands of messages per second, all small)
We can afford to lose messages (as few as possible preferably) in an emergency fail over. We should not lose messages in a controlled fail over.
We do not want clients in site A connecting to Artemis in site B - we will enable clients on site B in the event of a fail over.
The first thing to note is that the high availability functionality (both shared-store and replication - options #2 & #3) configured via ha-policy is designed for use in a local data center with high speed, low latency network connections. It is not designed for disaster recovery.
The problem specifically with network-based data replication for you is that replication is synchronous which means there's a very good chance it will negatively impact performance since every durable message will have to be sent across the country from one data center to the other. Also, if the replicating broker fails then clients will automatically fail-over to the backup in the other data center.
Using a solution based on block-storage (e.g. Ceph or DRDB) is viable, but it's really an independent thing outside the control of ActiveMQ Artemis.
The mirror broker connection was designed with the disaster recovery use-case in mind. It is asynchronous so it won't have nearly the performance impact of replication, and if the mirroring broker fails clients will not automatically fail-over to the mirror.
We have a set of micro services that all communicate via REST API. Each service will be implemented as a stateful actor in Service Fabric and each will have access to the reliable collections we have in service fabric. It is imperative that these services act in a transactional manner. We are architecting this solution right now, and there is a debate on the ability for Service Fabric's ability to do distributed transaction coordination. If distributed transactions are not supported (as some are claiming) then the solution will be architected using Nuget packages to update functionality. I think this comes with its own set of problems almost like the old COM components.
Does Service Fabric have a distributed transaction coordinator for Stateful Serivces using Web API communications?
No, SF transactions work on the level of a service replica. Maybe the quorum got people confused, even though this feels like a distributed transaction, it's not something you as a developer can use.
Strong consistency is achieved by ensuring transaction commits finish
only after the entire transaction has been logged on a majority quorum
of replicas, including the primary.
Note:
Distributed transactions cause more issues than they solve, I'd recommend you read about Event Driven Architectures instead.
We are choosing the best option for implementing a leader election to achieve high availability. Our goal is to have only a single instance active at any given time. We are using Spring Boot to develop application which is getting deployed by default on Tomcat. Would be great to hear your opinion about the following options:
Does Zookeeper provide better CP than Consul ?
View on maintenance/complexity ?
ZooKeeper is based on ZAB & Consul is based on Raft. Both are very similar atomic broadcast algorithms at a high level. So, as far as "Consistancy" of CAP (which is actually linearizability, a very strong form of consistancy) is concerned, both will provides similar guarantees. Both of them have linearizable writes to quorum (majority). The other nodes (not in quorum) may lag in updates by default resulting in stale reads. This is done this way because complete linearizability makes things slow and many applications are good with a little stale reads. However, if that is not acceptable in a particular usecase, it is always possible to use sync call before read in ZooKeeper and Consistent mode in Consul to acheive complete linearizability.
For service discovery, however, Consul seems to provide higher level constructs that are not out-of-the-box in ZooKeeper.
In terms of leader election use case, both can be used.
But given that ZooKeeper is used by many top level apache projects and it is also older than the Raft and therefore Consul, I hope it will have better community support and documentation. Also the Apache documentation providing various recepes is great.
Finally, if you go with ZooKeeper, you may also want to use Apache Curator which provides higher level APIs on top of ZooKeeper.