Worker dial-in pattern with multiple masters? - scala

There's a worker dial-in pattern described for Akka, particularly here: http://letitcrash.com/post/29044669086/balancing-workload-across-nodes-with-akka-2. It describes a way to fairly spread a load between multiple remote workers. It assumes there's only one master, and workers discover and register with it. Is there a way to support multiple masters with worker dial-in pattern, which supports fair and deterministic sharing of workers between multiple masters?
I imagine the following situation. Let's say there's a cluster with 2 different node roles: front-end and worker. There are multiple front-end nodes which run HTTP servers. Those front-ends delegate the business logic to actors running on worker nodes. The front-ends are behind simple HTTP round-robin load balancer (Nginx).
I'd like to have a shared pool of worker nodes that can be used by any of the front-ends. If one node has more load than other, it should consume more worker nodes' capacity. If the load is too heavy, I should be able to add more worker nodes (probably automatically via auto-scaling), and they should, again, support all of the front-ends fairly, on a need basis.
There is a couple of naive implementation leading to different deficiencies. If workers somehow decide which single front-end to support, then worker capacity might not be spread fairly, because front-end load is highly dynamic. Alternatively, if workers will register with all of the front-ends, there might be a race condition when multiple front-ends request some work from a single worker. All in all, I don't see a good way of supporting this. Has anyone any better idea?

By using clusters current state we can add more than one master
.match(CurrentClusterState.class, state -> {
for (Member member : state.getMembers()) {
if (member.status().equals(MemberStatus.up())) {
register(member);
}
}
})

Related

akka cluster : configuration seed node

I have a sample question
Does it make sense to configure all the nodes of the akka cluster as seed nodes?
example:
cluster {
seed-nodes = [
"akka://application#127.0.0.1:2551",
"akka://application#127.0.0.1:2552",
"akka://application#127.0.0.1:2553",
"akka://application#127.0.0.1:2554",
"akka://application#127.0.0.1:2555",
"akka://application#127.0.0.1:2556",
"akka://application#127.0.0.1:2557",
"akka://application#127.0.0.1:2558",
"akka://application#127.0.0.1:2559",
"akka://application#127.0.0.1:2560",
"akka://application#127.0.0.1:2561",
"akka://application#127.0.0.1:2562"]
downing-provider-class = "akka.cluster.sbr.SplitBrainResolverProvider"
split-brain-resolver {
active-strategy = static-quorum
static-quorum {
quorum-size = 7
}
}
Are there disadvantages for this configuration?
I guess the answer has to be "it depends".
Seed nodes is one mechanism that enables new nodes to join akka cluster.
For your example to work you have to run all the nodes on the same host. I am guessing you're passing some JVM argument like -Dakka.remote.artery.canonical.port=2*** to bind each node to different port. That's fine, it will work. A new node starting up will try to join the cluster by contacting the seed nodes starting from the first until one of them responds.
In practice you probably want the cluster nodes running on different machines and that's when a static configuration like the one in your example can become a bit of a pain. This is because you'd need to know all the IP addresses beforehand and would need to guarantee that they will not change over time. This is perhaps possible in a network with statically assigned IPs but is nearly impossible with dynamically assigned IPs or in environments like Kubernetes. This is why there are other methods of cluster joining implemented (https://doc.akka.io/docs/akka/current/discovery/index.html).
So the disadvantage I see here is limitation of this configuration in any real-life scenario. As long as you're doing this to learn/experiment with Akka cluster, then it's all fine, though you can also argue that if you're doing that, then having a list of 12 seed nodes does not give you that much advantage over say 2 seed nodes as long as you can keep them up and running for the time of your experiment so that all the nodes can join the cluster.

Kubernetes : Disadvantages of an all Master cluster

Hy !!
I was wondering if it could be possible to replicate an VMWare architecture in Kubernetes.
What I mean by that :
In place of having the Control-Panel always separated from the Worker Nodes, I would like to put them all together, at the end we would obtain a cluster of Master Nodes on which we can schedule applications. For now I'm using kata-container with containerd as such all applications are deployed in 'mini' VMs so there isn't the 'escape from the container' problem. The management of the Cluster would be done trough a special interface (eth0 1Gb). The users would be able to communicate with the apps that are deployed within the cluster trough another interface (eth1 10Gb). I would use Keepalived and HAProxy to elect my 'Main Master' and load balance the traffic.
The question might be 'why would you do that ?'. Well to assure High Availability at all time and reduce the management overhead, in place of having 2 sets of "entities" to manage (the control-plane and the worker nodes) simply reduce it to one, as such there won't be any problems such as 'I don't have more than 50% of my masters online so there won't be a leader elect', so now I would have to either eliminate master nodes from my cluster until the percentage of online master nodes > 50%, that would ask for technical intervention and as fast as possible which might result in human errors etc..
Another positive point would be the scaling, in place of having 2 parts of the cluster that I would need to scale (masters and workers) there would be only one, I would need to add another master/worker to the cluster and that's it. All the management traffic would be redirected to the Main Master that uses a Virtual IP (VIP) and in case of an overcharge the request would be redirected to another Node.
In the end I would have something resembling to this :
Photo - Architecture VMWare-like
I try to find disadvantages to this kind of architecture, I know that there would be etcd traffic on each Node but how impactful is it ? I know that there will be wasted resources for the Pods of the control-plane on each node, but knowing that these pods (except etcd) wont do much beside waiting, how impactful would it be ? Having each Node being capable to take the Master role there won't be any down time. Right now if my control-plane (3 masters) go down I have to reboot them or find the solution as fast as possible before there's a problem with one of the apps that turn on the worker Nodes.
The topology I'm using right now resembles the following :
Architecture basic Kubernetes
I'm new to kuberentes so the question might be seen as stupid but I would really like to know the advantages/disadvantages between the two and understand why it wouldn't be a good idea.
Thanks a lot for any help !! :slightly_smiling_face:
There are two reasons for keeping control planes on their own. The big one is that you only want a small number of etcd nodes, usually 3 or 5 and that's usually the bounding factor on the size of the control plane. You usually want the ability to scale worker nodes independently from that. The second issue is Etcd is very sensitive to IOPS brownouts and can get bad cascade failures if the machine runs low on IOPS.
And given that you are doing things on top of VMWare anyway, the overhead of managing 3 vs 6 VMs is not generally a difference in kind. This seems like a false savings in the long run.

Lagom: is it possible to split service instances across multiple clusters?

Let's say I have Hello-Service. In Lagom, this service can run across multiple nodes of a single cluster.
So within Cluster 1, we can have multiple "copies" of Hello-Service:
Cluster1: Hello-Service-1, Hello-Service-2, Hello-Service-3
But is it possible to run service Hello-Service across multiple clusters?
Like this:
Cluster1: Hello-Service-1, Hello-Service-2, Hello-Service-3,
Cluster2: Hello-Service-4, Hello-Service-5, Hello-Service-6
What I want to achieve is better scalability of the read-side processors and event consumers:
In Lagom, we need to set up front the number of shards of given event tag within the cluster.
So I wonder if I can just add another cluster to distribute the load across them.
And, of course, I'd like to shard persistent entities by some key.
(Let's say that I'm building a multi-tenant application, I would shard entities by organization id, so all entities of some set of organizations would go into Cluster 1, and entities of another set of organizations would go into Cluster 2, so I can have sharded read side processors per each cluster which handle only subset of events/entities within the cluster (for better scalability)).
With a single cluster approach, as a system grows, a sharded processor within a single cluster may become slower and slower because it needs to handle more and more events.
So as the system grows, I would just add a new cluster (Let's say, Cluster 2, then Cluster 3, which would handle their own subset of events/entities)
If you are using sharded read sides, Lagom will distribute the processing of the shards across all the nodes in the cluster. So, if you have 10 shards, and 6 nodes in 1 cluster, then each node will process between 1-2 shards. If you try to deploy two clusters, 3 nodes each, then you'll end up each node processing 3-4 shards, but every event will be processed twice, once in each cluster. That's not helping scalability, that's doing twice as much work as needs to be done. So I don't see why you would want two clusters, just have one cluster, and the Lagom will distribute the shards evenly across it.
If you are not using sharded read sides, then it doesn't matter how many nodes you have in your cluster, all events will be processed by one node. If you deploy a second cluster, it won't share the load, it will also process the same events, so you'll get double processing of each event by each cluster, which is not what you want.
So, just use sharded read sides, and let Lagom distribute the work across your single cluster for you, that's what it's designed to do.

What to do when my AKKA Actor dies

Just for discussion here,
I was thinking AKKA is great for writing distributed systems but however if your Supervisor and Actors are all in one machine your distributed system will not be highly available. If the machine goes down the whole distributed system goes down with it.
So how about i put the Supervisor in one machine and all the Actors in separate machines. So if one Actor dies there are still others to handle the work. If i bring up a replacement machine. How can the Supervisor know that there is this new machine that can house a new Actor?
Ultimately the Supervisor tree leads to a Root Supervisor. What if the machine that houses the Root Supervisor dies? Does this make it the weakest link in the whole distributed system? How about having an additional Root Supervisor node that one can fail over to? How about having several and have a load balancer in front of all the Root Supervisor to distribute the load?
Some problems are reoccurring and are solved by various cluster tools in Akka, that are all built on the core cluster APIs.
Cluster singleton allow you to have a single instance of an actor in an entire cluster, if the node is downed, the singleton actors will start on a new node. See docs here: http://doc.akka.io/docs/akka/current/scala/cluster-singleton.html#cluster-singleton
Sharding, as mentioned by László is for when you have a high number of actors that you want to maintain one of in the cluster. Docs here:
http://doc.akka.io/docs/akka/current/scala/cluster-sharding.html
Both of these nicely fits together with Akka Persistence, to make an actor arrive at the same state as a previous actor instance on another node.
If you want your actors to be available at all times, you cannot achieve that at the same time as "having only one" (essentially the consistency vs availability problem), but you can use Akka Distributed Data for always available but eventually consistent state. http://doc.akka.io/docs/akka/current/scala/distributed-data.html
You could of course also build your own tools using Actors and the Cluster APIs if none of the existing fits your use case (I wouldn't say it is trivial though ;) ).
First off, there is quite some documentation on the general topic of distributing Actor systems over multiple machines, communicating over the network:
http://doc.akka.io/docs/akka/snapshot/scala/index-network.html
That said, while it is possible to remotely supervise an Actor, you correctly surmised that it does not yield much additional resilience. Rather, use Cluster Sharding to ensure that copies of workers exist on multiple machines, and use a simple cluster-aware ActorRef to send message to them from another machine (without a supervision relationship).

Maximum servers in a ZooKeeper ensemble cluster?

Use case: 100 Servers in a pool; I want to start a ZooKeeper service on each Server and Server applications (ZooKeeper client) will use the ZooKeeper cluster (read/write). Then there is no single point of failure.
Is this solution possible for this use case? What about the performance?
What if there are 1000 Servers in the pool?
If you are simply trying to avoid a single point of failure, then you only need 3 servers. In a 3 node ensemble, a single failure can be tolerated with the remaining 2 nodes forming the quorum. The more servers you have the worse write performance will be. And 100 servers is the extreme of this, if ZK can even handle it.
However, having that many clients is no problem at all. Zookeeper has active deployments with many more than 1000 clients. If you find that you need more servers to handle your read load, you can always add Observers. I highly recommend you join the list serve. It is an excellent way to quickly have your questions answered, and likely in much more detail than anyone will give you on SO.
Maybe zookeeper is not the right tool?
Hazelcast does what you want, I think. You can hundreds of peers, and if the master is lost a new one is elected from all the peers.
You don't need to use all of hazel cast. You can just use the maps, or just the worker pools, or just the synchronisation primitives, or just the messaging etc.