Distributed DDD entities with Akka - scala

Suppose we'd have a large number of persistent Person actors, each constructed with an identity and a name argument. What would be the best way to distribute these actors in a cluster, in such a manner that:
new actors are appointed a node by strategy X (round robin, consistent hash, etc.)
a "coordinator" actor contains a mapping from identity to ActorRef
one or more nodes can fail and the affected actors are recovered on other nodes
there is no SPF
I've considered the following, which doesn't seem to solve the problem:
Cluster sharding; all actors are initialised equally and created by coordinator
Cluster aware routing; groups or pools are fixed size and can't be modified dynamically

Sounds like you pretty much exactly are describing Akka cluster sharding and there isn't enough information to see why it would not fit.
The common solution to deal with such a design problem is to have an uninitialized state of the sharded entity where it only accepts an initialize command containing the needed values (so something like CreateUser(id, name)) and when it gets that it toggles to its "normal" behavior.
Another option could be to introduce an intermediate actor that doesn't start the actual actor until it has extracted the name value if you have no means to change the design of your Person actor.
Ofc. you could also drop down to the Akka cluster APIs directly and build something that exactly matches your use case, but handling redistribution on cluster topology change (add, remove nodes etc) is far from trivial to get right.
I think you would also come to the realisation that achieving such a tool that is entirely non-invasive for your entities without the sharding solution being tightly coupled with you business logic is very hard.

Related

Is there a way of assigning an int number to different instances of stateless services?

I'm building a solution where we'll have a (service-fabric) stateless service deployed to K instances. This service is tasked with some workload (like querying) and I want to split the workload between them as evenly as I can - and I want to make this a dynamic solution, which means if I decide to go from K instances to N instances tomorrow, I want the workload splitting to happen in a way that it will automatically distribute the load across N instances now. I don't have any partitions specified for this service.
As an example -
Let's say I'd like to query a database to retrieve a particular chunk of the records. I have 5 nodes. I want these 5 nodes to retrieve different 1/5th of the set of records. This can be achieved through some query logic like (row_id % N == K) where N is the total number of instances and K is the unique instance_number.
I was hoping to leverage FabricRuntime.GetNodeContext().NodeId - but this returns a guid which is not overly useful.
I'm looking for a way where I can deterministically say it's instance number M out of N (I need to be able to name the instances through 1..N) - so I can set my querying logic according to this. One of the requirements is if that instance goes down / crashes etc... when SF automatically restarts it, it should still identify as the same instance id - so that 2 or more nodes doesn't query the same set of results.
What is the best of solving this problem? Is there a solution which involves pure configuration through ApplicationManifest.xml or ServiceManifest.xml?
There is no out of the box solution for your problem, but it can be easily done in many different ways.
The simplest way is using the Queue-Based Load Leveling pattern in conjunction with Competing Consumers pattern.
It consists of creating a queue, add the work to the queue, and each instance get one message to process this work, if one instance goes down and the message is not processed, it goes back to the queue and another instance pick it up.
This way you don't have to worry about the number of instances running, failures and so on.
Regarding the work being put in the queue, it will depend if you want to to do batch processing or process item by item.
Item by item, you put one message in the queue for each item being processed, this is a simple way to handle the work and each instance process one message at time, or multiple messages in parallel.
In batch, you can put a message that represents a list of items to be processed and each instance process that batch until completed, this is a bit trickier because you might have to handle the progress of the work being done, in case of failure, the next time you can continue from where it stopped.
The queue approach is a reactive design, in this case the work need to be put in the queue to trigger the processing, If you want a proactive approach and need to keep track of which work goes to who, you probably might be better of using some other approach, like a Leasing mechanism, where each instance acquire a lease that belongs to the instance until it releases the lease, this would more suitable when you work with partitioned data or other mechanism where you can easily split the load.
Regarding the issue with the ID, an option would be the InstanceId of the replica you are on, you can reach by StatelessService.Context.InstanceId, it is not a sequential ID, but it is a random number. It is better than using the node id, because you might have multiple partitions on same node and the id would conflict with each other.
If you decide to use named partitions, you could use order in the partition name instead, so each partition would have a sequential name.
Worth mention that service fabric has a limitation that doesn't allow services to have multiple replicas on same node, because of this limitation you might have to design your services with this in mind, otherwise you won't be able to scale out once the limit is reached. Also, the same thread has some discussion about approaches to process multiple distributed items that might give you some ideas.

Akka cluster-sharding: moving actor shards based on communication patterns

I am building an open-source distributed economic simulation platform using Akka (especially the remote and cluster packages). A key bottleneck in such simulations is the fact that communication patterns between actors evolve over the course of the simulation and often actors will end up sending loads of messages over the wire between nodes in the cluster.
I am looking for a mechanism to detect actors on some node that are communicating heavily with actors on some other node and move them to that other node. Is this possible using existing Akka cluster sharding functionality? Perhaps this is what Roland Kuhn meant by "automatic actor tree partitioning" is his answer to this SO question.
To move shards around according to your own logic is doable by implementing a custom ShardAllocationStrategy.
You just have to extend ShardAllocationStrategy and implement those 2 methods:
def allocateShard(requester: ActorRef, shardId: ShardId,
currentShardAllocations: Map[ActorRef, immutable.IndexedSeq[ShardId]])
: Future[ActorRef]
def rebalance(currentShardAllocations: Map[ActorRef,
immutable.IndexedSeq[ShardId]], rebalanceInProgress: Set[ShardId])
: Future[Set[ShardId]]
The first one determines which region will be chosen when allocating a new shard, and provides you the shards already allocated. The second one is called regularly and lets you control which shards to rebalance to another region (for example, if they became too unbalanced).
Both functions return a Future, which means that you can even query another actor to get the information you need (for example, an actor that has the affinity information between your actors).
For the affinity itself, I think you have to implement something yourself. For example, actors could collect statistics about their sender nodes and post that regularly to a cluster singleton that would determine which actors should be moved to the same node.

Changing number of partitions for a reliable actor service

When I create a new Service Fabric actor the underlying (auto generated) actor service is configured to use 10 partitions.
I'm wondering how much I need to care about this value?
In particular, I wonder whether the Actor Runtime has support for changing the number of partitions of an actor service on a running cluster.
The Partition Service Fabric reliable services topic says:
In rare cases, you may end up needing more partitions than you have initially chosen. As you cannot change the partition count after the fact, you would need to apply some advanced partition approaches, such as creating a new service instance of the same service type. You would also need to implement some client-side logic that routes the requests to the correct service instance, based on client-side knowledge that your client code must maintain.
However, due to the nature of Actors and that they are managed by the Actor Runtime I'm tempted to believe that it would indeed be possible to do this. -- That the Actor Runtime would be able to take care of all the heavylifting required to re-partition actor instances.
Is that at all possible?
The number of partitions in a running service cannot be changed. This is true of Actors as well as Reliable Services. Typically, you would want to pick a large number of partitions (more than the number of nodes) up front and then scale out the number of nodes in the cluster instead of trying to repartition your data on the fly. Take a look at Abhishek and Matthew's comments in the discussion here for some ideas on how to estimate how many partitions you might need.

Is it mandatory to have a master actor in Akka?

I am trying to learn a bit about akka actors (in scala) and I came up with this question that I couldn't find an answer to.
Do you necessarily need to create a master actor and from there create the workers with a workerRouter?
Or can you just skip this step and go directly to create workers with a workerRouter from your Main object?
Let me know if you need any code, but I'm basically following the HelloWorld for akka.
Strictly speaking: yes. In terms of bussiness logic: no.
Akka's actors, are by design, hierarchical . That means, any actors you create will always have a "parent"/"master", if not one defined yourself, then the /user guardian actor .
However, note that this hierarchical relationship, from Akka's system point of view, concerns the actor lifecycle and child supervision. It does not care about how you wire the actors up with your messages and/or any custom lifecycle handling.
So, from the point of view of your application, you can have your worker actors run as peers with some some sort of consensus scheme. They will of course have system parent (/user if you won't define one yourself), but as long as you don't care about supervision - and if you're just starting to learn Akka you might not - everything will work fine.
Finally, note that there can be many schemes for working in a "worker pool" setup. For example, this article on work pulling might give you some inspiration on the possible solutions to such problems.

Akka and state among actors in cluster

I am working on my bc thesis project which should be a Minecraft server written in scala and Akka. The server should be easily deployable in the cloud or onto a cluster (not sure whether i use proper terminology...it should run on multiple nodes). I am, however, newbie in akka and i have been wondering how to implement such a thing. The problem i'm trying to figure out right now, is how to share state among actors on different nodes. My first idea was to have an Camel actor that would read tcp stream from minecraft clients and then send it to load balancer which would select a node that would process the request and then send some response to the client via tcp. Lets say i have an AuthenticationService implementing actor that checks whether the credentials provided by user are valid. Every node would have such actor(or perhaps more of them) and all the actors should have exactly same database (or state) of users all the time. My question is, what is the best approach to keep this state? I have came up with some solutions i could think of, but i haven't done anything like this so please point out the faults:
Solution #1: Keep state in a database. This would probably work very well for this authentication example where state is only represented by something like list of username and passwords but it probably wouldn't work in cases where state contains objects that can't be easily broken into integers and strings.
Solution #2: Every time there would be a request to a certain actor that would change it's state, the actor will, after processing the request, broadcast information about the change to all other actors of the same type whom would change their state according to the info send by the original actor. This seems very inefficient and rather clumsy.
Solution #3: Having a certain node serve as sort of a state node, in which there would be actors that represent the state of the entire server. Any other actor, except the actors in such node would have no state and would ask actors in the "state node" everytime they would need some data. This seems also inefficient and kinda fault-nonproof.
So there you have it. Only solution i actually like is the first one, but like i said, it probably works in only very limited subset of problems (when state can be broken into redis structures). Any response from more experienced gurus would be very appriciated.
Regards, Tomas Herman
Solution #1 could possibly be slow. Also, it is a bottleneck and a single point of failure (meaning the application stops working if the node with the database fails). Solution #3 has similar problems.
Solution #2 is less trivial than it seems. First, it is a single point of failure. Second, there are no atomicity or other ordering guarantees (such as regularity) for reads or writes, unless you do a total order broadcast (which is more expensive than a regular broadcast). In fact, most distributed register algorithms will do broadcasts under-the-hood, so, while inefficient, it may be necessary.
From what you've described, you need atomicity for your distributed register. What do I mean by atomicity? Atomicity means that any read or write in a sequence of concurrent reads and writes appears as if it occurs in single point in time.
Informally, in the Solution #2 with a single actor holding a register, this guarantees that if 2 subsequent writes W1 and then W2 to the register occur (meaning 2 broadcasts), then no other actor reading the values from the register will read them in the order different than first W1 and then W2 (it's actually more involved than that). If you go through a couple of examples of subsequent broadcasts where messages arrive to destination at different points in time, you will see that such an ordering property isn't guaranteed at all.
If ordering guarantees or atomicity aren't an issue, some sort of a gossip-based algorithm might do the trick to slowly propagate changes to all the nodes. This probably wouldn't be very helpful in your example.
If you want fully fault-tolerant and atomic, I recommend you to read this book on reliable distributed programming by Rachid Guerraoui and Luís Rodrigues, or the parts related to distributed register abstractions. These algorithms are built on top of a message passing communication layer and maintain a distributed register supporting read and write operations. You can use such an algorithm to store distributed state information. However, they aren't applicable to thousands of nodes or large clusters because they do not scale, typically having complexity polynomial in the number of nodes.
On the other hand, you may not need to have the state of the distributed register replicated across all of the nodes - replicating it across a subset of your nodes (instead of just one node) and accessing those to read or write from it, providing a certain level of fault-tolerance (only if the entire subset of nodes fails, will the register information be lost). You can possibly adapt the algorithms in the book to serve this purpose.