Scala actors with concurrent acces to shared cache of objects, scala.concurrent.Lock, react vs receive - scala

I'm writing a soft in which various actors concurrently create portions of a same graph.
Nodes of the graph are modeled through a class hierarchy, each concrete class of the hierarchy has a companion object.
abstract class Node
class Node1(k1:Node, k2:Node) extends Node
object Node1 {
def apply(k1:Node, k2:Node) = ...
}
class Node2(k1:Node, k2:Node) extends Node
object Node2 {
def apply(k1:Node, k2:Node) = ...
}
...
So far so good.
We perfom structural hashing on the nodes on creation.
That is, each companion object has a HashTable which stores node instances keyed under their constructor arguments, this is used to detect that an instance of a given node class with the same subnodes already exists and to return that one instead of creating a new instance. This allows to avoid memory blowup, and allows have a node equality test that takes constant time (reference comparison instead of graph comparison). Access to this map is protected using a scala.concurrent.Lock.
Yet the problem is that the Lock operate at jvm thread level, and that depending on how actors are coded, they can either be allocated on their own jvm threads, or instead be interleaved with several other actors in a same JVM thread in which case the structural hashing ceases to work (i.e., several structurally identical nodes can be created, and only one of them will be stored in the cache, and the structural equality will cease to work).
First, I know that this structural hashing architecture goes against the actor share-nothing philosophy, but we really need this hashing to work for performance reasons (constant time equality brings us an order of magnitude improvement), but is there a way to implement mutual exclusion on shared ressources with actors that would work at actor level rather than jvm thread level?
I thought of encapsulating the node companion in an actor to completely sequentialize access to the factory but this would imply a complete rewrite of all the existing code, any other idea?
Thanks,

If you have shared mutable state, have a single actor which mutates this state. You can have other actors read, but have one actor that does the writes.

Related

How can I represent relationships between instances of the same class in a concurrent system

I made a concurrent system which has a critical section which involves read and write access to a TXT file.
First, an Auctioneer class creates a TXT file and writes the number 50 to it. The Auctioneer then allows the next node, one of three instances of the Bidder class, to open the file and change the current bid. The bidder class then allows the next node, another bidder to bid, then another bidder, and then that bidder allows the Auctioneer to look at the file.
I allowed the nodes to take turns using server sockets. Each node waits for access using the ServerSocket.accept() method, and allows the next node to enter its critical section by creating a Socket object with the socket that ne next nde is listening on.
Each of these nodes run independantly in seperate java environments and only communicate with server sockets.
Each node of the ring relies on the previous node because in order for that node to access the resource, the previous node needs to pass the current node the token. I'm unsure on how I would represent that kind of relationship in a UML compliant way.
It is of my understanding that class diagrams should not include several instances of the same class such as the example below with 3 bidders.
Is this the correct way to represent the relationship which I have described? If not, which way would be better/UML compliant?
Class diagrams, as the name suggest represent classes of objects and not individual objects, i.e. instances of these classes. Moreover, a class diagram is structural: it does not tell how objects interact or wait one for another, but how classes relate.
In tour case the class diagram would therefore represent one bidder class. To represent a concrete example with instances and how they relate, you could consider an object diagram. There you could very well represent different instances of the same class.
However, if you’re interested in the interactions between classes (e.g. the tokens they exchange), you’d better consider an interaction diagram such as the sequence diagram.

Housing the DAO layer inside a Scala Akka Actor

I am getting warmed up to Scala but am still very new to Akka. This seems like a fairly straightforward question, but I was not able to find any information on this specific approach, which tells me that something might be wrong with my thinking or that there is already a very standard way to do this.
All of the solutions I found revolve around having an Akka actor make calls to a pre-built service layer, which would handle the database logic.
My question is whether it is feasible to make the DAO itself a persistent actor. Something along the lines of this:
class UserDAO extends Actor {
val db = actorSystem.actorSelection("/repository/dao")
def receive = {
case GetUserById(id) => sender ! (db ? RunStoredProc(SpGetUserById(id)))
...
}
}
The above is purely hypothetical pseudo-code and all the methods (i.e. RunStoredProc) are intended as examples only. I am more curious about the sanity behind the design decision of such a system. The UserDAO and the DAO(db) actors would be persistent and stateless (albeit DAO would hold a handle to the database connection). What I mean by persistent is that they would not be instantiated by the actors that actually call them.
Am I re-inventing the wheel here?
Your approach seems perfectly feasible. This would allow you to handle all connection based logic in one place.
The one caveat to consider is that if your DAO is synchronous, using a single actor for all DOA calls would mean only one DB call can execute at one time. This may or may not be desirable.

Scala Actors - any suggestions when converting OOP based approach?

I'm learning Scala and its Actors (via Akka lib) approach for handling concurrency. I'm having some questions while trying to convert typical OOP (think - Java style OOP) scenarios to Actor based ones.
Let's consider the overused e-commerce example Webstore where Customers are making Orders that contain Items. If it is simulated in OOP-style you end up with appropriately named domain model classes that interact between themselves by calling methods on each other.
If we want to simulate concurrency e.g. many customers buying items at once we throw in some sort of threading (e.g. via ExecutorService). Basically each Customer then implements Runnable interface and its run() method calls e.g. shop.buy(this, item, amount). Since we want to avoid data corruption caused by many threads possibly modifying shared data at once, we have to use synchronization. So the most typical thing to do is synchronize the shop.buy() method.
Now let's move on to Actor based approach. What I understand is that Shop and each Customer now become Actors who, instead of calling buy() method on shop directly, sends message to shop. But here come the difficulties.
Should all the other domain models (Order, Item) become Actors too and all the communication between all the domain models be message driven? In other words it is a question whether it is OK or not to leave some OOP style interaction between domain models via method invoking. For example, in OOP based approach Order would typically have a reference to List which you could populate when user is buying something by calling add(item) in buy() method. Does this (and similar) interactions have to be remodeled by messaging to make most use of Actor based approach? In yet another words it is a question when do we communicate with internal state of an actor directly and when do we extract internal state to another Actor?
In OOP based solution you pass to methods instances of classes. I read in documentation that in Actor model one is supposed to pass immutable messages. So if I understand correctly, instead of messaging objects themselves you message only some data that makes it possible to identify which entities have to be processed e.g. by messaging their IDs and the type of action you want to perform.
Answering your questions:
2) Your domain model (including shops, orders, buyers, sellers, items) should be described with immutable case classes. Actors should exchange (immutable) commands, which may use these classes, like AddItem(count: Int, i: Item) - AddItem case class represents command and encapsulates business entity called Item.
1) Your protocol, e.g. interaction between shops, orders, sellers, buyers etc., should be encapsulated inside actor (one actor class per protocol, one instance per state). Simply saying, an actor should manage any (mutable) state, changing between requests, like current basket/order. For instance, you may have actor for every basket, which will contain information about choosed items and receive commands like AddItem, RemoveItem, ExecuteOrder. So you don't need actor for every business entity, you need actor for every business process.
In addition, there is some best practices and also recommendations about managing concurrency with routers.
P.S. The nearest JavaEE-based approach is EJB with its entities (as case-classes) and message-driven beans (as actors).

Sharing Mutable Map across akka actor children

I have a large Mutable Map object which occupies so much memory which all the children of the parent need to access or modify values to the same Map.
I am considering just passing the mutable map during child creation as constructor parameters to all the children upon which they can access or modify the map accordingly.
I just wanted to confirm that SCALA actually passes the object reference around and so the Mutable Map will not be copied all over again, instead all the children will be modifying the same Map instance?
This is a bad idea. The Akka team does not recommend shared mutable state of any kind.
The Akka way to solve your problem would be to make your map immutable and pass it to your children in immutable messages. If you are convinced that the map has to be mutable then have one actor manage the map and have the other actors send messages to it to retrive / update values. There is nothing wrong with mutable state within one actor.
I have a large Mutable Map object which occupies so much memory which
all the children of the parent need to access or modify values to the
same Map
This is exactly what Akka shines at compared to other concurrency mechanisms. Encapsulating mutable state inside an Actor and sending messages to the Actor to mutate the state is the correct and recommended approach.
So you need to pass an ActorRef to any actor who needs to modify the map. The cool thing is that other actor can be on a different JVM and it would still work.
This also takes care of your concern the large memory footprint of your Map.

Functional Programming + Domain-Driven Design

Functional programming promotes immutable classes and referential transparency.
Domain-driven design is composed of Value Object (immutable) and Entities (mutable).
Should we create immutable Entities instead of mutable ones?
Let's assume, project uses Scala as main language, how could we write Entities as case classes (immutable so) without risking stale status if we're dealing with concurrency?
What is a good practice? Keeping Entities mutable (var fields etc...) and avoiding great syntax of case classes?
You can effectively use immutable Entities in Scala and avoid the horror of mutable fields and all the bugs that derives from mutable state. Using Immutable entities help you with concurrency, doesn't make things worse. Your previous mutable state will become a set of transformation which will create a new reference at each change.
At a certain level of your application, however, you will need to have a mutable state, or your application would be useless. The idea is to push it as up as you can in your program logic. Let's take an example of a Bank Account, which can change because of interest rate and ATM withdrawal or
deposit.
You have two valid approaches:
You expose methods that can modify an internal property and you manage concurrency on those methods (very few, in fact)
You make all the class immutable and you surround it with a "manager" that can change the account.
Since the first is pretty straightforward, I will detail the first.
case class BankAccount(val balance:Double, val code:Int)
class BankAccountRef(private var bankAccount:BankAccount){
def withdraw(withdrawal) = {
bankAccount = bankAccount.copy(balance = bankAccount.balance - withdrawal)
bankAccount.balance
}
}
This is nice, but gosh, you are still stuck with managing concurrency. Well, Scala offers you a solution for that. The problem here is that if you share your reference to BankAccountRef to your Background job, then you will have to synchronize the call. The problem is that you are doing concurrency in a suboptimal way.
The optimal way of doing concurrency: message passing
What if on the other side, the different jobs cannot invoke methods directly on the BankAccount or a BankAccountRef, but just notify them that some operations needs to be performed? Well, then you have an Actor, the favourite way of doing concurrency in Scala.
class BankAccountActor(private var bankAccount:BankAccount) extends Actor {
def receive {
case BalanceRequest => sender ! Balance(bankAccount.balance)
case Withdraw(amount) => {
this.bankAccount = bankAccount.copy(balance = bankAccount.balance - amount)
}
case Deposit(amount) => {
this.bankAccount = bankAccount.copy(balance = bankAccount.balance + amount)
}
}
}
This solution is extensively described in Akka documentation: http://doc.akka.io/docs/akka/2.1.0/scala/actors.html . The idea is that you communicate with an Actor by sending messages to its mailbox, and those messages are processed in order of receival. As such, you will never have concurrency flaws if using this model.
This is sort of an opinion question that is less scala specific then you think.
If you really want to embrace FP I would go the immutable route for all your domain objects and never put any behavior them.
That is some people call the above the service pattern where there is always a seperation between behavior and state. This eschewed in OOP but natural in FP.
It also depends what your domain is. OOP is some times easier with stateful things like UI and video games. For hard core backend services like web sites or REST I think the service pattern is better.
Two really nice things that I like about immutable objects besides the often mentioned concurrency is that they are much more reliable to cache and they are also great for distributed message passing (e.g. protobuf over amqp) as the intent is very clear.
Also in FP people combat the mutable to immutable bridge by creating a "language" or "dialogue" aka DSL (Builders, Monads, Pipes, Arrows, STM etc...) that enables you to mutate and then to transform back to the immutable domain. The services mentioned above uses the DSL to make changes. This is more natural than you think (e.g. SQL is an example "dialogue"). OOP on the other hand prefers having a mutable domain and leveraging the existing procedural part of the language.