Scala Actors - any suggestions when converting OOP based approach? - scala

I'm learning Scala and its Actors (via Akka lib) approach for handling concurrency. I'm having some questions while trying to convert typical OOP (think - Java style OOP) scenarios to Actor based ones.
Let's consider the overused e-commerce example Webstore where Customers are making Orders that contain Items. If it is simulated in OOP-style you end up with appropriately named domain model classes that interact between themselves by calling methods on each other.
If we want to simulate concurrency e.g. many customers buying items at once we throw in some sort of threading (e.g. via ExecutorService). Basically each Customer then implements Runnable interface and its run() method calls e.g. shop.buy(this, item, amount). Since we want to avoid data corruption caused by many threads possibly modifying shared data at once, we have to use synchronization. So the most typical thing to do is synchronize the shop.buy() method.
Now let's move on to Actor based approach. What I understand is that Shop and each Customer now become Actors who, instead of calling buy() method on shop directly, sends message to shop. But here come the difficulties.
Should all the other domain models (Order, Item) become Actors too and all the communication between all the domain models be message driven? In other words it is a question whether it is OK or not to leave some OOP style interaction between domain models via method invoking. For example, in OOP based approach Order would typically have a reference to List which you could populate when user is buying something by calling add(item) in buy() method. Does this (and similar) interactions have to be remodeled by messaging to make most use of Actor based approach? In yet another words it is a question when do we communicate with internal state of an actor directly and when do we extract internal state to another Actor?
In OOP based solution you pass to methods instances of classes. I read in documentation that in Actor model one is supposed to pass immutable messages. So if I understand correctly, instead of messaging objects themselves you message only some data that makes it possible to identify which entities have to be processed e.g. by messaging their IDs and the type of action you want to perform.

Answering your questions:
2) Your domain model (including shops, orders, buyers, sellers, items) should be described with immutable case classes. Actors should exchange (immutable) commands, which may use these classes, like AddItem(count: Int, i: Item) - AddItem case class represents command and encapsulates business entity called Item.
1) Your protocol, e.g. interaction between shops, orders, sellers, buyers etc., should be encapsulated inside actor (one actor class per protocol, one instance per state). Simply saying, an actor should manage any (mutable) state, changing between requests, like current basket/order. For instance, you may have actor for every basket, which will contain information about choosed items and receive commands like AddItem, RemoveItem, ExecuteOrder. So you don't need actor for every business entity, you need actor for every business process.
In addition, there is some best practices and also recommendations about managing concurrency with routers.
P.S. The nearest JavaEE-based approach is EJB with its entities (as case-classes) and message-driven beans (as actors).

Related

How to create or update a many to many relationship with a RESTful API

I'm trying to understand exactly how a many to many relationship works.
Let's say I have a Movie and Actor model. Actors belong to many Movies and Movies have many Actors. I understand that I can create a MovieActor table that has foreign keys from both a Movie and Actor. The part that I'm not quite clear about is if I want to create a new Actor and relate it to a Movie (POST) or update an existing Actor to relate it to a Movie (PUT), do I use my /api/actor endpoint or do I create a separate endpoint for /api/movieactor?
You can use either. Often times, a REST API (and even the backing DB) will denormalize that data so that it exists in both entries. But it's up to you how you want to notify the REST server of the data. You could POST a new Actor, and in the Actor, include the Movies - and the Server (in addition to adding the new record for the Actor) could update the data stored in the Movie record. Or vice-versa. Or both. There's no rule that a modification to one REST object can't have side-effects on other objects.
And I think people would generally recommend against having a 3rd API just to get the relationship data between the two primary objects. It just complicates the client API, introduces more latency, and exposes too much of the DB internals to the client (making it harder to change in the future).

Recursive model design

The software I am working on uses an API which roughly has this organization : (you might have to read it two times in order to resolve the symbols :) )
A scenario is a process which contains a set of intervals (duration) and events (time point).
An interval is defined by its beginning and end events, which specifies the time at which it starts and ends (hence its duration). An interval can hold an arbitrary number of processes (like a scenario).
An event is just a point in time.
Events can be placed on a graphical view in order to create a scenario.
As you can see, this model is recursive, since you can put a scenario in an interval, and another interval inside this scenario ad infinitum.
My question is : in a "view model" - "presenter" - "view" pattern, what should be the ownership relations of the API objects and the view model objects ? Should I let the API manage the ownership of its own model objects, like Event & Interval, or should I instantiate them when I instantiate the corresponding view model objects ? Is there a best practice ?
You should probably let the API manage its own domain objects, and in your project map those objects to custom Model or ViewModel objects as necessary
Whenever working with ViewModels, try to remember that MVVM or MVP is a pattern for your UI, not a pattern for business logic. Presenters should call other classes (which should be regarded as outside the MVVM/MVP/MVPVM pattern) to do their business logic. It sounds like the API you mention provides a lot of your business functionality; ideally, your models would be specific to your app and then you would map the API's objects to your model objects appropriately.
It is common and sometimes a mistake to use the domain objects (e.g. those provided by the API) as your Model, so be careful, for the instant that you need a property or attribute on your Model that isn't provided by the API object, you are stuck and confused. Be very willing to map the API's objects to your custom Model objects that exist only for your app or site.
When in doubt, get back to SOLID, especially the single-responsibility principle.
Hope I understood you rightly.

Functional programming equivalents for the following [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am trying to make the leap from functional programs for "hello world" equivalents to more real-world applications.
As I come from a Java world and have been exposed to all it's design patterns, my modeling process is still very Java oriented (e.g. I think in terms of *Managers, *Factory, *ClientFactory, *Handler etc.)
Changing my thought process in one shot, will be hard so I was hoping to get some pointers on how the following scenarios (described in a OO way) would be modeled in a functional language.
Examples in a functional language like Clojure/Haskell (or perhaps a hybrid like Scala) would be helpful.
Stateless Request handlers
E.g. is a Servlet. It is essentially a request handler with methods like doGet, doPost. How would one model such a class in a functional language?
Orchestrator classes
Such classes don't do anything by themselves, but just orchestrate the whole process or workflow. They offer multiple entry point APIs.
E.g. A OrderOrchestrator orchestrates a multiple step workflow starting with payment instrument validation, shopping cart management, payment, shipment initiation etc.
They might maintain some internal state of their own that is used by the different steps like payment, shipment etc.
ClientFactory pattern
Let's say you have written a client that for a LogService that is used by your client to log traffic data about their services. The client logs the data in S3 under buckets and accounts managed by you and you provide additional services like reporting and analytics on this data.
You don't want your customer to worry about providing the configuration information like AWS account info etc and hence you provide a ClientFactory that instantiates the appropriate client object based on whether this is for testing or production purposes without requiring the customer to provide any configuration. E.g. LogServiceClientFactory.getProdInstance() or LogServiceClientFactory.getTestInstance().
How is such a client modeled in a functional language?
Builder Pattern and other Fluent API designs
Client libraries often provide Builders to create objects with complex configuration. Sometimes APIs are also fluent to make it easy to create. An example of Fluent API is Mockito APIs : Mockito.when(A.get()).thenReturn(a) IIRC this is internally implemented by returning progressively restrictive Builders to allow the developer to write this code.
Is this a parallel to this in the functional programming world?
Datastore instances
Let's say that your codebase uses data stored in a ActiveUserRegistry from multiple places. You want only 1 instance of this registry to exist and have the entire code base access this registry. So you provide a ActiveUserRegistry.getInstance() that guarantees that all the code base accesses the instance (Assume that the instance is thread-safe etc.)
How is this managed in a functional setting? Do we have to make sure the same instance is passed around in the entire codebase?
Below is something to get started:
Stateless Request handlers
Clojure: Protocols
Haskell: Type classes
Orchestrator classes
State monad
ClientFactory pattern
LogServiceClientFactory is a Module and getProdInstance and getTestInstance being the functions in the module.
Builder Pattern and other Fluent API designs
Function composition
Datastore instances
Clojure: Function that uses an atom (to store and use the single instance)
Haskell: TVar,MVar
I'm not vary familiar with the many of these Java-style structures, but I'll take a stab at answering:
Stateless Request handlers
These exist in the functional world as well. Functions can fill this role easily, even with something as simple as a function from requests to responses. The Play Framework uses something more powerful, specifically a function from the Request to an Iteratee (type (RequestHeader) ⇒ Iteratee[Array[Byte], SimpleResult]). The Iteratee is an entity that can progressively consume input (Array[Byte]) as it is received and eventually produce the response (SimpleResult) to give back to the client. The request handler function is stateless and can be reused. The Iteratee is also stateless - the result of feeding it each chunk is actually to get a new Iteratee back, which is then fed the next chunk. (I'm oversimplifying really, it uses Futures, is entirely non-blocking, and has effective error handling - worth looking at to get a feel of the power and simplicity that functional-style code can bring to this problem).
Orchestrator classes
I'm not familiar with this pattern, so forgive me if this makes no sense. Having one giant mutable object that gets passed around is an anti-pattern. In functional code, there would be separate datatypes to represent the data that needs to passed between each stage of the process. These datatypes would be immutable.
As for things that organize other things, look at Akka and how one actor can monitor other actors underneath it, handling errors or restarting them as needed.
Builder Pattern and other Fluent API designs
Functional program has these and takes them to their logical conclusion. Functional code allows for very powerful DSLs. As for an example, check out a parser combinator library, either the one in the Scala standard library or one of the libraries for Haskell.
ClientFactory pattern and Datastore instances
I don't think this is any different in functional code. Either you have a singleton, or you do proper dependency injection. The Factory pattern is used in functional code as well, though first-class functions make many design patterns too trivial to be worth naming (from the GoF: Factory, Factory method, Command, and at least some instances of Strategy and Template can usually just be functions).
Have a look at Functional Programming Patterns in Scala and Clojure: http://pragprog.com/book/mbfpp/functional-programming-patterns-in-scala-and-clojure .
It should exactly have what you need.

Functional Programming + Domain-Driven Design

Functional programming promotes immutable classes and referential transparency.
Domain-driven design is composed of Value Object (immutable) and Entities (mutable).
Should we create immutable Entities instead of mutable ones?
Let's assume, project uses Scala as main language, how could we write Entities as case classes (immutable so) without risking stale status if we're dealing with concurrency?
What is a good practice? Keeping Entities mutable (var fields etc...) and avoiding great syntax of case classes?
You can effectively use immutable Entities in Scala and avoid the horror of mutable fields and all the bugs that derives from mutable state. Using Immutable entities help you with concurrency, doesn't make things worse. Your previous mutable state will become a set of transformation which will create a new reference at each change.
At a certain level of your application, however, you will need to have a mutable state, or your application would be useless. The idea is to push it as up as you can in your program logic. Let's take an example of a Bank Account, which can change because of interest rate and ATM withdrawal or
deposit.
You have two valid approaches:
You expose methods that can modify an internal property and you manage concurrency on those methods (very few, in fact)
You make all the class immutable and you surround it with a "manager" that can change the account.
Since the first is pretty straightforward, I will detail the first.
case class BankAccount(val balance:Double, val code:Int)
class BankAccountRef(private var bankAccount:BankAccount){
def withdraw(withdrawal) = {
bankAccount = bankAccount.copy(balance = bankAccount.balance - withdrawal)
bankAccount.balance
}
}
This is nice, but gosh, you are still stuck with managing concurrency. Well, Scala offers you a solution for that. The problem here is that if you share your reference to BankAccountRef to your Background job, then you will have to synchronize the call. The problem is that you are doing concurrency in a suboptimal way.
The optimal way of doing concurrency: message passing
What if on the other side, the different jobs cannot invoke methods directly on the BankAccount or a BankAccountRef, but just notify them that some operations needs to be performed? Well, then you have an Actor, the favourite way of doing concurrency in Scala.
class BankAccountActor(private var bankAccount:BankAccount) extends Actor {
def receive {
case BalanceRequest => sender ! Balance(bankAccount.balance)
case Withdraw(amount) => {
this.bankAccount = bankAccount.copy(balance = bankAccount.balance - amount)
}
case Deposit(amount) => {
this.bankAccount = bankAccount.copy(balance = bankAccount.balance + amount)
}
}
}
This solution is extensively described in Akka documentation: http://doc.akka.io/docs/akka/2.1.0/scala/actors.html . The idea is that you communicate with an Actor by sending messages to its mailbox, and those messages are processed in order of receival. As such, you will never have concurrency flaws if using this model.
This is sort of an opinion question that is less scala specific then you think.
If you really want to embrace FP I would go the immutable route for all your domain objects and never put any behavior them.
That is some people call the above the service pattern where there is always a seperation between behavior and state. This eschewed in OOP but natural in FP.
It also depends what your domain is. OOP is some times easier with stateful things like UI and video games. For hard core backend services like web sites or REST I think the service pattern is better.
Two really nice things that I like about immutable objects besides the often mentioned concurrency is that they are much more reliable to cache and they are also great for distributed message passing (e.g. protobuf over amqp) as the intent is very clear.
Also in FP people combat the mutable to immutable bridge by creating a "language" or "dialogue" aka DSL (Builders, Monads, Pipes, Arrows, STM etc...) that enables you to mutate and then to transform back to the immutable domain. The services mentioned above uses the DSL to make changes. This is more natural than you think (e.g. SQL is an example "dialogue"). OOP on the other hand prefers having a mutable domain and leveraging the existing procedural part of the language.

On observing an execution tree of interdependent models in MVC

I've developed on the Yii Framework for a while now (4 months), and so far I have encountered some issues with MVC that I want to share with experienced developers out there. I'll present these issues by listing their levels of complexity.
[Level 1] CR(create update) form. First off, we have a lot of forms. Each form itself is a model, so each has some validation rules, some attributes, and some operations to perform on the attributes. In a lot of cases, each of these forms does both updating and creating records in the db using a single active record object.
-> So at this level of complexity, a form has to
when opened,
be able to display the db-friendly data from the db in a human-friendly way
be able to display all the form fields with the attributes of the active record object. Adding, removing, altering columns from the db table has to affect the display of the form.
when saves, be able to format the human-friendly data to db-friendly data before getting the data
when validates, be able to perform basic validations enforced by the active record object, it also has to perform other validations to fulfill some business rules.
when validating fails, be able to roll back changes made to the attribute as well as changes made to the db, and present the user with their originally entered data.
[Level 2] Extended CR form. A form that can perform creation/update of records from different tables at once. Not just that, whether a form would create/update of one of its records can sometimes depend on other conditions (more business rules), so a form can sometimes update records at table A,B but not D, and sometimes update records at A,D but not B
-> So at this level of complexity, we see a form has to:
be able to satisfy [Level 1]
be able to conditionally create/update of certain records, conditionally create/update of certain columns of certain records.
[Level 3] The Tree of Models. The role of a form in an application is, in many ways, a port that let user's interact with your application. To satisfy requests, this port will interact with many other objects which, in turn, interact with many more objects. Some of these objects can be seen as models. Active Record is a model, but a Mailer can also be a model, so is a RobotArm. These models use one another to satisfy a user's request. Each model can perform their own operation and the whole tree has to be able to roll back any changes made in the case of error/failure.
Has anyone out there come across or been able to solve these problems?
I've come up with many stuffs like encapsulating model attributes in ModelAttribute objects to tackle their existence throughout tiers of client, server, and db.
I've also thought we should give the tree of models an Observer to observe and notify the observed models to rollback changes when errors occur. But what if multiple observers can exist, what if a node use its parent's observer but give its children another observers.
Engineers, developers, Rails, Yii, Zend, ASP, JavaEE, any MVC guys, please join this discussion for the sake of science.
--Update to teresko's response:---
#teresko I actually intended to incorporate the services into the execution inside a unit of work and have the Unit of work not worry about new/updated/deleted. Each object inside the unit of work will be responsible for its state and be required to implement their own commit() and rollback(). Once an error occur, the unit of work will rollback all changes from the newest registered object to the oldest registered object, since we're not only dealing with database, we can have mailers, publishers, etc. If otherwise, the tree executes successfully, we call commit() from the oldest registered object to the newest registered object. This way the mailer can save the mail and send it on commit.
Using data mapper is a great idea, but We still have to make sure columns in the db matches data mapper and domain object. Moreover, an extended CR form or a model that has its attributes depending on other models has to match their attributes in terms of validation and datatype. So maybe an attribute can be an object and shipped from model to model? An attribute can also tell if it's been modified, what validation should be performed on it, and how it can be human-friendly, application-friendly, and db-friendly. Any update to the db schema will affect this attribute, and, thereby throwing exceptions that requires developers to make changes to the system to satisfy this change.
The cause
The root of your problem is misuse of active record pattern. AR is meant for simple domain entities with only basic CRUD operations. When you start adding large amount of validation logic and relations between multiple tables, the pattern starts to break apart.
Active record, at its best, is a minor SRP violation, for the sake of simplicity. When you start piling on responsibilities, you start to incur severe penalties.
Solution(s)
Level 1:
The best option is the separate the business and storage logic. Most often it is done by using domain object and data mappers:
Domain objects (in other materials also known as business object or domain model objects) deal with validation and specific business rules and are completely unaware of, how (or even "if") data in them was stored and retrieved. They also let you have object that are not directly bound to a storage structures (like DB tables).
For example: you might have a LiveReport domain object, which represents current sales data. But it might have no specific table in DB. Instead it can be serviced by several mappers, that pool data from Memcache, SQL database and some external SOAP. And the LiveReport instance's logic is completely unrelated to storage.
Data mappers know where to put the information from domain objects, but they do not any validation or data integrity checks. Thought they can be able to handle exceptions that cone from low level storage abstractions, like violation of UNIQUE constraint.
Data mappers can also perform transaction, but, if a single transaction needs to be performed for multiple domain object, you should be looking to add Unit of Work (more about it lower).
In more advanced/complicated cases data mappers can interact and utilize DAOs and query builders. But this more for situation, when you aim to create an ORM-like functionality.
Each domain object can have multiple mappers, but each mapper should work only with specific class of domain objects (or a subclass of one, if your code adheres to LSP). You also should recognize that domain object and a collection of domain object are two separate things and should have separate mappers.
Also, each domain object can contain other domain objects, just like each data mapper can contain other mappers. But in case of mappers it is much more a matter of preference (I dislike it vehemently).
Another improvement, that could alleviate your current mess, would be to prevent application logic from leaking in the presentation layer (most often - controller). Instead you would largely benefit from using services, that contain the interaction between mappers and domain objects, thus creating a public-ish API for your model layer.
Basically, services you encapsulate complete segments of your model, that can (in real world - with minor effort and adjustments) be reused in different applications. For example: Recognition, Mailer or DocumentLibrary would all services.
Also, I think I should not, that not all services have to contain domain object and mappers. A quite good example would be the previously mentioned Mailer, which could be used either directly by controller, or (what's more likely) by another service.
Level 2:
If you stop using the active record pattern, this become quite simple problem: you need to make sure, that you save only data from those domain objects, which have actually changed since last save.
As I see it, there are two way to approach this:
Quick'n'Dirty
If something changed, just update it all ...
The way, that I prefer is to introduce a checksum variable in the domain object, which holds a hash from all the domain object's variables (of course, with the exception of checksum it self).
Each time the mapper is asked to save a domain object, it calls a method isDirty() on this domain object, which checks, if data has changed. Then mapper can act accordingly. This also, with some adjustments, can be used for object graphs (if they are not to extensive, in which case you might need to refactor anyway).
Also, if your domain object actually gets mapped to several tables (or even different forms of storage), it might be reasonable to have several checksums, for each set of variables. Since mapper are already written for specific classes of domain object, it would not strengthen the existing coupling.
For PHP you will find some code examples in this ansewer.
Note: if your implementation is using DAOs to isolate domain objects from data mappers, then the logic of checksum based verification, would be moved to the DAO.
Unit of Work
This is the "industry standard" for your problem and there is a whole chapter (11th) dealing with it in PoEAA book.
The basic idea is this, you create an instance, that acts like controller (in classical, not in MVC sense of the word) between you domain objects and data mappers.
Each time you alter or remove a domain object, you inform the Unit of Work about it. Each time you load data in a domain object, you ask Unit of Work to perform that task.
There are two ways to tell Unit of Work about the changes:
caller registration: object that performs the change also informs the Unit of Work
object registration: the changed object (usually from setter) informs the Unit of Work, that it was altered
When all the interaction with domain object has been completed, you call commit() method on the Unit of Work. It then finds the necessary mappers and store stores all the altered domain objects.
Level 3:
At this stage of complexity the only viable implementation is to use Unit of Work. It also would be responsible for initiating and committing the SQL transactions (if you are using SQL database), with the appropriate rollback clauses.
P.S.
Read the "Patterns of Enterprise Application Architecture" book. It's what you desperately need. It also would correct the misconception about MVC and MVC-inspired design patters, that you have acquired by using Rails-like frameworks.