DDD abstraction over infrastructure code - scala

I am creating a repository in my project which will be responsible for all storage operations for a User entity. I will use mongo as the db and mongoreactive as the client. The problem which I have now is about types.
trait UserRepository {
save(user: User) : ?
}
trait MongoUserRepository extends UserRepository {
save(user: User) : Future[WriteResult] = {
collection.insert(user)
}
}
How in my domain should I model WriteResult which comes from MongoReactive? I do not want it to leak into my domain. Is there any existing pattern or a good practice?

How in my domain should I model WriteResult which comes from MongoReactive? I do not want it to leak into my domain. Is there any existing pattern or a good practice?
The usual practice is that the domain would define the UserRepository trait as a service provider interface (spi) that the persistence infrastructure would need to support. Fundamentally, it's a way of expressing the usage requirements that the model imposes on persistence.
Using the language of Command Query Separation, save is a command: it's an operation that changes the state of the repository. So the implementation of the trait should conform to your local coding standard for implementing a command.
Greg Young on (generic) repositories:
What exactly was the intent of the repository pattern in the first place? Looking back to [DDD, Evans] one will see that it is to represent a series of objects as if they were a collection in memory so that the domain can be freed of persistence concerns. In other words the goal is to put collection semantics on the objects in persistence.
So you could also to your collections library for inspiration.
But generally speaking, the most common choice would look like
trait UserRepository {
save(user: User) : Unit
}
So that is the contract your specific implementations would be expected to satisfy.
In MongoUserRepository, you adapt the implementation of your persistence solution to satisfy the contract. In this case that would mean unpacking the Future, inspecting the WriteResult for errors, and throwing an exception if the write was unsuccessful.
With save(user: User) : Unit you implicitly put a requirement on your clients to watch for repository failure (for example: in case of db failure)
Other way around - the repository is a service provider interface; this design doesn't constrain the clients, but the providers. In the lingo of hexagonal architecture, I'm defining a secondary port and constraining the secondary adapter to conform to the contract of the port.
The motivation is exactly the one you describe: the repository consumer is supposed to be isolated from the protocol required to interact with the selected persistence solution. The domain model sits in the middle of the business universe, and the adapter insulates the business universe from reality.
Evans Chapter 6 raises the management challenge of "preventing the model from getting swamped by the complexity of managing the (domain object) life cycle". Repositories provide "the means of finding and retrieving persistent objects while encapsulating the immense infrastructure involved.
The repository is a firewall.
What we are addressing here is separation of concerns; the domain model describes your business logic. We want to be able to see the business case clearly, and that's not going to be possible if we have to explicitly manage what happens in the event that a mutable, in memory, collection fails catastrophically on modification. The pragmatic answer is to nope out and let the application handle it.
That said... I deliberately hedged above when I wrote "conform to your local coding standard". If your local guidelines use railway oriented programming, or message driven, then by all means line it up. There's absolutely no reason why the domain model should care whether storage is synchronous or asynchronous, local or remote, etc.
But if your domain model starts to fill up with match expressions that describe implementation concerns, I'd say you've lost the plot somewhere.

When I was implementing the same repository, I ended up extracting WriteResult value I was most interested in. In my case I ended up with the following signature:
trait UserRepository {
save(user: User) : Future[Option[String]]
}
which returns either some error message or nothing. As a result implementation will be like this:
trait MongoUserRepository extends UserRepository {
save(user: User) : Future[Option[String]] = {
collection.insert(user).map(_.errmsg)
}
}
I ended up with this implementation because I do not lose exception message in case of exception.
Alternative options could be to map insertion result to Boolean:
trait UserRepository {
save(user: User) : Future[Boolean]
}
trait MongoUserRepository extends UserRepository {
save(user: User) : Future[Boolean] = {
collection.insert(user).map(_.ok)
}
}
But in this case you will lose exception message. Sometimes it can be fine however it depends on your exact case.
UPDATE: The answer posted above is valid for 0.11 version. In 0.12 method errmsg in WriteResult was removed. Alternatively you can use writeErrors and in case Seq is not empty you extract all errmsgs from all WriteError.
Hope it helps, sir!

Related

Better dao design with Anorm

I'm working on a code base has lots of singleton daos which getting a new connection from pool in each method with DB.withConnection then executes block.
Dao methods use Anorm parsers to parse result set. There are some cases each dao method run some other dao methods in anorm parsers to get nested related items for business model.
Let's assume you have data structure like this;
User -> Posts -> Comments for each posts
Dao work like this;
UserDao.getUser
PostDao.getUserPosts
CommentsDao.getPostComments
Because each dao method is calling DB.withConnection, multiple connections used for simple operations.
I want to use same connection. It can be done with implicit connection passing through each dao method.
But then I need to maintain connection allocation in upper layer. Right now daos accessed directly from rest actions, there is not any kind of service layer sitting between api and dao. I feel like it's not good to have connection around just in api layer.
So probably having services like UserService which calling daos and handling connection and transaction would be better.
Then another requirement is make me uncomfortable. Most of the dao methods also need to be called by individually.
For example we have api requesting comments only by;
CommentsDao.getPostComments
This means also I need to implement Services for all daos basically, overriding each dao methods with DB.withConnection wrapper, seems like a overhead. (~30 daos, ~10 methods)
Another limitation is that actually inner dao calls is done in anorm parsers, I think this one is misuse of the library;
When we change the each dao method definition to pass implicit connection around, parsers would fail to compile.
Because anorm parsers is ResultSetParser[T] , there is no way to pass connection inside it as default. Example parser is like;
val apiProviderParser = get[Int]("id") ~
get[String]("product_name") ~
get[String]("description") ~
get[String]("icon_url") map {
case id ~ productName ~ desc ~ iconUrl => {
//Inner dao call
//Connection needed now
val params = getApiProviderParams(id)//(implicit connection)
new ApiProviderTemplate(id,productName,desc,iconUrl,params)
}
}
Maybe custom ResultSetParser[T] with connection in the scope can work, I'm not sure it's correct way of solving this problem.
I checked cake pattern, dao design issues but couldn't decide how to proceed and what's the pragmatic, good solution for this problem.
Any help appreciated.

scala api design: can i avoid either an unsafe cast or generics

I am writing a library that provides a distributed algorithm. The idea being that existing applications can add the library to use the algorithm. The algorithm is in a library module and it abstracts the actual transmission of data over the network behind a trait. An application that uses the algorithm has to provide the actual network transport code. In code it looks something like the following:
// library is really a separate project not a single object
object Library {
// handle to a remote server
trait RemoteProcess
// concrete server need to know how to actually send to a real remote process
trait Server {
def send(client: RemoteProcess, msg: String)
}
}
// the application uses the library and provides the concrete transport
object AkkaDemoApplication {
// concreate ref is a m wrapper to an actor ref in demo app
case class ConcreteRemoteProcess(ref: akka.actor.ActorRef) extends Library.RemoteProcess
class AkkaServer extends Library.Server {
// **WARNING** this wont compile its here to make my question
override def send(client: ConcreteRemoteProcess, msg: String): Unit = client.ref ! msg
}
}
A couple of options I have considered:
Have the signature of the AkkaServer method overload the library trait method then perform an unsafe cast to ConcreteRemoteProcess. Yuk!
Have the signature of the AkkaServer method overload the library trait method then pattern match on the RemoteProcesss argument give a ConcreteRemoteProcess. This is no better than an unsafe cast in terms of blowing up at runtime if the wrong thing is passed.
Make the library server generic in terms of the RemoteProcess.
An example of option 3 looks like:
object Library {
trait Server[RemoteProcess] {
def send(client: RemoteProcess, msg: String)
}
}
object Application {
class AkkaServer extends Library.Server[ActorRef] {
override def send(client: ActorRef, msg: String): Unit = client ! msg
}
}
I tried option 3 and it worked but the generic type ended up be stamped on just about every type throughout the entire library module. There was then a lot of covariant and contravariant hassles to get the algorithmic code to compile. Simply to get compile time certainty at the one integration point the cognitive overhead was very large. Visually the library code is dominated by the generic signatures as though understanding that is critical to understanding the library when in fact it's a total distraction to understanding the library logic.
So using the genetic works and gave me compile time certainly but now I wished I had gone with the option 2 (the pattern match) with the excuse "it would fail fast at startup if someone got it wrong lets keep it simple".
Am I missing some Scala feature or idiom here to get compile time certainty without the cognitive overhead of a "high touch" generic that all the library code touches but ignores?
Edit I have considered that perhaps my code library is badly factored such that a refactor could move the generic to the boundaries. Yet the library has already been refactored for testability and that breakup into testable responsibilities is part of the problem of the generic getting smeared around the codebase. Hence my question is: in general is there another technique I don't know about to avoid a generic to provide a concrete implementation to an abstract API?
I think you are coupling your algorithm and Akka too closely. Further more, I assume the server to send data to the remote client that performs some operation and sends back the result
Answer
Why not
object Library {
// handle to a remote server
trait RemoteProcessInput
trait RemoteProcessResult
// concrete server need to know how to actually send to a real remote process and how to deal with the result
trait Server {
def handle(clientData: RemoteProcessInput) : Future[RemoteProcessResult]
}
}
A concrete implementation provides the implementation with Akka
object Application {
class AkkaServerImpl(system: ActorSystem)
extends Library.Server {
override def handle(clientData: RemoteProcessInput)
: ActorRef, msg: String): Future[RemoteProcessResult] = {
// send data to client and expect result
// you can distinguish target and msg from the concrete input
val ref : ActorRef = ??? // (resolve client actor)
val msg = ??? // (create your message based on concrete impl)
val result = ref ? msg // using ask pattern here
// alternatively have an actor living on server side that sends msgs and receives the clients results, triggered by handle method
result
}
}
}

NOT using repository pattern, use the ORM as is (EF)

I always used Repository pattern but for my latest project I wanted to see if I could perfect the use of it and my implementation of “Unit Of Work”. The more I started digging I started asking myself the question: "Do I really need it?"
Now this all starts with a couple of comments on Stackoverflow with a trace to Ayende Rahien's post on his blog, with 2 specific,
repository-is-the-new-singleton
ask-ayende-life-without-repositories-are-they-worth-living
This could probably be talked about forever and ever and it depends on different applications. Whats I like to know,
would this approach be suited for a Entity Framework project?
using this approach is the business logic still going in a service layer, or extension methods (as explained below, I know, the extension method is using NHib session)?
That's easily done using extension methods. Clean, simple and reusable.
public static IEnumerable GetAll(
this ISession instance, Expression<Func<T, bool>> where) where T : class
{
return instance.QueryOver().Where(where).List();
}
Using this approach and Ninject as DI, do I need to make the Context a interface and inject that in my controllers?
I've gone down many paths and created many implementations of repositories on different projects and... I've thrown the towel in and given up on it, here's why.
Coding for the exception
Do you code for the 1% chance your database is going to change from one technology to another? If you're thinking about your business's future state and say yes that's a possibility then a) they must have a lot of money to afford to do a migration to another DB technology or b) you're choosing a DB technology for fun or c) something has gone horribly wrong with the first technology you decided to use.
Why throw away the rich LINQ syntax?
LINQ and EF were developed so you could do neat stuff with it to read and traverse object graphs. Creating and maintain a repository that can give you the same flexibility to do that is a monstrous task. In my experience any time I've created a repository I've ALWAYS had business logic leak into the repository layer to either make queries more performant and/or reduce the number of hits to the database.
I don't want to create a method for every single permutation of a query that I have to write. I might as well write stored procedures. I don't want GetOrder, GetOrderWithOrderItem, GetOrderWithOrderItemWithOrderActivity, GetOrderByUserId, and so on... I just want to get the main entity and traverse and include the object graph as I so please.
Most examples of repositories are bullshit
Unless you are developing something REALLY bare-bones like a blog or something your queries are never going to be as simple as 90% of the examples you find on the internet surrounding the repository pattern. I cannot stress this enough! This is something that one has to crawl through the mud to figure out. There will always be that one query that breaks your perfectly thought out repository/solution that you've created, and it's not until that point where you second guess yourself and the technical debt/erosion begins.
Don't unit test me bro
But what about unit testing if I don't have a repository? How will I mock? Simple, you don't. Lets look at it from both angles:
No repository - You can mock the DbContext using an IDbContext or some other tricks but then you're really unit testing LINQ to Objects and not LINQ to Entities because the query is determined at runtime... OK so that's not good! So now it's up to the integration test to cover this.
With repository - You can now mock your repositories and unit test the layer(s) in between. Great right? Well not really... In the cases above where you have to leak logic into the repository layer to make queries more performant and/or less hits to the database, how can your unit tests cover that? It's now in the repo layer and you don't want to test IQueryable<T> right? Also let's be honest, your unit tests aren't going to cover the queries that have a 20 line .Where() clause and .Include()'s a bunch of relationships and hits the database again to do all this other stuff, blah, blah, blah anyways because the query is generated at runtime. Also since you created a repository to keep the upper layers persistence ignorant, if you now you want to change your database technology, sorry your unit tests are definitely not going to guarantee the same results at runtime, back to integration tests. So the whole point of the repository seems weird..
2 cents
We already lose a lot of functionality and syntax when using EF over plain stored procedures (bulk inserts, bulk deletes, CTEs, etc.) but I also code in C# so I don't have to type binary. We use EF so we can have the possibility of using different providers and to work with object graphs in a nice related way amongst many things. Certain abstractions are useful and some are not.
The repository pattern is an abstraction. It's purpose is to reduce complexity and make the rest of the code persistant ignorant. As a bonus it allows you to write unit tests instead of integration tests.
The problem is that many developers fail to understand the patterns purpose and create repositories which leak persistance specific information up to the caller (typically by exposing IQueryable<T>). By doing so they get no benefit over using the OR/M directly.
Update to address another answer
Coding for the exception
Using repositories is not about being able to switch persistence technology (i.e. changing database or using a webservice etc instead). It's about separating business logic from persistence to reduce complexity and coupling.
Unit tests vs integration tests
You do not write unit tests for repositories. period.
But by introducing repositories (or any other abstraction layer between persistance and business) you are able to write unit tests for the business logic. i.e. you do not have to worry about your tests failing due to an incorrectly configured database.
As for the queries. If you use LINQ you also have to make sure that your queries work, just as you have to do with repositories. and that is done using integration tests.
The difference is that if you have not mixed your business with LINQ statements you can be 100% sure that it's your persistence code that are failing and not something else.
If you analyze your tests you will also see that they are much cleaner if you have not mixed concerns (i.e. LINQ + Business logic)
Repository examples
Most examples are bullshit. that is very true. However, if you google any design pattern you will find a lot of crappy examples. That is no reason to avoid using a pattern.
Building a correct repository implementation is very easy. In fact, you only have to follow a single rule:
Do not add anything into the repository class until the very moment that you need it
A lot of coders are lazy and tries to make a generic repository and use a base class with a lot of methods that they might need. YAGNI. You write the repository class once and keep it as long as the application lives (can be years). Why fuck it up by being lazy. Keep it clean without any base class inheritance. It will make it much easier to read and maintain.
(The above statement is a guideline and not a law. A base class can very well be motivated. Just think before you add it, so that you add it for the right reasons)
Old stuff
Conclusion:
If you don't mind having LINQ statements in your business code nor care about unit tests I see no reason to not use Entity Framework directly.
Update
I've blogged both about the repository pattern and what "abstraction" really means: http://blog.gauffin.org/2013/01/repository-pattern-done-right/
Update 2
For single entity type with 20+ fields, how will you design query method to support any permutation combination? You dont want to limit search only by name, what about searching with navigation properties, list all orders with item with specific price code, 3 level of navigation property search. The whole reason IQueryable was invented was to be able to compose any combination of search against database. Everything looks great in theory, but user's need wins above theory.
Again: An entity with 20+ fields is incorrectly modeled. It's a GOD entity. Break it down.
I'm not arguing that IQueryable wasn't made for quering. I'm saying that it's not right for an abstraction layer like Repository pattern since it's leaky. There is no 100% complete LINQ To Sql provider (like EF).
They all have implementation specific things like how to use eager/lazy loading or how to do SQL "IN" statements. Exposing IQueryable in the repository forces the user to know all those things. Thus the whole attempt to abstract away the data source is a complete failure. You just add complexity without getting any benefit over using the OR/M directly.
Either implement Repository pattern correctly or just don't use it at all.
(If you really want to handle big entities you can combine the Repository pattern with the Specification pattern. That gives you a complete abstraction which also is testable.)
IMO both the Repository abstraction and the UnitOfWork abstraction have a very valuable place in any meaningful development. People will argue about implementation details, but just as there are many ways to skin a cat, there are many ways to implement an abstraction.
Your question is specifically to use or not to use and why.
As you have no doubt realised you already have both these patterns built into Entity Framework, DbContext is the UnitOfWork and DbSet is the Repository. You don’t generally need to unit test the UnitOfWork or Repository themselves as they are simply facilitating between your classes and the underlying data access implementations. What you will find yourself needing to do, again and again, is mock these two abstractions when unit testing the logic of your services.
You can mock, fake or whatever with external libraries adding layers of code dependencies (that you don’t control) between the logic doing the testing and the logic being tested.
So a minor point is that having your own abstraction for UnitOfWork and Repository gives you maximum control and flexibility when mocking your unit tests.
All very well, but for me, the real power of these abstractions is they provide a simple way to apply Aspect Oriented Programming techniques and adhere to the SOLID principles.
So you have your IRepository:
public interface IRepository<T>
where T : class
{
T Add(T entity);
void Delete(T entity);
IQueryable<T> AsQueryable();
}
And its implementation:
public class Repository<T> : IRepository<T>
where T : class
{
private readonly IDbSet<T> _dbSet;
public Repository(PPContext context)
{
_dbSet = context.Set<T>();
}
public T Add(T entity)
{
return _dbSet.Add(entity);
}
public void Delete(T entity)
{
_dbSet.Remove(entity);
}
public IQueryable<T> AsQueryable()
{
return _dbSet.AsQueryable();
}
}
Nothing out of the ordinary so far but now we want to add some logging - easy with a logging Decorator.
public class RepositoryLoggerDecorator<T> : IRepository<T>
where T : class
{
Logger logger = LogManager.GetCurrentClassLogger();
private readonly IRepository<T> _decorated;
public RepositoryLoggerDecorator(IRepository<T> decorated)
{
_decorated = decorated;
}
public T Add(T entity)
{
logger.Log(LogLevel.Debug, () => DateTime.Now.ToLongTimeString() );
T added = _decorated.Add(entity);
logger.Log(LogLevel.Debug, () => DateTime.Now.ToLongTimeString());
return added;
}
public void Delete(T entity)
{
logger.Log(LogLevel.Debug, () => DateTime.Now.ToLongTimeString());
_decorated.Delete(entity);
logger.Log(LogLevel.Debug, () => DateTime.Now.ToLongTimeString());
}
public IQueryable<T> AsQueryable()
{
return _decorated.AsQueryable();
}
}
All done and with no change to our existing code. There are numerous other cross cutting concerns we can add, such as exception handling, data caching, data validation or whatever and throughout our design and build process the most valuable thing we have that enables us to add simple features without changing any of our existing code is our IRepository abstraction.
Now, many times I have seen this question on StackOverflow – “how do you make Entity Framework work in a multi tenant environment?”.
https://stackoverflow.com/search?q=%5Bentity-framework%5D+multi+tenant
If you have a Repository abstraction then the answer is “it’s easy add a decorator”
public class RepositoryTennantFilterDecorator<T> : IRepository<T>
where T : class
{
//public for Unit Test example
public readonly IRepository<T> _decorated;
public RepositoryTennantFilterDecorator(IRepository<T> decorated)
{
_decorated = decorated;
}
public T Add(T entity)
{
return _decorated.Add(entity);
}
public void Delete(T entity)
{
_decorated.Delete(entity);
}
public IQueryable<T> AsQueryable()
{
return _decorated.AsQueryable().Where(o => true);
}
}
IMO you should always place a simple abstraction over any 3rd party component that will be referenced in more than a handful of places. From this perspective an ORM is the perfect candidate as it is referenced in so much of our code.
The answer that normally comes to mind when someone says “why should I have an abstraction (e.g. Repository) over this or that 3rd party library” is “why wouldn’t you?”
P.S. Decorators are extremely simple to apply using an IoC Container, such as SimpleInjector.
[TestFixture]
public class IRepositoryTesting
{
[Test]
public void IRepository_ContainerRegisteredWithTwoDecorators_ReturnsDecoratedRepository()
{
Container container = new Container();
container.RegisterLifetimeScope<PPContext>();
container.RegisterOpenGeneric(
typeof(IRepository<>),
typeof(Repository<>));
container.RegisterDecorator(
typeof(IRepository<>),
typeof(RepositoryLoggerDecorator<>));
container.RegisterDecorator(
typeof(IRepository<>),
typeof(RepositoryTennantFilterDecorator<>));
container.Verify();
using (container.BeginLifetimeScope())
{
var result = container.GetInstance<IRepository<Image>>();
Assert.That(
result,
Is.InstanceOf(typeof(RepositoryTennantFilterDecorator<Image>)));
Assert.That(
(result as RepositoryTennantFilterDecorator<Image>)._decorated,
Is.InstanceOf(typeof(RepositoryLoggerDecorator<Image>)));
}
}
}
First of all, as suggested by some answer, EF itself is a repository pattern, there is no need to create further abstraction just to name it as repository.
Mockable Repository for Unit Tests, do we really need it?
We let EF communicate to test DB in unit tests to test our business logic straight against SQL test DB. I don't see any benefit of having mock of any repository pattern at all. What is really wrong doing unit tests against test database? As it is bulk operations are not possible and we end up writing raw SQL. SQLite in memory is perfect candidate for doing unit tests against real database.
Unnecessary Abstraction
Do you want to create repository just so that in future you can easily replace EF with NHbibernate etc or anything else? Sounds great plan, but is it really cost effective?
Linq kills unit tests?
I will like to see any examples on how it can kill.
Dependency Injection, IoC
Wow these are great words, sure they look great in theory, but sometimes you have to choose trade off between great design and great solution. We did use all of that, and we ended up throwing all at trash and choosing different approach. Size vs Speed (Size of code and Speed of development) matters huge in real life. Users need flexibility, they don't care if your code is great in design in terms of DI or IoC.
Unless you are building Visual Studio
All these great design are needed if you are building a complex program like Visual Studio or Eclipse which will be developed by many people and it needs to be highly customizable. All great development pattern came into picture after years of development these IDEs has gone through, and they have evolved at place where all these great design patterns matter so much. But if you are doing simple web based payroll, or simple business app, it is better that you evolve in your development with time, instead of spending time to build it for million users where it will be only deployed for 100s of users.
Repository as Filtered View - ISecureRepository
On other side, repository should be a filtered view of EF which guards access to data by applying necessary filler based on current user/role.
But doing so complicates repository even more as it ends up in huge code base to maintain. People end up creating different repositories for different user types or combination of entity types. Not only this, we also end up with lots of DTOs.
Following answer is an example implementation of Filtered Repository without creating whole set of classes and methods. It may not answer question directly but it can be useful in deriving one.
Disclaimer: I am author of Entity REST SDK.
http://entityrestsdk.codeplex.com
Keeping above in mind, we developed a SDK which creates repository of filtered view based on SecurityContext which holds filters for CRUD operations. And only two kinds of rules simplify any complex operations. First is access to entity, and other is Read/Write rule for property.
The advantage is, that you do not rewrite business logic or repositories for different user types, you just simply block or grant them the access.
public class DefaultSecurityContext : BaseSecurityContext {
public static DefaultSecurityContext Instance = new DefaultSecurityContext();
// UserID for currently logged in User
public static long UserID{
get{
return long.Parse( HttpContext.Current.User.Identity.Name );
}
}
public DefaultSecurityContext(){
}
protected override void OnCreate(){
// User can access his own Account only
var acc = CreateRules<Account>();
acc.SetRead( y => x=> x.AccountID == UserID ) ;
acc.SetWrite( y => x=> x.AccountID == UserID );
// User can only modify AccountName and EmailAddress fields
acc.SetProperties( SecurityRules.ReadWrite,
x => x.AccountName,
x => x.EmailAddress);
// User can read AccountType field
acc.SetProperties<Account>( SecurityRules.Read,
x => x.AccountType);
// User can access his own Orders only
var order = CreateRules<Order>();
order.SetRead( y => x => x.CustomerID == UserID );
// User can modify Order only if OrderStatus is not complete
order.SetWrite( y => x => x.CustomerID == UserID
&& x.OrderStatus != "Complete" );
// User can only modify OrderNotes and OrderStatus
order.SetProperties( SecurityRules.ReadWrite,
x => x.OrderNotes,
x => x.OrderStatus );
// User can not delete orders
order.SetDelete(order.NotSupportedRule);
}
}
These LINQ Rules are evaluated against Database in SaveChanges method for every operation, and these Rules act as Firewall in front of Database.
There is a lot of debate over which method is correct, so I look at it as both are acceptable so I use ever which one I like the most (Which is no repository, UoW).
In EF UoW is implemented via DbContext and the DbSets are repositories.
As for how to work with the data layer I just directly work on the DbContext object, for complex queries I will make extension methods for the query that can be reused.
I believe Ayende also has some posts about how abstracting out CUD operations is bad.
I always make an interface and have my context inherit from it so I can use an IoC container for DI.
What most apply over EF is not a Repository Pattern. It is a Facade pattern (abstracting the calls to EF methods into simpler, easier to use versions).
EF is the one applying the Repository Pattern (and the Unit of Work pattern as well). That is, EF is the one abstracting the data access layer so that the user has no idea they are dealing with SQLServer.
And at that, most "repositories" over EF are not even good Facades as they merely map, quite straightforwardly, to single methods in EF, even to the point of having the same signatures.
The two reasons, then, for applying this so-called "Repository" pattern over EF is to allow easier testing and to establish a subset of "canned" calls to it. Not bad in themselves, but clearly not a Repository.
Linq is a nowadays 'Repository'.
ISession+Linq already is the repository, and you need neither GetXByY methods nor QueryData(Query q) generalization. Being a little paranoid to DAL usage, I still prefer repository interface. (From maintainability point of view we also still have to have some facade over specific data access interfaces).
Here is repository we use - it de-couples us from direct usage of nhibernate, but provides linq interface (as ISession access in exceptional cases, which are subject to refactor eventually).
class Repo
{
ISession _session; //via ioc
IQueryable<T> Query()
{
return _session.Query<T>();
}
}
The Repository (or however one chooses to call it) at this time for me is mostly about abstracting away the persistence layer.
I use it coupled with query objects so I do not have a coupling to any particular technology in my applications. And also it eases testing a lot.
So, I tend to have
public interface IRepository : IDisposable
{
void Save<TEntity>(TEntity entity);
void SaveList<TEntity>(IEnumerable<TEntity> entities);
void Delete<TEntity>(TEntity entity);
void DeleteList<TEntity>(IEnumerable<TEntity> entities);
IList<TEntity> GetAll<TEntity>() where TEntity : class;
int GetCount<TEntity>() where TEntity : class;
void StartConversation();
void EndConversation();
//if query objects can be self sustaining (i.e. not need additional configuration - think session), there is no need to include this method in the repository.
TResult ExecuteQuery<TResult>(IQueryObject<TResult> query);
}
Possibly add async methods with callbacks as delegates.
The repo is easy to implement generically, so I am able not to touch a line of the implementation from app to app. Well, this is true at least when using NH, I did it also with EF, but made me hate EF. 4. The conversation is the start of a transaction. Very cool if a few classes share the repository instance. Also, for NH, one repo in my implementation equals one session which is opened at the first request.
Then the Query Objects
public interface IQueryObject<TResult>
{
/// <summary>Provides configuration options.</summary>
/// <remarks>
/// If the query object is used through a repository this method might or might not be called depending on the particular implementation of a repository.
/// If not used through a repository, it can be useful as a configuration option.
/// </remarks>
void Configure(object parameter);
/// <summary>Implementation of the query.</summary>
TResult GetResult();
}
For the configure I use in NH only to pass in the ISession. In EF makes no sense more or less.
An example query would be.. (NH)
public class GetAll<TEntity> : AbstractQueryObject<IList<TEntity>>
where TEntity : class
{
public override IList<TEntity> GetResult()
{
return this.Session.CreateCriteria<TEntity>().List<TEntity>();
}
}
To do an EF query you would have to have the context in the Abstract base, not the session. But of course the ifc would be the same.
In this way the queries are themselves encapsulated, and easily testable. Best of all, my code relies only on interfaces. Everything is very clean. Domain (business) objects are just that, e.g. there is no mixing of responsibilities like when using the active record pattern which is hardly testable and mixes data access (query) code in the domain object and in doing so is mixing concerns (object which fetches itself??). Everybody is still free to create POCOs for data transfer.
All in all, much code reuse and simplicity is provided with this approach at the loss of not anything I can imagine. Any ideas?
And thanks a lot to Ayende for his great posts and continued dedication. Its his ideas here (query object), not mine.
For me, it's a simple decision, with relatively few factors. The factors are:
Repositories are for domain classes.
In some of my apps, domain classes are the same as my persistence (DAL) classes, in others they are not.
When they are the same, EF is providing me with Repositories already.
EF provides lazy loading and IQueryable. I like these.
Abstracting/'facading'/re-implementing repository over EF usually means loss of lazy and IQueryable
So, if my app can't justify #2, separate domain and data models, then I usually won't bother with #5.

How do I abstract the domain layer from the persistence layer in Scala

UPDATE:
I've edited the title and added this text to better explain what I'm trying to achieve: I'm trying to create a new application from the ground up, but don't want the business layer to know about the persistence layer, in the same way one would not want the business layer to know about a REST API layer. Below is an example of a persistence layer that I would like to use. I'm looking for good advice on integrating with this i.e. I need help with the design/architecture to cleanly split the responsibilities between business logic and persistence logic. Maybe a concept along the line of marshalling and unmarshalling of persistence objects to domain objects.
From a SLICK (a.k.a. ScalaQuery) test example, this is how you create a many-to-many database relationship. This will create 3 tables: a, b and a_to_b, where a_to_b keeps links of rows in table a and b.
object A extends Table[(Int, String)]("a") {
def id = column[Int]("id", O.PrimaryKey)
def s = column[String]("s")
def * = id ~ s
def bs = AToB.filter(_.aId === id).flatMap(_.bFK)
}
object B extends Table[(Int, String)]("b") {
def id = column[Int]("id", O.PrimaryKey)
def s = column[String]("s")
def * = id ~ s
def as = AToB.filter(_.bId === id).flatMap(_.aFK)
}
object AToB extends Table[(Int, Int)]("a_to_b") {
def aId = column[Int]("a")
def bId = column[Int]("b")
def * = aId ~ bId
def aFK = foreignKey("a_fk", aId, A)(a => a.id)
def bFK = foreignKey("b_fk", bId, B)(b => b.id)
}
(A.ddl ++ B.ddl ++ AToB.ddl).create
A.insertAll(1 -> "a", 2 -> "b", 3 -> "c")
B.insertAll(1 -> "x", 2 -> "y", 3 -> "z")
AToB.insertAll(1 -> 1, 1 -> 2, 2 -> 2, 2 -> 3)
val q1 = for {
a <- A if a.id >= 2
b <- a.bs
} yield (a.s, b.s)
q1.foreach(x => println(" "+x))
assertEquals(Set(("b","y"), ("b","z")), q1.list.toSet)
As my next step, I would like to take this up one level (I still want to use SLICK but wrap it nicely), to working with objects. So in pseudo code it would be great to do something like:
objectOfTypeA.save()
objectOfTypeB.save()
linkAtoB.save(ojectOfTypeA, objectOfTypeB)
Or, something like that. I have my ideas on how I might approach this in Java, but I'm starting to realize that some of my object-oriented ideas from pure OO languages are starting to fail me. Can anyone please give me some pointers as to how approach this problem in Scala.
For example: Do I create simple objects that just wrap or extend the table objects, and then include these (composition) into another class that manages them?
Any ideas, guidance, example (please), that will help me better approach this problem as a designer and coder will be greatly appreciated.
The best idea would be to implement something like data mapper pattern. Which, in contrast to active record, will not violate SRP.
Since I am not a Scala developer, I will not show any code.
The idea is following:
create domain object instance
set conditions on the element (for example setId(42), if you are looking for element by ID)
create data mapper instance
execute fetch() method on the mapper by passing in domain object as parameter
The mapper would look up current parameters of provided domain object and, based on those parameters, retrieve information from storage (which might be SQL database, or JSON file or maybe a remote REST API). If information is retrieved, it assigns the values to the domain object.
Also, I must note, that data mappers are created for work with specific domain object's interface, but the information, which they pass from domain object to storage and back, can be mapped to multiple SQL tables or multiple REST resources.
This way you can easily replace the mapper, when you switch to different storage medium, or even unit-test the logic in domain objects without touching the real storage. Also, if you decide to add caching at some point, that would be just another mapper, which tried to fetch information from cache, and, if it fails, the mapper for persistent storage kicks in.
Domain object (or, in some cases, a collection of domain objects) would be completely unaware of whether it is stored or retrieved. That would be the responsibility of the data mappers.
If this is all in MVC context, then, to fully implement this, you would need another group of structures in the model layer. I call them "services" (please share, of you come up with better name). They are responsible for containing the interaction between data mappers and domain objects. This way you can prevent the business logic from leaking in the presentation layer (controllers, to be exact), and these services create a natural interface for interaction between business (also know as model) layer and the presentation layer.
P.S. Once again, sorry that I cannot provide any code examples, because I am a PHP developer and have no idea how to write code in Scala.
P.P.S. If you are using data mapper pattern, the best option is to write mappers manually and not use any 3rd party ORM, which claims to implement it. It would give you more control over codebase and avoid pointless technical debt [1] [2].
A good solution for simple persistence requirements is the ActiveRecord pattern: http://en.wikipedia.org/wiki/Active_record_pattern . This is implemented in Ruby and in Play! framework 1.2, and you can easily implement it in Scala in a stand-alone application
The only requirement is to have a singleton DB or a singleton service to get a reference to the DB you require. I personally would go for an implementation based on the following:
A generic trait ActiveRecord
A generic typeclass ActiveRecordHandler
Exploiting the power of implicits, you could obtain an amazing syntax:
trait ActiveRecordHandler[T]{
def save(t:T):T
def delete[A<:Serializable](primaryKey:A):Option[T]
def find(query:String):Traversable[T]
}
object ActiveRecordHandler {
// Note that an implicit val inside an object with the same name as the trait
// is one of the way to have the implicit in scope.
implicit val myClassHandler = new ActiveRecordHandler[MyClass] {
def save(myClass:MyClass) = myClass
def delete[A <: Serializable](primaryKey: A) = None
def find(query: String) = List(MyClass("hello"),MyClass("goodbye"))
}
}
trait ActiveRecord[RecordType] {
self:RecordType=>
def save(implicit activeRecordHandler:ActiveRecordHandler[RecordType]):RecordType = activeRecordHandler.save(this)
def delete[A<:Serializable](primaryKey:A)(implicit activeRecordHandler:ActiveRecordHandler[RecordType]):Option[RecordType] = activeRecordHandler.delete(primaryKey)
}
case class MyClass(name:String) extends ActiveRecord[MyClass]
object MyClass {
def main(args:Array[String]) = {
MyClass("10").save
}
}
With such a solution, you only need your class to extends ActiveRecord[T] and have an implicit ActiveRecordHandler[T] to handle this.
There is actually also an implementation: https://github.com/aselab/scala-activerecord which is based on similar idea, but instead of making the ActiveRecord having an abstract type, it declares a generic companion object.
A general but very important comment on the ActiveRecord pattern is that it helps meet simple requirements in terms of persistence, but cannot deal with more complex requirements: for example is when you want to persist multiple objects under the same transaction.
If your application requires more complex persistence logic, the best approach is to introduce a persistence service which exposes only a limited set of functions to the client classes, for example
def persist(objectsofTypeA:Traversable[A],objectsOfTypeB:Traversable[B])
Please also note that according to your application complexity, you might want to expose this logic in different fashions:
as a singleton object in the case your application is simple, and you do not want your persistence logic to be pluggable
through a singleton object which acts as a sort as a "application context", so that in your application at startup you can decide which persistence logic you want to use.
with some sort of lookup service pattern, if your application is distributed.

Serialize Function1 to database

I know it's not directly possible to serialize a function/anonymous class to the database but what are the alternatives? Do you know any useful approach to this?
To present my situation: I want to award a user "badges" based on his scores. So I have different types of badges that can be easily defined by extending this class:
class BadgeType(id:Long, name:String, detector:Function1[List[UserScore],Boolean])
The detector member is a function that walks the list of scores and return true if the User qualifies for a badge of this type.
The problem is that each time I want to add/edit/modify a badge type I need to edit the source code, recompile the whole thing and re-deploy the server. It would be much more useful if I could persist all BadgeType instances to a database. But how to do that?
The only thing that comes to mind is to have the body of the function as a script (ex: Groovy) that is evaluated at runtime.
Another approach (that does not involve a database) might be to have each badge type into a jar that I can somehow hot-deploy at runtime, which I guess is how a plugin-system might work.
What do you think?
My very brief advice is that if you want this to be truly data-driven, you need to implement a rules DSL and an interpreter. The rules are what get saved to the database, and the interpreter takes a rule instance and evaluates it against some context.
But that's overkill most of the time. You're better off having a little snippet of actual Scala code that implements the rule for each badge, give them unique IDs, then store the IDs in the database.
e.g.:
trait BadgeEval extends Function1[User,Boolean] {
def badgeId: Int
}
object Badge1234 extends BadgeEval {
def badgeId = 1234
def apply(user: User) = {
user.isSufficientlyAwesome // && ...
}
}
You can either have a big whitelist of BadgeEval instances:
val weDontNeedNoStinkingBadges = Map(
1234 -> Badge1234,
5678 -> Badge5678,
// ...
}
def evaluator(id: Int): Option[BadgeEval] = weDontNeedNoStinkingBadges.get(id)
def doesUserGetBadge(user: User, id: Int) = evaluator(id).map(_(user)).getOrElse(false)
... or if you want to keep them decoupled, use reflection:
def badgeEvalClass(id: Int) = Class.forName("com.example.badge.Badge" + id + "$").asInstanceOf[Class[BadgeEval]]
... and if you're interested in runtime pluggability, try the service provider pattern.
You can try and use Scala Continuations - they can give you the ability to serialize the computation and run it at later time or even on another machine.
Some links:
Continuations
What are Scala continuations and why use them?
Swarm - Concurrency with Scala Continuations
Serialization relates to data rather than methods. You cannot serialize functionality because it is a class file which is designed to serialize that and object serialization serializes the fields of an object.
So like Alex says, you need a rule engine.
Try this one if you want something fairly simple, which is string based, so you can serialize the rules as strings in a database or file:
http://blog.maxant.co.uk/pebble/2011/11/12/1321129560000.html
Using a DSL has the same problems unless you interpret or compile the code at runtime.