EF 4.2 Code First and DDD Design Concerns - entity-framework

I have several concerns when trying to do DDD development with EF 4.2 (or EF 4.1) code first. I've done some extensive research but haven't come up with concrete answers for my specific concerns. Here are my concerns:
The domain cannot know about the persistence layer, or in other words the domain is completely separate from EF. However, to persist data to the database each entity must be attached to or added to the EF context. I know you are supposed to use factories to create instances of the aggregate roots so the factory could potentially register the created entity with the EF context. This appears to violate DDD rules since the factory is part of the domain and not part of the persistence layer. How should I go about creating and registering entities so that they correctly persist to the database when needed to?
Should an aggregate entity be the one to create it's child entities? What I mean is, if I have an Organization and that Organization has a collection of Employee entities, should Organization have a method such as CreateEmployee or AddEmployee? If not where does creating an Employee entity come in keeping in mind that the Organization aggregate root 'owns' every Employee entity.
When working with EF code first, the IDs (in the form of identity columns in the database) of each entity are automatically handled and should generally never be changed by user code. Since DDD states that the domain is separate from persistence ignorance it seems like exposing the IDs is an odd thing to do in the domain because this implies that the domain should handle assigning unique IDs to newly created entities. Should I be concerned about exposing the ID properties of entities?
I realize these are kind of open ended design questions, but I am trying to do my best to stick to DDD design patterns while using EF as my persistence layer.
Thanks in advance!

On 1: I'm not all that familiar with EF but using the code-first/convention based mapping approach, I'd assume it's not too hard to map POCOs with getters and setters (even keeping that "DbContext with DbSet properties" class in another project shouldn't be that hard). I would not consider the POCOs to be the Aggregate Root. Rather they represent "the state inside an aggregate you want to persist". An example below:
// This is what gets persisted
public class TrainStationState {
public Guid Id { get; set; }
public string FullName { get; set; }
public double Latitude { get; set; }
public double Longitude { get; set; }
// ... more state here
}
// This is what you work with
public class TrainStation : IExpose<TrainStationState> {
TrainStationState _state;
public TrainStation(TrainStationState state) {
_state = state;
//You can also copy into member variables
//the state that's required to make this
//object work (think memento pattern).
//Alternatively you could have a parameter-less
//constructor and an explicit method
//to restore/install state.
}
TrainStationState IExpose.GetState() {
return _state;
//Again, nothing stopping you from
//assembling this "state object"
//manually.
}
public void IncludeInRoute(TrainRoute route) {
route.AddStation(_state.Id, _state.Latitude, _state.Longitude);
}
}
Now, with regard to aggregate life-cycle, there are two main scenario's:
Creating a new aggregate: You could use a factory, factory method, builder, constructor, ... whatever fits your needs. When you need to persist the aggregate, query for its state and persist it (typically this code doesn't reside inside your domain and is pretty generic).
Retrieving an existing aggregate: You could use a repository, a dao, ... whatever fits your needs. It's important to understand that what you are retrieving from persistent storage is a state POCO, which you need to inject into a pristine aggregate (or use it to populate it's private members). This all happens behind the repository/DAO facade. Don't muddle your call-sites with this generic behavior.
On 2: Several things come to mind. Here's a list:
Aggregate Roots are consistency boundaries. What consistency requirements do you see between an Organization and an Employee?
Organization COULD act as a factory of Employee, without mutating the state of Organization.
"Ownership" is not what aggregates are about.
Aggregate Roots generally have methods that create entities within the aggregate. This makes sense because the roots are responsible for enforcing consistency within the aggregate.
On 3: Assign identifiers from the outside, get over it, move on. That does not imply exposing them, though (only in the state POCO).

The main problem with EF-DDD compatibility seems to be how to persist private properties. The solution proposed by Yves seems to be a workaround for the lack of EF power in some cases. For example, you can't really do DDD with Fluent API which requires the state properties to be public.
I've found only mapping with .edmx files allows you to leave Domain Entities pure. It doesn't enforce you to make things publc or add any EF-dependent attributes.
Entities should always be created by some aggregate root. See a great post of Udi Dahan: http://www.udidahan.com/2009/06/29/dont-create-aggregate-roots/
Always loading some aggregate and creating entities from there also solves a problem of attaching an entity to EF context. You don't need to attach anything manually in that case. It will get attached automatically because aggregate loaded from the repository is already attached and has a reference to a new entity. While repository interface belongs to the domain, repository implementation belongs to the infrastructure and is aware of EF, contexts, attaching etc.
I tend to treat autogenerated IDs as an implementation detail of the persistent store, that has to be considered by the domain entity but shouldn't be exposed. So I have a private ID property that is mapped to autogenerated column and some another, public ID which is meaningful for the Domain, like Identity Card ID or Passport Number for a Person class. If there is no such meaningful data then I use Guid type which has a great feature of creating (almost) unique identifiers without a need for database calls.
So in this pattern I use those Guid/MeaningfulID to load aggregates from a repository while autogenerated IDs are used internally by database to make a bit faster joins (Guid is not good for that).

Related

Entity Framework and DDD - Load required related data before passing entity to business layer

Let's say you have a domain object:
class ArgumentEntity
{
public int Id { get; set; }
public List<AnotherEntity> AnotherEntities { get; set; }
}
And you have ASP.NET Web API controller to deal with it:
[HttpPost("{id}")]
public IActionResult DoSomethingWithArgumentEntity(int id)
{
ArgumentEntity entity = this.Repository.GetById(id);
this.DomainService.DoDomething(entity);
...
}
It receives entity identifier, load entity by id and execute some business logic on it with domain service.
The problem:
The problem here is with related data. ArgumentEntity has AnotherEntities collection that will be loaded by EF only if you explicitly ask to do so via Include/Load methods.
DomainService is a part of business layer and should know nothing about persistence, related data and other EF concepts.
DoDomething service method expects to receive ArgumentEntity instance with loaded AnotherEntities collection.
You would say - it's easy, just Include required data in Repository.GetById and load whole object with related collection.
Now lets come back from simplified example to reality of the large application:
ArgumentEntity is much more complex. It contains multiple related collections and that related entities have their related data too.
You have multiple methods of DomainService. Each method requires different combinations of related data to be loaded.
I could imagine possible solutions, but all of them are far from ideal:
Always load the whole entity -> but it is inefficient and often impossible.
Add several repository methods: GetByIdOnlyHeader, GetByIdWithAnotherEntities, GetByIdFullData to load specific data subsets in controller -> but controller become aware of which data to load and pass to each service method.
Add several repository methods: GetByIdOnlyHeader, GetByIdWithAnotherEntities, GetByIdFullData to load specific data subsets in each service method -> it is inefficient, sql query for each service method call. What if you call 10 service methods for one controller action?
Each domain method call repository method to load additional required data ( e.g: EnsureAnotherEntitiesLoaded) -> it is ugly because my business logic become aware of EF concept of related data.
The question:
How would you solve the problem of loading required related data for the entity before passing it to business layer?
In your example I can see method DoSomethingWithArgumentEntity which obviously belongs to Application Layer. This method has call to Repository which belongs to Data Access Layer. I think this situation does not conform to classic Layered Architecture - you should not call DAL directly from Application Layer.
So your code can be rewritten in another manner:
[HttpPost("{id}")]
public IActionResult DoSomethingWithArgumentEntity(int id)
{
this.DomainService.DoDomething(id);
...
}
In DomainService implementation you can read from repo whatever it needs for this specific operation. This avoids your troubles in Application Layer. In Business Layer you will have more freedom to implement reading: with serveral repository methods reads half-full entity, or with EnsureXXX methods, or something else. Knowledge about what you need to read for operation will be placed into operation's code and you don't need this knowledge in app-layer any more.
Every time situation like this emerged it is a strong signal about your entity is not preperly designed. As krzys said the entity has not cohesive parts. In other words if you often need parts of an entity separately you should split this entity.
Nice question :)
I would argue that "related data" in itself is not a strict EF concept. Related data is a valid concept with NHibernate, with Dapper, or even if you use files for storage.
I agree with the other points mostly, though. So here's what I usually do: I have one repository method, in your case GetById, which has two parameters: the id and a params Expression<Func<T,object>>[]. And then, inside the repository I do the includes. This way you don't have any dependency on EF in your business logic (the expressions can be parsed manually for another type of data storage framework if necessary), and each BLL method can decide for themselves what related data they actually need.
public async Task<ArgumentEntity> GetByIdAsync(int id, params Expression<Func<ArgumentEntity,object>>[] includes)
{
var baseQuery = ctx.ArgumentEntities; // ctx is a reference to your context
foreach (var inlcude in inlcudes)
{
baseQuery = baseQuery.Include(include);
}
return await baseQuery.SingleAsync(a=>a.Id==id);
}
Speaking in context of DDD, It seems that you had missed some modeling aspects in your project that led you to this issue. The Entity you wrote about looked not to be highly cohesive. If different related data is needed for different processes (service methods) it seems like you didn't find proper Aggregates yet. Consider splitting your Entity into several Aggregates with high cohesion. Then all processes correlated with particular Aggregate will need all or most of all data that this Aggregate contains.
So I don't know the answer for your question, but if you can afford to make few steps back and refactor your model, I believe you will not encounter such problems.

Foreign key between aggregate roots

I understand the concept of aggregate root and I know that one aggregate root must reference another by identity ( http://dddcommunity.org/wp-content/uploads/files/pdf_articles/Vernon_2011_2.pdf ) so what I don't get is how can I force Entity Framework to add a foreign key constraint between two aggregates?
Lets suppose I have a simplified domain:
public class AggregateOne{
[Key]
public Guid AggregateOneID{ get; private set;}
public Guid AggregateTwoFK{get; private set;}
/*Other Properties and methods*/
}
public class AggregateTwo{
[Key]
public Guid AggregateTwoID{get; private set;}
/*Other Properties and methods*/
}
With this domain design, Entity Framework doesn't know that there is a relationship between AggregateOne and AggregateTwo and consequently there is no foreign key at the generated database.
In DDD, EF doesn't exist. Domain relationships are not the same as database relationships. Don't try to mix EF with domain modeling, they don't work together. So in a nutshell, what you have there is not DDD, just plain old relational db masquerading as DDD. EF would be used by the Repositories and would care about persisting one Aggregate Root (AR).
Two ARs can work together, however you need to model the process according to the domain. EF is there to act as a db for the app, it's concerned with persistence issues and shouldn't care about the Domain. Persistence is all about storage and not about reflecting domain relationships (the EF entity is not the domain entity although they can have the same name and can look similar. The important detail is that both belong to different layers and handle different issues). The Domain repositories care only to persist the AR in a way that can be easily restored when it will change. If more AR need to be persisted together, embrace eventual consistency and learn how to use a service bus and sagas. It will greatly simplify your life (consider it a kind of implementation for the unit of work pattern).
For querying, the most clean and elegant way is to generate/update a read model suitable for the querying use cases and this is usually done after a domain event tells the 'world' that something changed in the Domain.
Doing DDD right is not straightforward and it's very easy to fall into the trap, believing that you apply DDD when in fact you're just CRUD ing away, using DDD terminology. Also IMO CQRS is a must with DDD if you like an easy life.
Understand the domain without rushing it and being superficial, identify the bounded contexts, model the domain concepts and their use cases (very important!!!), define repository interfaces as you need them, and implement the repositories only when there's nothing else left to do (the real repos, in the mean time you can use fake ones like in memory repos - they're very fast to implement and your app being decoupled means it shouldn't care about how persistence is implemented, right?). I know it sounds weird, but this how you know you have a maintainable DDD app.
The point of implementing the repositories last is to really decouple the app from the persistence details and also to have defined the expectations(repository methods) the app has from persistence. Once defined, you can write tests :D then implement the repositories. The bonus is that you get to focus only on repo implementation is isolation and when the all tests pass, you know everything works as it should.
Why should you have two complete different objects? Why not only expose your entities as domain objects through a domain interface?
In this case there's no issue with having your entities also act as domain objects with their implementation details neatly hidden behind the interface.
Another point a neat way to represent aggregate roots with EF is to make sure the foreign key column also makes up the primary key of the dependant entity. In your case that would mean AggregateOneId and AggregateTwoFk together would form the composite primary key of AggregateOne. This will ensure that EF doesn't need a repository for removing instances off AggregateOne as long as it's removed from AggregateTwo's collection it will be properly marked for deletion from the databases (if you don't have key like this you need to remove it from AggregateOne set because EF would throw an exception not understanding the intent of the developer that AggregateOne should be deleted.

DbContext with dynamic DbSet

Is it possible to have a DbContext that has only one property of the generic type IDbSet and not a collection of concrete IDbSet e.g. DbSet.
More specifically, i want to create only one generic DbSet where the actual type will be determined dynamically e.g.
public new IDbSet<T> Set<T>() where T : class
{
return context.Set<T>();
}
I don't want to create multiple DbSets e.g.
DbSet<product> Products { get; set; }
...
Actually i tried to use that generic DbSet but there seems to be one problem. The DbContext doesn't create the corresponding tables in the database. So although i can work with the in-memory entity graph, when the time comes to store the entites into the DB an exception is thrown (Invalid object name 'dbo.Product'.)
Is there any way to force the EF to create tables that correspond to dynamicaly creates DbSets?
Yes you can do this.
modelBuilder.Configurations.Add
The DBSet entries will be derived.
If you plan to use POCOs and just build the model this way ok.
So you save Manual DBSet<> declaration...
But if you plan on being more Dynamic without POCOs...
Before you go down the this route, there are a number of things to consider.
Have you selected the right ORM ?
Do you plan on having a POCOs ?
Why is DbSet Products { get; set; } so bad ?
You get a lot of action for that 1 line of code.
What Data access approach you plan to use without types DBSets
Do you plan to use Linq to Entity statements?
do you plan on creating Expression trees for the Dynamic Data access necessary. Since the types arent known at compile time.
Do you plan to use the DB Model cache,?
How will the cache be managed, especially in Web. ASP environments.
There are most likely other issues i did think of off the top of my head.
Constructing the model yourself is a big task. The Linq access is compromised when compile time types/POCOs are NOT used and the model cache and performance become critical management tasks.
The practical side of this task is not to under estimate
Start here bContext.OnModelCreating
Typically, this method is called only once when the first instance of
a derived context is created. The model for that context is then
cached and is for all further instances of the context in the app
domain. This caching can be disabled by setting the ModelCaching
property on the given ModelBuidler, but this can seriously degrade
performance. More control over caching is provided through use of the
DbModelBuilder and DbContext classes directly.
The modelbuilder class
Good Luck

Managing an Entity and its Snapshot with an ORM

I would like to use one of the ideas that Jimmy Nilsson mentioned in his book Applying DDD with Patterns, and that is if i have an entity like a Product for example, i would like to take a snapshot of that entity for historic information, something like ProductSnapshot but i wonder how i might be able to implement this with an ORM (i am currently using Entity Framework). The main problem i am facing is that if i have another Entity like OrderLine that receives the Product via its constructor then entity framework would need you to make a public property of the type you wish to persist so this will force me to have something like this:
class OrderLine {
public Product Original Product {get; set;}
public ProductSnapshot Snapshot {get; set;}
}
and that seems awkward and not intuitive and i don't know how to deal with it properly when it comes to data binding (to which property i should bind), and finally i think that Product is an Entity while ProductSnapshot is a Value Object plus the snapshot is only taken when the OrderLine is accepted and after that the Product is not needed.
When doing DDD, forget that the database exists. This means the ORM doesn't exist either. Now, because you don't have to care about persistence and ORM limits, you can model the ProductSnapshot according to the domain needs.
Create a ProductSnapshot class with all the required members.This class would be a result probably of a SnapshotService.GetSnapshot(Product p) . Once you have the ProductSnapshot just send it to a repository SnapshotsRepository.Save(snapshot) . Being a snapshot, this means it will probably be more of a data structure, a 'dumb' object. It also should be invariable, 'frozen' .
The Repository will use EF to actually save the data. You decide what the EF entities and relations are. ProductSnapshot is a considered to be a business object by the persistence(it doesn't matter if in reality it's just a simple Dto) and the EF entities may look very different (for example, I store business objects in serialized form in a key-value table) according to your querying needs.
Once you define the EF entites you need to map the ProductSnapshot to them. It's very probable that ProductSnapshot itself can be used as an EF Entity so you don't need to do any mapping.
The point is, that taking a snapshot seems to be domain behavior. You deal with the EF only after you have the snapshot and you do exactly as you'd do with any other busines object.
Why does OrderLine have to have ProductSnapshot property? I suppose, you can either have a link to ProductSnapshot from Product class if you need to get that historical informatil, or, in case you just want to save Product state under some conditions, just implement a SaveSnapshot method in Product partial class, or have an extension method for it.

DTOs: best practices

I am considering to use DTOs instead of passing around my domain objects. I have read several posts here as well as elsewhere, and i understand there are several approaches to getting this done.
If i only have about 10 domain classes in all, and considering that i want to use DTOs rather than domain objects for consumption in my Views (WPF front ends), what is the recommended approach.
I think using tools like automapper etc maybe an overkill for my situation. So i am thinking of writing my custom mapper class that will have methods for converting a domain type to a DTO type.
What is the best way to do this, are there any sample to get me started to do this?
Second question: When writing those methods that will create DTOs, how do i deal with setting up all the data, especially when the domain type has references to other domain objects? Do i write equivalent properties in the DTO for mapping to those refernece types in the domain class?
Please ask if i have not put my second question in proper words. But i think you understand what i am trying to ask.
Thrid question: When writing DTOs, should i write multiple DTOs, each containing partial data for a given domain model, so that each of it can be used to cater to a specific View's requirement, or should the DTO have all the data that are there in the corresponding model class.
I've been reading a few posts here regarding DTO's and it seems to me that a lot of people equate them to what I would consider a ViewModel. A DTO is just that, Data Transfer Object - it's what gets passed down the wire. So I've got a website and services, only the services will have access to real domain/entity objects, and return DTO's. These may map 1:1, but consider that the DTO's may be populated from another service call, a database query, reading a config - whatever.
After that, the website then can take those DTO and either add them to a ViewModel, or convert into one. That ViewModel may contain many different types of DTO's. A simple example would be a task manager - the ViewModel contains both the task object you are editing, as well as a group of Dto.User objects that the task can be assigned to.
Keep in mind that the services returning DTO's maybe used by both a website, and maybe a tablet or phone application. These applications would have different views to take advantage of their displays and so the ViewModels would differ, but the DTO's would remain the same.
At any rate, I love these types of discussions, so anyone please let me know what you think.
Matt
I'm kind of using DTOs in a project. I tend to make the DTOs only to show the data I need for an specified view. I fetch all the data shown in the view in my data access class. For example, I may have an Order object which references a Client object:
public class Client{
public int Id{get;set;}
public string Name{get;set;}
}
public class Order{
public int OrderID{get;set;}
public Client client{get;set;}
public double Total{get;set;}
public IEnumerable<OrderLine> lines {get;set;}
}
Then in my OrderListDTO I may have something like:
public class OrderListDTO{
public int OrderId{get;set;}
public string ClientName{get;set;}
...
}
Which are the fields I want to show in my view. I fetch all these fields in my Database access code so I don't have to bother with entity asociations in my view or controller code.
Best Way to develop DTOs
The way to start developing DTOs is to understand that their sole purpose is to transfer subset of data of your business entities to different clients(could be UI, or an external service). Given this understanding you could create seperate packages for each client...and write your DTO classes. For mapping you could write your own mapper defining interfaces to be passed to a factory creating DTO objects based on which data from the entity for which the DTO is being created would be extracted. You could also define annotations to be placed on your entity fields but personally given the number of annotations used I would prefer the interface way. The main thing to note about DTOs is that they are also classes and data among the DTOs should be reused, in other words while it may seem tempting to create DTOs for each use case try to reuse existing DTOs to minimize this.
Getting started
Regarding getting started as stated above the sole purpose of the DTO is to give the client the data it needs....so you keeping in mind you could just set data into the dto using setters...or define a factory which creates a DTO from an Entity based on an interface.....
Regarding your third question, do as is required by your client :)
I come to project with spring-jdbc and there are used DAO layer. Some times existing entities doesn't cover all possible data from DB. So I start using DTO.
By applying '70 structure programming rule I put all DTOs into separate package:
package com.evil.dao; // DAO interfaces for IOC.
package com.evil.dao.impl; // DAO implementation classes.
package com.evil.dao.dto; // DTOs
Now I rethink and decide to put all DTO as inner classes on DAO interfaces for result-sets which have no reuse. So DAO interface look like:
interface StatisticDao {
class StatisticDto {
int count;
double amount;
String type;
public static void extract(ResultSet rs, StatisticDto dto) { ... }
}
List<StatisticDto> getStatistic(Criteria criteria);
}
class StatisticDaoImpl implements StatisticDao {
List<StatisticDto> getStatistic(Criteria criteria) {
...
RowCallbackHandler callback = new RowCallbackHandler() {
#Override
public void processRow(ResultSet rs) throws SQLException {
StatisticDao.StatisticDto.extract(rs, dto);
// make action on dto
}
}
namedTemplate.query(query, queryParams, callback);
}
}
I think that holding related data together (custom DTO with DAO interface) make code better for PageUp/PageDown.
Question 1: If the DTO's you need to transfer are just a simple subset of your domain object, you can use a modelmapper to avoid filling your codebase with logic-less mapping. But if you need to apply some logic/conversion to your mapping then do it yourself.
Question 2: You can and probably should create a DTO for each domain object you have on your main DTO. A DTO can have multiple DTO's inside of it, one for each domain object you need to map. And to map those you could do it yourself or even use some modelmapper.
Question 3: Don't expose all your domain if your view does not require it to. Also you don't need to create a DTO for each view, try to create DTO's that expose what need to be exposed and may be reused to avoid having multiples DTO's that share a lot of information. But it mainly depend's on your application needs.
If you need clarification, just ask.
I'm going to assume that your domain model objects have a primary key ID that may correspond to the ID's from the database or store they came from.
If the above is true, then your DTO will overcome type referecning to other DTO's like your domain objects do, in the form of a foreign key ID. So an OrderLine.OrderHeader relationship on the domain object, will be OrderLine.OrderHeaderId cin the DTO.
Hope that helps.
Can I ask why you have chosen to use DTO's instead of your rich domain objects in the view?
We all know what Dtos are (probably).
But the important thing is to overuse DTOs or not.
Transfering data using Dtos between "local" services is a good practice but have a huge overhead on your developer team.
There is some facts:
Clients should not see or interact with Entities (Daos). So you
always need Dtos for transferig data to/from remote (out of the process).
Using Dtos to pass data between services is optional. If you don't plan to split up your project to microservices there is no need to do that. It will be just an overhead for you.
And this is my comment: If you plan to distribute your project to
microservices in long future. or don't plan to do that, then
DON'T OVERUSE DTOs
You need to read this article https://martinfowler.com/bliki/LocalDTO.html