Persistence ignorance and DDD reality - persistence

I'm trying to implement fully valid persistence ignorance with little effort. I have many questions though:
The simplest option
It's really straightforward - is it okay to have Entities annotated with Spring Data annotations just like in SOA (but make them really do the logic)? What are the consequences other than having to use persistance annotation in the Entities, which doesn't really follow PI principle? I mean is it really the case with Spring Data - it provides nice repositories which do what repositories in DDD should do. The problem is with Entities themself then...
The harder option
In order to make an Entity unaware of where the data it operates on came from it is natural to inject that data as an interface through constructor. Another advantage is that we always could perform lazy loading - which we have by default in Neo4j graph database for instance. The drawback is that Aggregates (which compose of Entities) will be totally aware of all data even if they don't use them - possibly it could led to debugging difficulties as data is totally exposed (DAO's would be hierarchical just like Aggregates). This would also force us to use some adapters for the repositories as they doesn't store real Entities anymore... And any translation is ugly... Another thing is that we cannot instantiate an Entity without such DAO - though there could be in-memory implementations in domain... again, more layers. Some say that injecting DAOs does break PI too.
The hardest option
The Entity could be wrapped around a lazy-loader which decides where data should come from. It could be both in-memory and in-database, and it could handle any operations which need transactions and so on. Complex layer though, but might be generic to some extent perhaps...? Have a read about it here
Do you know any other solution? Or maybe I'm missing something in mentioned ones. Please share your thoughts!

I achieve persistence ignorance (almost) for free, as a side effect of proper domain modeling.
In particular:
if you correctly define each context's boundary, you will obtain small entities without any need for lazy loading (that, actually becomes an antipattern/code smell in a DDD project)
if you can't simply use SQL into your repository, map a set of DTO to your db schema, and use them into factories to initialize entity classes.
In DDD projects, persistence ignorance is relevant for the domain model itself, not for repositories, factories and other applicative code. Indeed you are very unlikely to change the ORM and/or the DB in the future.
The only (but very strong) rational behind persistence ignorance of the domain model is separation of concerns: in the domain model you should express business invariants only! Persistence is an infrastructural concern!
For example without persistence ignorance (and with lazy loading) the domain model should handle possible exceptions from the db, it's complexity grows and business rules are buried under technological details.

Personally I find it near impossible to achieve a clean domain model when trying to use the same entities as the ORM.
My solution is to model my domain entities as I see fit and ensure that any ORM entities don't leak outside of the repositories. This means that my repositories accept and return domain entities.
This means you lose "most of your ORM goodness" and end up "using your ORM for simple CRUD operations".
Both of these trade-offs are fine for me, I would rather have a clean domain model that I can use, rather than one polluted with artefacts from my DB or ORM. It also cuts down the amount of time I spend "wrestling with my ORM" to zero.
As a side-note, I find document databases a much better fit for DDD.

Once you will provide persistence mapping in you domain model:
your code depends on framework. If you decided to change this framework, you want to change persistence layer and model layer source code - more work, more changes, more merging of code etc.
your domain model jar file depends on spring/nhibernate jars etc.
your classes become larger and larger how business code and persistence related code grows
I've to admit that I dont understand harder and hardest option.
We used separated interfaces and implementations for domain entities. Provide separated mapping files using Hibernate along with repositories.
Entities are created using factory (or repository later), identifier is generated within persistence layer, entity does not need it until it's being persisted.
Lazy loading is provided by special implementation of List once:
mapping of an entity contains it
entity/aggregate is fetched from persistence layer
The only issue is related to transaction as when you use lazy-loaded collection out of transaction scope, it fails.

I would follow the simplest option unless I ran into a stone wall. There are also pitfalls such as this when you adopt pi principle.
Somtimes some compromises are acceptable.
public class Order {
private String status;//my orm does not support enum
public Status status() {
return Status.of(this.status);
}
public is(Status status) {
return status() == status;//use status() instead of getStatus() in domain model
}
}

Related

Should JPA entities and DDD entities be the same classes?

There are classes that are entities according to DDD, and there are classes that have #javax.persistence.Entity annotation. Should they be the same classes? Or should JPA entities act just as a mechanism for a mapper (https://martinfowler.com/eaaCatalog/dataMapper.html) to load DDD entities from a database (and store them) and be kept outside the domain model?
Would it make a difference if database metadata were separated and stored externally (for example, in XML)? If such classes are entities, where is the boundary? I think classes generated from XSD (for example, with JAXB) or even from database with MyBatis Generator are not entities as understood in DDD.
That's an implementation detail really. They could be or they could not depending on the flexibility of your ORM. For instance, if your ORM allows to map your domain objects without polluting them with persistence concerns then that's the approach that requires the less overhead and which I'd go for.
On the other hand, if your ORM is not flexible enough then you could go for a pragmatic hybrid approach where your AR and it's state are two different classes and where the state class is simple enough to easily be mapped. Note that the AR would still be responsible to protect it's state here and the state object wouldn't be accessed directly from outside the AR. The approach is described by Vaughn Vernon here.
Your JPA entities should be the domain entities. Why?
Your domain entities should express some strong constraints, e.g. by
Having parameterized constructors
Not exposing all setters
Do validation on write operations
If possible, a domain entity should always remain a valid business entity.
By introducing some kind of mapper, you introduce a possibility to automagically write arbitrary stuff into your domain entities, which basically renders your constraints useless.
The other option would be enforcing the same constraints on JPA and domain entities which introduces redundancy.
Your best bet is keeping your JPA entities as ORM-agnostic as possible. Using Hibernate, this can be done using a configurating class or XML file. But I am no Java EE/JPA guy, so it's hard for me to give a good implementation advice.
After some more experience with JPA and microservices, I would say that I would most likely not separate them when using JPA, unless there's a reason that makes me do otherwise. On the other hand, entities in a single bounded context do not necessarily have to be only JPA entities. It's possible to have both entities mapped by JPA implementation and entities mapped from DTOs using other technologies (like JSON mappers) or manually.
I agree that both ways are possible. After programming some applications with DDD in mind, I find that this heuristic works well:
If you start from having an entity and not having JPA, it will probably be too hard to refactor an entity so that it can be used by ORM framework, so keep them separate
If you start from scratch, it is worth not distinguishing DDD entities from JPA entities

Exposed domain model in Java microservice architecture

I'm aware that copying entity classes and properties into DTOs is considered anti-pattern, so by Exposed domain model pattern the same #Entity can be used as both database entity class, and DTO for service and MVC layer. (see here https://codereview.stackexchange.com/questions/93511/data-transfer-objects-vs-entities-in-java-rest-server-application)
But suppose we have microservice architecture where the same set of properties is used as entity in one project with persistence, and as DTO in another project which uses the first one as a service. What's the proposed pattern in such a situation?
Because the second project doesn't need #Entity related functionality, and if we put that class in shared library, it will be tied unnecessary to JPA specific APIs and libraries. And the alternative is to again use separate DTO classes anti-pattern.
When your requirements for a DTO model exactly match your entity model you are either in a very early stage of the project or very lucky that you just have a simple model. If your model is very simple, then DTOs won't give you many immediate benefits.
At some point, the requirements for the DTO model and the entity model will diverge though. Imagine you add some audit aspects, statistics or denormalization to your entity/persistence model. That kind of data is usually never exposed via DTOs directly, so you will need to split the models. It is also often the case that the main driver for DTOs is the fact that you don't need all the data all the time. If you display objects in e.g. a dropdown you only need a label and the object id, so why would you load the whole entity state for such a use case?
The fact that you have annotations on your DTO models shouldn't bother you that much, what is the alternative? An XML-like mapping? Manual object wiring?
If your model is used by third parties directly, you could use a subclassing i.e. keep the main model free of annotations and have annotated subclasses in your project that extend the main model.
Since implementing a DTO approach correctly, I created Blaze-Persistence Entity Views which will not only simplify the way you define DTOs, but it will also improve the performance of your queries.
If you are interested, I even have an example for an external model that uses entity view subclasses to keep the main model clean.
Thank you for the answers, but emphasize in the question is on microservice (MS) architecture and reusing defined entity POJOs from one MS in another as POJOs. From what I've read on microservices it's closely related to another question - should MSs share any common functionality and classes at all, or be completely independent? It seems there is no definite agreement on it, and also no definite answer, or widely accepted pattern, to this.
From my recent experience here is what I adopted, and it works well so far.
Have common functionality across MSs - yes, in form of a commons project added as dependency to all MSs, with its dependencies set as optional. Share entity classes (expose them in commons) - no.
The main reason is that entity classes are closely related to data store for particular MS. And as the established rule is that MSs shouldn't share data stores, then it makes sense not to share entity classes for those data stores. It helps MSs to be more independent, and freedom to manage their data store in their own way. It means some more typing to add additional DTO classes and conversion between them, but it's a trade-off worth taking to retain MS independence. Reasons Christian Beikov and Maksim Gumerov mentioned apply as well.
What we do share (put in commons) are some common functionality and helper classes (for cloud, discovery, error handling, rest and json configuration...), and pure DTOs, where T is transfer between MSs (rest entities or message payloads).

Should Entities in Domain Driven Design and Entity Framework be the same?

I have started using Entity Framework Code First for the first time and am impressed by the way in which our greenfield application is being built around the domain rather than around the relational database tables (which is how I have worked for years).
So, we are building entities in C# that are being reflected in the database every time we do a new migration.
My question is this: should these same entities (i.e. designed with Entity Framework in mind) play the same role as entities in Domain Driven Design (i.e. representing the core of the domain)?
Object-Relational Mapping and Domain-Driven Design are two orthogonal concerns.
ORM
An ORM is just here to bridge the gap between the relational data model residing in your database and an object model, any object model.
An Entity as defined by EF concretely means any object that you wish to map some subpart of your relational model to (and from). It turns out that the EF creators wanted to give a business connotation to those by naming them Entities, but in the end nothing forces you that way. You could map to View Models for all it cares.
DDD
From a DDD perspective, there's no such thing as "an Entity designed with EF in mind". A DDD Entity should be persistence ignorant and bear no trace of any ORM. The domain layer has no interest in how, where, whether or when its objects are stored.
Where the two meet
The only point where the two orthogonal concepts intersect is when the object model targeted by your ORM mapping is precisely your domain model. This is possible with what EF calls "Code first" (but should really be named regular ORM), by pointing to your DDD Entities in separate EF mapping files living in a non-domain layer, and refraining from using EF artifacts such as data annotations directly in your Entity classes. This is not possible when using Database First, because the DDD "purity" part of the deal wouldn't be met.
In short, the terms collide, but they should really be conceptually considered as two different things. One is the domain object itself and the other is a pointer that can indicate the same bunch of code, but it could point to pretty much anything else.
They shouldn't be the same as they're designed for different purposes. An ORM entity is a facade for 1 or more tables, its purpose is to simulate OOP on top of relational tables. A Domain Entity is about defining a Domain concept. If your Domain Entity turns out to be just a data structure, then you can reuse it as an EF entity, but that's just one case.
A DDD app never knows about EF or ORM. It only knows about a Repository. Hence, your Domain Objects (DO) don't know either about EF. You can choose to consider them EF entities, as an implementation detail, BUT... you should do that ONLY after your DOs are defined and their use cases implemented. You should defer as much as possible the implementation of persistence (use in-memory repos (lists) for devel).
When you reach that point you'll know if you can reuse your DO for ORM purposes or if you'll need other ways (such as a memento).
Note that a design of a DO while driven by the Domain, it should take into consideration the persistence issue, but it shouldn't be influenced by it i.e don't design your DO according to the db schema. The persistence strategy can be different for each DO and it might involve or not an ORM.
If you're using Event Sourcing for a DO, ORM doesn't exist. Same for serialized objects. It matters a lot how an object will be used by the app (updating and querying), that's why I've said you should defer the persistence implementation. For a lot of DOs you won't need a rdbms (even if you're using it) so an ORM entity will look more like a KeyValuePair (Id => serialized data).
In conclusion, they are different things for different purposes, that might look identical for some cases (CRUD scenarios).
I would say, they can be the same.
Sometimes there is no need to support two models. When you follow code first approach, your entities model your domain, your infrastructure (ORM) separates domain and persistence layers.
It might be reasonable to maintain two models if you have legacy database and have to maintain it.
There are two other SO questions that can be helpful:
Repository pattern and mapping between domain models and Entity Framework
Advice on mapping of entities to domain objects
Well.That's The Approach i use.And I've seen a lot of others doing the same.Now am using The Onion Architecture/Pattern to Create my application and making Everything rely on the domain entities made my life easier.because whenever i want to change for example the Layer that deal with my database ,i can do that without changing the UI layer(ASP.NET MVC app,WPF app...etc)...I suggest doing the same.
let's wait for other posts
I agree with what MikeSW said (3rd Answer).When you design your domain entities,you should do that without caring about who will consume those entities (ORMs or any other technology serving whatever purpose).design them with one idea in mind : they will be reusable and they will not need to be changed in the future (hopefully)

ORM Entities vs. Domain Entities under Entity Framework 6.0

I stumbled upon the following two articles First and Second in which the author states in summary that ORM Entities and Domain Entities shouldn't be mixed up.
I face exactly this problem at the moment as I code with EF 6.0 using the Code First approach. I use the POCO classes as entities in the EF as well as my domain/business objects. But I find myself frequently in the situation where I define a property as public or a navigation property as virtual only because the EF Framework forces me to do so.
I don't know what to take as the bottom line of the two articles? Should I really create for example a CustomerEF class for the entity framework and a CustomerD for my domain. Then create a repository which consumes CustomerD maps it to CustomerEF do some queries and than maps back the received CustomerEF to CustomerD. I thought EF is all about mapping my domain entities to the data.
So please give me some advice. Do I overlook an important thing the EF is able to provide me with? Or is this a problem which can not completely solved by the EF? In the latter case what is a good way to manage this problem?
I agree with the general idea of these posts. An ORM class model is part of a data access layer first and foremost (even if it consists of so-called POCOs). If any conflict of interests arises between persistence and business logic (or any other concern), decisions should always be made in favor of persistence.
However, as software developers we always have to balance between purism and pragmatism. Whether or not to use the persistence model as a domain model depends on a number of factors:
The size/coherence of the development team. When the whole team knows that properties can be public just because of ORM requirements, but should not be set all over the place, it may not be a big deal. If everybody knows (and obeys) that an ID property is not to be used in business logic, having IDs may not be a big deal. A scattered, unexperienced or undisciplined team may need more stringent segregation of code.
The overlap between business logic concerns and persistence concerns. Object oriented design thrives when a class model sticks to SOLID principles. But these principles are not necessarily at odds with persistence concerns. I mean that although the concerns are different, in the end their resultant requirements may be quite similar. For instance, both concerns may require valid object state and correct associations.
There can be use cases, however, in which objects temporarily need to be in a state that absolutely shouldn't be stored. This may be a reason to work with dedicated domain classes. Another reason may be that the entity model just can't fulfill the best segmentation of responsibilities. For instance, a business process "blacklisting customer" may require data that is scattered over so many entity objects that new domain classes must be designed that can encapsulate the data and the methods working on them. In other words: doing this by entities would violate the Tell Don't Ask principle.
The need for layering. For instance, if the data access layer targets different database vendors it may have to consist of interchangeable parts that are vendor-specific (e.g. to account for subtle differences in data types between Oracle and Sql Server or to exploit vendor-specific features). Using the persistence model as domain model would probably bleed vendor-specific implementations into the business logic. That would be really bad. There the data access layer should be precisely that, a layer.
(Very trivial) The amount of data. Creating objects takes time and resources. When "many" objects are involved in a business case it may just be too expensive to build both entity objects and domain objects.
And more, undoubtedly.
So I would always try to be a pragmatist. If entity classes do a decent job, go for it. If the mismatch is too large, create a business domain for appropriate parts of the business logic. I would not slavishly follow a (any) design pattern just because it is a good pattern. Contrary to what is said in the post, it requires a lot of maintenance to map an entity model onto a business model. When you find yourself creating myriads of business classes that are almost identical to entity classes it's time to rethink what you're doing.

Entity Framework as Repository and UnitOfWork?

I'm starting a new project and have decided to try to incorporate DDD patterns and also include Linq to Entities. When I look at the EF's ObjectContext it seems to be performing the functions of both Repository and Unit of Work patterns:
Repository in the sense that the underlying data level interface is abstracted from the entity representation and I can request and save data through the ObjectContext.
Unit Of Work in the sense that I can write all my inserts/updates to the objectContext and execute them all in one shot when I do a SaveChanges().
It seems redundant to put another layer of these patterns on top of the EF ObjectContext? It also seems that the Model classes can be incorporated directly on top of the EF generated entities using 'partial class'.
I'm new at DDD so please let me know if I'm missing something here.
I don't think that the Entity Framework is a good implementation of Repository, because:
The object context is insufficiently abstract to do good unit testing of things which reference it, since it is bound to the DB access. Having an IRepository reference instead works much better for creating unit tests.
When a client has access to the ObjectContext, the client can do pretty much anything it cares to. The only real control you have over this at all is to make certain types or properties private. It is hard to implement good data security this way.
On a non-trivial model, the ObjectContext is insufficiently abstract. You may, for example, have both tables and stored procedures mapped to the same entity type. You don't really want the client to have to distinguish between the two mappings.
On a related note, it is difficult to write comprehensive and well-enforce business rules and entity code. Indeed, whether or not it this is even a good idea is debatable.
On the other hand, once you have an ObjectContext, implementing the Repository pattern is trivial. Indeed, for cases that are not particularly complex, the Repository is something of a wrapper around the ObjectContext and the Entity types.
I would say that you should look at the ObjectContext as your UnitOfWork, and not as a repository.
An ObjectContext cannot be a repository -imho- since it is 'to generic'.
You should create your own Repositories, which have specialized methods (like GetCustomersWithGoldStatus for instance) next to the regular CRUD methods.
So, what I would do, is create repositories (one for each aggregate-root), and let those repositories use the ObjectContext.
I like to have a repository layer for the following reasons:
EF gotcha's
When you look at some of the current tutorials on EF (Code First version), it is apparent there's a number of gotcha's to be handled, particularly around object graphs (entities containing entities) and disconnected scenarios. I think a repository layer is great for wrapping these up in one place.
A clear picture of data access mechanisms
A repository gives a specific picture as to how the BL is accessing and updating the data store. It exposes methods that have a clear single purpose, and can be tested independently of the BL. Standard example from the textbooks, Find() to find a single entity. A more application specific example, Clear() to clear down a db table.
A place for optimizations
Inevitably you come up against performance hits when using vanilla EF. I use the repository to hide the optimization mechanisms from the BL.
Examples,
GetKeys() to project cached keys from the tables (for Insert/Update decisions). The reading of key only is faster and uses less memory than reading the full entity.
Bulk load via SqlBulkCopy. EF will insert by individual SQL statements. If you want a single statement to insert multiple rows, SqlBulkCopy is a good mechanism. The repository encapsulates this and provides metadata for SqlBulkCopy. As well as the Insert method, you need a StartBatch() and EndBatch() method, which is also an argument for a UnitOfWork layer.