Design entity model by Inheritance versus Association for Entity Framework

Design entity model by Inheritance versus Association for Entity Framework - entity-framework

I'm using EF 6 Code First, and I'm facing a design decision:
The entity model
I have an abstract entity which is realized by several concrete ones, each of which add several attributes to the entity. I have polymorphic associations (base class has several related entities), so when mapping this in EF, it makes a clear case for TPT inheritance. (More info here for readers: http://weblogs.asp.net/manavi/archive/2010/12/28/inheritance-mapping-strategies-with-entity-framework-code-first-ctp5-part-2-table-per-type-tpt.aspx)
The issue with TPT design
What I'm concerned about, is that I have several cases where I only need to process data on the base class. To be specific, I have several scheduled batch processes that need to load the entities and perform calculations and updates.
TPT inheritance will cause performing JOIN on all sub-type tables, which is unnecessary for my case.
Second design option
My second option is to create separate concrete entities for parent and child types, and forget about the inheritance altogether.
From the SQL point of view, these designs will be equivalent. From the object-oriented point of view, this option is less correct. Plus, reaching child entities from the parent is not clean enough.
When to use which design?
Personally, I'm leaning toward the second design (associations) because most of my cases would be easier and simpler. (Less logic handled by the ORM tool means more control and simplicity).
My question is when to use which design.

Related

Should JPA entities and DDD entities be the same classes?

There are classes that are entities according to DDD, and there are classes that have #javax.persistence.Entity annotation. Should they be the same classes? Or should JPA entities act just as a mechanism for a mapper (https://martinfowler.com/eaaCatalog/dataMapper.html) to load DDD entities from a database (and store them) and be kept outside the domain model?
Would it make a difference if database metadata were separated and stored externally (for example, in XML)? If such classes are entities, where is the boundary? I think classes generated from XSD (for example, with JAXB) or even from database with MyBatis Generator are not entities as understood in DDD.

That's an implementation detail really. They could be or they could not depending on the flexibility of your ORM. For instance, if your ORM allows to map your domain objects without polluting them with persistence concerns then that's the approach that requires the less overhead and which I'd go for.
On the other hand, if your ORM is not flexible enough then you could go for a pragmatic hybrid approach where your AR and it's state are two different classes and where the state class is simple enough to easily be mapped. Note that the AR would still be responsible to protect it's state here and the state object wouldn't be accessed directly from outside the AR. The approach is described by Vaughn Vernon here.

Your JPA entities should be the domain entities. Why?
Your domain entities should express some strong constraints, e.g. by
Having parameterized constructors
Not exposing all setters
Do validation on write operations
If possible, a domain entity should always remain a valid business entity.
By introducing some kind of mapper, you introduce a possibility to automagically write arbitrary stuff into your domain entities, which basically renders your constraints useless.
The other option would be enforcing the same constraints on JPA and domain entities which introduces redundancy.
Your best bet is keeping your JPA entities as ORM-agnostic as possible. Using Hibernate, this can be done using a configurating class or XML file. But I am no Java EE/JPA guy, so it's hard for me to give a good implementation advice.

After some more experience with JPA and microservices, I would say that I would most likely not separate them when using JPA, unless there's a reason that makes me do otherwise. On the other hand, entities in a single bounded context do not necessarily have to be only JPA entities. It's possible to have both entities mapped by JPA implementation and entities mapped from DTOs using other technologies (like JSON mappers) or manually.

I agree that both ways are possible. After programming some applications with DDD in mind, I find that this heuristic works well:
If you start from having an entity and not having JPA, it will probably be too hard to refactor an entity so that it can be used by ORM framework, so keep them separate
If you start from scratch, it is worth not distinguishing DDD entities from JPA entities

Should Entities in Domain Driven Design and Entity Framework be the same?

I have started using Entity Framework Code First for the first time and am impressed by the way in which our greenfield application is being built around the domain rather than around the relational database tables (which is how I have worked for years).
So, we are building entities in C# that are being reflected in the database every time we do a new migration.
My question is this: should these same entities (i.e. designed with Entity Framework in mind) play the same role as entities in Domain Driven Design (i.e. representing the core of the domain)?

Object-Relational Mapping and Domain-Driven Design are two orthogonal concerns.
ORM
An ORM is just here to bridge the gap between the relational data model residing in your database and an object model, any object model.
An Entity as defined by EF concretely means any object that you wish to map some subpart of your relational model to (and from). It turns out that the EF creators wanted to give a business connotation to those by naming them Entities, but in the end nothing forces you that way. You could map to View Models for all it cares.
DDD
From a DDD perspective, there's no such thing as "an Entity designed with EF in mind". A DDD Entity should be persistence ignorant and bear no trace of any ORM. The domain layer has no interest in how, where, whether or when its objects are stored.
Where the two meet
The only point where the two orthogonal concepts intersect is when the object model targeted by your ORM mapping is precisely your domain model. This is possible with what EF calls "Code first" (but should really be named regular ORM), by pointing to your DDD Entities in separate EF mapping files living in a non-domain layer, and refraining from using EF artifacts such as data annotations directly in your Entity classes. This is not possible when using Database First, because the DDD "purity" part of the deal wouldn't be met.
In short, the terms collide, but they should really be conceptually considered as two different things. One is the domain object itself and the other is a pointer that can indicate the same bunch of code, but it could point to pretty much anything else.

They shouldn't be the same as they're designed for different purposes. An ORM entity is a facade for 1 or more tables, its purpose is to simulate OOP on top of relational tables. A Domain Entity is about defining a Domain concept. If your Domain Entity turns out to be just a data structure, then you can reuse it as an EF entity, but that's just one case.
A DDD app never knows about EF or ORM. It only knows about a Repository. Hence, your Domain Objects (DO) don't know either about EF. You can choose to consider them EF entities, as an implementation detail, BUT... you should do that ONLY after your DOs are defined and their use cases implemented. You should defer as much as possible the implementation of persistence (use in-memory repos (lists) for devel).
When you reach that point you'll know if you can reuse your DO for ORM purposes or if you'll need other ways (such as a memento).
Note that a design of a DO while driven by the Domain, it should take into consideration the persistence issue, but it shouldn't be influenced by it i.e don't design your DO according to the db schema. The persistence strategy can be different for each DO and it might involve or not an ORM.
If you're using Event Sourcing for a DO, ORM doesn't exist. Same for serialized objects. It matters a lot how an object will be used by the app (updating and querying), that's why I've said you should defer the persistence implementation. For a lot of DOs you won't need a rdbms (even if you're using it) so an ORM entity will look more like a KeyValuePair (Id => serialized data).
In conclusion, they are different things for different purposes, that might look identical for some cases (CRUD scenarios).

I would say, they can be the same.
Sometimes there is no need to support two models. When you follow code first approach, your entities model your domain, your infrastructure (ORM) separates domain and persistence layers.
It might be reasonable to maintain two models if you have legacy database and have to maintain it.
There are two other SO questions that can be helpful:
Repository pattern and mapping between domain models and Entity Framework
Advice on mapping of entities to domain objects

Well.That's The Approach i use.And I've seen a lot of others doing the same.Now am using The Onion Architecture/Pattern to Create my application and making Everything rely on the domain entities made my life easier.because whenever i want to change for example the Layer that deal with my database ,i can do that without changing the UI layer(ASP.NET MVC app,WPF app...etc)...I suggest doing the same.
let's wait for other posts
I agree with what MikeSW said (3rd Answer).When you design your domain entities,you should do that without caring about who will consume those entities (ORMs or any other technology serving whatever purpose).design them with one idea in mind : they will be reusable and they will not need to be changed in the future (hopefully)

ORM Entities vs. Domain Entities under Entity Framework 6.0

I stumbled upon the following two articles First and Second in which the author states in summary that ORM Entities and Domain Entities shouldn't be mixed up.
I face exactly this problem at the moment as I code with EF 6.0 using the Code First approach. I use the POCO classes as entities in the EF as well as my domain/business objects. But I find myself frequently in the situation where I define a property as public or a navigation property as virtual only because the EF Framework forces me to do so.
I don't know what to take as the bottom line of the two articles? Should I really create for example a CustomerEF class for the entity framework and a CustomerD for my domain. Then create a repository which consumes CustomerD maps it to CustomerEF do some queries and than maps back the received CustomerEF to CustomerD. I thought EF is all about mapping my domain entities to the data.
So please give me some advice. Do I overlook an important thing the EF is able to provide me with? Or is this a problem which can not completely solved by the EF? In the latter case what is a good way to manage this problem?

I agree with the general idea of these posts. An ORM class model is part of a data access layer first and foremost (even if it consists of so-called POCOs). If any conflict of interests arises between persistence and business logic (or any other concern), decisions should always be made in favor of persistence.
However, as software developers we always have to balance between purism and pragmatism. Whether or not to use the persistence model as a domain model depends on a number of factors:
The size/coherence of the development team. When the whole team knows that properties can be public just because of ORM requirements, but should not be set all over the place, it may not be a big deal. If everybody knows (and obeys) that an ID property is not to be used in business logic, having IDs may not be a big deal. A scattered, unexperienced or undisciplined team may need more stringent segregation of code.
The overlap between business logic concerns and persistence concerns. Object oriented design thrives when a class model sticks to SOLID principles. But these principles are not necessarily at odds with persistence concerns. I mean that although the concerns are different, in the end their resultant requirements may be quite similar. For instance, both concerns may require valid object state and correct associations.
There can be use cases, however, in which objects temporarily need to be in a state that absolutely shouldn't be stored. This may be a reason to work with dedicated domain classes. Another reason may be that the entity model just can't fulfill the best segmentation of responsibilities. For instance, a business process "blacklisting customer" may require data that is scattered over so many entity objects that new domain classes must be designed that can encapsulate the data and the methods working on them. In other words: doing this by entities would violate the Tell Don't Ask principle.
The need for layering. For instance, if the data access layer targets different database vendors it may have to consist of interchangeable parts that are vendor-specific (e.g. to account for subtle differences in data types between Oracle and Sql Server or to exploit vendor-specific features). Using the persistence model as domain model would probably bleed vendor-specific implementations into the business logic. That would be really bad. There the data access layer should be precisely that, a layer.
(Very trivial) The amount of data. Creating objects takes time and resources. When "many" objects are involved in a business case it may just be too expensive to build both entity objects and domain objects.
And more, undoubtedly.
So I would always try to be a pragmatist. If entity classes do a decent job, go for it. If the mismatch is too large, create a business domain for appropriate parts of the business logic. I would not slavishly follow a (any) design pattern just because it is a good pattern. Contrary to what is said in the post, it requires a lot of maintenance to map an entity model onto a business model. When you find yourself creating myriads of business classes that are almost identical to entity classes it's time to rethink what you're doing.

Any decent resources on how to map complex POCO objects in EF 4.1?

So I heard L2S is going the way of the dodo bird. I am also finding out that if I use L2S, I will have to write multiple versions of the same code to target different schemas even if they vary slightly. I originally chose L2S because it was reliable and easy to learn, while EF 3 wasn't ready for public consumption at the time.
After reading lots of praises for EF 4.1, I thought I would do a feasibility test. I discovered that EF 4.1 is a beast to get your head around. It is mindnumblingly complex with hundreds of ways of doing the same thing. It seems to work fine if you're planning on using simple table-to-object mapped entities, but complex POCO object mapping has been a real PITA. There are no good tutorials and the few that exist are very rudimentary.
There are tons of blogs about learning the fundamentals about EF 4.1, but I have a feeling that they deliberately avoid advanced topics. Are there any good tutorials on more complex mapping scenarios? For instance, taking an existing POCO object and mapping it across several tables, or persisting a POCO object that is composed of other POCO objects? I keep hearing this is possible, but haven't found any examples.

Disclaimer: IMO EF 4.1 is best known for its Code-First approach. Most of the following links point to articles about doing stuff in code-first style. I'm not very familiar with DB-First or Model-First approaches.
I have learned many things from Mr. Manavi's blog. Especially, the Inheritance with code-first series was full of new stuff for me. This MSDN link has some valuable links/infos about different mapping scenarios too. Also, I have learned manu stuff by following or answering questions with entity-framework tags here on SO.
Whenever I want to try some new complex object mapping, I do my best (based on my knowledge about EF) to create the correct mappings; However sometimes, you face a dead end. That's why god created StackOverflow. :)

What do you mean by EFv4.1? Do you mean overhyped code-first / fluent-API? In such case live with a fact that it is mostly for simple mapping scenarios. It offers more then L2S but still very little in terms of advanced mappings.
The basic mapping available in EF follows basic rule: one table = one entity. Entity can be single class or composition of the main class representing the entity itself and helper classes for set of mapped fields (complex types).
The most advanced features you will get with EF fluent-API or designer are:
TPH inheritance - multiple tables in inheritance hierarchy mapped to the same table. Types are differed by special column called discriminator. Shared fields must be in parent class.
TPT inheritance - each type mapped to the separate table = basic type has one table and each derived type has one table as well. Shared fields must be defined in base type and thus in base table. Relation between base and derived table is one-to-one. Derived entities span multiple tables.
TPC inheritance - each class has separate table = shared fields must be defined in base type but each derived type has them in its own table.
Entity splitting - entity is split into two or more tables which are related by one-to-one relation. All parts of entity must exist.
Table splitting - table is split into two or more entities related with one-to-one relation.
Designer also offers
Conditional mapping - this is not real mapping. It is only hardcoded filter on mapping level where you select one or more fields to restrict records which are allowed for loading.
When using basic or more advanced features table can participate only in one mapping.
All these mapping techniques follow very strict rules. Your classes and tables must follow these rules to make them work. That means you cannot take arbitrary POCO and map it to multiple tables without satisfying those rules.
These rules can be avoided only when using EDMX and advanced approach with advanced skills = no fluent API and no designer but manual modifications of XML defining EDMX. Once you go this way you can use
Defining query - custom SQL query used to specify loading of new "entity". This is also approach natively used by EDMX and designer when mapping database view
Query view - custom ESQL query used to specify new "entity" from already mapped entities. It is more usable for predefined projections because in contrast to defining query it has some limitations (for example aggregations are not allowed).
Both these features allow you defining classes combined from multiple tables. The disadvantage of both these mapping techniques is that mapped result is read only. You must use stored procedures for persisting changes when using these techniques.

Core Data entity inheritance --> limitations?

I thought I'll post this to the community. I am using coredata, and have two entities. Both entities have a hierarchical relationship. I am noticing quite a lot of duplicated functionality now, and am wondering if I should re-structure to have a base Entity which is abstract (HierarchicalObject), and make my entities inherit from them.
So the question is are there some limitations of this inheritance that I should take into account? Reading some of the posts out there, I see a few trade-offs, let me know if my assumptions are correct.
(Good) clean up structure, keep the HierarchicalObject functionality in one spot.
(Ok) With inheritance, both objects now end up in the same sqlite table (I am using Sqlite as the backend). So if the number of objects grow, search/sorting could take longer? Not sure if this is a huge deal, as the number of objects in my case should stay pretty static.
(not so good) With inheritance, the relationship could get more complicated? (http://www.cocoadev.com/index.pl?CoreDataInheritanceIssues)
Are there other things to take into account?
Thanks for your comments.

I think it's a mistake to draw to close a parallel between entities and classes. While very similar they do have some important differences.
The most important difference is that entities don't have code like a class would so when you have entities with duplicate attributes, your not adding a lot of extra coding and potential for introducing bugs.
A lot of people believe that class inheritance must parallel entity inheritance. It does not. As a long as a class descends from NSManagedObject and responds to the right key-value messages for the entity it represents, the class can have many merry adventures in it's inheritance that are not reflected in the entities inheritance. E.g. It's fairly common to create a custom base class right below NSManagedObject and the have all the subsequent managed object subclasses inherit from that regardless of their entities.
I think the only time that entity inheritance is absolutely required is when you need different entities to show up in the same relationship. E.g:
Owner{
vehical<-->Vehical.owner
}
Vehical(abstract){
owner<-->Owner.vehical
}
Motocycle:Vehical{
}
Car:Vehical{
}
Now the Owner.vehical can hold either a Motocycle object or a Car object. Note that the managed object class inheritance for Motocycle and Car don't have to be same. You could have something like Motocycle:TwoWheeled:NSManagedObject and Car:FourWheeled:NSManagedObject and everything would work fine.
In the end, entities are just instructions to context to tell it how the object graph fits together. As long as your entity arrangement makes that happen, you have a lot flexibility in the design details, quite a bit more than you would have in an analogous situation with classes.

I thought it would be useful to mention that the Notes app on iOS 10 uses inheritance in its Core Data model. They use a base entity SyncingObject, that has 7 sub-entities including Note and Folder. And as you mentioned all of these are stored in the same SQLite table which has a whopping 106 columns, and since are shared among all entities most are NULL. They also implemented the folder-notes one-to-many relation as a many-to-many which creates a pivot table, which might be a work-around for an inheritance problem.
There are a couple of advantages to using entity inheritance that likely outweigh these storage limitations. For example, a unique constraint can be unique across entities. And a fetch request for a parent entity can return multiple child entities making UI that uses fetched results controller simpler, e.g. grouping by accounts or folders in a sidebar. Notes uses this to show an "All Notes" row above the Folder rows which is actually backed by an Account.

I have had issues in the past with data migration of models that had inheritance - you may want to experiment with that and see if you can get it to work.
As you noted also, all objects go in one table.
However, as Core Data is managing an object graph, it is really nice to keep the structure the way you would naturally have it just modeling objects - which includes inheritance. There's a lot to be said for keeping the model sane so that you have to do less work in maintaining code.
I have personally used a fairly complex CD model with inheritance in one of my own apps, and it has worked out OK (apart from as I said having issues with data migration, but that has been so flakey for me in general I do not rely on that working any longer).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse