EF Table-per-hierarchy mapping - entity-framework

In trying to normalize a database schema and mapping it in Entity Framework, I've found that there might end up being a bunch of lookup tables. They would end up only containing key and value pairs. I'd like to consolidate them into one table that basically has two columns "Key" and "Value". For example, I'd like to be able to get Addresses.AddressType and Person.Gender to both point to the same table, but ensure that the navigation properties only return the rows applicable to the appropriate entity.
EDIT: Oops. I just realized that I left this paragraph out:
It seems like a TPH type of problem, but all of the reading I've done indicates that you start with fields in the parent entity and migrate fields over to the inherited children. I don't have any fields to move here because there would generally only be two.
There are a lot of domain-specific key-value pairs need to be represented. Some of them will change from time to time, others will not. Rather than pick and choose I want to just make everything editable. Due to the number of these kinds of properties that are going to be used, I'd rather not have to maintain a list enums that require a recompile, or end up with lots of lookup tables. So, I thought that this might be a solution.
Is there a way to represent this kind of structure in EF4? Or, am I barking up the wrong tree?
EDIT: I guess another option would be to build the table structure I want at the database level and then write views on top of that and surface those as EF entities. It just means any maintenance needs to be done at multiple levels. Does that sound more, or less desireable than a pure EF solution?

Table per hiearchy demands that you have one parent entity which is used as base class for child entities. All entities are mapped to the same table and there is special discriminator column to differ type of entity stored in database record. You can generally use it even if your child entities do not define any new properties. You will also have to define primary key for your table otherwise it will be handled as readonly entity in EF. So your table can look like:
CREATE TABLE KeyValuePairs
(
Id INT NOT NULL IDENTITY(1,1),
Key VARCHAR(50) NOT NULL,
Value NVARCHAR(255) NOT NULL,
Discriminator VARCHAR(10) NOT NULL,
Timestamp Timestamp NOT NULL
)
You will define your top level KeyValuePair entity with properties Id, Key, Value and Timestamp (set as concurrency mode fixed). Discriminator column will be used for inheritance mapping.
Be aware that EF mapping is static. If you define AddressType and Gender entities you will be able to use them but you will not be able to dynamically define new type like PhoneType. This will always require modifying your EF model, recompiling and redeploying your application.
From OOP perspective it would be nicer to not model this as object hiearchy and instead use conditional mapping of multiple unrelated entities to the same table. Unfortunatelly even EF supports conditional mapping I have never been able to map two entities to the same table yet.

Related

mapping generalization constraints to sql (STI approcach)

I'm trying to model the following relationships between entities, mainly consisting of a partial, disjoint generalization.
original EERD
'mapped' to relational
Since I didn't need the subclasses to have any particular attributes I decided to use the "single table inheritance" approach, added the "type" field and moved the relationships towards the parent.
After that I had two choices to make:
1- type for the "business type" attribute
2- way to constraint participation to at most one of the 4 relationships based on the type attribute
For the sake of portability and extensibility I decided to implement no.1 as a lookup table (rather than enum or a hardcoded check).
About no.2 I figured the best way to enforce participation and exclusivity constraints on the four relationships would be a trigger.
The problem is that now I'm not really sure how to write a trigger function; for instance it would have to reference values inserted into business type, so I'd have to make sure those can't be changed or deleted in the future.
I feel like I'm doing something wrong so I wanted to ask for feedback before going further; is this a suitable approach in your opinion?
I found an interesting article describing a possible solution: it feels a bit like an 'hack' but it should be working
(it's intended for SQL Server, but it can be easily applied in postgres too).
EDIT:
It consists in adding a type field to the parent table, and then have every child table reference said field along with the parent's id by using a foreign key constraint (a UNIQUE constraint on this pair of fields has to be added beforehand, since FKs must be unique).
Now in order to force the type field to match the table it belongs to, one adds a check constraint/always generated value ensuring that the type column always has the same value
(eg: CHECK(Business_type_id = 1) in the Husbandry table, where 1 represents 'husbandry' in the type lookup table).
The only issue is that it requires a whole column in every subclass, each containing the same generated value repeated over and over (waste of space?), and it may fall apart as soon as the IDs in the lookup table are modified

How to create relationships between entities with existing database that does not contain foreign keys

Using Entity Framework Core 2.0
Stuck with company's production database which has primary keys defined for each table but no foreign keys defined for any relationships.
Dependent records in the database have id fields which are intended to relate to the primary key fields of the parent record like you would normally find with a foreign key relationship/constraint. But these fields were all created as INT NOT NULL and are using a SQL default of '0'.
As a result dependent records have been inserted over time without requiring that a related parent record be specified.
Initially I defined my models in EF with integers and used a fluent configuration to specify "IsRequired". This was done so I could run migrations to create a test database for comparison against the production database to verify that my code first was correctly coded.
This then lead to the problem while using "Include" in my Linq queries which performs an inner join that results in dropping the records that contain the 0's in the id fields of the dependent record.
The only way that I have found to make this work is to model all of the id fields in the dependent entity as nullable integers and remove the "IsRequired" from the fluent configuration.
When using the "Include" it performs a left outer join keeping all of the dependent entities. This also means that any reference properties on the included entities are set to null instead of an empty string. This part can probably be fixed fairly easily.
The downside is if I wanted to use migrations to create a database now, all id fields in the dependent records would be created as NULL.
Is there anyone who has run up against this type of situation? Does anyone have any suggestions to try other than the approach I am using?
I haven't dealt with this scenario before but I wonder if you can solve it by defining the FK property as Nullable and then in the migrations, after the migration is created, edit it to add a HasDefaultValue property to ensure that it's 0? (doc for that migration method: https://learn.microsoft.com/en-us/ef/core/modeling/relational/default-values)

Entity Framework: Doing JOINs without having to creating Entities

Just starting out with Entity Framework (Code First) and I have to say I am having a lot of problems with it when loading SQL data that is fairly complex. For example, let's say I have the following tables which stores which animals belongs to which regions in the world and the animal are also categorized.
Table: Region
Id: integer
Name string
Table AnimalCategory
Id integer
Name: string
RegionId: integer -- Refers back Region
Table Animal
Id integer
AnimalCategoryId integer -- Refers back AnimalCategory
Let's say I want to create a query with Entity Framework that would load all Animals for a specific region. The easiest thing to do is to create 3 Entities Region, AnimalCategory, and Animal and use LINQ to load the data.
But let's say I am not interested in loading any AnimalCategory information and define an Entity class just to represent AnimalCategory so that I can do the JOIN. How can I do this with Entity Framework? Even with many of its Mapping functions I still don't think this is possible.
In non Entity Framework solutions this is easy to accomplish by using INNER JOINs in SPs or inline SQL. So what are my options in Entity Framework? Shall I pollute my data model with these useless tables just so I can do a JOIN?
It's a matter of choice I guess. EF choose to support many-to-many associations with transparent junction tables, i.e. where junction tables only have two foreign keys to the associated entities. They simply didn't choose to support this far less common "skipping one-to-many-to-many" scenario in a similar manner.
And I can imagine why.
To start with, in a many-to-many association, the junction table is nothing but that: a junction, an association. However, in a chain of one-to-many (or many-to-one) associations it would be exceptional for any of the involved tables to be just an association. In your example...
Animal → AnimalCategory → Region
...AnimalCategory would only have a primary key (Id) and a foreign key (RegionId). That would be useless though: Animal might just as well have a RegionId itself. There's no reason to support a data model that doesn't make sense.
What you're after though, is a model in which the table in the middle does carry information (AnimalCategory.Name), but where you'd like to map it as a transparent junction table, because a particular class model doesn't need this information.
Your focus seems to be on reading data. But EF has to support all CRUD actions. The problem here would be: how to deal with inserts? Suppose Name is a required field. There would be no way to supply its value.
Another problem would be that a statement like...
region.Animals.Add(animal);
...could mean two things:
add an Animal and a new AnimalCategory, the latter referring to the Region.
Add an Animal referring to an existing AnimalCategory - without being able to choose which one.
EF wouldn't want to choose for some default behavior. You'd have to make the choice yourself, so you can't do without access to AnimalCategory.

Select Specific Columns from Database using EF Code First

We have a customer very large table with over 500 columns (i know someone does that!)
Many of these columns are in fact foreign keys to other tables.
We also have the requirement to eager load some of the related tables.
Is there any way in Linq to SQL or Dynamic Linq to specify what columns to be retrieved from the database?
I am looking for a linq statement that actually HAS this effect on the generated SQL Statement:
SELECT Id, Name FROM Book
When we run the reguar query generated by EF, SQL Server throws an error that you have reached the maximum number of columns that can be selected within a query!!!
Any help is much appreciated!
Yes exactly this is the case, the table has 500 columns and is self referencing our tool automatically eager loads the first level relations and this hits the SQL limit on number of columns that can be queried.
I was hoping that I can set to only load limited columns of the related Entities such as Id and Name (which is used in the UI to view the record to user)
I guess the other option is to control what FK columns should be eager loaded. However this still remains problem for tables that has a binary or ntext column which you may not want to load all the times.
Is there a way to hook multiple models (Entities) to the same table in Code First? We tried doing this I think the effort failed miserably.
Yes you can return only subset of columns by using projection:
var result = from x in context.LargeTable
select new { x.Id, x.Name };
The problem: projection and eager loading doesn't work together. Once you start using projections or custom joins you are changing shape of the query and you cannot use Include (EF will ignore it). The only way in such scenario is to manually include relations in the projected result set:
var result = from x in context.LargeTable
select new {
Id = x.Id,
Name = x.Name,
// You can filter or project relations as well
RelatedEnitites = x.SomeRelation.Where(...)
};
You can also project to specific type BUT that specific type must not be mapped (so you cannot for example project to LargeTable entity from my sample). Projection to the mapped entity can be done only on materialized data in Linq-to-objects.
Edit:
There is probably some misunderstanding how EF works. EF works on top of entities - entity is what you have mapped. If you map 500 columns to the entity, EF simply use that entity as you defined it. It means that querying loads entity and persisting saves entity.
Why it works this way? Entity is considered as atomic data structure and its data can be loaded and tracked only once - that is a key feature for ability to correctly persist changes back to the database. It doesn't mean that you should not load only subset of columns if you need it but you should understand that loading subset of columns doesn't define your original entity - it is considered as arbitrary view on data in your entity. This view is not tracked and cannot be persisted back to database without some additional effort (simply because EF doesn't hold any information about the origin of the projection).
EF also place some additional constraints on the ability to map the entity
Each table can be normally mapped only once. Why? Again because mapping table multiple times to different entities can break ability to correctly persist those entities - for example if any non-key column is mapped twice and you load instance of both entities mapped to the same record, which of mapped values will you use during saving changes?
There are two exceptions which allow you mapping table multiple times
Table per hierarchy inheritance - this is a mapping where table can contains records from multiple entity types defined in inheritance hierarchy. Columns mapped to the base entity in the hierarchy must be shared by all entities. Every derived entity type can have its own columns mapped to its specific properties (other entity types have these columns always empty). It is not possible to share column for derived properties among multiple entities. There must also be one additional column called discriminator telling EF which entity type is stored in the record - this columns cannot be mapped as property because it is already mapped as type discriminator.
Table splitting - this is direct solution for the single table mapping limitation. It allows you to split table into multiple entities with some constraints:
There must be one-to-one relation between entities. You have one central entity used to load the core data and all other entities are accessible through navigation properties from this entity. Eager loading, lazy loading and explicit loading works normally.
The relation is real 1-1 so both parts or relation must always exists.
Entities must not share any property except the key - this constraint will solve the initial problem because each modifiable property is mapped only once
Every entity from the split table must have a mapped key property
Insertion requires whole object graph to be populated because other entities can contain mapped required columns
Linq-to-Sql also contains ability to mark a column as lazy loaded but this feature is currently not available in EF - you can vote for that feature.
It leads to your options for optimization
Use projections to get read-only "view" for entity
You can do that in Linq query as I showed in the previous part of this answer
You can create database view and map it as a new "entity"
In EDMX you can also use Defining query or Query view to encapsulate either SQL or ESQL projection in your mapping
Use table splitting
EDMX allows you splitting table to many entities without any problem
Code first allows you splitting table as well but there are some problems when you split table to more than two entities (I think it requires each entity type to have navigation property to all other entity types from split table - that makes it really hard to use).
Create stored procedures that query the number of columns needed and then call the stored procs from code.

How to Use inheritance in EF

I am using EF 4.0 , i have one problem
Table structure in DB is:
Table: Setting--->
Name (PK)
GroupBy
DataType
Table: UserSetting-->
SettingName(PK)(FK)
UserName(PK)(FK)
Value
Table: WorkstationSetting-->
SettingName(PK)(FK)
WorkstationName(PK)(FK)
Value
Now i want to make use of inheritance, because WorkstationSetting and UserSetting inherits settings so any suggestion how to achieve inheritance, i tried but i got error like
"Error 39 Error 3003: Problem in mapping fragments starting at line 1621:All the key properties (Settings.Name) of the EntitySet Settings must be mapped to all the key properties (WorkstationSetting.SettingName, WorkstationSetting.WorkstationName) of table WorkstationSetting.
I see you have in UserSetting and WorkstationSetting a composite PK.
If UserSetting and WorkstationSetting are derived from Setting, they should have Name as PK.
Another comment; in general, it's not recommended to use a name or something "meaningful" as PK since it is less scalable and might cause limitations (i.e. max index size). Use instead an int or uniqueidentifier.
I recommend you to introduce a new field which is SettingId which should be added to all three tables. In EF designer, just add the Inheritance.
Look into table per type inheritance. For example look here. It should help you get started. The idea is that you have a table for each concrete type (as you have) and you map it to an object hierarchy.
Maybe your problem is with the keys. How is your mapping defined? Are the associations between the tables defined in the DB?