How to deal with master tables - jpa

I have an ERD with a main table (A) which has one attribute(String) that is a FK to another table (B).
The issue that I have is that in B the only attribute is the PK; I just want to ensure that the user inputs only one of the allowed values in the main table attribute. I do not even want to update the B table from the application, as it will be a task so unusual that I'll do it directly in the DB.
I could treat B just as another Entity and deal with them with "regular" JPA, but I am a little troubled that maybe there are more efficient ways to do it*. All I want from B table is to get the full list of values and to ensure that the attribute value is correct.
So the question is: there is a specific pattern in JPA to deal with those master tables?
Thanks In advance.
*: My concern is creating / retrieving Entity B objects when all that it is needed is an string, every time an Entity A object is created retrieved.

I would simply use a native query to get all the strings from the B table, or map B as an entity to retrieve all the B Strings using a JPQL query, but not have any association from A to B.
The B string would be stored as basic String column in entity A. And if you try creating or updating an A instance with a string that is not in the B table, then you'll get an exception at flush or commit time because the foreign key constraint is broken.

Related

How to create relationships between entities with existing database that does not contain foreign keys

Using Entity Framework Core 2.0
Stuck with company's production database which has primary keys defined for each table but no foreign keys defined for any relationships.
Dependent records in the database have id fields which are intended to relate to the primary key fields of the parent record like you would normally find with a foreign key relationship/constraint. But these fields were all created as INT NOT NULL and are using a SQL default of '0'.
As a result dependent records have been inserted over time without requiring that a related parent record be specified.
Initially I defined my models in EF with integers and used a fluent configuration to specify "IsRequired". This was done so I could run migrations to create a test database for comparison against the production database to verify that my code first was correctly coded.
This then lead to the problem while using "Include" in my Linq queries which performs an inner join that results in dropping the records that contain the 0's in the id fields of the dependent record.
The only way that I have found to make this work is to model all of the id fields in the dependent entity as nullable integers and remove the "IsRequired" from the fluent configuration.
When using the "Include" it performs a left outer join keeping all of the dependent entities. This also means that any reference properties on the included entities are set to null instead of an empty string. This part can probably be fixed fairly easily.
The downside is if I wanted to use migrations to create a database now, all id fields in the dependent records would be created as NULL.
Is there anyone who has run up against this type of situation? Does anyone have any suggestions to try other than the approach I am using?
I haven't dealt with this scenario before but I wonder if you can solve it by defining the FK property as Nullable and then in the migrations, after the migration is created, edit it to add a HasDefaultValue property to ensure that it's 0? (doc for that migration method: https://learn.microsoft.com/en-us/ef/core/modeling/relational/default-values)

Entity Framework : map duplicate tables to single entity at runtime?

I have a legacy database with a particular table -- I will call it ItemTable -- that can have billions of rows of data. To overcome database restrictions, we have decided to split the table into "silos" whenever the number of rows reaches 100,000,000. So, ItemTable will exist, then a procedure will run in the middle of the night to check the number of rows. If numberOfRows is > 100,000,000 then silo1_ItemTable will be created. Any Items added to the database from now on will be added to silo1_ItemTable (until it grows to big, then silo2_ItemTable will exist...)
ItemTable and silo1_ItemTable can be mapped to the same Item entity because the table structures are identical, but I am not sure how to set this mapping up at runtime, or how to specify the table name for my queries. All inserts should be added to the latest siloX_ItemTable, and all Reads should be from a specified siloX_ItemTable.
I have a separate siloTracker table that will give me the table name to insert/read the data from, but I am not sure how I can use this with entity framework...
Thoughts?
You could try to use the Entity Inheritance to get this. So you have a base class which has all the fields mapped to ItemTable and then you have descendant classes that inherit from ItemTable entity and is mapped to the silo tables in the db. Every time you create a new silo you create a new entity mapped to that silo table.
[Table("ItemTable")]
public class Item
{
//All the fields in the table goes here
}
[Table("silo1_ItemTable")]
public class Silo1Item : Item
{
}
[Table("silo2_ItemTable")]
public class Silo2Item : Item
{
}
You can find more information on this here
Other option is to create a view that creates a union of all those table and map your entity to that view.
As mentioned in my comment, to solve this problem I am using the SQLQuery method that is exposed by DBSet. Since all my item tables have the exact same schema, I can use the SQLQuery to define my own query and I can pass in the name of the table to the query. Tested on my system and it is working well.
See this link for an explanation of running raw queries with entity framework:
EF raw query documentation
If anyone has a better way to solve my question, please leave a comment.
[UPDATE]
I agree that stored procedures are also a great option, but for some reason my management is very resistant to make any changes to our database. It is easier for me (and our customers) to put the sql in code and acknowledge the fact that there is raw sql. At least I can hide it from the other layers rather easily.
[/UPDATE]
Possible solution for this problem may be using context initialization with DbCompiledModel param:
var builder = new DbModelBuilder(DbModelBuilderVersion.V6_0);
builder.Configurations.Add(new EntityTypeConfiguration<EntityName>());
builder.Entity<EntityName>().ToTable("TableNameDefinedInRuntime");
var dynamicContext = new MyDbContext(builder.Build(context.Database.Connection).Compile());
For some reason in EF6 it fails on second table request, but mapping inside context looks correct on the moment of execution.

EF query contains incorrect elements

I have a query with EF which looks like this:
var x = _db.qMetaDataLookups.ToList();
if I execute, direct on the SQL server SELECT * FROM qMetaDataLookup, 2155 distinct rows are returned. After executing the above, x ALSO contains 2155 elements.
The problem is that the data is wrong. I'm not getting the same data back from the EF as I do from the SQL Query.
In particular, theres a particular element that exists on the SQL output, call it "WXYZ", which makes no appearance at all in the EF version of the query (against the exact same database).
Instead, what I find are numerous repeats. If I call x.Distinct() the list filters down from 2155 elements, to a mere 143.
I'm flummoxed. I have never seen my EF and SQL results differ on a query this simple. There must be a very simple [face-palm] explanation, but I'm missing it.
Thanks.
EDIT qMetaDataLookup (a view) are contains information about our database. In essence, its a listing of all tables and views, and each of their columns, with other information about the datatype, length, precision, scale, etc. The 'key' in this table ought to be the column that matches "tableName.columnName" but instead EF chose for it all the datatype properties. This is why the query fails to perform as desired.
Make sure the entity key is set correctly for qMetaDataLookup in the Entity Data Model. Sometimes the entity keys are messed up...
The issue might have been that your model was using a key with duplicate values where the Entity Framework was expecting unique values. This would happen if, for example, your data model used a composite primary key composed of foreign keys from other tables. It seems EF doesn't like composite primary keys very much, and so returned results from queries will generate what appear to be duplicated rows.
The fix seems to be to add a surrogate primary key column to your table which is guaranteed to be unique. If you still need to reference the foreign columns that's fine, so long as they aren't being used as a composite primary key for the table.
I can't claim any credit for the solution, but here's the link that helped me solve my issue:
http://jepsonsblog.blogspot.ca/2011/11/enitity-framework-duplicate-rows-in.html

Select Specific Columns from Database using EF Code First

We have a customer very large table with over 500 columns (i know someone does that!)
Many of these columns are in fact foreign keys to other tables.
We also have the requirement to eager load some of the related tables.
Is there any way in Linq to SQL or Dynamic Linq to specify what columns to be retrieved from the database?
I am looking for a linq statement that actually HAS this effect on the generated SQL Statement:
SELECT Id, Name FROM Book
When we run the reguar query generated by EF, SQL Server throws an error that you have reached the maximum number of columns that can be selected within a query!!!
Any help is much appreciated!
Yes exactly this is the case, the table has 500 columns and is self referencing our tool automatically eager loads the first level relations and this hits the SQL limit on number of columns that can be queried.
I was hoping that I can set to only load limited columns of the related Entities such as Id and Name (which is used in the UI to view the record to user)
I guess the other option is to control what FK columns should be eager loaded. However this still remains problem for tables that has a binary or ntext column which you may not want to load all the times.
Is there a way to hook multiple models (Entities) to the same table in Code First? We tried doing this I think the effort failed miserably.
Yes you can return only subset of columns by using projection:
var result = from x in context.LargeTable
select new { x.Id, x.Name };
The problem: projection and eager loading doesn't work together. Once you start using projections or custom joins you are changing shape of the query and you cannot use Include (EF will ignore it). The only way in such scenario is to manually include relations in the projected result set:
var result = from x in context.LargeTable
select new {
Id = x.Id,
Name = x.Name,
// You can filter or project relations as well
RelatedEnitites = x.SomeRelation.Where(...)
};
You can also project to specific type BUT that specific type must not be mapped (so you cannot for example project to LargeTable entity from my sample). Projection to the mapped entity can be done only on materialized data in Linq-to-objects.
Edit:
There is probably some misunderstanding how EF works. EF works on top of entities - entity is what you have mapped. If you map 500 columns to the entity, EF simply use that entity as you defined it. It means that querying loads entity and persisting saves entity.
Why it works this way? Entity is considered as atomic data structure and its data can be loaded and tracked only once - that is a key feature for ability to correctly persist changes back to the database. It doesn't mean that you should not load only subset of columns if you need it but you should understand that loading subset of columns doesn't define your original entity - it is considered as arbitrary view on data in your entity. This view is not tracked and cannot be persisted back to database without some additional effort (simply because EF doesn't hold any information about the origin of the projection).
EF also place some additional constraints on the ability to map the entity
Each table can be normally mapped only once. Why? Again because mapping table multiple times to different entities can break ability to correctly persist those entities - for example if any non-key column is mapped twice and you load instance of both entities mapped to the same record, which of mapped values will you use during saving changes?
There are two exceptions which allow you mapping table multiple times
Table per hierarchy inheritance - this is a mapping where table can contains records from multiple entity types defined in inheritance hierarchy. Columns mapped to the base entity in the hierarchy must be shared by all entities. Every derived entity type can have its own columns mapped to its specific properties (other entity types have these columns always empty). It is not possible to share column for derived properties among multiple entities. There must also be one additional column called discriminator telling EF which entity type is stored in the record - this columns cannot be mapped as property because it is already mapped as type discriminator.
Table splitting - this is direct solution for the single table mapping limitation. It allows you to split table into multiple entities with some constraints:
There must be one-to-one relation between entities. You have one central entity used to load the core data and all other entities are accessible through navigation properties from this entity. Eager loading, lazy loading and explicit loading works normally.
The relation is real 1-1 so both parts or relation must always exists.
Entities must not share any property except the key - this constraint will solve the initial problem because each modifiable property is mapped only once
Every entity from the split table must have a mapped key property
Insertion requires whole object graph to be populated because other entities can contain mapped required columns
Linq-to-Sql also contains ability to mark a column as lazy loaded but this feature is currently not available in EF - you can vote for that feature.
It leads to your options for optimization
Use projections to get read-only "view" for entity
You can do that in Linq query as I showed in the previous part of this answer
You can create database view and map it as a new "entity"
In EDMX you can also use Defining query or Query view to encapsulate either SQL or ESQL projection in your mapping
Use table splitting
EDMX allows you splitting table to many entities without any problem
Code first allows you splitting table as well but there are some problems when you split table to more than two entities (I think it requires each entity type to have navigation property to all other entity types from split table - that makes it really hard to use).
Create stored procedures that query the number of columns needed and then call the stored procs from code.

How can I replicate core data model using a traditional relational database?

I have my app using core data with the data model below. However, I'm switching to a standard database with columns and rows. Can anyone help me with setting up this new database schema?
First of all you need to create tables for each of the Entities and their attributes (note I added "id" to each of the tables for relationships):
Routine (name, timestamp, id)
Exercise - this looks like a duplicate to me, so leaving one only here (muscleGroup, musclePicture, name, timeStamp, id)
Session (timeStamp, id)
Set (reps, timeStamp, unit, weight, id)
Now that you have tables that describe each of the entities, you need to create tables that will describe the relationships between these entities - as before table names are in capitals and their fields are in parenthesis:
RoutineExercises (routine_id, exercise_id)
SessionExercises (session_id, exercise_id)
ExerciseSets (exercise_id, set_id)
That's it! Now if you need to add an exercise to a routine, you simply:
Add an entry into Exercise table
Establish the relationship by adding a tuple into RoutineExercises table where routine_id is your routine ID and exercise_id is the ID of the newly created entry in the Exercise table
This will hold true for all the rest of the relationships.
NOTE: Your core data model has one-to-many and many-to-many relationships. If you want to specifically enforce that a relationship is one-to-many (e.g. Exercise can only have 1 routine), then you will need to make "exercise_id" as the index for the RoutineExercises table. If you want a many-to-many relationships to be allowed (i.e. each exercise is allowed to have multiple routines), then set the tuple of (routine_id, exercise_id) as the index.