Entity Framework Primary Key Design (Human Resources web app) - entity-framework

I'm going to start the development for a Human Resources (HRMS) application and I have been thinking about choosing the best datatype for database primary keys.
The application has a few must to have's:
Javascript Framework (EXTJS)
ASP.NET WebAPI Server Side
Multi-Tenant feature (Database design)
I have developed other enterprise applications before using Entity Framework and incremental INT as primary keys but sometimes you get into trouble when dealing with manual imports, etc. because the primary key is dynamic.
So I have been thinking on using GUID's as primary key because it gives you a lot of benefits in terms on data management but would like to know how does that perform with Entity Framework. Is there any side-effect on using GUID as primary keys in all my tables?
The only down-side element that I can think on using GUID's on server side is the Payload to the client because then each jSON sent from server to client will have a GUID on each record (36 chars instead of simple INT).
Appreciate any feedback.

No downside I know of besides it can be harder to debug when trying to compare Guid's. (integers are easier to read). The major benefit I see is that you don't need to have the table locked when doing inserts because you simply say "Guid.new" (something like that) and it is practically guaranteed to be unique.
I use Guid's all the time as my primary key with EF so I'm sure it works very well. Unless your tables records are very very short, I don't think the length is of material concern.
My 2 cents.

Related

How to stop EF Core from indexing all foreign keys

As documented in questions like Entity Framework Indexing ALL foreign key columns, EF Core seems to automatically generate an index for every foreign key. This is a sound default for me (let's not get into an opinion war here...), but there are cases where it is just a waste of space and slowing down inserts and updates. How do I prevent it on a case-by-case basis?
I don't want to wholly turn it off, as it does more good than harm; I don't want to have to manually configure it for all those indices I do want. I just want to prevent it on specific FKs.
Related side question: is the fact that these index are automatically created mentioned anywhere in the EF documentation? I can't find it anywhere, which is probably why I can't find how to disable it?
Someone is bound to question why I would want to do this... so in the interest of saving time, the OPer of the linked question gave a great example in a comment:
We have a People table and an Addresses table, for example. The
People.AddressID FK was Indexed by EF but I only ever start from a
People row and search for the Addresses record; I never find an
Addresses row and then search the People.AddressID column for a
matching record.
EF Core has a configuration option to replace one of its services.
I found replacing IConventionSetBuilder to custom one would be a much cleaner approach.
https://giridharprakash.me/2020/02/12/entity-framework-core-override-conventions/
If it is really necessary to avoid the usage of some foreign keys indices - as far as I know (currently) - in .Net Core, it is necessary to remove code that will set the indices in generated migration code file.
Another approach would be to implement a custom migration generator in combination with an attribute or maybe an extension method that will avoid the index creation. You could find more information in this answer for EF6: EF6 preventing not to create Index on Foreign Key. But I'm not sure if it will work in .Net Core too. The approach seems to be bit different, here is a MS doc article that should help.
But, I strongly advise against doing this! I'm against doing this, because you have to modify generated migration files and not because of not using indices for FKs. Like you mentioned in question's comments, in real world scenarios some cases need such approach.
For other people they are not really sure if they have to avoid the usage of indices on FKs and therefor they have to modify migration files:
Before you go that way, I would suggest to implement the application with indices on FKs and would check the performance and space usage. Therefor I would produce a lot test data.
If it really results in performance and space usage issues on a test or QA stage, it's still possible to remove indices in migration files.
Because we already chat about EnsureCreated vs migrations here for completeness further information about EnsureCreated and migrations (even if you don't need it :-)):
MS doc about EnsureCreated() (It will not update your database if you have some model changes - migrations would do it)
interesting too (even if for EF7) EF7 EnsureCreated vs. Migrate Methods
Entity Framework core 2.0 (the latest version available when the question was asked) doesn't have such a mechanism, but EF Core 2.2 just might - in the form of Owned Entity Types.
Namely, since you said:
" I only ever start from a People row and search for the Addresses record; I never find an Addresses row"
Then you may want to make the Address an Owned Entity Type (and especially the variant with 'Storing owned types in separate tables', to match your choice of storing the address information in a separate Addresses table).
The docs of the feature seem to say a matching:
"Owned entities are essentially a part of the owner and cannot exist without it"
By the way, now that the feature is in EF, this may justify why EF always creates the indexes for HasMany/HasOne. It's likely because the Has* relations are meant to be used towards other entities (as opposed to 'value objects') and these, since they have their own identity, are meant to be queried independently and allow accessing other entities they relate to using navigational properties. For such a use case, it would be simply dangerous use such navigation properties without indexes (a few queries could make the database slow down hugely).
There are few caveats here though:
Turning an entity into an owned one doesn't instruct EF only about the index, but rather it instructs to map the model to database in a way that is a bit different (more on this below) but the end effect is in fact free of that extra index on People.
But chances are, this actually might be the better solution for you: this way you also say that no one should query the Address (by not allowing to create a DbSet<T> of that type), minimizing the chance of someone using it to reach the other entities with these costly indexless queries.
As to what the difference is, you'll note that if you make the Address owned by Person, EF will create a PersonId column in the Address table, which is different to your AddressId in the People table (in a sense, lack of the foreign key is a bit of a cheat: an index for querying Person from Address is there, it's just that it's the primary key index of the People table, which was there anyways). But take note that this design is actually rather good - it not only needs one column less (no AddressId in People), but it also guarantees that there's no way to make orphaned Address record that your code will never be able to access.
If you would still like to keep the AddressId column in the Addresses, then there's still one option:
Just choose a name of AddressId for the foreign key in the Addresses table and just "pretend" you don't know that it happens to have the same values as the PersonId :)
If that option isn't funny (e.g. because you can't change your database schema), then you're somewhat out of luck. But do take note that among the Current shortcomings of EF they still list "Instances of owned entity types cannot be shared by multiple owners", while some shortcomings of the previous versions are already listed as addressed. Might be worth watching that space as, it seems to me, resolving that one will probably involve introducing the ability to have your AddressId in the People, because in such a model, for the owned objects to be shared among many entities the foreign keys would need to be sitting with the owning entities to create an association to the same value for each.
in the OnModelCreating override
AFTER the call to
base.OnModelCreating(modelBuilder);
add:
var indexForRemoval = modelBuilder.Entity<You_Table_Entity>().HasIndex(x => x.Column_Index_Is_On).Metadata;
modelBuilder.Entity<You_Table_Entity>().Metadata.RemoveIndex(indexForRemoval);
'''

Updating foreign keys in the db or having a model that is maped to db that does not have foreign key (for lazt loading)

we have many an application that uses a database that is been used by differrent clients (each client has is own DB). Over the years, some of our client's database has lost some of the foreign key definition.
We would like to use code first Entity Framework, but since not all the db has the relationship defined, we have a lot of problems (specialy if we want to use lazy loading).
We where thinking of trying reverse negeneering a db that has the relations defined and to update only the foreign key definitions, is that possible ?
We only want to fix the foreign key definition and nothing else, because there is critical data in the DB and we don't want to take risks and update the db from the model on the production enveironnement.
Thank you in advance!
I'm having a hard time following your question mainly because I think you have left a lot of information out. I would think that you could reverse engineer the database, then use automatic-migrations to add any foreign keys. YOu may want to look into:
Automatic-Migrations AND Data Annotations or Fluent API
Without more information or an example of your DB Schema or any Code First (CODE) I don't think anyone will be able to do more than point you in the right direction like I have tried to do.

EF with KeyTable Sequence style PK

How to implement in EF5 a KeyTable style identity method
"Uses a table in the database to store the next Id, and advances this value every time a new block of Ids is required" from LightSpeed
I believe this is like Oracle sequences.
This feels like it should be easy (as it is in LightSpeed). This gives the ORM an easy way to do bulk inserts ie it can get 10 identities at a time, then do a bulk insert back to the db.
Am using EF5 / WCF RIA Services (latest) talking to Silverlight. The rest of the project uses bulk insert SSIS stuff.. and the SL project does some inserting. So I need to follow this convention.
I guess the question is fundamentally more about whether Entity Framework can support this style of key generation, and a secondary part of the question is whether RIA Services would integrate OK with that.
It reminds me of the range of key generation strategies that are available in NHibernate.
Here's an answer here which suggests that EF does not have such full support as NHibernate 'out of the box':
Unfortunately, EF doesn't have anything very close to the POID
generators like NHibernate does, although I hear rumors that similar
capabilities will be included in the next release of EF.
from HiLO for the Entity Framework
This answer suggests that it's not too tricky to intercept a Save (specifically an insert) in RIA Services and call a SPROC to get the new ID
you only need to call stored procedure to get a value before you are
going to save the record [can put this into an] overriden
SaveChanges() in your context
see https://stackoverflow.com/a/5924487/5351 in answer to What is the best way to manually generate Primary Keys in Entity Framework 4.1 Code First
and a similar answer here... https://stackoverflow.com/a/5277642/5351
Here are some findings on possibly implementing a HiLo generator (a more robust key gen pattern) with EF:
The Hi/Lo pattern describe a mechanism for generating safe-ids on the
client side rather than the database. Safe in this context means
without collisions. This pattern is interesting for three reasons:
It doesn’t break the Unit of Work pattern (check  this link and this other one)
It doesn’t need many round-trips as the Sequence generator in other DBMS.
It generates human readable identifier unlike to GUID techniques.
from http://joseoncode.com/2011/03/23/hilo-for-entityframework/

JPA/Hibernate and composite keys

I have come across some SO discussions and others posts (e.g. here, here and here) where using composite primary keys with JPA is described either as something to be avoided if possible, or as a necessity due to legacy databases or as having "hairy" corner cases. Since we are designing a new database from scratch and don't have any legacy issues to consider is it recommended or let's say, safer, to avoid composite primary keys with JPA (either Hibernate or EclipseLink?).
My own feeling is that since JPA engines are complex enough and certainly, like all software, not without bugs, it may be best to suffer non-normalized tables than to endure the horror of running against a bug related to composite primary keys (the rationale being that numeric single-column primary keys and foreign keys are the simplest use case for JPA engines to support and so it should be as bug-free as possible).
I've tried both methods, and personally I prefer avoiding composite primary keys for several reasons:
You can make a superclass containing the id field, so you don't have to bother with it in all your entities.
Entity creation becomes much easier
JPA plays nicer in general
Referencing to an entity becomes easier. For example storing a bunch of IDs in a set, or specififying a single id in the query string of a web page is largelly simplified by only having to use a single number.
You can use a single equals method specified in the super class that works for all entities).
If you use JSF you can make a generic converter
Easier to specify objects when working with your DB client
But it brings some bad parts aswell:
Small amount of denormalization
Working with unpersisted objects (if you use auto generated IDs, which you should) can mean trouble in some cases, since equality methods and such needs an ID to work correctly

I don't need/want a key!

I have some views that I want to use EF 4.1 to query. These are specific optimized views that will not have keys to speak of; there will be no deletions, updates, just good ol'e select.
But EF wants a key set on the model. Is there a way to tell EF to move on, there's nothing to worry about?
More Details
The main purpose of this is to query against a set of views that have been optimized by size, query parameters and joins. The underlying tables have their PKs, FKs and so on. It's indexed, statiscized (that a word?) and optimized.
I'd like to have a class like (this is a much smaller and simpler version of what I have...):
public MyObject //this is a view
{
Name{get;set}
Age{get;set;}
TotalPimples{get;set;}
}
and a repository, built off of EF 4.1 CF where I can just
public List<MyObject> GetPimply(int numberOfPimples)
{
return db.MyObjects.Where(d=> d.TotalPimples > numberOfPimples).ToList();
}
I could expose a key, but whats the real purpose of dislaying a 2 or 3 column natural key? That will never be used?
Current Solution
Seeming as their will be no EF CF solution, I have added a complex key to the model and I am exposing it in the model. While this goes "with the grain" on what one expects a "well designed" db model to look like, in this case, IMHO, it added nothing but more logic to the model builder, more bytes over the wire, and extra properties on a class. These will never be used.
There is no way. EF demands unique identification of the record - entity key. That doesn't mean that you must expose any additional column. You can mark all your current properties (or any subset) as a key - that is exactly how EDMX does it when you add database view to the model - it goes through columns and marks all non-nullable and non-computed columns as primary key.
You must be aware of one problem - EF internally uses identity map and entity key is unique identification in this map (each entity key can be associated only with single entity instance). It means that if you are not able to choose unique identification of the record and you load multiple records with the same identification (your defined key) they will all be represented by a single entity instance. Not sure if this can cause you any issues if you don't plan to modify these records.
EF is looking for a unique way to identify records. I am not sure if you can force it to go counter to its nature of desiring something unique about objects.
But, this is an answer to the "show me how to solve my problem the way I want to solve it" question and not actually tackling your core business requirement.
If this is a "I don't want to show the user the key", then don't bind it when you bind the data to your form (web or windows). If this is a "I need to share these items, but don't want to give them the keys" issue, then map or surrogate the objects into an external domain model. Adds a bit of weight to the solution, but allows you to still do the heavy lifting with a drag and drop surface (EF).
The question is what is the business requirement that is pushing you to create a bunch of objects without a unique identifier (key).
One way to do this would be not to use views at all.
Just add the tables to your EF model and let EF create the SQL that you are currently writing by hand.