Entity framework code first minimum cardinality - entity-framework

Say I have the following entity classes:
public class Order
public int OrderID { get; set; }
public ICollection<OrderLine> OrderLines { get; set; }
public class OrderLine
public int OrderLineID { get; set; }
public Order Order { get; set; }
I want to enforce a minimum cardinality of 1 for this relationship; ie I want to ensure that an Order cannot be created without at least 1 OrderLine.
I'm using EF code first fluent style configurations and I am able to enforce the fact that OrderLine must have an Order reference (using HasRequired() extension method) but I cant see how I can prevent an Order from being created without at least one OrderLine.

In short: you can't. Your requirement cannot be mapped to a database constraint: orders and order lines are saved separately, so when you create an order and add an order line, either the order or the order line must be saved first. The order line -> order relation is backed by a foreign key, so the order must be saved first. When the order is saved, as far as the database knows, the order has no order lines, they're not added until later.
You can create custom validation functions and call them before saving. If you're using an ObjectContext, you will have to do this yourself. If you have a DbContext, you should be able to override DbContext.ValidateEntity. For obvious reasons, this only works if you make all database modifications through your context. If you modify the database tables directly, custom validation functions don't get used.

It can be done, just not with fluent configuration:
public class Order : IValidatableObject
public IEnumerable<ValidationResult> Validate(
ValidationContext validationContext)
if (!OrderLines.Any())
yield return new ValidationResult("At least one line needed");
This will be enforced when you SaveChanges(), just like a Required property, or any other model constraint.


EntityFramework DatabaseGeneratedOption.Identity accepts and saves data instead of generating new one

Assuming this test model:
public class TestEntity
public Guid Id { get; set; }
public string Name { get; set; }
When I generate a new instance of it, Id is 00000000-0000-0000-0000-000000000000.
Saving such an instance in the database as a new row, results in a Guid being generated (which is different from the empty one).
However, if I provide a valid Guid in TestEntity.Id, the new row is created with the provided Guid instead of a newly computed one.
I would like this behavior to exists only when editing a row, not when creating it. This is to ensure a database-layer protection from attacks where a user normally shouldn't get to choose which data to input.
Off course this protection is present in other layers, but I want it in the database too. Is this possible? How can I tell EF to ignore model data when creating a new row?
DatabaseGeneratedOption.Computed descriptions says
the database generates a value when a row is inserted or updated
So clearely that's not an option. I don't want to change Id when updating a row. I only want to be sure no one can create a row and choose the Id.
I'd try to keep things simple. Make your set method protected, then you have two ways to generate Ids, You can generate it by yourself inside a constructor:
public class TestEntity
// no need to decorate with `DatabasGenerated`, since it won't be generated by database...
public Guid Id { get; protected set; }
public string Name { get; set; }
public TestEntity()
this.Id = Guid.NewGuid();
...or you can let the database generate it for you. At least for SQL Server, it will be able to generate for int and Guid as well:
public class TestEntity
public Guid Id { get; protected set; }
public string Name { get; set; }
// no need to generate a Guid by yourself....
This will avoid people from setting a value to Id outside the class (therefore no one can choose a Guid for new rows, or modify from existing ones).
Of course, your team could use reflection to by-pass class definitions, but if that's the case, you need to have a talk with your team.
If you still want to make sure they won't cheat, then you'd have to do check before saving changes to database, maybe overriding SaveChanges() in your DbContext.
As a side note, for both int and Guid, values are not generated by Entity Framework. Decorating the property with [DatabaseGenerated(DatabaseGeneratedOption.Identity)] will tell Entity Framework to generate a column with a default value coming from the own database provider.

Why the frequent unexplained use of the partial modifier in EF Code First?

In the EF Code first docs and examples, you'll frequently see classes and methods defined using the partial modifier. For example, the following
public partial class Department
public int DepartmentID { get; set; }
public DepartmentNames Name { get; set; }
public decimal Budget { get; set; }
I understand the general use of the partial keyword by the C# compiler. However, I often see these examples without applying that functionality (i.e., the class is never re-opened elsewhere).
In other examples, I have also seen partial modifiers on methods as well.
Do these modifiers carry some special meaning in an EF Code First context? Can anyone help me understand what's going on?
Given EF makes working with POCOs really easy, this makes it flexible in terms of separating components and pieces. For example, a section that defines your models:
public partial class PurchaseOrder
public Int32 ID { get; set; }
public String CustomerName { get; set; }
public Double InvoiceAmount { get; set; }
public virtual ICollection<PurchaseOrderItem> Items { get; set; }
Then apply business logic elsewhere:
public partial class PurchaseOrder : IValidatableObject
public IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
// ...
And maybe extend its functionality another place still:
public partial class PurchaseOrder
public void AddItem(PurchaseOrderItem item)
// ...
though as #E.J. Brennan mentions, it's more likely they were generated from a T4 template. This means that anything you did in the generated file would be wiped with every generation; however, if you left the generated item alone, you could still extend it (like I've shown with IValidatableObject or additional methods) without worrying if your changes would be lost.
It wouldn't be unusual to use T4, or another code generator to create the basic classes that map back to your database. If one did that, you would want those classes to be partial so that you could extend those classes in a seperate file - if you extended those classes in the original file, they would get overwritten every time you re-generated the file.
If you hand coded your classes, there would be no need to use the partial on all of them.
You can extend Entity Framework generated types:
The classes only contain properties that are defined in the conceptual model and do not contain any methods. The generated classes are partial.
So, in your new partial classes (not the generated ones) you can define business logic, display attributes, validation logic etc. These classes won't be overwritten, like the generated ones.

What is the most optimized way of implementing computed fields using Entity Framework?

I am using Entity Framework/Fluent API and I am new to them. In my scenario I am having the following three classes.
public class Review
public int Id { get; private set; }
public float AverageRating { get; private set; } //Computed Field
public int TotalLikes { get; private set; } //Computed Field
public List<Rating> Ratings { get; private set;}
public List<Like> Likes { get; private set;}
public class Rating
public int CustomerId { get; private set; }
public int ReviewId { get; private set; }
public int Rating { get; set; }
public class Like
public int CustomerId { get; private set; }
public int ReviewId { get; private set; }
I have Fluent mapping for all three classes and their relationships. In the review class I have two computed fields. I could populate computed fields from child collections (Ratings and Likes). In that case in the Linq query I have to include both child collections, which I believe is a performance intensive operation. Alternatively I could also use computed columns in the DB. But I don't like to put anything in the database side. So, what is the best way of populating the computed fields (mostly aggregate operations like Count, Average, etc) without loading the child collections or using a database solution?
If you don't want to use a database solution like a stored procedure or using computed columns on the database side then this leaves us with one option i guess. That is have a method in your repository like GetRatings() or something similar and use a linq query in that method to compute ratings, of course LINQ to Entities will convert that Linq query to a native SQL query which should be faster as it is native to the database.
With out using Database solution...
Linq has count/max/average,Sum,groupby,distinct .... etc So Pulling data back once the DB has "calculated it", such as a count or sum integer isnt normally an issue.
So no need to drag back all objects.
The use of concurrency checking to keep integrity when posting back . ie Timestamp EF type Rowversion, Will be necessary
Basically with EF you put a value in the context and SAY Save.
So the key thing to remember is when Saving, how to make sure the data is OK.
Thats the role of the Rowversion. (optimistic locking)
If the record you are changing has changed since you read it, fail.
You re-read/recalc and try again.
However, if your application demands pessimistic locking
then this article from Ladislav is recommended reading
Essentially EF doesnt offer Pessimistic locking.
HOWEVER, you can call the DB using methods exposed on EF, or just call the DB.
Note by default EF does not read Dirty data.(ie no uncommitted read)
After all that you still need/want pessimistic locking semaphore style to access to the data.
then see Application Lock on SQL server
YEP you will need the DB, unless you have an ENQUEUE server handy;-)

An alternative way of implemening navigation properties in Entity Framework

The official approach to defining navigation properties for complex entities is:
public class SuperEntity
public int Id { get; set; }
//Other properties
public class LowerEntity
public int Id { get; set; }
public int SuperEntityId { get; set; }
public virtual SuperEntity SuperEntity { get; set; }
//Other properties
The main thing here is that a class that references (allows navigation to linked super entity) has both public SuperEntity SuperEntity { get; set; } property, as well as it's Id in public int SuperEntityId { get; set; }.
I have gone a few days into my entities design ommiting the public int SuperEntityId { get; set; } property in the "lower entities". So I am navigating only by virtual SuperEntity property. And everything works fine! But I had people on SO telling me that it creates an excessive tables in the DB. I've checked, and that is not true. When I use my approach, the DB tables has the SuperEntityId column and just populates it with the referenced entity Id automatically. What's the point in this public int SuperEntityId { get; set; } field then?
Or, perhaps, what I am doing became available in a "fresh" versions of EF like 4.3?
The point of SuperEntityId is that it is sometimes easier to use a foreign key property in apps where your context isn't alive the entire time, e.g. a webapp.
In such a situation, it's a lot easier to just use a foreign key property, than to try to attach object B to object A.
As far as I know, with nav properties, EF uses an object to track the relation between 2 objects. So if you want to couple object B to object A, in a disconnected app, it's not enough to just set the property on object A, you also have to fiddle with the entry of object A in the changetracker to register the relation between B and A.
Setting a foreign key property is the equivalent of this fiddling.
When we were just beginning with EF and didn't know about all of this, every time we wanted to connect 2 objects, e.g. B to A, and B already existed in the DB, the context thought that B was a new object instead of an existing one, and duplicated the record in the DB.
It won't create excessive tables, but it will probably generate extra, or longer, queries on that database. But that depends on how you're using these entities.

Code First: Independent associations vs. Foreign key associations?

I have a mental debate with myself every time I start working on a new project and I am designing my POCOs. I have seen many tutorials/code samples that seem to favor foreign key associations:
Foreign key association
public class Order
public int ID { get; set; }
public int CustomerID { get; set; } // <-- Customer ID
As opposed to independent associations:
Independent association
public class Order
public int ID { get; set; }
public Customer Customer { get; set; } // <-- Customer object
I have worked with NHibernate in the past, and used independent associations, which not only feel more OO, but also (with lazy loading) have the advantage of giving me access to the whole Customer object, instead of just its ID. This allows me to, for example, retrieve an Order instance and then do Order.Customer.FirstName without having to do a join explicitly, which is extremely convenient.
So to recap, my questions are:
Are there any significant disadvantages in
using independent associations? and...
If there aren't any, what
would be the reason of using foreign key associations at all?
If you want to take full advantage of ORM you will definitely use Entity reference:
public class Order
public int ID { get; set; }
public Customer Customer { get; set; } // <-- Customer object
Once you generate an entity model from a database with FKs it will always generate entity references. If you don't want to use them you must manually modify the EDMX file and add properties representing FKs. At least this was the case in Entity Framework v1 where only Independent associations were allowed.
Entity framework v4 offers a new type of association called Foreign key association. The most obvious difference between the independent and the foreign key association is in Order class:
public class Order
public int ID { get; set; }
public int CustomerId { get; set; } // <-- Customer ID
public Customer Customer { get; set; } // <-- Customer object
As you can see you have both FK property and entity reference. There are more differences between two types of associations:
Independent association
It is represented as separate object in ObjectStateManager. It has its own EntityState!
When building association you always need entitites from both ends of association
This association is mapped in the same way as entity.
Foreign key association
It is not represented as separate object in ObjectStateManager. Due to that you must follow some special rules.
When building association you don't need both ends of association. It is enough to have child entity and PK of parent entity but PK value must be unique. So when using foreign keys association you must also assign temporary unique IDs to newly generated entities used in relations.
This association is not mapped but instead it defines referential constraints.
If you want to use foreign key association you must tick Include foreign key columns in the model in Entity Data Model Wizard.
I found that the difference between these two types of associations is not very well known so I wrote a short article covering this with more details and my own opinion about this.
Use both. And make your entity references virtual to allow for lazy loading. Like this:
public class Order
public int ID { get; set; }
public int CustomerID { get; set; }
public virtual Customer Customer { get; set; } // <-- Customer object
This saves on unnecessary DB lookups, allows lazy loading, and allows you to easily see/set the ID if you know what you want it to be. Note that having both does not change your table structure in any way.
Independent association doesn't work well with AddOrUpdate that is usually used in Seed method. When the reference is an existing item, it will be re-inserted.
// Existing customer.
var customer = new Customer { Id = 1, Name = "edit name" };
// New order.
var order = new Order { Id = 1, Customer = customer };
The result is existing customer will be re-inserted and new (re-inserted) customer will be associated with new order.
Unless we use the foreign key association and assign the id.
// Existing customer.
var customer = new Customer { Id = 1, Name = "edit name" };
// New order.
var order = new Order { Id = 1, CustomerId = customer.Id };
We have the expected behavior, existing customer will be associated with new order.
I favour the object approach to avoid unnecessary lookups. The property objects can be just as easily populated when you call your factory method to build the whole entity (using simple callback code for nested entities). There are no disadvantages that I can see except for memory usage (but you would cache your objects right?). So, all you are doing is substituting the stack for the heap and making a performance gain from not performing lookups. I hope this makes sense.