How to prevent cyclic loading of related entities in Entity Framework Code First - entity-framework

I'm new to Entity Framework and am trying to learn how to use Code First to load entities from the database.
My model contains a user:
public class User
{
public int UserID { get; set; }
[Required]
public string Name { get; set; }
// Navigation Properties
public virtual ICollection<AuditEntry> AuditEntries { get; set; }
}
Each user can have a set of audit entries each of which contains a simple message:
public class AuditEntry
{
public int AuditEntryID { get; set; }
[Required]
public string Message { get; set; }
// Navigation Properties
public int UserID { get; set; }
public virtual User User { get; set; }
}
I have a DBContext which just exposes the two tables:
public DbSet<User> Users { get; set; }
public DbSet<AuditEntry> AuditEntries { get; set; }
What I want to do is load a list of AuditEntry objects containing the message and the related User object containing the UserID and Name properties.
List<AuditEntry> auditEntries = db.AuditEntries.ToList();
Because I have my navigation properties marked as virtual and I haven't disabled lazy loading, I get an infinitely deep object graph (each AuditEntry has a User object, which contains a list of the AuditEntries, each of which contains a User object, which contains a list of AuditEntries etc)
This is no good if I then want to serialize the object (for example to send as the result in a Web API).
I've tried turning off lazy loading (either by removing the virtual keywords from my navigation properties in the model, or by adding this.Configuration.LazyLoadingEnabled = false; to my DBContext). As expected this results in a flat list of AuditEntry objects with User set to null.
With lazy loading off, I've tried to eager load the User like so:
var auditentries = db.AuditEntries.Include(a => a.User);
but this results in the same deep / cyclic result as before.
How can I load one level deep (e.g. include the user's ID and name) without also loading back-references / following navigation properties back to the original object and creating a cycle?

After much hacking, I've come up with the following potential solution using a dynamic return type and projection in my Linq query:
public dynamic GetAuditEntries()
{
var result = from a in db.AuditEntries
select new
{
a.AuditEntryID,
a.Message,
User = new
{
a.User.UserID,
a.User.Username
}
};
return result;
}
This produces (internally) the following SQL which seems sensible:
SELECT
[Extent1].[AuditEntryID] AS [AuditEntryID],
[Extent1].[Message] AS [Message],
[Extent1].[UserID] AS [UserID],
[Extent2].[Username] AS [Username]
FROM [dbo].[AuditEntries] AS [Extent1]
INNER JOIN [dbo].[Users] AS [Extent2] ON [Extent1].[UserID] = [Extent2].[UserID]
This produces the results that I'm after, but it seems a bit long winded (especially for real life models that would be significantly more complex than my example), and I question the impact this will have on performance.
Advantages
This gives me a lot of flexibility over the exact contents of my returned object. Since I generally do most of my UI interaction / templating on the client side, I frequently find myself having to create multiple versions of my model objects. I generally need a certain granularity over which users can see which properties (e.g. I might not want to send every user's email address to low-privilege user's browser in an AJAX request)
It allows entity framework to intelligently build the query and only select the fields that I have chosen to project. For example, inside each top level AuditEntry object, I want to see User.UserID and User.Username but not User.AuditEntries.
Disadvantages
The returned type from my Web API is no longer strongly typed so I couldn't create a strongly typed MVC view based on this API. As it happens this is not a problem for my particular case.
Projecting manually in this way from a large / complex model could result in a lot of code, seems like a lot of work and has the potential to introduce errors in the API. This would have to be carefully tested.
The API method becomes tightly coupled with the structure of the model and since this is no longer fully automated based on my POCO classes, any changes made to the model would have to be reflected in the code that loads them.
Include method?
I'm still a little confused about the use of the .Include() method. I understand that this method will specify that related entities should be "eager loaded" along with the specified entity. However, since the guidance seems to be that navigation properties should be placed on both sides of a relationship and marked as virtual, the Include method seems to result in a cycle being created which has a significant negative impact on it's usefulness (especially when serializing).
In my case the "tree" would look a little like:
AuditEntry
User
AuditEntries * n
User * n
etc
I'd be very interested to hear any comments about this approach, the impact of using dynamic in this way or any other insights.

Related

EF Core update untracked entity with many to many collection

I'm facing a problem with EF Core and collections; I have persons who read books, the books can be read by multiple people and people can read multiple books (it's a many-to-many relationship). My EF generates the 3 tables Books, Persons and BookPersons.
When I insert new persons with a set of books they read, there is no problem. Still when I recreate one of the persons outside the db context (so same id, but mutated collection of read books) and I try to save it, it fails on the many-to-many relation. Because the relation between the existing already exists (not unique constraint)
I've tried:
to attach the book collection to the context (same error)
the person (no error but no change either)
only change person details not the collection (the untracked entity is saved but my books read is not saved)
I'm not very fond of managing the BookPersons table or doing queries first to get existing entities. My goal is to do an update of a person and its read books in one go. I do know how to write it in SQL but it seems EF is quite a challenge.
If you want to view my code, visit: https://github.com/CasperCBroeren/EfCollectionsProblem/blob/master/Program.cs
Thanks for explaining what I'm missing or not getting
I would create PersonBooks model to handle that,
Book model
public class Book
{
[DatabaseGenerated(DatabaseGeneratedOption.None)]
public int Id { get; set; }
public string Title { get; set; }
[ForeignKey("BookId")]
public virtual ICollection<PersonBook> PersonBooks { get; set; }
}
Person Model
public class Person
{
[DatabaseGenerated(DatabaseGeneratedOption.None)]
public int Id { get; set; }
public string Name { get; set; }
[ForeignKey("PersonId")]
public virtual ICollection<PersonBook> PersonBooks { get; set; }
}
PersonBook Model
public class PersonBook
{
public int Id { get; set; }
public int PersonId { get; set; }
public int BookId { get; set; }
public virtual Person Person { get; set; }
public virtual Book Book { get; set; }
}
then you can get all book Id's readen by person by using
var personId = 15; // what ever you want
db.PersonBooks.Where(a=> a.PersonId == personId);
or get all persons Id's who read a book by id
var bookId= 11; // what ever you want
db.PersonBooks.Where(a=> a.BookId== bookId);
Note:
you can reach the Book entity by using for example
db.PersonBooks.Where(a=> a.PersonId == personId).FirstOrDefault().Book;
A key factor in EF is dealing with object references. Any reference that a DbContext isn't tracking will be treated as a new entity. The Update method on DbSets should actually be avoided as it can lead to inefficient and potentially dangerous data changes.
This option: "to attach the book collection to the context" works with singular references, but doesn't work with collections. The trouble is that what you want to say is "add any book the person isn't already associated with" however, the DbContext has no knowledge of what books that person is already associated with unless you fetch that information first.
... or doing queries first to get existing entities.
This is actually what you should do in most cases. In the case of a simple console application to test out ideas and learn how EF works it may look like overkill, but in real-world systems this is the recommended approach for a number of reasons.
Keeping payloads small. Take an API or web site where you allow a user to associate books to people. Sending entire representation of people, their books, etc. back and forth between server and client can get potentially expensive in terms of data size. If I have an API that allows me to associate books to a person, if those books already reflect known data state (already exist in the db) then all I need to pass are IDs. When passing data to views the idea is to only pass what the view needs rather than entire entity graphs.
Keeping payloads safe. Passing entire entities around and using methods like Update can make your systems prone to tampering. Update will update all columns in an entity whether you expect, or allow them to change or not. By minimizing the data coming back you ensure only the expected details can change, and you by definition validate that the provided values are safe.
For example, if I have a service that wanted to update books associated to a person. In the UI I had loaded that John had "Jungle Book (ID: 1)", and I wanted to update the associations so John now had "Jungle Book" and "Tom Sawyer". While my UI might now allow it, it is certainly possible that the client browser can intercept the call to my controller / web service, and seeing a Book { ID: 1, Name: "Jungle Book" }, tamper with that data to send Book { ID: 1, Name: "Hitchhiker's Guide to the Galaxy"}. Provided you did solve this issue in a way that resulted in attaching entities and doing an Update or such, the consequence of this tampering would be that an attacker could rename a book. That would have a flow-on effect to every Person that referenced Book ID #1.
Instead if I want to have something like an "UpdateBooks" method that can reassign books for a person, I would have a method something like this:
private void UpdateBooks(int personId, params int[] bookIds)
{
using (var context = new AppDbContext())
{
var person = context.Persons
.Include(x => x.Books)
.Single(x => x.PersonId == personId);
var existingBookIds = person.Books.Select(x => x.BookId).ToList();
var bookIdsToAdd = bookIds.Except(existingBookIds).ToList();
var bookIdsToRemove = existingBookIds.Except(bookIds).ToList();
foreach(var bookId in bookIdsToRemove)
{
var book = person.Books.Single(x => x.BookId == bookId);
person.Books.Remove(book);
}
if (bookIdsToAdd.Any())
{
var booksToAdd = context.Books
.Where(x => bookIdsToAdd.Contains(x.BookId))
.ToList();
if(booksToAdd.Count != bookIdsToAdd.Count)
{
// Handle scenario where one or more book IDs provided weren't found.
}
person.Books.AddRange(booksToAdd);
}
context.SaveChanges();
}
}
This assumes that EF is handling PersonBooks entirely behind the scenes where PersonBook consists of just PersonId and BookId so-as Person can have a collecton of Books rather than PersonBooks.
This example runs up to two SELECT queries. One to get the Person and it's current books, and one to get any new books if any need to be added. There is no risk of tampering with books, and we can easily validate scenarios such as passing an unknown book ID. The temptation might be to avoid querying, seeing it as expensive, but in most cases EF can provide data quite quickly and efficiently. It is the exception rather than the norm that you might need to get creative to get around possible performance bottlenecks with data access.
A third consideration is to focus on keeping operations atomic, especially for things like web services / web applications. This doesn't apply when just getting familiar with the workings of EF, entities, and such, but a consideration for more real-world applications. Rather than having more complex methods like UpdateBooks(), using actions like "AddBook" and "RemoveBook" can keep operations faster and simpler. One argument for a larger method is that you might expect all of the operations to be committed (or not) as one operation, such as UpdateBooks gets called as part of one big "SavePerson" method reflecting changes to the person and all of it's associated details. In these cases having atomic actions is still recommended, except instead of updating data state, they can update server (session) state waiting for a "Save" call to come through to persist the changes as one operation, or discarding the changes. Add/Remove methods can still provide the validation checks ultimately setting things up for entities to be loaded, modified, and persisted.

Entity Framework (C# ASP.NET) - Association Entity without Tracking in Non-Dependent Entity

I have two Model classes to be created using Entity Framework: Skill and Activity. The following are the definitions of each:
Skill.cs
public class Skill
{
public int Id { get; set; }
public String Name { get; set; }
}
Activity.cs
public class Activity
{
public int Id { get; set; }
public String Name { get; set; }
public virtual List<Skill> RequiredSkills { get; set; }
}
Ideally, in the database, I'd want the Activity to be linked via foreign key to a association entity (e.g. SkillActivityAssoc) and the Skill not to have to do anything with it. I don't need to track which activities need a certain skill. I just need to track what skills are needed for each activity thus explaining why I don't have a List in the Skill class. I hope that made sense.
My question is: Is this the right way to go about doing this? When I update the RequiredSkills property of Activity via:
activity.RequiredSkills = someInstanceOfRequiredSkillsList;
dbcontext.Entry(activity).State = EntityState.modified;
dbcontext.SaveChanges();
.., it doesn't work. I'm already speculating that it's because I'm not able to update the association entity. Moreover, my current implementation has a virtual List<Activity> property in the Skill class which I want to get rid of. How do I go about changing my model design and how do I update RequiredSkills accordingly?
Thank you in advance!
virtual is for lazy loading and track changes in EF. You can read more about it here: Understanding code first virtual properties. You should also read MSDN documentation about loading entities in EF: https://msdn.microsoft.com/en-us/library/jj574232(v=vs.113).aspx
Since you want to have more than one Skills in each Activity and each Skills can be in more than one Activity as well, you have a many-to-many relantionship. Please read this example: How to create a many-to-many mapping in Entity Framework? and this http://www.entityframeworktutorial.net/code-first/configure-many-to-many-relationship-in-code-first.aspx

add several references to the same record in navigation property

In my EF 6 Model First application, I have an entity with a many-to-many navigation property to another entity. In the first entity, I need to add several references to the same record in navigation property.
The first entity is a “saleslistItem” and the second entity is “warehouseItem”. Normally there will be a one-to-one relationship here, but exceptionally there will be some bundles where one “saleslistItem” contains several “warehouseItems”. “WarehouseItem” can also be included in several “salesListItems”. At the end of the project, my customer says, testing it, that “saleslistItem” must be able to consist of several “WarehouseItems” of the same kind (like two boxes of smoked ham).
These data is used several places in my code. (ie. doing a sale removes items from the warehouse) If I could just add several the same reference, my code would work without any modifications. But in the implementation of the navigation property the “hashtable”-collection is used, and this collection requires unique entries. Is there a workaround here? Performance is irrelevant as the data amount is small.
If there are no such workaround, is it possible to store values together with the instance of navigation property? Maybe it is implemented as a field in the join-table???
Any other suggestions?
Need a solution so the customer pays the last part of the bill!
So you currently have a 1:1 from SalesListItem toWarehouseItemvia a ForeignKey inSalesListItem`? Sounds like you need:
public class SalesListItem
{
public virutal ICollection<SalesListWarehouseItem> WareHouseItems { get; set; }
}
public class SalesListWarehouseItem
{
public virtual SalesListItem Parent{ get; set; }
public virtual WarehouseItem WarehouseItem { get; set; }
public int Quantity { get; set; } // maybe double?
}

Adding Navigation property breaks breeze client-side mappings (but not Server Side EF6)

I have an application that I developed standalone and now am trying to integrate into a much larger model. Currently, on the server side, there are 11 tables and an average of three navigation properties per table. This is working well and stable.
The larger model has 55 entities and 180+ relationships and includes most of my model (less the relationships to tables in the larger model). Once integrated, a very strange thing happens: the server sends the same data, the same number of entities are returned, but the exportEntities function returns a string of about 150KB (rather than the 1.48 MB it was returning before) and all queries show a tenth of the data they were showing before.
I followed the troubleshooting information on the Breeze website. I looked through the Breeze metadata and the entities and relationships seem defined correctly. I looked at the data that was returned and 9 out of ten entities did not appear as an object, but as a function: function (){return e.refMap[t]} which, when I expand it, has an 'arguments' property: Exception: TypeError: 'caller', 'callee', and 'arguments' properties may not be accessed on strict mode functions or the arguments objects for calls to them.
For reference, here are the two entities involved in the breaking change.
The Repayments Entity
public class Repayment
{
[Key, Column(Order = 0)]
public int DistrictId { get; set; }
[Key, Column(Order = 1)]
public int RepaymentId { get; set; }
public int ClientId { get; set; }
public int SeasonId { get; set; }
...
#region Navigation Properties
[InverseProperty("Repayments")]
[ForeignKey("DistrictId")]
public virtual District District { get; set; }
// The three lines below are the lines I added to break the results
// If I remove them again, the results are correct again
[InverseProperty("Repayments")]
[ForeignKey("DistrictId,ClientId")]
public virtual Client Client { get; set; }
[InverseProperty("Repayments")]
[ForeignKey("DistrictId,SeasonId,ClientId")]
public virtual SeasonClient SeasonClient { get; set; }
The Client Entity
public class Client : IClient
{
[Key, Column(Order = 0)]
public int DistrictId { get; set; }
[Key, Column(Order = 1)]
public int ClientId { get; set; }
....
// This Line lines were in the original (working) model
[InverseProperty("Client")]
public virtual ICollection<Repayment> Repayments { get; set; }
....
}
The relationship that I restored was simply the inverse of a relationship that was already there, which is one of the really weird things about it. I'm sure I'm doing something terribly wrong, but I'm not even sure at this point what information might be helpful in debugging this.
For defining foreign keys and inverse properties, I assume I must use either data annotations or the FluentAPI even if the tables follow all the EF conventions. Is either one better than the other? Is it necessary to consistently choose one approach and stay with it? Does the error above provide any insight as to what I might be doing wrong? Is there any other information I could post that might be helpful?
Breeze is an excellent framework and has the potential to really increase our reach providing assistance to small farmers in rural East Africa, and I'd love to get this prototype working.
THanks
Ok, some of what you are describing can be explained by breeze's default behavior of compressing the payload of any query results that return multiple instances of the same entity. If you are using something like the default 'json.net' assembly for serialization, then each entity is sent with an extra '$id' property and if the same entity is seen again it gets serialized via a simple '$ref' property with the value of the previously mentioned '$id'.
On the breeze client during deserialization these '$refs' get resolved back into full entities. However, because the order in which deserialization is performed may not be the same as the order that serialization might have been performed, breeze internally creates deferred closure functions ( with no arguments) that allow for the deferred resolution of the compressed results regardless of the order of serialization. This is the
function (){return e.refMap[t]}
that you are seeing.
If you are seeing this value as part of the actual top level query result, then we have a bug, but if you are seeing this value while debugging the results returned from your server, before they have been returned to the calling function, then this is completely expected ( especially if you are viewing the contents of the closure before it should be executed.)
So a couple of questions and suggestions
Are you are actually seeing an error processing the result of your query or are simply surprised that the results are so small? If it's just a size issue, check and see if you can identify data that should have been sent to the client and is missing. It is possible that the reference compression is simply very effective in your case.
take a look at the 'raw' data returned from your web service. It should look something like this, with '$id' and '$ref' properties.
[{
'$id': '1',
'Name': 'James',
'BirthDate': '1983-03-08T00:00Z',
},
{
'$ref': '1'
}]
if so, then look at the data and make sure that an '$'id' exists that correspond to each of your '$refs'. If not, something is wrong with your server side serialization code. If the data does not look like this, then please post back with a small example of what the 'raw' data does look like.
After looking at your Gist, I think I see the issue. Your metadata is out of sync with the actual results returned by your query. In particular, if you look for the '$id' value of "17" in your actual results you'll notice that it is first found in the 'Client' property of the 'Repayment' type, but your metadata doesn't have 'Client' navigation property defined for the 'Repayment' type ( there is a 'ClientId' ). My guess is that you are reusing an 'older' version of your metadata.
The reason that this results in incomplete results is that once breeze determines that it is deserializing an 'entity' ( i.e. a json object that has $type property that maps to an actual entityType), it only attempts to deserialize the 'known' properties of this type, i.e. those found in the metadata. In your case, the 'Client' navigation property on the 'Repayment' type was never being deserialized, and any refs to the '$id' defined there are therefore not available.

Want Entity Framework 6.1 eager loading to load only first level

I am not sure I am approaching wrong way or it is a default behaviour but it is not working the way I am expecting ...
Here are two sample classes ...
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public Department Department { get; set; }
}
Second one is Department
public class Department
{
public string Name { get; set; }
public List<Person> People { get; set; }
}
Context Configuration
public MyDbContext() : base("DefaultConnection")
{
this.Configuration.ProxyCreationEnabled = false;
this.Configuration.LazyLoadingEnabled = false;
}
public DbSet<Person> People { get; set; }
public DbSet<Department> Departments { get; set; }
I am try to load people where last name is from 'Smith'
var foundPeople
= context
.people
.Where(p => p.LastName == "Smith");
Above query load foundPeople with just FirstName and LastName no Department object. It is a correct behaviour as my LazyLoading is off. And that was expected as well.
Now in another query with Eager loading Department,
var foundPeople
= context
.people
.Where(p => p.LastName == "Smith")
.Include(p => p.Department);
Above query loads foundPeople with FirstName, LastName, Department with Department->Name as well as Deparment->People (all people in that department, which I dont want, I just want to load first level of the Included property.
I dont know is this an intended behaviour or I have made some mistake.
Is there any way to just load first level of Included property rather then complete graph or all levels of included property.
Using Include() to achieve eager loading only works if lazy loading is enabled on your objects--that is, your navigation properties must be declared as virtual, so that the EF proxies can override them with the lazy-loading behavior. Otherwise, they will eagerly load automatically and the Include() will have no effect.
Once you declare Person.Department and Department.People as virtual properties, your code should work as expected.
Very sorry, my original answer was wholly incorrect in the main. I didn't read your question closely enough and was incorrect in fact on the eager behavior. Not sure what I was thinking (or who upvoted?). Real answer below the fold:
Using the example model you posted (with necessary modifications: keys for the entities and removed "this" from context constructor) I was unable to exactly reproduce your issue. But I don't think it's doing what you think it's doing.
When you eagerly load the Department (or explicitly load, using context.Entry(...).Reference(...).Load()) inspect your results more closely: there are elements in the Department.People collections, but not all the Persons, only the Persons that were loaded in the query itself. I think you'll find, on your last snippet, that !foundPeople.SelectMany(p => p.Department.People).Any(p => p.LastName != "Smith") == true. That is, none of them are not "Smith".
I don't think there's any way around this. Entity Framework isn't explicitly or eagerly loading People collections (you could Include(p => p.Department.People) for that). It's just linking the ones that were loaded to their related object, because of the circular relationship in the model. Further, if there are multiple queries on the same context that load other Persons, they will also be linked into the object graph.
(An aside: in this simplified case, the proxy-creation and lazy-loading configurations are superfluous--neither are enabled on the entities by virtue of the fact that neither have lazy or proxy-able (virtual) properties--the one thing I did get right the first time around.)
By desing, DbContext does what it's called "relationship fix-up". As your model has information on which are the relations between your entities, whenever an entity is attached, or modified, in the context, EF will try to "fix-up" the relations between entities.
For example, if you load in the context an entity with a FK that indicates that it's a children of another entity already attached to the context, it will be added to the children collection of the existing entity. If you make any chages (change FK, delete entity, etc.) the relationships will be automatically fixed up. That's what the other answer explains: even if you load the related entities separatedly, with a different query, they'll be attached to the children collection they belong to.
This functionality cannot be disabled. See other questions related to this:
AsNoTracking and Relationship Fix-Up
Is it possible to enable relationship fixup when change tracking is disabled but proxies are generated
How to get rid of the related entities
I don't know what you need to do, but with the current version of EF you have to detach the entity from the context and manually remove the related entities.
Another option is to map using AutoMapper or ValueInjecter, to get rid of the relationship fix-up.
You could try using a LINQ query so you can select only the fields that you need. I hope that helps.