EF Many-Many Relation Circular Loading - entity-framework

Lets say I have EntityA that have multiple of EntityB and EntityB can have multiple of EntityA. For simplicity:
class Student
{
public string Name {get;set;}
public virtual ICollection<Teacher> Teachers{get;set;}
}
class Teacher
{
public string Name {get;set;}
public virtual ICollection<Student> Students{get;set;}
}
I do mapping like this:
HasMany(x => x.Teachers)
.WithMany(x => x.Students)
.Map(x =>
{
x.MapLeftKey("StudentId");
x.MapRightKey("TeacherId");
x.ToTable("StudentTeacher");
});
Lazy loading is turned of.
Then I want to load Student including Teachers (meaning only Name of Teacher) but not other Students and then Teachers curculary. I tried something like this:
var student = _context.Students.Where(x => x.Name == studentName)
.Include(x=>x.Teachers)
.SingleOrDefault();
But I really get a objects. I only want to load the first level.
How can I do this?

This is normal behavior even though when debugging it seems like you are loading the entire object graph, EF is only populating children entities with the initial entity.
Let's say you have this data:
Tables:
Students
1 | Student1
2 | Student2
Teachers
1 | Teacher1
2 | Teacher2
StudentTeacher
1 | 1
1 | 2
2 | 1
2 | 2
EF query
var student = _context.Students.Where(x => x.Name == "Student1")
.Include(x=>x.Teachers)
.SingleOrDefault();
gives you:
student.Teachers = ("Teacher1", "Teacher2")
student.Teachers[1].Students = ("Student1")
student.Teachers[2].Students = ("Student1")
You can see how the last 2 lines only bring back "Student1" even though each teacher has {"Student1", "Student2"} attached to them.

Initially I saw problem with serialization when I access my system through Web API. Than I configure JSON serializer ignore reference loop handling, that it worked but with partially complete graph of object, without duplicates, since JSON serializer handles this. Than I found out lazy loading is on (I was assuming it was not :-) ). I turned it off, start to debug, and in debugger I saw "complete graph" of object loaded, but I did not assume that this was feature of debugger.
So, when I access my system again through Web API, now I have expected behavior. To make long story short, my problem was lazy loading

Related

Do I need a foreign key in the EF core code first?

I'm new at code first in entity framework and reading up on relationships, I see everyone does it differently. It might be because of earlier versions, might be the same or might be because of performance.
Let's say I have two tables Company and User.
I would set the company-to-user relationship like this:
public List<User> Users { get; set; } = new List<User>();
Then if I from the user-to-company perspective needed to find the company, I would have this in the User:
public Company Company { get; set; }
And do this query:
return await _clientContext.Users.Where(x => x.Company.Id == companyId).ToListAsync();
Or I could have this:
public int CompanyId { get; set; }
[ForeignKey("CompanyId")]
public Company Company { get; set; }
And have this query:
return await _clientContext.Users.Where(x => x.CompanyId == companyId).ToListAsync();
Also some define Company with the keyword virtual like this:
public virtual Company Company { get; set; }
I'm not sure if every scenario is the same and doing x.CompanyId instead of x.Company.Id would actually be the same. What is used normally?
I generally recommend the first option using Shadow Properties for FKs over the second when using navigation properties. The main reason is that with the second approach there are two sources of truth. For instance with a Team referencing a Coach, some code may use team.CoachId while other code uses team.Coach.CoachId. These two values are not guaranteed to always be in sync. (depending on when you happen to check them when one or the other is updated.)
Updating references between entities via a FK property can have varied behaviour depending on whether the referenced entity is loaded or not.
What is the expected difference between if want to update a team's coach:
var teamA = context.Teams.Single(x => x.TeamId == teamId);
If Team has a Coach navigation property and a CoachId FK reference I could do...
teamA.CoachId = newCoachId;
If TeamA's old coach ID was 1, and the newCoachId = 2, what do you think happens if I have code that lazy loads the coach before SaveChanges?
var coachName = teamA.Coach.Name;
You might expect that since the Coach hadn't been loaded yet it would load in Coach #2's name, but it loads Coach #1 because the change hasn't been committed even though teamA.CoachId == 2. If you check the Coach reference after SaveChanges you get Coach #2.
Depending on whether lazy loading is enabled or not you can get a bit strange behaviour by setting a FK property where navigation properties are nulled. Even when eager loading, changing a FK property will potentially trigger a new lazy load if that new entity isn't already tracked:
var teamA = context.Teams.Include(x => x.Coach).Single(x => x.TeamId == teamId);
teamA.CoachId == newCoachId;
var coachName = teamA.Coach.Name; // Still points to Coach #1's name as expected.
context.SaveChanges();
coachName = teamA.Coach.Name; // Triggers lazy load and return new coach's name.
Saving a FK against an entity that has eager loaded the reference does not automatically re-populate referenced entities. So for instance if you have lazy loading disabled, the same above code:
context.SaveChanges();
coachName = teamA.Coach.Name; // Potential NullReferenceException on teamA.Coach.
This will potentially trigger a null reference exception unless the new coach happens to be tracked by the DbContext prior to SaveChanges being called. If the DbContext is tracking the entity, the new reference will be swapped in on SaveChanges, otherwise it is nulled. (With lazy loading this is covered by the new lazy load call after it was nulled)
When working with navigation properties my default recommendation is to hide FK properties as Shadow Properties. (For EF6 this means using .Map(x => x.MapKey()). For relationships where I only care about the ID, I will expose the FK with no navigation property. So, one or the other. (Such as lookups or bounded contexts where I want raw speed.)
I will deviate sparingly from this for exposing FKs for relationships I may inspect by ID frequently, and treat it as read-only, but still have infrequent need of the navigation property. An example of this would be CreatedBy / CreatedByUserId. Many queries may inspect the CreatedByUserId for data filtering, while some projections may want the CreatedBy.Name etc. A record's CreatedBy doesn't change so I avoid potential pitfalls of the data getting out of sync.
Your second scenario is used normally.
i.e.
public int CompanyId { get; set; }
[ForeignKey("CompanyId")]
public Company Company { get; set; }
And have this query:
return await _clientContext.Users.Where(x => x.CompanyId == companyId).ToListAsync();

EF: how to eager load associations correctly

I'm using EF for Data Access in my App. Data Model (greatly simplified):
class Project
{
virtual ICollection<Foo> Foos {get;set;}
virtual ICollection<Bar> Bars {get;set;}
// actually I have many more kinds of Project's data
}
class Foo
{
virtual Project Project {get;set;}
virtual ICollection<Bar> LinkedBars {get;set;}
//And they are greatly intervened with each other
}
//etc etc.
Point is:
All Foo, Bar etc etc have references to each other
This references never leaves Project, i.e. following is always right:
foo.Bars.All(bar => bar.Project == foo.Project) == true
I load projects as follows:
var project = Db.Set<Project>
.Include(p => p.Foos)
.Include(p => p.Bars);
So, after that, if I accessing some foo.Bars I trigger lazy load. That's expected, after all, I have eagerly loaded all required data from Project, Foo and Bar, but not Foo_Bar link table.
Let's modify:
var project = Db.Set<Project>
.Include(p => p.Foos.Select(f => f.Bars))
.Include(p => p.Bars);
Now I have all required data in memory, but (!) generated SQL is clearly not optimal. Actually, EF is loading Project, Foo, Foo_Bar tables and scans Bar twice — once for p.Bars and once for foo.Bars. So, I can't eager load foo.Bar_Ids without eager loading foo.Bars, can I?
What should I do to improve?
AFAIK only way to obtain this is to declare Foo_Bars as explicit link table (not autogenerated by EF6), and Include it

breezeManager.createEntity() fails to populate foreign key property for a specific value

I have two relevant tables here:
public partial class List
{...
public int RegionId { get; set; }
[ForeignKey("RegionId")]
public virtual Region Region { get; set; }
...}
public partial class Region
{
public Region()
{
Lists = new HashSet<List>();
}
public int RegionId { get; set; }
[Required]
[StringLength(255)]
public string Name { get; set; }
public DateTime Added { get; set; }
public virtual ICollection<List> Lists { get; set; }
}
Here's the contents of the Regions table:
RegionId Name
1 Global
2 China
3 USA
4 UK
8 Canada
9 Spain
10 France
On the breeze side of things, I pull down the Regions:
var query = new breeze.EntityQuery().from("Regions").select("RegionId,Name").orderBy("RegionId");
return $rootScope.breezeManager.executeQuery(query).then(function (data) {
service.regions = data.results;
It seems to work:
service.regions[2]; // Name: "USA", RegionId: 3
However, when I try to create a new entity:
var newList = $rootScope.breezeManager.createEntity('List', listValues);
And in listValues, I specify { RegionId: 3, ...}:
newList.RegionId // 3
newList.Region // null
That would be strange already perhaps, but the really frustrating thing is, if I specify another value, like 1, it works. Same for 2, and 4:
newList.RegionId // 1
newList.Region.Name // "Global"
I've been poring through the code (and the internet) for hours trying to figure this out, but it's eluded me, and thus qualifies for my first ever SO question!
Update
I'm now even more confused. I planned to workaround this by manually setting the Region after the createEntity call, so I added this line of code right above createEntity:
var region = Enumerable.From(service.regions).Single("$.RegionId == " + listValues.RegionId);
However, after doing so, and with no other changes, newList now correctly gets a populated Region, even for 3 (USA)! I then commented out that line and tried again, and it still worked. Tried a few other times, various combinations of things, and now it's not working again, even with that line. Not sure if that helps. I'll keep experimenting.
I don't know what you're trying to do exactly but I do see that you have a fundamental misunderstanding.
I believe you are expecting your first query to return Region entities and you think that service.regions is populated with Region entities in the success callback.
Neither is true. Your query contains a select() clause which makes it what we call a projection. A projection returns data objects, not entities. There are no Region entities in cache either.
You may have seen elsewhere - perhaps a Pluralsight video - where a projection query did return entities. That happens when the query includes a toType clause which actually casts the projected data into instances of the targeted type. That's a trick you should only perform with great care, knowing what you are doing, why, and the limitations. But you're not casting in this query so this digression is besides the point.
It follows that the service.regions array holds some data objects that contain Region data but it does not hold Region entities.
This also explains why, when you created a new List entity with RegionId:3, the new entity's Region property returned null. Of course it is null. Based only on what you've told us, there are no Region entities in cache at all, let alone a Region with id=3.
I can't explain how you're able to get a Region with id=1 or id=2. I'm guessing there is something you haven't told us ... like you acquired these regions by way of some other query.
I don't understand why you're using a projection query in the first place. Why not just query for all regions and be done with it?
breeze.EntityQuery.from("Regions").orderBy("RegionId")
.using(manager).execute(function(data) {
service.regions = data.results;
});
I don't understand your update at all; that doesn't look like a valid EF query to me.
Tangential issues
First, why are you using a HashSet<List> to initialize the Lists property in your server-side Region model class? That's a specialized collection type that only confuses matters here. It doesn't do what the casual reader might think it does. While it prevents someone from adding the same object reference twice, it doesn't do the more important job of preventing someone from adding two different List objects with the same RegionId. The simple, obvious, and correct thing to do is ...
Lists = new System.Collections.Generic.List<List>();
// the full type name is only necessary because you burned the name "List"
// when defining your model class.
Second, on the client-side, please don't extend the Angular $rootScope with anything. That kind of global variable "pollution" is widely regarded as "bad practice". Keep your Breeze stuff inside a proper Angular service and inject that service when you need something.

EF classes containing collections: Lists induce to use memory when navigating

When one wants to use EF navigation (navigating by the property of the classes), List<T>´s are all treated in memory.
For example I have this EF model class:
class School
{
public virtual ICollection<Groups> Groups { get; set; }
...
public School()
{
this.Courses = new List<Group>(); // List<T>!!
}
}
And if I do this:
someSchool.Groups.Count
I will be counting the groups in memory and not in SQL (ie: these won't be counted like "select count(*) from Groups join School Where SchoolId = ...")
So my question is.. what should I use instead of List?
IEnumerable is an interface so I can't have a new IEnumerable,... IQueryable too..
If no collection class is suitable for this, then I guess I should be using my DbContext instance. Like this:
(new MyDbContext()).Groups.Count(g => g.SchoolId == ...)
If that is the case, then: why is there EF navigation?!??
Edit:
Ok maybe I should use real information:
I'm already using ICollection (i used IEnumerable in the post because I thought they where similar)
This is the slow query: domain.Persons.Count(p => p.IsStudent && p.GuardianId != null && p.Guardian.Mobile.Equals(""))
This is the fast query: db.Persons.Count(p => p.Domains.Any(d => d.DomainId == domain.DomainId) && p.IsStudent && p.GuardianId != null && p.Guardian.Mobile.Equals(""))
As you can see, 2 and 3 are very similar ...one uses navigation and the other doesn't.
You should use ICollection<T> instead and define your properties as virtual, so that you can lazy load and get your Count().
// Example
public virtual ICollection<Apple> Apples{get;set;}
virtual keyword enables EF to override its behavior and lazy load entities for you when you access the getter.

How to do recursive load with Entity framework?

I have a tree structure in the DB with TreeNodes table. the table has nodeId, parentId and parameterId. in the EF, The structure is like TreeNode.Children where each child is a TreeNode...
I also have a Tree table with contain id,name and rootNodeId.
At the end of the day I would like to load the tree into a TreeView but I can't figure how to load it all at once.
I tried:
var trees = from t in context.TreeSet.Include("Root").Include("Root.Children").Include("Root.Children.Parameter")
.Include("Root.Children.Children")
where t.ID == id
select t;
This will get me the the first 2 generations but not more.
How do I load the entire tree with all generations and the additional data?
I had this problem recently and stumbled across this question after I figured a simple way to achieve results. I provided an edit to Craig's answer providing a 4th method, but the powers-that-be decided it should be another answer. That's fine with me :)
My original question / answer can be found here.
This works so long as your items in the table all know which tree they belong to (which in your case it looks like they do: t.ID). That said, it's not clear what entities you really have in play, but even if you've got more than one, you must have a FK in the entity Children if that's not a TreeSet
Basically, just don't use Include():
var query = from t in context.TreeSet
where t.ID == id
select t;
// if TreeSet.Children is a different entity:
var query = from c in context.TreeSetChildren
// guessing the FK property TreeSetID
where c.TreeSetID == id
select c;
This will bring back ALL the items for the tree and put them all in the root of the collection. At this point, your result set will look like this:
-- Item1
-- Item2
-- Item3
-- Item4
-- Item5
-- Item2
-- Item3
-- Item5
Since you probably want your entities coming out of EF only hierarchically, this isn't what you want, right?
.. then, exclude descendants present at the root level:
Fortunately, because you have navigation properties in your model, the child entity collections will still be populated as you can see by the illustration of the result set above. By manually iterating over the result set with a foreach() loop, and adding those root items to a new List<TreeSet>(), you will now have a list with root elements and all descendants properly nested.
If your trees get large and performance is a concern, you can sort your return set ASCENDING by ParentID (it's Nullable, right?) so that all the root items are first. Iterate and add as before, but break from the loop once you get to one that is not null.
var subset = query
// execute the query against the DB
.ToList()
// filter out non-root-items
.Where(x => !x.ParentId.HasValue);
And now subset will look like this:
-- Item1
-- Item2
-- Item3
-- Item4
-- Item5
About Craig's solutions:
You really don't want to use lazy loading for this!! A design built around the necessity for n+1 querying will be a major performance sucker. ********* (Well, to be fair, if you're going to allow a user to selectively drill down the tree, then it could be appropriate. Just don't use lazy loading for getting them all up-front!!)I've never tried the nested set stuff, and I wouldn't suggest hacking EF configuration to make this work either, given there is a far easier solution. Another reasonable suggestion is creating a database view that provides the self-linking, then map that view to an intermediary join/link/m2m table. Personally, I found this solution to be more complicated than necessary, but it probably has its uses.
When you use Include(), you are asking the Entity Framework to translate your query into SQL. So think: How would you write an SQL statement which returns a tree of an arbitrary depth?
Answer: Unless you are using specific hierarchy features of your database server (which are not SQL standard, but supported by some servers, such as SQL Server 2008, though not by its Entity Framework provider), you wouldn't. The usual way to handle trees of arbitrary depth in SQL is to use the nested sets model rather than the parent ID model.
Therefore, there are three ways which you can use to solve this problem:
Use the nested sets model. This requires changing your metadata.
Use SQL Server's hierarchy features, and hack the Entity Framework into understanding them (tricky, but this technique might work). Again, you'll need to change your metadata.i
Use explicit loading or EF 4's lazy loading instead of eager loading. This will result in many database queries instead of one.
I wanted to post up my answer since the others didn't help me.
My database is a little different, basically my table has an ID and a ParentID. The table is recursive. The following code gets all children and nests them into a final list.
public IEnumerable<Models.MCMessageCenterThread> GetAllMessageCenterThreads(int msgCtrId)
{
var z = Db.MCMessageThreads.Where(t => t.ID == msgCtrId)
.Select(t => new MCMessageCenterThread
{
Id = t.ID,
ParentId = t.ParentID ?? 0,
Title = t.Title,
Body = t.Body
}).ToList();
foreach (var t in z)
{
t.Children = GetChildrenByParentId(t.Id);
}
return z;
}
private IEnumerable<MCMessageCenterThread> GetChildrenByParentId(int parentId)
{
var children = new List<MCMessageCenterThread>();
var threads = Db.MCMessageThreads.Where(x => x.ParentID == parentId);
foreach (var t in threads)
{
var thread = new MCMessageCenterThread
{
Id = t.ID,
ParentId = t.ParentID ?? 0,
Title = t.Title,
Body = t.Body,
Children = GetChildrenByParentId(t.ID)
};
children.Add(thread);
}
return children;
}
For completeness, here's my model:
public class MCMessageCenterThread
{
public int Id { get; set; }
public int ParentId { get; set; }
public string Title { get; set; }
public string Body { get; set; }
public IEnumerable<MCMessageCenterThread> Children { get; set; }
}
I wrote something recently that does N+1 selects to load the whole tree, where N is the number of levels of your deepest path in the source object.
This is what I did, given the following self-referencing class
public class SomeEntity
{
public int Id { get; set; }
public int? ParentId { get; set; }
public string Name { get; set;
}
I wrote the following DbSet helper
using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using System.Threading.Tasks;
namespace Microsoft.EntityFrameworkCore
{
public static class DbSetExtensions
{
public static async Task<TEntity[]> FindRecursiveAsync<TEntity, TKey>(
this DbSet<TEntity> source,
Expression<Func<TEntity, bool>> rootSelector,
Func<TEntity, TKey> getEntityKey,
Func<TEntity, TKey> getChildKeyToParent)
where TEntity: class
{
// Keeps a track of already processed, so as not to invoke
// an infinte recursion
var alreadyProcessed = new HashSet<TKey>();
TEntity[] result = await source.Where(rootSelector).ToArrayAsync();
TEntity[] currentRoots = result;
while (currentRoots.Length > 0)
{
TKey[] currentParentKeys = currentRoots.Select(getEntityKey).Except(alreadyProcessed).ToArray();
alreadyProcessed.AddRange(currentParentKeys);
Expression<Func<TEntity, bool>> childPredicate = x => currentParentKeys.Contains(getChildKeyToParent(x));
currentRoots = await source.Where(childPredicate).ToArrayAsync();
}
return result;
}
}
}
Whenever you need to load a whole tree you simply call this method, passing in three things
The selection criteria for your root objects
How to get the property for the primary key of the object (SomeEntity.Id)
How to get the child's property that refers to its parent (SomeEntity.ParentId)
For example
SomeEntity[] myEntities = await DataContext.SomeEntity.FindRecursiveAsync(
rootSelector: x => x.Id = 42,
getEntityKey: x => x.Id,
getChildKeyToParent: x => x.ParentId).ToArrayAsync();
);
Alternatively, if you can add a RootId column to the table then for each non-root entry you can set this column to the ID of the root of the tree. Then you can fetch everything with a single select
DataContext.SomeEntity.Where(x => x.Id == rootId || x.RootId == rootId)
For an example of loading in child objects, I'll give the example of a Comment object that holds a comment. Each comment has a possible child comment.
private static void LoadComments(<yourObject> q, Context yourContext)
{
if(null == q | null == yourContext)
{
return;
}
yourContext.Entry(q).Reference(x=> x.Comment).Load();
Comment curComment = q.Comment;
while(null != curComment)
{
curComment = LoadChildComment(curComment, yourContext);
}
}
private static Comment LoadChildComment(Comment c, Context yourContext)
{
if(null == c | null == yourContext)
{
return null;
}
yourContext.Entry(c).Reference(x=>x.ChildComment).Load();
return c.ChildComment;
}
Now if you were having something that has collections of itself you would need to use Collection instead of Reference and do the same sort of diving down. At least that's the approach I took in this scenario as we were dealing with Entity and SQLite.
This is an old question, but the other answers either had n+1 database hits or their models were conducive to bottom-up (trunk to leaves) approaches. In this scenario, a tag list is loaded as a tree, and a tag can have multiple parents. The approach I use only has two database hits: the first to get the tags for the selected articles, then another that eager loads a join table. Thus, this uses a top-down (leaves to trunk) approach; if your join table is large or if the result cannot really be cached for reuse, then eager loading the whole thing starts to show the tradeoffs with this approach.
To begin, I initialize two HashSets: one to hold the root nodes (the resultset), and another to keep a reference to each node that has been "hit."
var roots = new HashSet<AncestralTagDto>(); //no parents
var allTags = new HashSet<AncestralTagDto>();
Next, I grab all of the leaves that the client requested, placing them into an object that holds a collection of children (but that collection will remain empty after this step).
var startingTags = await _dataContext.ArticlesTags
.Include(p => p.Tag.Parents)
.Where(t => t.Article.CategoryId == categoryId)
.GroupBy(t => t.Tag)
.ToListAsync()
.ContinueWith(resultTask =>
resultTask.Result.Select(
grouping => new AncestralTagDto(
grouping.Key.Id,
grouping.Key.Name)));
Now, let's grab the tag self-join table, and load it all into memory:
var tagRelations = await _dataContext.TagsTags.Include(p => p.ParentTag).ToListAsync();
Now, for each tag in startingTags, add that tag to the allTags collection, then travel down the tree to get the ancestors recursively:
foreach (var tag in startingTags)
{
allTags.Add(tag);
GetParents(tag);
}
return roots;
Lastly, here's the nested recursive method that builds the tree:
void GetParents(AncestralTagDto tag)
{
var parents = tagRelations.Where(c => c.ChildTagId == tag.Id).Select(p => p.ParentTag);
if (parents.Any()) //then it's not a root tag; keep climbing down
{
foreach (var parent in parents)
{
//have we already seen this parent tag before? If not, instantiate the dto.
var parentDto = allTags.SingleOrDefault(i => i.Id == parent.Id);
if (parentDto is null)
{
parentDto = new AncestralTagDto(parent.Id, parent.Name);
allTags.Add(parentDto);
}
parentDto.Children.Add(tag);
GetParents(parentDto);
}
}
else //the tag is a root tag, and should be in the root collection. If it's not in there, add it.
{
//this block could be simplified to just roots.Add(tag), but it's left this way for other logic.
var existingRoot = roots.SingleOrDefault(i => i.Equals(tag));
if (existingRoot is null)
roots.Add(tag);
}
}
Under the covers, I am relying on the properties of a HashSet to prevent duplicates. To that end, it's important that the intermediate object that you use (I used AncestralTagDto here, and its Children collection is also a HashSet), override the Equals and GetHashCode methods as appropriate for your use-case.