Entity Framework: Querying Child Entities [duplicate] - entity-framework

This question already has answers here:
EF: Include with where clause [duplicate]
(5 answers)
Closed 1 year ago.
It seems that I can't get a parent and a subset of its children from the db.
For example...
db.Parents
.Include(p => p.Children)
.Where(p => p.Children.Any(c => c.Age >= 5))
This will return all Parents that have a child aged 5+, but if I iterate through the Parents.Children collection, all children will be present (not just those over 5 years old).
Now the query does make sense to me (I've asked to include children and I've got them!), but can imagine that I would like to have the where clause applied to the child collection in some scenarios.
How could I get an IEnumerable, in which each of the parents has a filtered collection of Children (Age>=5)?

The only way to get a collection of parents with a filtered children collection in a single database roundtrip is using a projection. It is not possible to use eager loading (Include) because it doesn't support filtering, Include always loads the whole collection. The explicite loading way shown by #Daz requires one roundtrip per parent entity.
Example:
var result = db.Parents
.Select(p => new
{
Parent = p,
Children = p.Children.Where(c => c.Age >= 5)
})
.ToList();
You can directly work with this collection of anonymous type objects. (You can also project into your own named type instead of an anonymous projection (but not into an entity like Parent).)
EF's context will also populate the Children collection of the Parent automatically if you don't disable change tracking (using AsNoTracking() for example). In this case you can then project the parent out of the anonymous result type (happens in memory, no DB query):
var parents = result.Select(a => a.Parent).ToList();
parents[i].Children will contain your filtered children for each Parent.
Edit to your last Edit in the question:
I am after a) A list of parents who have a child older than 5 (and
include only those children).
The code above would return all parents and include only the children with Age >= 5, so potentially also parents with an empty children collection if there are only children with Age < 5. You can filter these out using an additional Where clause for the parents to get only the parents which have at least one (Any) child with Age >= 5:
var result = db.Parents
.Where(p => p.Children.Any(c => c.Age >= 5))
.Select(p => new
{
Parent = p,
Children = p.Children.Where(c => c.Age >= 5)
})
.ToList();

In EF Core 5.0, the Include method now supports filtering of the entities included.
https://learn.microsoft.com/en-us/ef/core/what-is-new/ef-core-5.0/whatsnew#filtered-include
var data = db.Parents
.Include(p => p.Children.Where(c => c.Age >= 5))
.ToList();

Taking your example the following should do what you need. Take a look here for more info.
db.Entry(Parents)
.Collection("Children")
.Query().Cast<Child>()
.Where(c => c.Age >= 5))
.Load();

I think parents and child are not really well suited as separate entities. A child can always also be a parent and usually a child has two parents (a father and a mother), so it's not the simplest context. But I assume you just have a simple 1:n relationship as in the following master-slave model that I used.
What you need to do is make a left outer join (that answer has led me on the right path). Such a join is a bit tricky to do, but here's the code
var query = from m in ctx.Masters
join s in ctx.Slaves
on m.MasterId equals s.MasterId into masterSlaves
from ms in masterSlaves.Where(x => x.Age > 5).DefaultIfEmpty()
select new {
Master = m,
Slave = ms
};
foreach (var item in query) {
if (item.Slave == null) Console.WriteLine("{0} owns nobody.", item.Master.Name);
else Console.WriteLine("{0} owns {1} at age {2}.", item.Master.Name, item.Slave.Name, item.Slave.Age);
}
This will translate to the following SQL statement with EF 4.1
SELECT
[Extent1].[MasterId] AS [MasterId],
[Extent1].[Name] AS [Name],
[Extent2].[SlaveId] AS [SlaveId],
[Extent2].[MasterId] AS [MasterId1],
[Extent2].[Name] AS [Name1],
[Extent2].[Age] AS [Age]
FROM [dbo].[Master] AS [Extent1]
LEFT OUTER JOIN [dbo].[Slave] AS [Extent2]
ON ([Extent1].[MasterId] = [Extent2].[MasterId]) AND ([Extent2].[Age] > 5)
Note that it is important to perform the additional where clause on the age on the joined collection and not between the from and the select.
EDIT:
IF you want a hierarchical result you can convert the flat list by performing a grouping:
var hierarchical = from line in query
group line by line.Master into grouped
select new { Master = grouped.Key, Slaves = grouped.Select(x => x.Slave).Where(x => x != null) };
foreach (var elem in hierarchical) {
Master master = elem.Master;
Console.WriteLine("{0}:", master.Name);
foreach (var s in elem.Slaves) // note that it says elem.Slaves not master.Slaves here!
Console.WriteLine("{0} at {1}", s.Name, s.Age);
}
Note that I used an anonymous type to store the hierarchical result. You can of course create also a specific type like this
class FilteredResult {
public Master Master { get; set; }
public IEnumerable<Slave> Slaves { get; set; }
}
and then project the group into instances of this class. That makes it easier if you need to pass these results to other methods.

Related

How do you build a recursive Expression tree in Entity Framework Core?

We are using EFCore.SqlServer.HierarchyId to represent a hierarchy in our data.
My goal is to return the descendants of an object with a particular path of indeterminate length, e.g. given a tree with the hierarchy one->two->three->four, the path one/two/three would return four
Knowing the length of the path, I can make a query like this:
var collections = await context.Collections.Where(c => c.CollectionHierarchyid.IsDescendantOf(
context.Collections.FirstOrDefault(c1 => c1.FriendlyId == "three" &&
context.Collections.Any(c2 => c2.CollectionHierarchyid == c1.CollectionHierarchyid.GetAncestor(1) && c2.FriendlyId == "two" &&
context.Collections.Any(c3 => c3.CollectionHierarchyid == c2.CollectionHierarchyid.GetAncestor(1) && c3.FriendlyId == "one")
)
).CollectionHierarchyid
)).ToListAsync();
But how would you go about this if the length of the path is unknown? I can't call a recursive function from the expression because it won't compile from Linq to Entity Sql.
I know the answer lies somewhere in using System.Linq.Expressions to build the expression, but I am not sure where to start.
The problem can be solved without dynamic expression tree generation, at least not directly, but using standard LINQ query operators.
Let say you have a hierarchical entity like this
public class Entity
{
public HierarchyId Id { get; set; }
// other properties...
}
Given a subquery returning the full set
IQueryable<Entity> fullSet = context.Set<Entity>();
and subquery defining some filtered subset containing the desired ancestors
IQueryable<Entity> ancestors = ...;
Now getting all direct and indirect descendants can easily be achieved with
IQueryable<Entity> descendants = fullSet
.Where(d => ancestors.Any(a => d.Id.IsDescendantOf(a.Id));
So the question is how to build ancestors subquery dynamically.
Applying some filter to the full set and retrieving the direct ancestors filtered by another criteria can be done by using simple join operator
from p in fullSet.Where(condition1)
join c in fullSet.Where(condition2)
on p.Id equals c.Id.GetAncestor(1)
select c
Hence all you need is to apply that recursively, e.g. having
IEnumerable<TArg> args = ...;
representing the filtering criteria arguments ordered by level, then the query can be built as follows
var ancestors = args
.Select(arg => fullSet.Where(e => Predicate(e, arg)))
.Aggregate((prevSet, nextSet) =>
from p in prevSet join c in nextSet on p.Id equals c.Id.GetAncestor(1) select c);
With that being said, applying it to your example:
IEnumerable<string> friendlyIds = new [] { "one", "two", "three" };
var fullSet = context.Collections.AsQueryable();
var ancestors = friendlyIds
.Select(friendlyId => fullSet.Where(e => e.FriendlyId == friendlyId))
.Aggregate((prevSet, nextSet) =>
from p in prevSet join c in nextSet on p.CollectionHierarchyid equals c.CollectionHierarchyid.GetAncestor(1) select c);
var descendants = fullSet
.Where(d => ancestors.Any(a => d.CollectionHierarchyid.IsDescendantOf(a.CollectionHierarchyid));

Return parent entities based on (and including) child entities

I am trying to retrieve a list of entities based on properties of their child entities - and include those same child entities. I am using EntityFramework Core 3.1, although I am happy to upgrade to 5.x if there've been any changes that would solve this for me. I have also not explored EntityFramework much beyond some very basic CRUD boilerplate until now, so I am not sure if this is something more LINQ-oriented or specific to EF (Core). The below is a heavily simplified example of a method in a project that I will be using to ultimately return data to a consumer of my API.
A point of interest (POI) has a number of historical records (History). POI has a List<History> and a History has a PointID which is used by EF Core to populate the POI's List<History>.
Here is how I would get all the POIs and their histories, where a point was first registered since a certain date (using a nullable date parameter for this method)
var result = _context.POIs
.Where(point => (registeredSince == null || point.RegisteredAt >= registeredSince))
.Include(point => point.Histories)
.ToList();
However, my question is.. how would I then get only POIs based on an attribute within the History of that POI (and include those same History records?) Or, to use an example; I want to return only POIs that have History records with an areaId == 5 (and include those in the results)
One way, without hugely in-depth EF knowledge, would be:
First run a query to return History entities where history.areaId == 5 and only select history.PointId
Second query would be to get all POIs where id is in the returned PointId list above
..including History where history.areaId == 5 (a duplication)
However, I would be running part of this twice, which seems inefficient. Basically, could I efficiently use LINQ/EF to get all POIs where history.areaId == 5 (and then only include those History records with an areaId of 5)? Would I have to write something that unavoidably loads all POI and their History records, before I am able to narrow the results down, or is that something EF can happily do?
You can use Filtered include introduced in EF Core 5.x, to query like -
var result = _context.POIs
.Include(p => p.Histories.Where(h => h.areaId == 5))
.ToList();
This will return a list of POI where each will contain only histories for which areaId == 5.
EDIT:
If you want only the POIs which has any History with areaId == 5, you can simply filter them accordingly -
var result = dbCtx.POIs
.Include(p => p.Histories.Where(h => h.areaId == 5))
.Where(p => p.Histories.Any(h => h.areaId == 5))
.ToList();
You should be able to use the following:
var result = _context.POIs
.Include(poi => poi.Histories)
// Enumerate linked Histories & get the areaId from each into a list)
// ... then see if that list contains the areaID we're looking for.
.Where(poi => poi.Histories.Select(h => h.areaId).Contains(areaIdParam))
.ToList();

Entity Framework - How to get an entity including a filtered collection of child entities

I have categories, and categories have a collection of products (Category.Products). I need to retrieve a category from the db by its id, but instead of including all its products, I want it to only include products with a given condition (example, order=0)
How can I do this with linq?
I tried with:
var e = db.Categories
.Include(a => a.products)
.Where(a => a.products.Any(r => r.order == 0))
.FirstOrDefault(p => p.id == id_category);
I don't think you can do that. In any case, the call to .Include() should be after any where clause, or it won't work.
In order to filter child collection you can try to select that to YouCustomModelor anonymous projection.
Note that it is not currently possible to filter which related entities are loaded. Include will always bring in all related entities Msdn reference.
var e = db.Categories
.Where(c => c.id == id_category)
.Select(p=> new
{
category = p,
products = p.Products.Where(k=>k.order==0)
}.FirstOrDefault();
var e = db.Categories.Where(a => a.order == 0);

Nested Where on 1-to-many in LINQ2Entity

I'm using EF4. Having 2 entities:
Person { Name }
Hobbys { Person.Name, IsCoolHobby }
1 Person can have several hobbys.
I now have
IQueryable<Person> p;
p = container.PersonSet.Include("Hobbys").AsQueryable();
p = p.Where(x => x ?????);
List<Person> tmp = p.ToList();
How can i return only those Persons who have cool hobbys (IsCoolHobby == true)? I tried join but i was not able to load them into the list (select can only return Person, Hobby or new Type - but how to map them to entity objects again?)
Thanks
How can i return only those Persons who have cool hobbys (IsCoolHobby
== true)?
List<Person> tmp = container.PersonSet.Include("Hobbys")
.Where(p => p.Hobbys.Any(h => h.IsCoolHobby))
.ToList();
This will load the people who have at least one cool hobby but the Hobbys collection for those people will always contain all hobbys, also the uncool hobbys.
Edit
Unfortunately filtering and sorting children during eager loading (Include) is currently not supported. There is a request on the EF feature suggestion page for this feature. The request has status "Under review", so there is a little hope that it might get implemented in the future. (Probably far future: At least the first docs about EF 5 (beta) on MSDN say explicitly that eager loading with filtering/sorting is still not implemented.)
For now there are only two workarounds. The first is to use a projection:
var projectedData = container.PersonSet
.Where(p => p.Hobbys.Any(h => h.IsCoolHobby))
.Select(p => new
{
Person = p,
CoolHobbys = p.Hobbys.Where(h => h.IsCoolHobby)
})
.ToList();
The result is a collection of anonymous objects which contain a user who has cool hobbys and a collection of those cool hobbys. If you don't disable change tracking (by using the NoTracking option for the query) the person's hobbys collection should be filled with the result automatically.
The second option is to use "explicit" loading with CreateSourceQuery:
List<Person> tmp = container.PersonSet
.Where(p => p.Hobbys.Any(h => h.IsCoolHobby))
.ToList();
foreach (var person in tmp)
{
person.Hobbys.Attach(person.Hobbys.CreateSourceQuery()
.Where(h => h.IsCoolHobby).ToList());
}
Two things to note here:
CreateSourceQuery is only available on EntityCollections, i.e. if you are using EntityObject derived entities. It's not available for POCO entities in EF 4.0. (EF >= 4.1/DbContext has the option for explicit loading also for POCOs -> Query() method.)
The above code represents 1+N roundtrips to the database: The first for the person collection without the hobbys and then one additional query per person to load the cool hobbys.

Entity Framework Include with condition

I need to filter a dealer based on id and the uncomplete checkins
Initially, it returned the dealer based only on id:
// TODO: limit checkins to those that are not complete
return this.ObjectContext.Dealers
.Include("Groups")
.Include("Groups.Items")
.Include("Groups.Items.Observations")
.Include("Groups.Items.Recommendations")
.Include("Checkins")
.Include("Checkins.Inspections")
.Include("Checkins.Inspections.InspectionItems")
.Where(d => d.DealerId == id)
.FirstOrDefault();
As you can see the requirement is to limit the checkins.
Here's what I did:
var query = from d in this.ObjectContext.Dealers
.Include("Groups")
.Include("Groups.Items")
.Include("Groups.Items.Observations")
.Include("Groups.Items.Recommendations")
.Include("Checkins.Inspections")
.Include("Checkins.Inspections.InspectionItems")
.Where(d => d.DealerId == id)
select new
{
Dealer = d,
Groups = from g in d.Groups
select new
{
Items = from i in g.Items
select new
{
Group = i.Group,
Observations = i.Observations,
Recommendations = i.Recommendations
}
},
Checkins = from c in d.Checkins
where c.Complete == true
select new
{
Inspections = from i in c.Inspections
select new
{
InspectionItems = i.InspectionItems
}
}
};
var dealer = query.ToArray().Select(o => o.Dealer).First();
return dealer;
It works.
However, I am not convinced I am doing the right thing.
What is the best way to accomplish what I did? A stored procedure maybe?
I am not sure I even have to use Include clause anymore
Thank you.
If you want to load filtered relations with single query you indeed have to execute such projection but you don't need those calls to Include. Once you are building projections includes are not use - you have returned data under your control.
Stored procedure will help you only if you fall back to plain ADO.NET because stored procedures executed through Entity framework are not able to fill related entities (only flattened structures).
Automatic fixupu mentioned by #Andreas requires multiple database queries and as I know it works only if lazy loading is disabled because proxied object somehow doesn't have information about fixup and it still has its internal flags for each relation as not loaded so when you access them for the first time they still execute additional query.
Maybe you can make use of the relation fixup mechanism in the EF ObjectContexts. When you do multiple queries in the same context for entities, that are related by associations, these are resolved.
Assuming your association between Dealers and Checkins is 1:n with navigation properties on each side, you could do like:
var dealer = yourContext.Dealers
.Where(p => p.DealerId == id)
.FirstOrDefault();
if(dealer != null)
{
yourContext.Checkins
.Where(c => c.Complete && c.DealerId == dealer.DealerId)
.ToList();
I have not tested this by now, but since EF recognises that the Checkins, it inserts into the context by the second query belong to the dealer from the first query, corresponding references are created.
#Andreas H:
Awesome, thank you a lot.
I had to adjust your suggestion like this and it worked:
var dealer = this.ObjectContext.Dealers
.Include("Groups")
.Include("Groups.Items")
.Include("Groups.Items.Observations")
.Include("Groups.Items.Recommendations")
.Where(p => p.DealerId == id).
FirstOrDefault();
if (dealer != null)
{
this.ObjectContext.Checkins
.Include("Inspections")
.Include("Inspections.InspectionItems")
.Where(c => !c.Complete && c.Dealer.DealerId == dealer.DealerId)
.ToList();
}
return dealer;
I still have to use the Include otherwise it won't return the referenced entities.
Note also that Dealer.Groups are unrelated to the Dealer.Checkins.
So if there's no checkins satisfying the condition, Groups still need to be returned.
It's interesting to note that at first, I put the two include for checkins to the dealer
var dealer = this.ObjectContext.Dealers
.Include("Groups")
.Include("Groups.Items")
.Include("Groups.Items.Observations")
.Include("Groups.Items.Recommendations")
.Include("Checkins.Inspections")
.Include("Checkins.Inspections.InspectionItems")
.Where(p => p.DealerId == id).
FirstOrDefault();
if (dealer != null)
{
this.ObjectContext.Checkins
.Where(c => c.Complete && c.DealerId == id)
.ToList();
}
return dealer;
but it returned all the Checkins including those which are not complete.
I don't understand exactly why the latter doesn't work but the former does, how are the entities are resolved. I somehow can intuit that the former returns all data.
Your accepted solution will generate multiple database queries. As Ladislav Mrnka said a projection is the only way to pull your result with one query. The maintance of your code indeed hard. Maybe you could use an IQueryable-Extension that builds the projection dynamically and keep your code clean:
var query = this.ObjectContext.Dealers.SelectIncluding( new List<Expression<Func<T,object>>>>(){
x => x.Groups,
x => x.Groups.Select(y => y.Items),
x => x.Groups.Select(y => y.Items.Select(z => z.Observations)),
x => x.Groups.Select(y => y.Items.Select(z => z.Recommendations)),
x => x.Checkins.Where(y => y.Complete==true),
x => x.Checkins.Select(y => y.Inspections),
x => x.Checkins.Select(y => y.Inspections.Select(z => z.InspectionItems))
});
var dealer = query.First();
return dealer;
You can find the extension at thiscode/DynamicSelectExtensions on github