Left outer join statements containing Reference navigation properties found in all split queries - entity-framework-core

I have an entity which contains several reference navigation properties under it.
The Repository implementation for the entity looks some thing like this:
return await _dbContext.MyEntity
.Include(s => s.Address) //Reference Navigation
.Include(s => s.BuildingDetails) //Reference Navigation
.ThenInclude(s => s.ChildOfBuildingDetails)
.Include(s => s.ContactPersons)
.Include(s => s.Technicians)
.Include(s => s.DeactivationDetails) //Reference Navigation
.FirstOrDefaultAsync(s => s.Id == id, cancellationToken);
When I check the actual DB queries being executed, all the queries contain the reference navigation properties included in them as joins to the parent entity.
SELECT [m92].[Id], .......
FROM [MyDB].[ContactPersons] AS [m92]
INNER JOIN (
SELECT DISTINCT [m93].[Id], [t76].[Id] AS [Id0]
FROM [MyDB].[MyEntity] AS [m93]
LEFT JOIN (
SELECT [m94].*
FROM [MyDB].[DeactivationDetails] AS [m94]
WHERE [m94].[Deleted] = 0
) AS [t75] ON [m93].[Id] = [t75].[MyEntityId]
LEFT JOIN (
SELECT [m95].*
FROM [MyDB].[BuildingDetails] AS [m95]
WHERE [m95].[Deleted] = 0
) AS [t76] ON [m93].[Id] = [t76].[MyEntityId]
LEFT JOIN (
SELECT [m96].*
FROM [MyDB].[Address] AS [m96]
WHERE [m96].[Deleted] = 0
) AS [t77] ON [m93].[Id] = [t77].[MyEntityId]
WHERE [m93].[Deleted] = 0
) AS [t78] ON [m92].[MyEntityId] = [t78].[Id]
WHERE [m92].[Deleted] = 0
ORDER BY [t78].[Id], [t78].[Id0]
Basically, the whole portion inside the INNER JOIN is present in all the queries that are being executed. Ideally we only need to join the child entities with parent entity in the queries.
1) Why does EF core translate to queries such that it includes the reference navigation property in all the split queries?
2) Is there a way to avoid this behavior, to be specific, replace the INNER JOIN block with just the parent entity

1) Why does EF core translate to queries such that it includes the reference navigation property in all the split queries?
It's an implementation defect/missing optimization.
2) Is there a way to avoid this behavior, to be specific, replace the INNER JOIN block with just the parent entity
The only way I found is to materialize the query with collection navigation property includes (which generate the additional queries) removed, then manually execute queries to load the related collections (requires tracking queries and relies on navigation property fix-up).
For instance (assuming navigation properties not marked as reference are collections):
// Query with filters only
var query = _dbContext.MyEntity
.Where(s => s.Id == id);
// Execute and materialize query with only filters and reference includes
var result = await query
.Include(s => s.Address) //Reference Navigation
.Include(s => s.BuildingDetails) //Reference Navigation
//.ThenInclude(s => s.ChildOfBuildingDetails)
//.Include(s => s.ContactPersons)
//.Include(s => s.Technicians)
.Include(s => s.DeactivationDetails) //Reference Navigation
.FirstOrDefaultAsync(cancellationToken);
// Load the related collections
await query.SelectMany(s => s.BuildingDetails.ChildOfBuildingDetails)
.LoadAsync(cancellationToken);
await query.SelectMany(s => s.ContactPersons)
.LoadAsync(cancellationToken);
await query.SelectMany(s => s.Technicians)
.LoadAsync(cancellationToken);

Related

Entity Framework - How to get an entity including a filtered collection of child entities

I have categories, and categories have a collection of products (Category.Products). I need to retrieve a category from the db by its id, but instead of including all its products, I want it to only include products with a given condition (example, order=0)
How can I do this with linq?
I tried with:
var e = db.Categories
.Include(a => a.products)
.Where(a => a.products.Any(r => r.order == 0))
.FirstOrDefault(p => p.id == id_category);
I don't think you can do that. In any case, the call to .Include() should be after any where clause, or it won't work.
In order to filter child collection you can try to select that to YouCustomModelor anonymous projection.
Note that it is not currently possible to filter which related entities are loaded. Include will always bring in all related entities Msdn reference.
var e = db.Categories
.Where(c => c.id == id_category)
.Select(p=> new
{
category = p,
products = p.Products.Where(k=>k.order==0)
}.FirstOrDefault();
var e = db.Categories.Where(a => a.order == 0);

EF CORE Select distinct grandchildren with many-to-may relationship

I'm trying to learn EF Core and hit this wall since I'm also fairly new to LINQ
Consider the model:
I'm trying to get all the distinct users from a single company;
The SQL statement would be something like this:
SELECT DISTINCT gau.AppUserId, au.Name, au.Id FROM Companies c
INNER JOIN Groups g ON g.CompanyId = c.Id
INNER JOIN GroupAppUsers gau ON gau.GroupId = g.Id
INNER JOIN AppUsers au ON gau.AppUserId = au.Id
Where c.Id = 40
Result:
How would I build this query like this? (Without the includes)
return await context.Companies
.Include(g => g.Groups)
.ThenInclude(au => au.AppUsers)
.ThenInclude(u => u.AppUser)
.SingleOrDefaultAsync(x => x.Id == id);
*Also, I'm not sure about the DB Model, I'm trying to avoid circular references but I think I should put Users linked with Companies instead of Groups, what do you think??
I'm trying to get all the distinct users from a single company
Rather than starting from companies and navigating to users, thus multiplying the users due to many-to-many relationship and then applying Disctinct operator, you could simply start from users and apply Any based criteria, thus eliminating the need of Disctinct at all.
Something like this (the DbSet / navigation property names could be different):
var companyUsers = await context.Users
.Where(u => u.UserGroups.Any(ug => ug.Group.Company.Id == id))
.ToListAsync();
Assuming your linking table (GroupAppUser) isn't modeled as an entity, something like:
var q = from c in db.Companies
from g in c.Groups
from u in g.AppUsers
select u;
or in Lambda form:
var q = db.Companies
.SelectMany(c => c.Groups)
.SelectMany(g => g.AppUsers);
Once you have the single Companies object, you can use the Navigation properties to get the AppUser objects:
return await context.Companies
.Include(g => g.Groups)
.ThenInclude(au => au.AppUsers)
.ThenInclude(u => u.AppUser)
.SingleOrDefaultAsync(x => x.Id == id)
.Groups.AppUsers.Distinct();

Select with fields to include in detail table with EF Core

How can I select whitch fields should be included for detail table in EF Core.
I tried with this query:
var result= this.context.MainTable
.Include(t => t.DetailTable)
.Select(t => new {
id = t.Id,
values = t.DetailTable.Select(t2 => t2.SomeField)
})
.ToArray();
I would expect this result to single query, but it does not. It automatically execute query one by one for every row in MainTable and select SomeField.

EF 7 - Navigation properties - incorrect SQL?

Does EF7 fully support navigation properties and custom projection? Or maybe I'm misunderstanding how to construct this query. The Study entity has a nullable ProjectId and corresponding virtual Project navigation property. The Project entity has a non-nullable CategoryId and Category navigation property. The entities were reverse scaffolded using the ef command.
If I run the following query:
return _context.Study
.Include(s => s.Project)
.ThenInclude(p => p.Category)
.Select(s => new Models.StudySearchResult
{
StudyId = s.StudyId,
MasterStudyId = s.MasterStudyId,
ShortTitle = s.ShortTitle,
Category = s.Project == null ? string.Empty : s.Project.Category.CategoryDesc,
SubmitterId = s.SubmitterId
}).ToList();
EF7 incorrectly generates the following SQL, which uses INNER JOIN instead of LEFT JOIN:
SELECT [s].[StudyId]
,[s].[MasterStudyId]
,[s].[ShortTitle]
,CASE WHEN [s].[ProjectId] IS NULL THEN #__Empty_0 ELSE [s.Project.Category].[CategoryDesc] END
,[s].[SubmitterId]
FROM [Study] AS [s]
INNER JOIN [Project] AS [s.Project]
ON [s].[ProjectId] = [s.Project].[ProjectId]
INNER JOIN [Category] AS [s.Project.Category]
ON [s.Project].[CategoryId] = [s.Project.Category].[CategoryId]
I have the same problem. And I found out that there is currently open issue in EF7 for generating SQL for optional navigations. It will be fixed in RC2.
https://github.com/aspnet/EntityFramework/issues/4205
https://github.com/aspnet/EntityFramework/issues/3186

Entity Framework: Querying Child Entities [duplicate]

This question already has answers here:
EF: Include with where clause [duplicate]
(5 answers)
Closed 1 year ago.
It seems that I can't get a parent and a subset of its children from the db.
For example...
db.Parents
.Include(p => p.Children)
.Where(p => p.Children.Any(c => c.Age >= 5))
This will return all Parents that have a child aged 5+, but if I iterate through the Parents.Children collection, all children will be present (not just those over 5 years old).
Now the query does make sense to me (I've asked to include children and I've got them!), but can imagine that I would like to have the where clause applied to the child collection in some scenarios.
How could I get an IEnumerable, in which each of the parents has a filtered collection of Children (Age>=5)?
The only way to get a collection of parents with a filtered children collection in a single database roundtrip is using a projection. It is not possible to use eager loading (Include) because it doesn't support filtering, Include always loads the whole collection. The explicite loading way shown by #Daz requires one roundtrip per parent entity.
Example:
var result = db.Parents
.Select(p => new
{
Parent = p,
Children = p.Children.Where(c => c.Age >= 5)
})
.ToList();
You can directly work with this collection of anonymous type objects. (You can also project into your own named type instead of an anonymous projection (but not into an entity like Parent).)
EF's context will also populate the Children collection of the Parent automatically if you don't disable change tracking (using AsNoTracking() for example). In this case you can then project the parent out of the anonymous result type (happens in memory, no DB query):
var parents = result.Select(a => a.Parent).ToList();
parents[i].Children will contain your filtered children for each Parent.
Edit to your last Edit in the question:
I am after a) A list of parents who have a child older than 5 (and
include only those children).
The code above would return all parents and include only the children with Age >= 5, so potentially also parents with an empty children collection if there are only children with Age < 5. You can filter these out using an additional Where clause for the parents to get only the parents which have at least one (Any) child with Age >= 5:
var result = db.Parents
.Where(p => p.Children.Any(c => c.Age >= 5))
.Select(p => new
{
Parent = p,
Children = p.Children.Where(c => c.Age >= 5)
})
.ToList();
In EF Core 5.0, the Include method now supports filtering of the entities included.
https://learn.microsoft.com/en-us/ef/core/what-is-new/ef-core-5.0/whatsnew#filtered-include
var data = db.Parents
.Include(p => p.Children.Where(c => c.Age >= 5))
.ToList();
Taking your example the following should do what you need. Take a look here for more info.
db.Entry(Parents)
.Collection("Children")
.Query().Cast<Child>()
.Where(c => c.Age >= 5))
.Load();
I think parents and child are not really well suited as separate entities. A child can always also be a parent and usually a child has two parents (a father and a mother), so it's not the simplest context. But I assume you just have a simple 1:n relationship as in the following master-slave model that I used.
What you need to do is make a left outer join (that answer has led me on the right path). Such a join is a bit tricky to do, but here's the code
var query = from m in ctx.Masters
join s in ctx.Slaves
on m.MasterId equals s.MasterId into masterSlaves
from ms in masterSlaves.Where(x => x.Age > 5).DefaultIfEmpty()
select new {
Master = m,
Slave = ms
};
foreach (var item in query) {
if (item.Slave == null) Console.WriteLine("{0} owns nobody.", item.Master.Name);
else Console.WriteLine("{0} owns {1} at age {2}.", item.Master.Name, item.Slave.Name, item.Slave.Age);
}
This will translate to the following SQL statement with EF 4.1
SELECT
[Extent1].[MasterId] AS [MasterId],
[Extent1].[Name] AS [Name],
[Extent2].[SlaveId] AS [SlaveId],
[Extent2].[MasterId] AS [MasterId1],
[Extent2].[Name] AS [Name1],
[Extent2].[Age] AS [Age]
FROM [dbo].[Master] AS [Extent1]
LEFT OUTER JOIN [dbo].[Slave] AS [Extent2]
ON ([Extent1].[MasterId] = [Extent2].[MasterId]) AND ([Extent2].[Age] > 5)
Note that it is important to perform the additional where clause on the age on the joined collection and not between the from and the select.
EDIT:
IF you want a hierarchical result you can convert the flat list by performing a grouping:
var hierarchical = from line in query
group line by line.Master into grouped
select new { Master = grouped.Key, Slaves = grouped.Select(x => x.Slave).Where(x => x != null) };
foreach (var elem in hierarchical) {
Master master = elem.Master;
Console.WriteLine("{0}:", master.Name);
foreach (var s in elem.Slaves) // note that it says elem.Slaves not master.Slaves here!
Console.WriteLine("{0} at {1}", s.Name, s.Age);
}
Note that I used an anonymous type to store the hierarchical result. You can of course create also a specific type like this
class FilteredResult {
public Master Master { get; set; }
public IEnumerable<Slave> Slaves { get; set; }
}
and then project the group into instances of this class. That makes it easier if you need to pass these results to other methods.