DbSet<T>.Include() causes SELECT N+1 when used in extension method

DbSet<T>.Include() causes SELECT N+1 when used in extension method - entity-framework

I have an extension on IQueryable that allows passing in delimited string of property names which, when used causes query not to construct JOINs and effectively causes SELECT N+1 issue.
What I noticed is that if I call native EF extension .Include("property") directly off of DbSet everything works fine. But if I use my extension (I even simplified it to just call .Include("property") SELECT N+1 occurs...
My questions is why? What am I doing wrong?
Here is calling method (from service)
public MyModel[] GetAll(int page, out int total, int pageSize, string sort, string filter)
{
return _myModelRepository
.Get(page, out total, pageSize, sort, filter, "PropertyOnMyModelToInclude")
.ToArray();
}
Here is the repository method that uses extension
public virtual IQueryable<T> Get(int page, out int total, int pageSize, string sort, string filter = null, string includes = null)
{
IQueryable<T> query = DatabaseSet;
if (!String.IsNullOrWhiteSpace(includes))
{
//query.IncludeMany(includes); // BAD: SELECT N+1
//query.Include(includes); // BAD: SELECT N+1
}
if (!String.IsNullOrWhiteSpace(filter))
{
query.Where(filter);
}
total = query.Count(); // needed for pagination
var order = String.IsNullOrWhiteSpace(sort) ? DefaultOrderBy : sort;
var perPage = pageSize < 1 ? DefaultPageSize : pageSize;
//return query.OrderBy(order).Paginate(page, total, perPage); // BAD: SELECT N+1 (in both variations above)
//return query.IncludeMany(includes).OrderBy(order).Paginate(page, total, perPage); // BAD: SELECT N+1
return query.Include(includes).OrderBy(order).Paginate(page, total, perPage); // WORKS!
}
Here is the extension (reduced just to call Include() to illustrate the issue)
public static IQueryable<T> IncludeMany<T>(this IQueryable<T> query, string includes, char delimiter = ',') where T : class
{
// OPTION 1
//var propertiesToInclude = String.IsNullOrWhiteSpace(includes)
// ? new string[0]
// : includes.Split(new[] {delimiter}, StringSplitOptions.RemoveEmptyEntries).Select(p => p.Trim()).ToArray();
//foreach (var includeProperty in propertiesToInclude)
//{
// query.Include(includeProperty);
//}
// OPTION 2
//if (!String.IsNullOrWhiteSpace(includes))
//{
// var propertiesToInclude = includes.Split(new[] { delimiter }, StringSplitOptions.RemoveEmptyEntries).AsEnumerable(); //.Select(p => p.Trim());
// propertiesToInclude.Aggregate(query, (current, include) => current.Include(include));
//}
// OPTION 3 - for testing
query.Include(includes);
return query;
}

I think the fundamental problem here is in the way you are using the Include method, and also incidentally, the Where method. These methods, as is typical with LINQ extension methods, do not modify the object that they are called on. Instead they return a new object which represents the query after the operator has been applied. So, for example, in this code:
var query = SomeQuery();
query.Include(q => q.Bing);
return query;
the Include method basically does nothing because the new query returned by Include is thrown away. On the other hand, this:
var query = SomeQuery();
query = query.Include(q => q.Bing);
return query;
applies the Include to the query and then updates the query variable with the new query object returned from Include.
It's not in the code you have posted, but I think you are still seeing N+1 with your code because the Include is being ignored and the related collections are therefore still being loaded using lazy loading.

Related

Get return value from ExecuteSqlRaw in EF Core

I have an extremely large table that I'm trying to get the number of rows for. Using COUNT(*) is too slow, so I want to run this query using EF Core:
int count = _dbContext.Database.ExecuteSqlRaw(
"SELECT Total_Rows = SUM(st.row_count) " +
"FROM sys.dm_db_partition_stats st " +
"WHERE object_name(object_id) = 'MyLargeTable' AND(index_id < 2)");
The only problem is that the return value isn't the result of the query, but the number of records returned, which is just 1
Is there a way to get the correct value here, or will I need to use a different method?

Since you only need a scalar value you can also use an output parameter to retrieve the data, eg
var sql = #"
SELECT #Total_Rows = SUM(st.row_count)
FROM sys.dm_db_partition_stats st
WHERE object_name(object_id) = 'MyLargeTable' AND(index_id < 2)
";
var pTotalRows = new SqlParameter("#Total_Rows", System.Data.SqlDbType.BigInt);
pTotalRows.Direction = System.Data.ParameterDirection.Output;
db.Database.ExecuteSqlRaw(sql, pTotalRows);
var totalRos = (long?)(pTotalRows.Value == DBNull.Value ? null:pTotalRows.Value) ;

If one let's me to recreate a correct answer based on this blog: https://erikej.github.io/efcore/2020/05/26/ef-core-fromsql-scalar.html
We need to create a virtual entity model for our database, that will contain our needed query result, at the same time we need a pseudo DbSet<this virtual model> to use ef core FromSqlRaw method that returns data instead of ExecuteSqlRaw that just returns numbers of rows affected by query.
The example is for returning an integer value, but you can easily adapt it:
Define a return value holding class
public class IntReturn
{
public int Value { get; set; }
}
Fake a virtual DbSet<IntReturn> it will not be really present in db:
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
...
modelBuilder.Entity<IntReturn>().HasNoKey();
base.OnModelCreating(modelBuilder);
}
Now we can call FromSqlRaw for this virtual set. In this example the calling method is inside MyContext:DbContext, you'd need to instantiate your own context to use it instead of this):
NOTE the usage of "as Value" - same name as IntReturn.Value property. In some wierd cases you'd have to do the opposite: name your virtual model property after the name of the value thre database funstion is returning.
public int ReserveNextCustomerId()
{
var sql = $"Select nextval(pg_get_serial_sequence('\"Customers\"', 'Id')) as Value;";
var i = this.Set<IntReturn>()
.FromSqlRaw(sql)
.AsEnumerable()
.First().Value;
return i;
}

EF custom select with entity property as parameters

For every table (and I have a lot of them) I provide Lookup REST API method in my ASP.Net Core application.
This is part of my query for every table:
context.Users.Select(t => new ViewLookupModel()
{
id = t.Id,
title = t.DisplayName
})
//......
context.Groups.Select(t => new ViewLookupModel()
{
id = t.Id,
title = t.Name
})
But I want to write extension to shrink code I need to write for every table. Something like this:
public static IQueryable<ViewLookupModel> SelectLookup<T>(this IQueryable<T> query, Func<int> idField, Func<string> titleField)
{
return query.Select(t => new ViewLookupModel()
{
id = idField(),
title = titleField()
});
}
And use case:
context.Users.SelectLookup(t => t.Id, t => t.DisplayName)
//......
context.Groups.SelectLookup(t => t.Id, t => t.Title)
But I get error:
Delegate 'Func' does not take 1 arguments.
This and this question seems similar, but I can not get it to work.
I am also interesting in any performance issues when querying database with custom SELECT extension method.

Change your extension method to this and try. Extension method takes T as input and returns the corresponding int or string etc.
public static IQueryable<ViewLookupModel> SelectLookup<T>(this IQueryable<T> query, Func<T,int> idField, Func<T,string> titleField)
{
return query.Select(t => new ViewLookupModel()
{
id = idField(t),
title = titleField(t)
});
}

Linq expression with Join, and conditional OrderBy / OrderByDescending

I am building a Linq expression, I don't have the ability to use Dynamic Linq in this project. Here is my initial statement.
var orderListQuery = context.Orders.Where(oExpr)
.Join(context.Members,
o => o.MemberId,
m => m.Id,
(o, m) => new {Order = o, Member = m})
.Where(oM => oM.Member.CustomerId == custId);
The next part is where I would like to conditionally add a descending expression:
orderListQuery = descending
? orderListQuery.OrderByDescending(sortProperty)
: orderListQuery.OrderBy(sortProperty);
the error which I already saw a post on is that I need to explicitly list the <TSource, TKey>, but due to the complexity of the syntax from the join, I have no idea how to explicitly list these. I tried <Order, string>, which does not work.
Anything helps, thanks.

Because you forget to write what goal you are trying to reach (no specification), it is a bit unclear what your problem is, and how this can be solved. I assume that your problem is not the join (although it can be simplified), but the problem is how users of your method can provide a sort property.
How to specify and use the Sort Property
The problem is, that the caller of your function decides about the sort property. Of course you only want to make useful and very reusable functions, so you don't want to limit the user of your function, he can specify any sort property he wants, as long as it is something the result of your join can be sorted on.
The result of your join contains data from a sequence of Orders and a Sequence of members. It is obvious that your user can only request to sort the result of your join on (a combination of) values from Orders and or Members.
The key selector in Queryable.OrderBy has the format Expression<Func<TSource, TKey>> keySelector. In this format the TSource is the result of your join.
As the user can only sort on values from Orders and/or Members, his key selector should be a function that selects a key from Orders and/or Members
public MyResult MyQuery<TKey>(Expression<Func<Order, Member, TKey> keySelector,
SortOrder sortOrder)
{
... // TODO: your query?
}
Examples of usage would be:
// order by Order.Id:
var result = MyQuery( (order, member) => order.Id, SortOrder.Ascending);
// order by Member.Name:
var result = MyQuery( (order, member) => member.Name, SortOrder.Descending);
// order by something complicated
var result = MyQuery( (order, member) => new{Id = order.Id, Name = member.Name},
SortOrder.Ascending);
Now that the interface is specified, let's define some helper classes and fill your function
class MyResult
{
public Order Order {get; set;}
public Member Member {get; set;}
}
class SortableMyResult<TKey>
{
public TKey SortKey {get; set;}
public MyResult MyResult {get; set;}
}
MyResult MyQuery<TKey>(
Expression<Func<Order, Member, TKey> keySelector, SortOrder sortOrder)
{
var query = context.Orders.Where(oExpr)
.Join(context.Members,
order => order.MemberId,
member => member.Id,
(order, member) => new SortableMyResult<TKey>()
{
SortKey = KeySelector(order, member),
MyResult = new MyResult()
{
Order = order,
Member = member,
}
))
.Where(....);
What I've done, is that your original result is put in a MyResult object. I have also calculated the value of SortKey using the Expression that the caller provided, and put the SortKey and the MyResult in a SortableMyResult.
Did you notice that the return value of your Expression is a TKey, which is also the type of property SortKey?
Because I've defined the helper classes, my compiler can check that I did not make any errors.
Do the sorting:
IQueryable<SortableMyResult<TKey>> sortedMyResult;
switch (sortOrder)
{
case sortOrder.Ascending:
sortedMyResult = query.OrderBy(item => item.SortKey);
break;
case sortOrder.Descending:
sortedMyResult = query.OrderByDescending(item => item.SortKey);
break;
default: // do not order
sortedMyResult = query;
break;
}
Finally, extract and return MyResult:
return sortedMyResult.Select(sortedMyResult => sortedMyResult.MyResult);
If you want to make your function even more generic, you can let the caller give the opportunity to provide an IComparer (and if he doesn't, use the default comparer for TKey):
MyResult MyQuery<TKey>(
Expression<Func<Order, Member, TKey> keySelector,
SortOrder sortOrder,
IComparer<TKey> comparer = EqualityComparer<TKey>.Default)
{
...
}

Thanks for that incredible detailed answer. It has been a few months now, and more about MVC and the existing application code is becoming clear. What had happened in my case was that we had an extension method defined that allowed OrderBy and OrderByDescending to accept strings passed in by our sorting module from the front end. e.g:
public static IOrderedQueryable<T> OrderByDescending<T>(this IQueryable<T> source, string propertyName)
{
return GetExpression(source, propertyName);
}
This would then funnel it into a GetExpression function that returned an IOrderedQueryable<T>
I had not included the reference to this, and due to my inexperience with both the framework and the application I was cranking my head for essentially no reason.

Merging Entity Framework Expression Trees

I'm looking for a way to merge multiple expression trees in order to build selectors for an Entity Framework query. The query knows which columns to select based on user-provided parameters. For example, a basic query returns ID/Name columns of an entity. If a parameter is explicitly set to also retrieve the Description column, then the query will return ID/Name/Description.
So, what I need it the code for the MergeExpressions method in the following code.
Expression<Func<T, TDto>> selector1 = x => new TDto
{
Id = x.Id,
Name = x.Name
}
Expression<Func<T, TDto>> selector2 = x => new TDto
{
Description = x.Description
}
var selector = selector1;
if (includeDescription)
selector = MergeExpressions(selector1, selector2);
var results = repo.All().Select(selector).ToList();
Thank you.

Not sure for general case, but merging MemberInitExpression bodied lambdas like in your sample is relatively easy. All you need is to create another MemberInitExpression with combined Bindings:
static Expression<Func<TInput, TOutput>> MergeExpressions<TInput, TOutput>(Expression<Func<TInput, TOutput>> first, Expression<Func<TInput, TOutput>> second)
{
Debug.Assert(first != null && first.Body.NodeType == ExpressionType.MemberInit);
Debug.Assert(second != null && second.Body.NodeType == ExpressionType.MemberInit);
var firstBody = (MemberInitExpression)first.Body;
var secondBody = (MemberInitExpression)second.Body.ReplaceParameter(second.Parameters[0], first.Parameters[0]);
var body = firstBody.Update(firstBody.NewExpression, firstBody.Bindings.Union(secondBody.Bindings));
return first.Update(body, first.Parameters);
}
Note that the lambda expressions must be bound to one and the same parameters, so the above code uses the following parameter replacer helper to rebind second lambda body to the first lambda parameter:
public static partial class ExpressionUtils
{
public static Expression ReplaceParameter(this Expression expression, ParameterExpression source, Expression target)
{
return new ParameterReplacer { Source = source, Target = target }.Visit(expression);
}
class ParameterReplacer : ExpressionVisitor
{
public ParameterExpression Source;
public Expression Target;
protected override Expression VisitParameter(ParameterExpression node)
{
return node == Source ? Target : base.VisitParameter(node);
}
}
}

Check out PredicateBuilder.
Example:
Expression<Func<Customer, bool>> expr1 = (Customer c) => c.CompanyName.StartsWith("A");
Expression<Func<Customer, bool>> expr2 = (Customer c) => c.CompanyName.Contains("B");
var expr3 = PredicateBuilder.And(expr1, expr2);
var query = context.Customers.Where(expr3);
or
var expr3 = expr1.And(expr2);
var query = context.Customers.Where(expr3);

I do this kind of thing with extension methods. Its syntactically a bit nicer than using expression trees everywhere. I call this composable repositories.
I also wrote a tool (LinqExpander) to combine the expression trees of different extension methods togeather, which is especially useful for doing projection (selects) from your database. This is only nessacary when you are doing things with sub-entities. (see my post here: Composable Repositories - Nesting extensions)
usage would be something along the lines of:
var dtos = context.Table
.ThingsIWant() //filter the set
.ToDtos() //project from database model to something else (your Selector)
.ToArray();//enumerate the set
ToDtos might look something like:
public static IQueryable<DtoType> ToDtos(this IQueryable<DatabaseType> things)
{
return things.Select(x=> new DtoType{ Thing = x.Thing ... });
}
You want to merge two selects togeather (im assuming to avoid an underfetch but this seems a bit wierd). I would do this by using a projection like this:
context.Table
.AsExpandable()
.Select(x=>new {
Dto1 = x.ToDto1(),
Dto2 = x.ToDto2()
})
.ToArray();
if you really wanted it to return a single entity like this you could probably do something like:
context.Table
.AsExpandable()
.Select(x=> ToDto1(x).ToDto2(x));
but I havent ever tried this.
As this uses a sub projection you will need the .AsExpandable extensions.

Combining Includes with Entity Framework

I generally use a generic repository to boilerplate my EF queries so I have to write limited code and also use caching. The source code for the repository can be found here.
The backbone query within the code is this one below. FromCache<T>() is an IEnumerable<T> extension method that utilizes the HttpContext.Cache to store the query using a stringified representation of the lambda expression as a key.
public IQueryable<T> Any<T>(Expression<Func<T, bool>> expression = null)
where T : class, new()
{
// Check for a filtering expression and pull all if not.
if (expression == null)
{
return this.context.Set<T>()
.AsNoTracking()
.FromCache<T>(null)
.AsQueryable();
}
return this.context.Set<T>()
.AsNoTracking<T>()
.Where<T>(expression)
.FromCache<T>(expression)
.AsQueryable<T>();
}
Whilst this all works it is subject to the N+1 problem for related tables since If I were to write a query like so:
var posts = this.ReadOnlySession.Any<Post>(p => p.IsDeleted == false)
.Include(p => p.Author);
The Include() will have no effect on my query since it has already been run in order to be cached.
Now I know that I can force Entity Framework to use eager loading within my model by removing the virtual prefix on my navigation properties but that to me feels like the wrong place to do it as you cannot predict the types of queries you will be making. To me it feels like something I would be doing in a controller class. What I am wondering is whether I can pass a list of includes into my Any<T>() method that I could then iterate though when I make the call?

ofDid you mean something like...
IQueryable<T> AnyWithInclude<T,I>(Expression<Func<T,bool>> predicate,
Expression<Func<T,I>> includeInfo)
{
return DbSet<T>.where(predicate).include(includeInfo);
}
the call
Context.YourDbSetReference.AnyWithInclude(t => t.Id==someId, i => i.someNavProp);
In response to extra question on as collection.
I realised late, there was an overload on Property. You can just pass a string
This might work but call is not easy. Well I find it hard.
IQueryable<T> GetListWithInclude<I>(Expression<Func<T, bool>> predicate,
params Expression<Func<T, I>>[] IncludeCollection);
so i tried
public virtual IQueryable<T> GetListWithInclude(Expression<Func<T, bool>> predicate,
List<string> includeCollection)
{ var result = EntityDbSet.Where(predicate);
foreach (var incl in includeCollection)
{
result = result.Include(incl);
}
return result;
}
and called with
var ic = new List<string>();
ic.Add("Membership");
var res = context.DbSte<T>.GetListWithInclude( t=>t.UserName =="fred", ic);
worked as before.

In the interest of clarity I'm adding the solution I came up with based upon #soadyp's answer.
public IQueryable<T> Any<T>(Expression<Func<T, bool>> expression = null,
params Expression<Func<T, object>>[] includeCollection)
where T : class, new()
{
IQueryable<T> query = this.context.Set<T>().AsNoTracking().AsQueryable<T>();
if (includeCollection.Any())
{
query = includeCollection.Aggregate(query,
(current, include) => current.Include(include));
}
// Check for a filtering expression and pull all if not.
if (expression != null)
{
query = query.Where<T>(expression);
}
return query.FromCache<T>(expression, includeCollection)
.AsQueryable<T>();
}
Usage:
// The second, third, fourth etc parameters are the strongly typed includes.
var posts = this.ReadOnlySession.Any<Post>(p => p.IsDeleted == false,
p => p.Author);