Retrieve the latest row in a group using GROUP BY [duplicate] - entity-framework-core

This question already has an answer here:
How to select top N rows for each group in a Entity Framework GroupBy with EF 3.1
(1 answer)
Closed 1 year ago.
I'm trying to retrieve the latest row in a group base on the CreatedDate field.
How can this query be rewritten for EF Core 5 in a way which is not going to throw this Exception?
".OrderByDescending(s => s.CreatedDate)' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable'"?
Inserting AsEnumerable works, but this would be terrible solution.
var entries = _dbContext.MyTable
.GroupBy(s => s.Something)
.Select(g => g.OrderByDescending(s => s.CreatedDate).First())
.ToListAsync();

Faster solution is via ROW_NUMBER function. But you can create workaround:
var mainQuery = _dbContext.MyTable;
var query =
from d in mainQuery.Select(s => new {s.Something}).Distinct()
from x in mainQuery.Where(x => x.Something == d.Something)
.OrderByDescending(x.CreateDate)
.Take(1)
select x;
var result = await query.ToListAsync();

Related

Entity framework 5.0 First or Group By Issue- After upgrading from 2.2 to 5.0

I have a table called Products and I need to find the products with unique title for a particular category. Earlier we used to do with this query in entity framework core 2.2 :
currentContext.Products
.GroupBy(x => x.Title)
.Select(x => x.FirstOrDefault()))
.Select(x => new ProductViewModel
{
Id = x.Id,
Title = x.Title,
CategoryId= x.CategoryId
}).ToList();
But after upgrading to Entity Framework Core 5.0, we get an error for Groupby Shaker exception:
The LINQ expression 'GroupByShaperExpression:KeySelector: t.title, ElementSelector:EntityShaperExpression: EntityType: Project ValueBufferExpression: ProjectionBindingExpression: EmptyProjectionMember IsNullable: False .FirstOrDefault()' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'.
I know there are multiple way to client projection but I am searching for most efficient way to search.
Most likely that LINQ query couldn't be translated in EF Core 2.2 either, because of some limitations that the GroupBy operator has.
From the docs:
Since no database structure can represent an IGrouping, GroupBy operators have no translation in most cases. When an aggregate operator is applied to each group, which returns a scalar, it can be translated to SQL GROUP BY in relational databases. The SQL GROUP BY is restrictive too. It requires you to group only by scalar values. The projection can only contain grouping key columns or any aggregate applied over a column.
What happened in EF Core 2.x is that whenever it couldn't translate an expression, it would automatically switch to client evaluation and give just a warning.
This is listed as the breaking change with highest impact when migrating to EF Core >= 3.x :
Old behavior
Before 3.0, when EF Core couldn't convert an expression that was part of a query to either SQL or a parameter, it automatically evaluated the expression on the client. By default, client evaluation of potentially expensive expressions only triggered a warning.
New behavior
Starting with 3.0, EF Core only allows expressions in the top-level projection (the last Select() call in the query) to be evaluated on the client. When expressions in any other part of the query can't be converted to either SQL or a parameter, an exception is thrown.
So if the performance of that expression was good enough when using EF Core 2.x, it will be as good as before if you decide to explicitly switch to client evaluation when using EF Core 5.x. That's because both are client evaluated, before and now, with the only difference being that you have to be explicit about it now. So the easy way out, if the performance was acceptable previously, would be to just client evaluate the last part of the query using .AsEnumerable() or .ToList().
If client evaluation performance is not acceptable (which will imply that it wasn't before the migration either) then you have to rewrite the query. There are a couple of answers by Ivan Stoev that might get you inspired.
I am a little confused by the description of what you want to achieve: I need to find the products with unique title for a particular category and the code you posted, since I believe it's not doing what you explained. In any case, I will provide possible solutions for both interpretations.
This is my attempt of writing a query to find the products with unique title for a particular category.
var uniqueProductTitlesForCategoryQueryable = currentContext.Products
.Where(x => x.CategoryId == categoryId)
.GroupBy(x => x.Title)
.Where(x => x.Count() == 1)
.Select(x => x.Key); // Key being the title
var productsWithUniqueTitleForCategory = currentContext.Products
.Where(x => x.CategoryId == categoryId)
.Where(x => uniqueProductTitlesForCategoryQueryable .Contains(x.Title))
.Select(x => new ProductViewModel
{
Id = x.Id,
Title = x.Title,
CategoryId= x.CategoryId
}).ToList();
And this is my attempt of rewriting the query you posted:
currentContext.Products
.Select(product => product.Title)
.Distinct()
.SelectMany(uniqueTitle => currentContext.Products.Where(product => product.Title == uniqueTitle ).Take(1))
.Select(product => new ProductViewModel
{
Id = product.Id,
Title = product.Title,
CategoryId= product.CategoryId
})
.ToList();
I am getting the distinct titles in the Product table and per each distinct title I get the first Product that matches it (that should be equivalent as GroupBy(x => x.Title)+ FirstOrDefault AFAIK). You could add some sorting before the Take(1) if needed.
You can use Join for this query as below :
currentContext.Products
.GroupBy(x => x.Title)
.Select(x => new ProductViewModel()
{
Title = x.Key,
Id = x.Min(b => b.Id)
})
.Join(currentContext.Products, a => a.Id, b => b.Id,
(a, b) => new ProductViewModel()
{
Id = a.Id,
Title = a.Title,
CategoryId = b.CategoryId
}).ToList();
If you watch or log translated SQL query, it would be as below:
SELECT [t].[Title], [t].[c] AS [Id], [p0].[CategoryId] AS [CategoryId]
FROM (
SELECT [p].[Title], MIN([p].[Id]) AS [c]
FROM [Product].[Products] AS [p]
GROUP BY [p].[Title]
) AS [t]
INNER JOIN [Product].[Products] AS [p0] ON [t].[c] = [p0].[Id]
As you can see, the entire query is translated into one SQL query and it is highly efficient because GroupBy operation is being performed in database and no additional record is fetched by the client.
As mentioned by Ivan Stoev, EFC 2.x just silently loads full table to the client side and then apply needed logic for extracting needed result. It is resource consuming way and thanks that EFC team uncovered such potential harmful queries.
Most effective way is already known - raw SQL and window functions. SO is full of answers like this.
SELECT
s.Id,
s.Title,
s.CategoryId
FROM
(SELECT
ROW_NUMBER() OVER (PARTITION BY p.Title ORDER BY p.Id) AS RN,
p.*
FROM Products p) s
WHERE s.RN = 1
Not sure that EFC team will invent universal algorithm for generating such SQL in nearest future, but for special edge cases it is doable and maybe it is their plan to do that for EFC 6.0
Anyway if performance and LINQ is priority for such question, I suggest to try our adaptation of linq2db ORM for EF Core projects: linq2db.EntityFrameworkCore
And you can get desired result without leaving LINQ:
urrentContext.Products
.Select(x => new
{
Product = x,
RN = Sql.Ext.RowNumber().Over()
.PartitionBy(x.Title)
.OrderBy(x.Id)
.ToValue()
})
.Where(x => x.RN == 1)
.Select(x => x.Product)
.Select(x => new ProductViewModel
{
Id = x.Id,
Title = x.Title,
CategoryId = x.CategoryId
})
.ToLinqToDB()
.ToList();
Short answer is you deal with breaking changes in EF Core versions.
You should consider the total API and behavior changes for migration from 2.2 to 5.0 as I provided bellow:
Breaking changes included in EF Core 3.x
Breaking changes in EF Core 5.0
You may face other problems to write valid expressions using the newer version. In my opinion, upgrading to a newer version is not important itself. This is important to know how to work with a specific version.
You should use .GroupBy() AFTER materialization. Unfortunately, EF core doesn't support GROUP BY. In version 3 they introduced strict queries which means you can not execute IQeuriables that can't be converted to SQL unless you disable this configuration (which is not recommended). Also, I'm not sure what are you trying to get with GroupBy() and how it will influence your final result. Anyway, I suggest you upgrade your query like this:
currentContext.Products
.Select(x=> new {
x.Id,
x.Title,
x.Category
})
.ToList()
.GroupBy(x=> x.Title)
.Select(x => new Wrapper
{
ProductsTitle = x.Key,
Products = x.Select(p=> new ProductViewModel{
Id = p.Id,
Title = p.Title,
CategoryId= p.CategoryId
}).ToList()
}).ToList();

Table value parameter to joinable IQueryable

So for various reasons we need to send a large list of Ids to a EF6 query.
queryable.Where(x => list.Contains(x.Id));
is not ideal since it will create a huge were list.
So I was thinking, would it be possible some homehow to pass a table value parameter with the ids and get a IQueryable back that I can join against?
something like (Pseudo code)
var queryable = TableValueToIQueryable<MyTableValueType>(ids);
context.Set<MyEntity>().Join(queryable, x => x.Id, x.Value, (entity, id) => entity);
Is this possible somehow?
update: I have been able to use EntityFramework.CodeFirstStoreFunctions to execute a sql function and map the data to IQueryable<MyEntity>. it uses CreateQuery and ObjectParameters, can I use table value params somehow with ObjectParamters?
update2: Set().SqlQuery(...) will work with Table value parameters, but the resulting DbSqlQuery is not Joinable in SQL with a IQueryably so the result will be two connections and the join is done in memory
var idResult = Set<IdFilter>().SqlQuery("select * from GetIdFilter(#ids)", parameter);
var companies = idResult.Join(Set<tblCompany>(), x => x.Id, y => y.CompanyID, (filter, company) => company).ToList();
update3: ExecuteStoreQuery
((IObjectContextAdapter)ctx).ObjectContext.ExecuteStoreQuery<InvoicePoolingContext.IdFilter>("select * from dbo.GetIdFilter(#ids)", parameter)
.Join(ctx.Set<tblCompany>(), x => x.Id, y => y.CompanyID, (filter, company) => company).ToList();
Gives error:
There is already an open DataReader associated with this Command which
must be closed first.

How to group by day of week using linq

I'm trying to count some records that were updated this week and group them by the day of week (depending when they were last updated). E.g.So Tues:1, Thur:4 Fri:5 etc... I'm not sure how to group by day of week.
var data = repo
.Where(o => o.LastUpdated >= monday)
.GroupBy(o => o.LastUpdated)
.Select(g => new { DayOfWeek = g.Key, Count = g.Count() })
.ToList();
I've tried .GroupBy(o => o.LastUpdated.DayOfWeek but that throws an error :
"The specified type member 'DayOfWeek' is not supported in LINQ to Entities"
If you are targeting only SqlServer database type, you can use SqlFunctions.DatePart canonical function like this
var data = repo
.Where(o => o.LastUpdated >= monday)
.GroupBy(o => SqlFunctions.DatePart("weekday", o.LastUpdated))
.Select(g => new { DayOfWeek = g.Key, Count = g.Count() })
.ToList();
Unfortunately there is no such general canonical function defined in DbFunctions, so if you are targeting another database type (or multiple database types), the only option is to switch to Linq To Objects context as described in another answer.
The message is explicit, Entity Framework doesn't know how to translate "DayOfWeek" to SQL. The simplest solution would be to do the grouping outside of SQL after retrieving the data:
var data = repo
.Where(o => o.LastUpdated >= monday)
.AsEnumerable() // After this everything uses LINQ to Objects and is executed locally, not on your SQL server
.GroupBy(o => o.LastUpdated)
.Select(g => new { DayOfWeek = g.Key, Count = g.Count() })
.ToList();
It should hardly have a performance impact either way as you're not filtering further down so you're not retrieving more data than you need from the server, anything past AsEnumerable is materialized as data, anything before just générâtes a SQL query, so past AsEnumerable (or anything else that would materialize the query like ToArray or ToList) you can use anything you'd normally use in C# without worrying about it being translatable to SQL.
It is only possible to lastupdated column datatype of datetime.
var data = repo.Where(o => o.LastUpdated >= monday).AsEnumerable().GroupBy(o => o.LastUpdated.Value.Day).Select(g => new { DayOfWeek = g.Key, Count = g.Count() }).ToList();

how to select distinct with paging in entity framework?

how to select distinct with paging in entity framework?
i try to code below
var ll = _ctx.Cwzz_AccVouchMain.Select(v => v.Ddate).Distinct();
var l = ll.Skip(start).Take(limit).ToList();
but error:
must call orderBy method before skip
but my try
var ll = _ctx.Cwzz_AccVouchMain.Select(v => v.Ddate).Distinct();
var l = ll.OrderBy(v => v.Year).ThenBy(v => v.Month).ThenBy(v => v.Date).Skip(start).Take(limit).ToList();
error
ystem.NotSupportedException: LINQ to Entities not suppor type of "Date”。only support initial settlement,entity memeber,entity navagation property.
how to do?
Try this instead :
var ll = _ctx.Cwzz_AccVouchMain.Select(v => v.Ddate).Distinct();
var l = ll.OrderBy(v => v).Skip(start).Take(limit).ToList();
When you try to order by year, month and date, your query is not yet executed, and when .ToList() triggers it, it tries to build the appropriate sql query before sending it to your database server. However, your db has no clue about a Ddate.Year, Ddate.Month or Ddate.Date, because on the db side the Ddate field is a simple date, he doesn't understand your object with properties like the DateTime you use in C#.
If you wanted to order by month only (for example), you would have to trigger your query before that.

groupby to multiple fields in LINQ Currently - i have linq something like this [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
linq + groupby - add fields in select query
I wanted to group by with multiple fields something like q = q.GroupBy(c => c.Id, c.name,c.age,c.dob)
also how to put them in select query ?? such that i will get the newly added fields in select query also.
q = q.GroupBy(c => c.Id)
.Select(g => new View
{
Id = g.Key,
ENAME= string.Join(",", g.Select(x => x.CaseApprover).ToList())
});
To group by multiple fields you do this:
var query = q.GroupBy(c => new { c.Id, c.name, c.age, c.dob });
You'll have to clarify the second part of your question (as per my comment above) if you want the answer to the rest.