How to write row_number() over clause in linq to Entity Framework

How to write row_number() over clause in linq to Entity Framework - entity-framework

Using Entity Framework and LINQ, I need to get row number together with an entity, e.g. for loan I have multiple invoices and I want to select specific invoice together with its sequence number.
Basically, I'd need to know how to write equivalent to this:
select
nr, i.*
from
[Invoices] i
inner join
(select
row_number() over (order by IssueDate) nr, id
from
[Invoices]
where
LoanId = 5) t on t.id = i.id
where
i.id = 207

According to this post ROW_NUMBER is not supported in L2E. If you don't mind the overhead of loading all invoices for a given LoanId into memory, then it can be be done easily in C# with the overload of the Select method on IEnumerable that produces an index for each item, e.g.:
//First select the invoices
var invoices = from i in dbContext.Invoices
where i.LoanId == 5
order by i.IssueDate
select i;
var indexedInvoice = invoices.ToList().Select((i, count) => new { Invoice = i, RowCount = count })
.First(ii => ii.Invoice.Id == 207);
I can see how this can be less than optimal in some situations, so you might consider bypassing L2E here and execute your query as a plain old sql string, depending on how performance-critical it is, and on how many invoices there usually are for a single loan.

Related

Query produced for IN filter on 1-1 relation joins to parent table twice

I have this problem and reproduced it with AdventureWorks2008R2 to make it more easy. Basically, I want to filter a parent table for a list of IN values and I thought it would generate this type of query
but it doesn't.
SELECT * FROM SalesOrderDetail where EXISTS( select * from SalesOrderHeader where d.id=h.id and rowguid IN ('asdf', 'fff', 'weee' )
Any ideas how to change the LINQ statement to query Header only once?
(ignore the fact I'm matching on Guids - it will actually be integers; I was just quickly looking for a 1-1 table in EF because that's when the problem occurs and I happened to find these)
var guidsToFind = new Guid[] { Guid.NewGuid(), Guid.NewGuid(), Guid.NewGuid()};
AdventureWorks2008R2Entities context = new AdventureWorks2008R2Entities();
var g = context.People.Where(p => guidsToFind.Contains(p.BusinessEntity.rowguid)).ToList();
That produces the following more expensive query:
SELECT [Extent1].[BusinessEntityID] AS [BusinessEntityID],
[Extent1].[PersonType] AS [PersonType],
[Extent1].[NameStyle] AS [NameStyle],
[Extent1].[Title] AS [Title],
[Extent1].[FirstName] AS [FirstName],
[Extent1].[MiddleName] AS [MiddleName],
[Extent1].[LastName] AS [LastName],
[Extent1].[Suffix] AS [Suffix],
[Extent1].[EmailPromotion] AS [EmailPromotion],
[Extent1].[AdditionalContactInfo] AS [AdditionalContactInfo],
[Extent1].[Demographics] AS [Demographics],
[Extent1].[rowguid] AS [rowguid],
[Extent1].[ModifiedDate] AS [ModifiedDate]
FROM [Person].[Person] AS [Extent1]
INNER JOIN [Person].[BusinessEntity] AS [Extent2] ON [Extent1].[BusinessEntityID] = [Extent2].[BusinessEntityID]
LEFT OUTER JOIN [Person].[BusinessEntity] AS [Extent3] ON [Extent1].[BusinessEntityID] = [Extent3].[BusinessEntityID]
WHERE [Extent2].[rowguid] = cast('b95b63f9-6304-4626-8e70-0bd2b73b6b0f' as uniqueidentifier) OR [Extent3].[rowguid] IN (cast('f917a037-b86b-4911-95f4-4afc17433086' as uniqueidentifier),cast('3188557d-5df9-40b3-90ae-f83deee2be05' as uniqueidentifier))

Really odd. Looks like a LINQ limitation.
I don't have a system to try this on right now but if you first get a list of BusinessEntityId values based on the provided guids and then get the persons like this
var g = context.People.Where(p => businessEntityIdList.Contains(p.BusinessEntityId)).ToList();
there should not be a reason for additional unnecessary joins anymore.
If that works, you can try to combine the to steps into one LINQ expression to see if the separation stays intact.

JPA Query over a join table

I have 3 tables like:
A AB B
------------- ------------ ---------------
a1 a1,b1 b1
AB is a transition table between A and B
With this, my classes have no composition within these two classes to each other. But I want to know that , with a JPQL Query, if any records exist for my element from A table in AB table. Just number or a boolean value is what I need.
Because AB is a transition table, there is no model object for it and I want to know if I can do this with a #Query in my Repository object.

the AB table must be modeled in an entity to be queried in JPQL. So you must model this as
an own entity class or an association in your A and or your B entity.

I suggest to use Native query method intead of JPQL (JPA supports Native query too). Let us assume table A is Customer and table B is a Product and AB is a Sale. Here is the query for getting list of products which are ordered by a customer.
entityManager.createNativeQuery("SELECT PRODUCT_ID FROM
SALE WHERE CUSTOMER_ID = 'C_123'");

Actually, the answer to this situation is simpler than you might think. It's a simple matter of using the right tool for the right job. JPA was not designed for implementing complicated SQL queries, that's what SQL is for! So you need a way to get JPA to access a production-level SQL query;
em.createNativeQuery
So in your case what you want to do is access the AB table looking only for the id field. Once you have retrieved your query, take your id field and look up the Java object using the id field. It's a second search true, but trivial by SQL standards.
Let's assume you are looking for an A object based on the number of times a B object references it. Say you are wanting a semi-complicated (but typical) SQL query to group type A objects based on the number of B objects and in descending order. This would be a typical popularity query that you might want to implement as per project requirements.
Your native SQL query would be as such:
select a_id as id from AB group by a_id order by count(*) desc;
Now what you want to do is tell JPA to expect the id list to comeback in a form that that JPA can accept. You need to put together an extra JPA entity. One that will never be used in the normal fashion of JPA. But JPA needs a way to get the queried objects back to you. You would put together an entity for this search query as such;
#Entity
public class IdSearch {
#Id
#Column
Long id;
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
}
Now you implement a little bit of code to bring the two technologies together;
#SuppressWarnings("unchecked")
public List<IdSearch> findMostPopularA() {
return em.createNativeQuery("select a_id as id from AB group by a_id
order by count(*) desc", IdSearch.class).getResultList();
}
There, that's all you have to do to get JPA to get your query completed successfully. To get at your A objects you would simply cross reference into your the A list using the traditional JPA approach, as such;
List<IdSearch> list = producer.getMostPopularA();
Iterator<IdSearch> it = list.iterator();
while ( it.hasNext() ) {
IdSearch a = it.next();
A object = em.find(A.class,a.getId());
// your in business!
Still, a little more refinement of the above can simplify things a bit further actually given the many many capabilities of the SQL design structure. A slightly more complicated SQL query will an even more direct JPA interface to your actual data;
#SuppressWarnings("unchecked")
public List<A> findMostPopularA() {
return em.createNativeQuery("select * from A, AB
where A.id = AB.a_id
group by a_id
order by count(*) desc", A.class).getResultList();
}
This removes the need for an interm IdSearch table!
List<A> list = producer.getMostPopularA();
Iterator<A> it = list.iterator();
while ( it.hasNext() ) {
A a = it.next();
// your in business!
What may not be clear tot the naked eye is the wonderfully simplified way JPA allows you to make use of complicated SQL structures inside the JPA interface. Imagine if you an SQL as follows;
SELECT array_agg(players), player_teams
FROM (
SELECT DISTINCT t1.t1player AS players, t1.player_teams
FROM (
SELECT
p.playerid AS t1id,
concat(p.playerid,':', p.playername, ' ') AS t1player,
array_agg(pl.teamid ORDER BY pl.teamid) AS player_teams
FROM player p
LEFT JOIN plays pl ON p.playerid = pl.playerid
GROUP BY p.playerid, p.playername
) t1
INNER JOIN (
SELECT
p.playerid AS t2id,
array_agg(pl.teamid ORDER BY pl.teamid) AS player_teams
FROM player p
LEFT JOIN plays pl ON p.playerid = pl.playerid
GROUP BY p.playerid, p.playername
) t2 ON t1.player_teams=t2.player_teams AND t1.t1id <> t2.t2id
) innerQuery
GROUP BY player_teams
The point is that with createNativeQuery interface, you can still retrieve precisely the data you are looking for and straight into the desired object for easy access by Java.
#SuppressWarnings("unchecked")
public List<A> findMostPopularA() {
return em.createNativeQuery("SELECT array_agg(players), player_teams
FROM (
SELECT DISTINCT t1.t1player AS players, t1.player_teams
FROM (
SELECT
p.playerid AS t1id,
concat(p.playerid,':', p.playername, ' ') AS t1player,
array_agg(pl.teamid ORDER BY pl.teamid) AS player_teams
FROM player p
LEFT JOIN plays pl ON p.playerid = pl.playerid
GROUP BY p.playerid, p.playername
) t1
INNER JOIN (
SELECT
p.playerid AS t2id,
array_agg(pl.teamid ORDER BY pl.teamid) AS player_teams
FROM player p
LEFT JOIN plays pl ON p.playerid = pl.playerid
GROUP BY p.playerid, p.playername
) t2 ON t1.player_teams=t2.player_teams AND t1.t1id <> t2.t2id
) innerQuery
GROUP BY player_teams
", A.class).getResultList();
}

How to ORDER BY non-column field?

I am trying to create an Entity SQL that is a union of two sub-queries.
(SELECT VALUE DISTINCT ROW(e.ColumnA, e.ColumnB, 1 AS Rank) FROM Context.Entity AS E WHERE ...)
UNION ALL
(SELECT VALUE DISTINCT ROW(e.ColumnA, e.ColumnB, 2 AS Rank) FROM Context.Entity AS E WHERE ...)
ORDER BY *??* LIMIT 50
I have tried:
ORDER BY Rank
and
ORDER BY e.Rank
but I keep getting:
System.Data.EntitySqlException: The query syntax is not valid. Near keyword 'ORDER'
Edit:
This is Entity Framework. In C#, the query is executed using:
var esql = "...";
ObjectParameter parameter0 = new ObjectParameter("p0", value1);
ObjectParameter parameter1 = new ObjectParameter("p1", value2);
ObjectQuery<DbDataRecord> query = context.CreateQuery<DbDataRecord>(esql, parameter0, parameter1);
var queryResults = query.Execute(MergeOption.NoTracking);
There is only a small portion of my application where I have to resort to using Entity SQL. Generally, the main use case is when I need to do: "WHERE Column LIKE 'A % value % with % multiple % wildcards'".
I do not think it is a problem with the Rank column. I do think it is how I am trying to apply an order by to two different esql statements joined by union all. Could someone suggest:
How to apply a ORDER BY to this kind of UNION/UNION ALL statment
How to order by the non-entity column expression.
Thanks.

Convert SQL to LINQ, nested select, top, "distinct" using group by and multiple order bys

I have the following SQL query, which I'm struggling to convert to LINQ.
Purpose: Get the top 10 coupons from the table, ordered by the date they expire (i.e. list the ones that are about to expire first) and then randomly choosing one of those for publication.
Notes: Because of the way the database is structured, there maybe duplicate Codes in the Coupon table. Therefore, I am using a GROUP BY to enforce distinction, because I can't use DISTINCT in the sub select query (which I think is correct). The SQL query works.
SELECT TOP 1
c1.*
FROM
Coupon c1
WHERE
Code IN (
SELECT TOP 10
c2.Code
FROM
Coupon c2
WHERE
c2.Published = 0
GROUP BY
c2.Code,
c2.Expires
ORDER BY
c2.Expires
)
ORDER BY NEWID()
Update:
This is as close as I have got, but in two queries:
var result1 = (from c in Coupons
where c.Published == false
orderby c.Expires
group c by new { c.Code, c.Expires } into coupon
select coupon.FirstOrDefault()).Take(10);
var result2 = (from c in result1
orderby Guid.NewGuid()
select c).Take(1);

Here's one possible way:
from c in Coupons
from cs in
((from c in coupons
where c.published == false
select c).Distinct()
).Take(10)
where cs.ID == c.ID
select c
Keep in mind that LINQ creates a strongly-typed data set, so an IN statement has no general equivalent. I understand trying to keep the SQL tight, but LINQ may not be the best answer for this. If you are using MS SQL Server (not SQL Server Compact) you might want to consider doing this as a Stored Procedure.

Using MercurioJ's slightly buggy response, in combination with another SO suggested random row solution my solution was:
var result3 = (from c in _dataContext.Coupons
from cs in
((from c1 in _dataContext.Coupons
where
c1.IsPublished == false
select c1).Distinct()
).Take(10)
where cs.CouponId == c.CouponId
orderby _dataContext.NewId()
select c).Take(1);

Paging in Entity Framework

In Entity Framework, using LINQ to Entities, database paging is usually done in following manner:
int totalRecords = EntityContext.Context.UserSet.Count;
var list = EntityContext.Context.UserSet
.Skip(startingRecordNumber)
.Take(pageSize)
.ToList();
This results in TWO database calls.
Please tell, how to reduce it to ONE database call.
Thank You.

Whats wrong with two calls? They are small and quick queries. Databases are designed to support lots of small queries.
A developing a complex solution to do one query for paging isn't going give you much pay off.

Using Esql and mapping a stored procedure to an entity can solve the problem.
SP will return totalRows as output parameter and current page as resultset.
CREATE PROCEDURE getPagedList(
#PageNumber int,
#PageSize int,
#totalRecordCount int OUTPUT
AS
//Return paged records
Please advise.
Thank You.

Hmmm... the actual call that uses paging is the second one - that's a single call.
The second call is to determine the total number of rows - that's quite a different operation, and I am not aware of any way you could combine those two distinct operations into a single database call with the Entity Framework.
Question is: do you really need the total number of rows? What for? Is that worth a second database call or not?
Another option you would have is to use the EntityObjectSource (in ASP.NET) and then bind this to e.g. a GridView, and enable AllowPaging and AllowSorting etc. on the GridView, and let the ASP.NET runtime handle all the nitty-gritty work of retrieving the appropriate data page and displaying it.
Marc

ALTER proc [dbo].[GetNames]
#lastRow bigint,
#pageSize bigint,
#totalRowCount bigint output
as
begin
select #totalRowCount = count(*) from _firstNames, _lastNames
select
FirstName,
LastName,
RowNumber
from
(
select
fn.[FirstName] as FirstName,
ln.[Name] as LastName,
row_number() over( order by FirstName ) as RowNumber
from
_firstNames fn, _lastNames ln
) as data
where
RowNumber between ( #lastRow + 1 ) and ( #lastRow + #pageSize )
end
There is no way to get this into one call, but this works fast enough.

This queries are too small for DBManager and I can not understand why you want to do this, anyway for reduce it to ONE database call use this:
var list = EntityContext.Context.UserSet
.Skip(startingRecordNumber)
.Take(pageSize)
.ToList();
int totalRecords = list.Count;

Suppose you want to get the details of Page 2 with a pagesize=4
int page =2;
int pagesize=4;
var pagedDetails= Categories.Skip(pagesize*(page-1)).Take(pagesize)
.Join(Categories.Select(item=>new {item.CategoryID,Total = Categories.Count()}),x=>x.CategoryID,y=>y.CategoryID,(x,y)=>new {Category = x,TotalRows=y.Total});
The Output will have all details of Category and TotalRows.
One DB call.
Generated SQL
-- Region Parameters
DECLARE #p0 Int = 2
DECLARE #p1 Int = 4
-- EndRegion
SELECT [t2].[CategoryID], [t2].[CategoryName], [t2].[Description], [t2].[Picture], [t5].[value] AS [TotalRows]
FROM (
SELECT [t1].[CategoryID], [t1].[CategoryName], [t1].[Description], [t1].[Picture], [t1].[ROW_NUMBER]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CategoryID], [t0].[CategoryName]) AS [ROW_NUMBER], [t0].[CategoryID], [t0].[CategoryName], [t0].[Description], [t0].[Picture]
FROM [Categories] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN #p0 + 1 AND #p0 + #p1
) AS [t2]
INNER JOIN (
SELECT [t3].[CategoryID], (
SELECT COUNT(*)
FROM [Categories] AS [t4]
) AS [value]
FROM [Categories] AS [t3]
) AS [t5] ON [t2].[CategoryID] = [t5].[CategoryID]
ORDER BY [t2].[ROW_NUMBER]