Translate nested query SQL to EF - entity-framework

I have this table structure:
Classic many to many relationship. I want to get all the orders for products belonging to the category for a small number of products I provide. It may be easier to show the SQL that does exactly what I want:
select o.*
from [Order] o join Product p2 on o.FKCatalogNumber=p2.CatalogNumber
where p2.FKCategoryId IN
(select c.Id
from Category c join Product p1 on p1.FKCategoryId=c.Id
where p1.CatalogNumber in ('0001', '0002')
This example gives me all the orders belonging to the categories that catalog #'s 0001 and 0002 are in.
But I am unable to wrap my head around the equivalent EF syntax for this query. I'm embarrassed to say I spent half the day on this. I bet it's easy for someone out there.
I came up with this but it's not working (and probably not even close):
string[] catNumbers = {"0001", "0002"};
var orders = ctx.Categories
.SelectMany(c => c.Products, (c, p) => new {c, p})
.Where(#t => catNumbers.Contains(#t.p.CatalogNumber))
.Select(#t => #t.p.Orders)
.ToList();

You can use query syntax (which looks very similar to SQL) in LINQ, so if you're more comfortable with SQL then you may prefer to write your query like this:
string[] catNumbers = {"0001", "0002"};
var orders = from o in ctx.Orders
join p2 in ctx.Products on o.FKCatalogNumber equals p2.CatalogNumber
where
(
from c in ctx.Categories
join p1 in ctx.Products on c.ID equals p1.FKCategoryId
where catNumbers.Contains(p1.CatalogNumber)
select c.ID
).Contains(p2.FKCategoryId)
select o;
As you can see, it's actually just your SQL query rearranged slightly, but it compiles as C#.
Note that:
the [Order] o syntax for referencing tables is replaced by o in ctx.Orders
LINQ enforces which way round you do the join condition so I had to flip your on o.FKCatalogNumber=p2.CatalogNumber to be on o.FKCatalogNumber equals p2.CatalogNumber
instead of your where p2.FKCategoryId IN (...), the equivalent c# is (...).Contains(p2.FKCategoryId)
the select comes last, not first
but those are the only major changes. Otherwise, it's written just like SQL.
I'd also draw your attention to a distinction regarding this comment:
the equivalent EF syntax for this query
The syntax here isn't specific to EF, but is just LINQ - Language Integrated Querying. It has two flavours: query syntax (sometimes called declarative) and method syntax (sometimes called fluent). LINQ works on just about any collection that implements IEnumerable or IQueryable, including EF's DbSet.
For more info on the different ways of querying, this MSDN page is a decent place to start. There's also this handy reference table showing the equivalent query syntax for each method-syntax operator, where applicable.

You can still nest queries in EF. The following looks like it works for me:
string[] catNumbers = {"0001", "0002"};
var orders = ctx.Orders
.Where(o => ctx.Products
.Where(p => catNumbers.Contains(p.CatalogNumber))
.Select(p => p.CategoryId)
.Contains(o.Product.CategoryId)
);
This produces the following SQL:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[CatalogNumber] AS [CatalogNumber]
FROM [dbo].[Orders] AS [Extent1]
WHERE EXISTS (SELECT
1 AS [C1]
FROM [dbo].[Products] AS [Extent2]
INNER JOIN [dbo].[Products] AS [Extent3] ON [Extent2].[CategoryId] = [Extent3].[CategoryId]
WHERE ([Extent1].[CatalogNumber] = [Extent3].[CatalogNumber]) AND ([Extent2].[CatalogNumber] IN (N'0001', N'0002'))
)

Related

EF Core: How can i Select (Distinct) after join?

Basically i want this as query:
SELECT DISTINCT c.description FROM students s
join courses c on s.courseId = c.Id
WHERE c.Id = 100
How do i do this in EF Core?
When i do :
db.Students
.Include(s => s.courseId)
.Select( -- how can i select for course description? --)
.Distinct()
Bear with me. I am new to Entity Framework.
First of all, your Include expression handles the joining for you, so you don't have to specify how your going to join the table. So your include will look like this:
_db.Students.Include(s => s.Course)
Then you'd accomplish your select through the use of a lamba expression like this:
_db.Students .Include(s => s.Course).Select(s => s.Course.Description).Distinct()

EFCore returning too many columns for a simple LEFT OUTER join

I am currently using EFCore 1.1 (preview release) with SQL Server.
I am doing what I thought was a simple OUTER JOIN between an Order and OrderItem table.
var orders = from order in ctx.Order
join orderItem in ctx.OrderItem
on order.OrderId equals orderItem.OrderId into tmp
from oi in tmp.DefaultIfEmpty()
select new
{
order.OrderDt,
Sku = (oi == null) ? null : oi.Sku,
Qty = (oi == null) ? (int?) null : oi.Qty
};
The actual data returned is correct (I know earlier versions had issues with OUTER JOINS not working at all). However the SQL is horrible and includes every column in Order and OrderItem which is problematic considering one of them is a large XML Blob.
SELECT [order].[OrderId], [order].[OrderStatusTypeId],
[order].[OrderSummary], [order].[OrderTotal], [order].[OrderTypeId],
[order].[ParentFSPId], [order].[ParentOrderId],
[order].[PayPalECToken], [order].[PaymentFailureTypeId] ....
...[orderItem].[OrderId], [orderItem].[OrderItemType], [orderItem].[Qty],
[orderItem].[SKU] FROM [Order] AS [order] LEFT JOIN [OrderItem] AS
[orderItem] ON [order].[OrderId] = [orderItem].[OrderId] ORDER BY
[order].[OrderId]
(There are many more columns not shown here.)
On the other hand - if I make it an INNER JOIN then the SQL is as expected with only the columns in my select clause:
SELECT [order].[OrderDt], [orderItem].[SKU], [orderItem].[Qty] FROM
[Order] AS [order] INNER JOIN [OrderItem] AS [orderItem] ON
[order].[OrderId] = [orderItem].[OrderId]
I tried reverting to EFCore 1.01, but got some horrible nuget package errors and gave up with that.
Not clear whether this is an actual regression issue or an incomplete feature in EFCore. But couldn't find any further information about this elsewhere.
Edit: EFCore 2.1 has addressed a lot of issues with grouping and also N+1 type issues where a separate query is made for every child entity. Very impressed with the performance in fact.
3/14/18 - 2.1 Preview 1 of EFCore isn't recommended because the GROUP BY SQL has some issues when using OrderBy() but it's fixed in nightly builds and Preview 2.
The following applies to EF Core 1.1.0 (release).
Although shouldn't be doing such things, tried several alternative syntax queries (using navigation property instead of manual join, joining subqueries containing anonymous type projection, using let / intermediate Select, using Concat / Union to emulate left join, alternative left join syntax etc.) The result - either the same as in the post, and/or executing more than one query, and/or invalid SQL queries, and/or strange runtime exceptions like IndexOutOfRange, InvalidArgument etc.
What I can say based on tests is that most likely the problem is related to bug(s) (regression, incomplete implementation - does it really matter) in GroupJoin translation. For instance, #7003: Wrong SQL generated for query with group join on a subquery that is not present in the final projection or #6647 - Left Join (GroupJoin) always materializes elements resulting in unnecessary data pulling etc.
Until it get fixed (when?), as a (far from perfect) workaround I could suggest using the alternative left outer join syntax (from a in A from b in B.Where(b = b.Key == a.Key).DefaultIfEmpty()):
var orders = from o in ctx.Order
from oi in ctx.OrderItem.Where(oi => oi.OrderId == o.OrderId).DefaultIfEmpty()
select new
{
OrderDt = o.OrderDt,
Sku = oi.Sku,
Qty = (int?)oi.Qty
};
which produces the following SQL:
SELECT [o].[OrderDt], [t1].[Sku], [t1].[Qty]
FROM [Order] AS [o]
CROSS APPLY (
SELECT [t0].*
FROM (
SELECT NULL AS [empty]
) AS [empty0]
LEFT JOIN (
SELECT [oi0].*
FROM [OrderItem] AS [oi0]
WHERE [oi0].[OrderId] = [o].[OrderId]
) AS [t0] ON 1 = 1
) AS [t1]
As you can see, the projection is ok, but instead of LEFT JOIN it uses strange CROSS APPLY which might introduce another performance issue.
Also note that you have to use casts for value types and nothing for strings when accessing the right joined table as shown above. If you use null checks as in the original query, you'll get ArgumentNullException at runtime (yet another bug).
Using "into" will create a temporary identifier to store the results.
Reference : MDSN: into (C# Reference)
So removing the "into tmp from oi in tmp.DefaultIfEmpty()" will result in the clean sql with the three columns.
var orders = from order in ctx.Order
join orderItem in ctx.OrderItem
on order.OrderId equals orderItem.OrderId
select new
{
order.OrderDt,
Sku = (oi == null) ? null : oi.Sku,
Qty = (oi == null) ? (int?) null : oi.Qty
};

Query produced for IN filter on 1-1 relation joins to parent table twice

I have this problem and reproduced it with AdventureWorks2008R2 to make it more easy. Basically, I want to filter a parent table for a list of IN values and I thought it would generate this type of query
but it doesn't.
SELECT * FROM SalesOrderDetail where EXISTS( select * from SalesOrderHeader where d.id=h.id and rowguid IN ('asdf', 'fff', 'weee' )
Any ideas how to change the LINQ statement to query Header only once?
(ignore the fact I'm matching on Guids - it will actually be integers; I was just quickly looking for a 1-1 table in EF because that's when the problem occurs and I happened to find these)
var guidsToFind = new Guid[] { Guid.NewGuid(), Guid.NewGuid(), Guid.NewGuid()};
AdventureWorks2008R2Entities context = new AdventureWorks2008R2Entities();
var g = context.People.Where(p => guidsToFind.Contains(p.BusinessEntity.rowguid)).ToList();
That produces the following more expensive query:
SELECT [Extent1].[BusinessEntityID] AS [BusinessEntityID],
[Extent1].[PersonType] AS [PersonType],
[Extent1].[NameStyle] AS [NameStyle],
[Extent1].[Title] AS [Title],
[Extent1].[FirstName] AS [FirstName],
[Extent1].[MiddleName] AS [MiddleName],
[Extent1].[LastName] AS [LastName],
[Extent1].[Suffix] AS [Suffix],
[Extent1].[EmailPromotion] AS [EmailPromotion],
[Extent1].[AdditionalContactInfo] AS [AdditionalContactInfo],
[Extent1].[Demographics] AS [Demographics],
[Extent1].[rowguid] AS [rowguid],
[Extent1].[ModifiedDate] AS [ModifiedDate]
FROM [Person].[Person] AS [Extent1]
INNER JOIN [Person].[BusinessEntity] AS [Extent2] ON [Extent1].[BusinessEntityID] = [Extent2].[BusinessEntityID]
LEFT OUTER JOIN [Person].[BusinessEntity] AS [Extent3] ON [Extent1].[BusinessEntityID] = [Extent3].[BusinessEntityID]
WHERE [Extent2].[rowguid] = cast('b95b63f9-6304-4626-8e70-0bd2b73b6b0f' as uniqueidentifier) OR [Extent3].[rowguid] IN (cast('f917a037-b86b-4911-95f4-4afc17433086' as uniqueidentifier),cast('3188557d-5df9-40b3-90ae-f83deee2be05' as uniqueidentifier))
Really odd. Looks like a LINQ limitation.
I don't have a system to try this on right now but if you first get a list of BusinessEntityId values based on the provided guids and then get the persons like this
var g = context.People.Where(p => businessEntityIdList.Contains(p.BusinessEntityId)).ToList();
there should not be a reason for additional unnecessary joins anymore.
If that works, you can try to combine the to steps into one LINQ expression to see if the separation stays intact.

Entity Framework, how to manually produce inner joins with LinQ to entitites

Let's imagine I have an table called Foo with a primary key FooID and an integer non-unique column Bar. For some reason in a SQL query I have to join table Foo with itself multiple times, like this:
SELECT * FROM Foo f1 INNER JOIN Foo f2 ON f2.Bar = f1.Bar INNER JOIN Foo f3 ON f3.Bar = f1.Bar...
I have to achieve this via LINQ to Entities.
Doing
ObjectContext.Foos.Join(ObjectContext.Foos, a => a.Bar, b => b.Bar, (a, b) => new {a, b})
gives me LEFT OUTER JOIN in the resulting query and I need inner joins, this is very critical.
Of course, I might succeed if in edmx I added as many associations of Foo with itself as necessary and then used them in my code, Entity Framework would substitute correct inner join for each of the associations. The problem is that at design time I don't know how many joins I will need. OK, one workaround is to add as many of them as reasonable...
But, if nothing else, from theoretical point of view, is it at all possible to create inner joins via EF without explicitly defining the associations?
In LINQ to SQL there was a (somewhat bizarre) way to do this via GroupJoin, like this:
ObjectContext.Foos.GroupJoin(ObjectContext.Foos, a => a.Bar, b => b.Bar, (a, b) => new {a, b}).SelectMany(o = > o.b.DefaultIfEmpty(), (o, b) => new {o.a, b)
I've just tried it in EF, the trick does not work there. It still generates outer joins for me.
Any ideas?
In Linq to Entities, below is one way to do an inner join on mutiple instances of the same table:
using (ObjectContext ctx = new ObjectContext())
{
var result = from f1 in ctx.Foo
join f2 in ctx.Foo on f1.bar equals f2.bar
join f3 in ctx.Foo on f1.bar equals f3.bar
select ....;
}

Dynamic Query with Entity Framework 4

Okay I'm new to EF and I'm having issues grasping on filtering results...
I'd like to emulate the ef code to do something like:
select *
from order o
inner join orderdetail d on (o.orderid = d.orderid)
where d.amount > 20.00
just not sure how this would be done in EF (linq to entities syntax)
Your SQL gives multiple results per order if there's multiple details > 20.00. That seems wrong to me. I think you want:
var q = from o in Context.Orders
where o.OrderDetails.Any(d => d.Amount > 20.00)
select o;
I would do it like that:
context.OrderDetails.Where(od => od.Amount > 20).Include("Order").ToList().Select(od => od.Order).Distinct();
We are taking details first, include orders, and take distinct orders.