I have an issue where we create a complex IQueryable that we need to make it more efficient.
There are 2 tables that should only be included if columns from them are being filtered.
My exact situation is complex to explain so I thought I could illustrate it with an example for cars.
If I have a CarFilter class like this:
public class CarFilter
{
public string BrandName { get;set; }
public decimal SalePrice {get; set; }
}
Let's say that we have a query for car sales:
var info = from car in cars
from carSale in carSales on carSale.BrandId == car.BrandId && car.ModelId == carSale.ModelId
from brand in carBrands on car.BrandId == brand.BrandId
select car
var cars = info.ToList();
Let's say that this is a huge query that returns 100'000 rows as we are looking at cars and sales and the associated brands.
The user only wants to see the details from car, the other 2 tables are for filtering purposes.
So if the user only wants to see Ford cars, our logic above is not efficient. We are joining in the huge car sale table for no reason as well as CarBrand as the user doesn't care about anything in there.
My question is how can I only include tables in my IQueryable if they are actually needed?
So if there is a BrandName in my filter I would include CarBrand table, if not, it's not included.
Using this example, the only time I would ever want both tables is if the user specified both a BrandName and SalePrice.
The semantics are not important here, i.e the number of records returned being impacted by the joins etc, I am looking for help on the approach
I am using EF Core
Paul
It is common for complex filtering. Just join when it is needed.
var query = cars;
if (filter.SalePrice > 0)
{
query =
from car in query
join carSale in carSales on new { car.BrandId, car.ModelId } equals new { carSale.BrandId, carSale.ModelId }
where carSale.Price >= filter.SalePrice
select car;
}
if (!filter.BrandName.IsNullOrEempty())
{
query =
from car in query
join brand in carBrands on car.BrandId equals brand.BrandId
where brand.Name == filter.BrandName
select car;
}
var result = query.ToList();
Related
I'm trying to optimize my EF queries. I have an entity called Employee. Each employee has a list of tools. Ultimately, I'm trying to get a list of employees with their tools that are NOT broken. When running my query, I can see that TWO calls are made to the server: one for the employee entities and one for the tool list. Again, I'm trying to optimize the query, so the server is hit for a query only once. How can I do this?
I've been exploring with LINQ's join and how to create a LEFT JOIN, but the query is still not optimized.
In my first code block here, the result is what I want, but -- again -- there are two hits to the server.
public class Employee
{
public int EmployeeId { get; set; }
public List<Tool> Tools { get; set; } = new List<Tool>();
...
}
public class Tool
{
public int ToolId { get; set; }
public bool IsBroken { get; set; } = false;
public Employee Employee { get; set; }
public int EmployeeId { get; set; }
...
}
var x = (from e in db.Employees.Include(e => e.Tools)
select new Employee()
{
EmployeeId = e.EmployeeId,
Tools = e.Tools.Where(t => !t.IsBroken).ToList()
}).ToList();
This second code block pseudoly mimics what I'm trying to accomplish. However, the GroupBy(...) is being evaluated locally on the client machine.
(from e in db.Employees
join t in db.Tools.GroupBy(tool => tool.EmployeeId) on e.EmployeeId equals t.Key into empTool
from et in empTool.DefaultIfEmpty()
select new Employee()
{
EmployeeId = e.EmployeeId,
Tools = et != null ? et.Where(t => !t.IsBroken).ToList() : null
}).ToList();
Is there anyway that I can make ONE call to the server as well as not having my GroupBy() evaluate locally and have it return a list of employees with a filtered tool list with tools that are not broken? Thank you.
Shortly, it's not possible (and I don't think it ever will be).
If you really want to control the exact server calls, EF Core is simply not for you. While EF Core still has issues with some LINQ query translation which leads to N+1 query or client evaluation, one thing is by design: unlike EF6 which uses single huge union SQL query for producing the result, EF Core uses one SQL query for the main result set plus one SQL query per each correlated result set.
This is sort of explained in the How Queries Work EF Core documentation section:
The LINQ query is processed by Entity Framework Core to build a representation that is ready to be processed by the database provider
The result is cached so that this processing does not need to be done every time the query is executed
The result is passed to the database provider
The database provider identifies which parts of the query can be evaluated in the database
These parts of the query are translated to database specific query language (for example, SQL for a relational database)
One or more queries are sent to the database and the result set returned (results are values from the database, not entity instances)
Note the word more in the last bullet.
In your case, you have 1 main result set (Employee) + 1 correlated result set (Tool), hence the expected server queries are TWO (except if the first query returns empty set).
You can use this:
var x = from e in _context.Employees
select new
{
e,
Tools = from tool in e.Tools where !tool.IsBroken select tool
};
var result = x.AsEnumerable().Select(y => y.e);
Which will be finally translated to a SQL query like below depending on your provider:
SELECT
`Project1`.`EmployeeId`,
`Project1`.`Name`,
`Project1`.`C1`,
`Project1`.`ToolId`,
`Project1`.`IsBroken`,
`Project1`.`EmployeeId1`
FROM (SELECT
`Extent1`.`EmployeeId`,
`Extent1`.`Name`,
`Extent2`.`ToolId`,
`Extent2`.`IsBroken`,
`Extent2`.`EmployeeId` AS `EmployeeId1`,
CASE WHEN (`Extent2`.`ToolId` IS NOT NULL) THEN (1) ELSE (NULL) END AS `C1`
FROM `Employees` AS `Extent1` LEFT OUTER JOIN `Tools` AS `Extent2` ON (`Extent1`.`EmployeeId` = `Extent2`.`EmployeeId`) AND (`Extent2`.`IsBroken` != 1)) AS `Project1`
ORDER BY
`Project1`.`EmployeeId` ASC,
`Project1`.`C1` ASC
I change my previous answer which was wrong, thanks to comments.
I would like to know what happens when I use IQueryable with and without AsQueryable(). Here is an example:
public partial class Book
{
.......
public Nullable<System.DateTime> CheckoutDate{get; set;}
}
I need to filter the data from SQL server before it is returned to an application server. I need to return books checked out more recently than entered date. Which one should I use?
A.
IQueryable<Book> books = db.Books;
books = books.Where(b => b.CheckoutDate >= date);
B.
IQueryable<Book> books = db.Books.ToList().AsQueryable();
books = books.Where(b => b.CheckoutDate >= date);
Basically I would like to know what is the difference between the above two options. Do they work on the similar grounds? Do they return same values?
With B option, you're basically retrieving every book from database and filtering data in memory.
A option is more performance, as it filters data at the database and return only the rows that match your query.
I'm developing a client app that uses breezejs and Entity Framework 6 on the back end. I've got a statement like this:
var country = 'Mexico';
var customers = EntityQuery.from('customers')
.where('country', '==', country)
.expand('order')
I want to use There may be hundreds of orders that each customer has made. For the purposes of performance, I only want to retrieve the latest order for each customer. This will be based on the created date for the order. In SQL, I could write something like this:
SELECT c.customerId, companyName, ContactName, City, Country, max(o.OrderDate) as LatestOrder FROM Customers c
inner join Orders o on c.CustomerID = o.CustomerID
group by c.customerId, companyName, ContactName, City, Country
If this was run against the northwind database, only the most recent order row is returned for each customer.
How can I write a similar query in breeze, so that it runs on the server side and therefore returns less data to the client. I know I could handle this all on the client but writing some javascript in a querysucceeded method that could be run by the client - but that's not the goal here.
thanks
For a case like this, you should create a special endpoint method that will perform your query.
Then you can use an Entity Framework query to do what you want, using the LINQ syntax.
Here are two Web API examples:
[HttpGet]
public IQueryable<Object> CustomersLatestOrderEntities()
{
// IQueryable<Object> containing Customer and Order entity
var entities = ContextProvider.Context.Customers.Select(c => new { Customer = c, LatestOrder = c.Orders.OrderByDescending(o => o.OrderDate).FirstOrDefault() });
return entities;
}
[HttpGet]
public IQueryable<Object> CustomersLatestOrderProjections()
{
// IQueryable<Object> containing Customer and Order entity
var entities = ContextProvider.Context.Customers.Select(c => new { Customer = c, LatestOrder = c.Orders.OrderByDescending(o => o.OrderDate).FirstOrDefault() });
// IQueryable<Object> containing just data fields, no entities
var projections = entities.Select(e => new { e.Customer.CustomerID, e.Customer.ContactName, e.LatestOrder.OrderDate });
return projections;
}
Note that you have a choice here. You can return actual entities, or you can return just some data fields. Which is right for you depends upon how you are going to use them on the client. If they are just for display in a
non-editable list, you can just return the plain data (CustomersLatestOrderProjections above). If they can potentially
be edited, then return the object containing the entities (CustomersLatestOrderEntities). Breeze will merge the entities
into its cache, even though they are contained inside this anonymous object.
Either way, because it returns IQueryable, you can use the Breeze filtering syntax from the client to further qualify the query.
var projectionQuery = breeze.EntityQuery.from("CustomersLatestOrderProjections")
.skip(20)
.take(10);
var entityQuery = breeze.EntityQuery.from("CustomersLatestOrderEntities")
.where('customer.countryName', 'startsWith', 'C');
.take(10);
Just starting out with Entity Framework and am trying to work out how you would do something like this....
Say I have the following entities, Customers that have Orders that have OrderLineItems which are linked to Products. I would like to return the name of every customer with a count of the number of times they have ordered a particular product.
I have seen examples of using .Count() but these have always been for the first navigation property i.e. number of orders per customer.
Would appreciate some guidance here.
Something like this should work, where context is your DbContext instance.
It will return an IEnumerable<dynamic>, although obviously you could make a class to hold the results.
// The product to count
var productId = 12345;
context.Customers.Include("Orders.OrderLineItems.Products")
.Select(customer =>
new {
CustomerName = customer.Name,
ProductCount = customer.Orders
.SelectMany(o => o.OrderLineItems)
.SelectMany(i => i.Products.Where(p => p.Id = productId).Count()
});
The Include() extension method is useful, it will make sure that the resulting SQL query joins the relevant tables together - otherwise multiple queries would be executed for each customer (one to get orders, another for line items and a final one for products).
This is my real world example.
I have 4 tables:
Person
Plan
Coverage
CoveredMembers
Each person can have many plans, each of those plans can have many coverages. Each of those coverages can have many CoveredMembers.
I need a query that will apply a filter on Plan.PlanType == 1 and CoveredMembers.TermDate == null.
This query should bring back any person who has a medical type plan that is not terminated.
This SQL statement would do just that:
SELECT Person.*, Plans.*, Coverages.*, CoveredMembers.*
FROM Person P
INNER JOIN Plan PL ON P.PersonID = PL.PersonID
INNER JOIN Coverage C on PL.PlanID = C.PlanID
INNER JOIN CoveredMember CM on C.CoverageID = CM.CoverageID
WHERE CM.TermDate = NULL AND PL.PlanType = 1
I have figured out how to do this using anonymous types, but I sometimes need to update the data and save back to the database - and anonymous types are read only.
I was given a solution that did work using JOIN but it only brought back the persons (albeit filtered the way I needed). I can then loop through each person:
foreach (var person in persons) {
foreach (var plan in person.Plans{
//do stuff
}
}
But wouldn't that make a db call for each iteration of the loop? I have 500 persons with 3 unterminated medical plans each, so it would call the db 1500 times?
This is why I want to bring the whole data tree from Persons to CoveredMembers back in one shot. Is this not possible?
I believe this is accomplished in two parts:
Your query to determine the people you wish to have returned based on your criteria as discussed in this question previously: Entity framework. Need help filtering results
Properly setting the navigation properties for entities you want brought together to be eagerly loaded: http://msdn.microsoft.com/en-us/data/jj574232.aspx
For example if your Person entity looks like:
public class Person {
public List<Plan> Plans {get; set;}
...
}
When returning data from the dbcontext you can also use explicit eager loading with the include option:
var people = context.People
.Include(p => p.Plans)
.ToList();
....
If these are nested - coverage is part of plan, etc (which it looks like, it goes something like):
var people = context.People
.Include(p => p.Plans.Select(pl=>pl.Coverage).Select(c=>c.CoveredMembers)))
.ToList();
....
I am making some assumptions about your data model here, and my code above probably needs a little tweaking.
EDIT:
I might need someone else to weigh in here, but I don't think you can add the where clause into an include like that (my example above leads you that way a bit by putting the include on the context object, instead return an IQueryable with your conditions set as solved in your first post (without a ToList() called on it) and then use the code you wrote above without the Where clauses:
From first post (you supplied different criteria in this one, but same concept)
var q = from q1 in dbContext.Parent
join q2 in dbContext.Children
on q1.key equals q2.fkey
join q3 in ........
where q4.col1 == 3000
select q1;
Then:
List<Person> people = q.Include(p => p.Plans
.Select(pl => pl.Coverages)
.Select(c => c.CoveredMembers).ToList();
Again, doing this without being able to troubleshoot - I am sure it would take me a few attempts to iron this one out too.