I'm trying to optimize my EF queries. I have an entity called Employee. Each employee has a list of tools. Ultimately, I'm trying to get a list of employees with their tools that are NOT broken. When running my query, I can see that TWO calls are made to the server: one for the employee entities and one for the tool list. Again, I'm trying to optimize the query, so the server is hit for a query only once. How can I do this?
I've been exploring with LINQ's join and how to create a LEFT JOIN, but the query is still not optimized.
In my first code block here, the result is what I want, but -- again -- there are two hits to the server.
public class Employee
{
public int EmployeeId { get; set; }
public List<Tool> Tools { get; set; } = new List<Tool>();
...
}
public class Tool
{
public int ToolId { get; set; }
public bool IsBroken { get; set; } = false;
public Employee Employee { get; set; }
public int EmployeeId { get; set; }
...
}
var x = (from e in db.Employees.Include(e => e.Tools)
select new Employee()
{
EmployeeId = e.EmployeeId,
Tools = e.Tools.Where(t => !t.IsBroken).ToList()
}).ToList();
This second code block pseudoly mimics what I'm trying to accomplish. However, the GroupBy(...) is being evaluated locally on the client machine.
(from e in db.Employees
join t in db.Tools.GroupBy(tool => tool.EmployeeId) on e.EmployeeId equals t.Key into empTool
from et in empTool.DefaultIfEmpty()
select new Employee()
{
EmployeeId = e.EmployeeId,
Tools = et != null ? et.Where(t => !t.IsBroken).ToList() : null
}).ToList();
Is there anyway that I can make ONE call to the server as well as not having my GroupBy() evaluate locally and have it return a list of employees with a filtered tool list with tools that are not broken? Thank you.
Shortly, it's not possible (and I don't think it ever will be).
If you really want to control the exact server calls, EF Core is simply not for you. While EF Core still has issues with some LINQ query translation which leads to N+1 query or client evaluation, one thing is by design: unlike EF6 which uses single huge union SQL query for producing the result, EF Core uses one SQL query for the main result set plus one SQL query per each correlated result set.
This is sort of explained in the How Queries Work EF Core documentation section:
The LINQ query is processed by Entity Framework Core to build a representation that is ready to be processed by the database provider
The result is cached so that this processing does not need to be done every time the query is executed
The result is passed to the database provider
The database provider identifies which parts of the query can be evaluated in the database
These parts of the query are translated to database specific query language (for example, SQL for a relational database)
One or more queries are sent to the database and the result set returned (results are values from the database, not entity instances)
Note the word more in the last bullet.
In your case, you have 1 main result set (Employee) + 1 correlated result set (Tool), hence the expected server queries are TWO (except if the first query returns empty set).
You can use this:
var x = from e in _context.Employees
select new
{
e,
Tools = from tool in e.Tools where !tool.IsBroken select tool
};
var result = x.AsEnumerable().Select(y => y.e);
Which will be finally translated to a SQL query like below depending on your provider:
SELECT
`Project1`.`EmployeeId`,
`Project1`.`Name`,
`Project1`.`C1`,
`Project1`.`ToolId`,
`Project1`.`IsBroken`,
`Project1`.`EmployeeId1`
FROM (SELECT
`Extent1`.`EmployeeId`,
`Extent1`.`Name`,
`Extent2`.`ToolId`,
`Extent2`.`IsBroken`,
`Extent2`.`EmployeeId` AS `EmployeeId1`,
CASE WHEN (`Extent2`.`ToolId` IS NOT NULL) THEN (1) ELSE (NULL) END AS `C1`
FROM `Employees` AS `Extent1` LEFT OUTER JOIN `Tools` AS `Extent2` ON (`Extent1`.`EmployeeId` = `Extent2`.`EmployeeId`) AND (`Extent2`.`IsBroken` != 1)) AS `Project1`
ORDER BY
`Project1`.`EmployeeId` ASC,
`Project1`.`C1` ASC
I change my previous answer which was wrong, thanks to comments.
Related
I have an issue where we create a complex IQueryable that we need to make it more efficient.
There are 2 tables that should only be included if columns from them are being filtered.
My exact situation is complex to explain so I thought I could illustrate it with an example for cars.
If I have a CarFilter class like this:
public class CarFilter
{
public string BrandName { get;set; }
public decimal SalePrice {get; set; }
}
Let's say that we have a query for car sales:
var info = from car in cars
from carSale in carSales on carSale.BrandId == car.BrandId && car.ModelId == carSale.ModelId
from brand in carBrands on car.BrandId == brand.BrandId
select car
var cars = info.ToList();
Let's say that this is a huge query that returns 100'000 rows as we are looking at cars and sales and the associated brands.
The user only wants to see the details from car, the other 2 tables are for filtering purposes.
So if the user only wants to see Ford cars, our logic above is not efficient. We are joining in the huge car sale table for no reason as well as CarBrand as the user doesn't care about anything in there.
My question is how can I only include tables in my IQueryable if they are actually needed?
So if there is a BrandName in my filter I would include CarBrand table, if not, it's not included.
Using this example, the only time I would ever want both tables is if the user specified both a BrandName and SalePrice.
The semantics are not important here, i.e the number of records returned being impacted by the joins etc, I am looking for help on the approach
I am using EF Core
Paul
It is common for complex filtering. Just join when it is needed.
var query = cars;
if (filter.SalePrice > 0)
{
query =
from car in query
join carSale in carSales on new { car.BrandId, car.ModelId } equals new { carSale.BrandId, carSale.ModelId }
where carSale.Price >= filter.SalePrice
select car;
}
if (!filter.BrandName.IsNullOrEempty())
{
query =
from car in query
join brand in carBrands on car.BrandId equals brand.BrandId
where brand.Name == filter.BrandName
select car;
}
var result = query.ToList();
I am trying to query my postgresql database using Ef core and Ardalis Specification.
For the query I build I want to sort the results by using OrderBy with an aggregate of a property that is on a nested object.
The sorting I want is to sort the list of Clinics by the Clinic that has the most Reviews with high Grades. The grades are on a scale of 1-5.
So if a clinic has two reviews with Grade=5 it should come on top of a clinic that has 5 reviews with Grade=2 or Grade=4. To do this I have to calculate the mean value and then order by the highest
public class Clinic
{
public int Id { get; set; }
public ICollection<Review> Reviews {get; set;}
}
public class Review
{
public int Id { get; set; }
public decimal Grade {get; set;}
}
My query so far, which doesnt work as intended as it only gets the highest value. Can I insert a mean-value calculation here somehow?
public ClinicFilterPaginatedSpecification()
{
Query.OrderByDescending(x => x.Reviews.Max(x => x.Grade ));
}
Running the query:
var filterSpec = new ClinicFilterSpecification();
var itemsOnPage= await _clinicRepo.ListAsync(filterSpec);
As Ivan Stoev notes in his comment, you should be able to use the .Average() command:
public ClinicFilterPaginatedSpecification()
{
Query.OrderByDescending(clinic => clinic.Reviews.Average(review => review.Grade ));
}
Have you tried this and if so is it working or producing an error?
I should mention that this is not directly related to the Specification package, in the sense that the expression is not altered in any way. Whatever works on EF, should work through specs as well. We're passing the expression as it is.
Now the question is how EF would behave in this case when you need to aggregate some data from the collections. I think this is optimized in EF Core 5, and Ardalis' suggestion should work. Prior to EF Core 3, this scenario would have involved an explicit Join operation (not quite sure).
I would like to know what happens when I use IQueryable with and without AsQueryable(). Here is an example:
public partial class Book
{
.......
public Nullable<System.DateTime> CheckoutDate{get; set;}
}
I need to filter the data from SQL server before it is returned to an application server. I need to return books checked out more recently than entered date. Which one should I use?
A.
IQueryable<Book> books = db.Books;
books = books.Where(b => b.CheckoutDate >= date);
B.
IQueryable<Book> books = db.Books.ToList().AsQueryable();
books = books.Where(b => b.CheckoutDate >= date);
Basically I would like to know what is the difference between the above two options. Do they work on the similar grounds? Do they return same values?
With B option, you're basically retrieving every book from database and filtering data in memory.
A option is more performance, as it filters data at the database and return only the rows that match your query.
I'm developing a client app that uses breezejs and Entity Framework 6 on the back end. I've got a statement like this:
var country = 'Mexico';
var customers = EntityQuery.from('customers')
.where('country', '==', country)
.expand('order')
I want to use There may be hundreds of orders that each customer has made. For the purposes of performance, I only want to retrieve the latest order for each customer. This will be based on the created date for the order. In SQL, I could write something like this:
SELECT c.customerId, companyName, ContactName, City, Country, max(o.OrderDate) as LatestOrder FROM Customers c
inner join Orders o on c.CustomerID = o.CustomerID
group by c.customerId, companyName, ContactName, City, Country
If this was run against the northwind database, only the most recent order row is returned for each customer.
How can I write a similar query in breeze, so that it runs on the server side and therefore returns less data to the client. I know I could handle this all on the client but writing some javascript in a querysucceeded method that could be run by the client - but that's not the goal here.
thanks
For a case like this, you should create a special endpoint method that will perform your query.
Then you can use an Entity Framework query to do what you want, using the LINQ syntax.
Here are two Web API examples:
[HttpGet]
public IQueryable<Object> CustomersLatestOrderEntities()
{
// IQueryable<Object> containing Customer and Order entity
var entities = ContextProvider.Context.Customers.Select(c => new { Customer = c, LatestOrder = c.Orders.OrderByDescending(o => o.OrderDate).FirstOrDefault() });
return entities;
}
[HttpGet]
public IQueryable<Object> CustomersLatestOrderProjections()
{
// IQueryable<Object> containing Customer and Order entity
var entities = ContextProvider.Context.Customers.Select(c => new { Customer = c, LatestOrder = c.Orders.OrderByDescending(o => o.OrderDate).FirstOrDefault() });
// IQueryable<Object> containing just data fields, no entities
var projections = entities.Select(e => new { e.Customer.CustomerID, e.Customer.ContactName, e.LatestOrder.OrderDate });
return projections;
}
Note that you have a choice here. You can return actual entities, or you can return just some data fields. Which is right for you depends upon how you are going to use them on the client. If they are just for display in a
non-editable list, you can just return the plain data (CustomersLatestOrderProjections above). If they can potentially
be edited, then return the object containing the entities (CustomersLatestOrderEntities). Breeze will merge the entities
into its cache, even though they are contained inside this anonymous object.
Either way, because it returns IQueryable, you can use the Breeze filtering syntax from the client to further qualify the query.
var projectionQuery = breeze.EntityQuery.from("CustomersLatestOrderProjections")
.skip(20)
.take(10);
var entityQuery = breeze.EntityQuery.from("CustomersLatestOrderEntities")
.where('customer.countryName', 'startsWith', 'C');
.take(10);
Im trying to call a stored procedure using Entity framework.
If I go direcly to the web api method it works fine, but when calling it from breeze it causes an exception on the metadata method.
The error is :
"Could not find the CLR type for...".
Anyone know how to fix this?
I had the very same issue, but thank God I figured out a solution. Instead of using a stored procedure, you should use a view, as Breeze recognizes views as DbSet<T>, just like tables. Say you have a SQL server table that contains two tables Customers and Orders.
Customers (**CustomerId**, FirstName, LastName)
Orders (OrderId, #CustomerId, OrderDate, OrderTotal)
Now, say you want a query that returns orders by CustomerId. Usually, you would do that in a stored procedure, but as I said, you need to use a view instead. So the query will look like this in the view.
Select o.OrderId, c.CustomerId, o.OrderDate, o.OrderTotal
from dbo.Orders o inner join dbo.Customers c on c.CustomerId = o.CustomerId
Notice there is no filtering (where ...). So:
i. Create a [general] view that includes the filtering key(s) and name it, say, OrdersByCustomers
ii. Add the OrdersByCustomers view to the entity model in your VS project
iii. Add the entity to the Breeze controller, as such:
public IQueryable<OrdersByCustomers> OrdersByCustomerId(int id)
{
return _contextProvider.Context.OrdersByCustomers
.Where(r => r.CustomerId == id);
}
Notice the .Where(r => r.CustomerId == id) filter. We could do it in the data service file, but because we want the user to see only his personal data, we need to filter from the server so it only returns his data.
iv. Now, that the entity is set in the controller, you may invoke it in the data service file, as such:
var getOrdersByCustomerId = function(orderObservable, id)
{
var query = breeze.EntityQuery.from('OrdersByCustomerId')
.WithParameters({ CustomerId: id });
return manager.executeQuery(query)
.then(function(data) {
if (orderObservable) orderObservable(data.results);
}
.fail(function(e) {
logError('Retrieve Data Failed');
}
}
v. You probably know what to do next from here.
Hope it helps.