I am working with Entity Framework and pretty new with it.
I have a table named: Order and table named: Products.
Each order have a lot of products.
When generating the entities I get Order object with ICollection to products.
The problem is I have a lot of products to each order (20K) and when I do
order.Products.where(......)
The EF runs a select statement only with orderId= 123 and does the rest of the where in the code.
Because I have a lot of results - the select takes a lot of time. How can I change the code - that the select in the DB will be with the where conditions?
This statement:
var prods = order.Products.Where(...);
is equivalent to:
var temps = order.Products;
var prods = temps.Where(...);
Unlike Where(...), which returns an IQueryable, order.Products triggers a lazy loading, which produces an ICollection and will be executed immediately, not delayed. So it's this order.Products part that generates the select statement you see. It fetches all the products belonging to that order into memory. Then the Where(...) part is executed in memory, hence the bad performance.
To avoid this, you should use order.Products only if you really want all the products on an order. If you want only a subset of them, do something like the following:
ctx.Products.Where(prod => prod.Order.Id == order.Id && ...)
Note that ctx is the database context, not the order object.
If you think that the prod.Order.Id == order.Id clause above looks a little dirty, here's a purer but longer alternative:
ctx.Entry(order).Collection(ord => ord.Products).Query().Where(...)
which produces exactly the same SQL query.
Related
The property is defined as virtual. But before I access the order property, the order entity's data has been loaded, why?
Full source code:
A couple of things:
When you state: doesn't work, you are getting an Order back, but the $ figure is 0.0 when you expect a different value? You appear have 2x order records, but based on what is coming back you're expecting a non-zero figure, are both records non-zero? In your debug view, expand the "Orders" in the pop-up context menu, this will reveal what order details EF has loaded.
Firstly you should be careful around the use of "OrDefault" renditions of the methods. Your code is assuming that a value is returned. In these cases you're better off using Single() or First() as applicable.
Additionally, when using First you should be specifying an OrderBy clause to ensure that there is a reliable ordering.
SaveChanges Should only be called if you modify data.
Lastly, lazy loading is an enabler for loading infrequently used data on demand. You should largely work to avoid relying on lazy-load calls. If you need the entire entities and you know you are going to use orders, then eager-load them.
I.e.
using (var context = new EfContext())
{
var customer = context.Customers
.Include(c => c.Orders)
.Single(c => c.CustomerId = customerId);
// Do stuff...
}
If you want just 1 applicable order for a given employee then consider using Select to retrieve it:
I.e.
using (var context = new EfContext())
{
var data = context.Customers.Where(c => c.CustomerId = customerId)
.Select(c => new { Customer = c, FirstOrder = c.Orders.OrderBy(o => o.OrderDate).FirstOrDefault()})
.Single();
// Do Stuff...
}
This will give you an anonymous type containing the Customer (without eager loading the orders) and the matching first order.
Better is just to use Select to retrieve the specific fields you need from customer and order(s). This reduces the amount of data (rows and columns) pulled back from the database.
sorry,It looks like the problem of visual studio,When the call statement is commented,the order table's sql querie no longer being executed,but when the call statement is included,although the breakpoint is set,the order table's sql querie still be executed.
visual studio debug execute the Expression automatically
comment the staement
sql profiler
include the statement
sql profiler
This one is driving me nuts!
For simplicity, I'm not gonna put up the entire code that is used for our DDD but simply expose what I've tried and explain what isn't working.
I have a simple database structure:
Products (holds product data)
Orders (holds entered orders)
OrderProducts (ref table between Orders and Product)
I have an Order aggregate root and I want to pull out the product count of one simple Order.
I fetch my Order by id which results in an EF lambda:
var order = _orderRepository.Get(orderId);
Then, I try to pull the count of products in that order using:
var count = order.OrderProducts.Count();
This line chokes up when an order has A LOT a records because, it's fetching all of them. Fine.
So, I'm refining it a bit by adding some filters to the products I want to count from within my order.
A product has a couple of properties which include a type (so, there's a type ID).
So, now I'm trying this:
//This is trimming down my results to about a dozen products)
var count = order.OrderProduct
.Where(op=>op.Product.TypeId == 2)
.Count();
If I use Linqpad to see what kind of SQL is generated, to my surprise, it's still loading ALL the OrderProducts from this order!
How can I force it to apply the filter in the query directly?
It's loading all of them because once you touch the navigation property (i.e. order.OrderProducts) eager loading kicks in and loads all of them (i.e. even the ones that you don't want). Your only option to reduce that would be to query the database itself given the orderID. Maybe something like:
_orderProductRepository.Where(p => p.OrderID == orderId && p.Product.TypeID == 2);
I have DbContext (called "MyContext") with about 100 DbSets within it.
Among the domain classes, I have a Document class with 10 direct subclasses (like PurchaseOrder, RequestForQuotation etc).
The heirarchy is mapped with a TPT strategy.
That is, in my database, there is a Document table, with other tables like PurchaseOrder, RequestForQuotation for the subclasses.
When I do a query like:
Document document = myContext.Documents.First();
the query took 5 seconds, no matter whether it's the first time I run it or subsequently.
A query like:
Document document = myContext.Documents.Where(o => o.ID == 2);
also took as long.
Is this an issue with EF4.1 (if so, will EF4.2 help) or is this an issue with the query codes?
Did you try using SQL Profile to see what is actually sent to the DB? It could be that you have too many joins on your Document that are not set to lazy load, and so the query has to do all the joins in one go, bringing back too many columns. Try to send a simple query with just one return column.
As you can read here, there are some performance issues regarding TPT in EF.
The EF Team annouced several fixes in the June 2011 CTP, including TPT queries optimization, but they are not included in EF 4.2, as you can read in the comments to this answer.
In the worst case, these fixes will only be released with .NET 4.5. I'm hoping it will be sooner...
I'm not certain that the DbSet exposed by code-first actually using ObjectQuery but you can try to invoke the .ToTraceString() method on them to see what SQL is generated, like so:
var query = myContext.Documents.Where(o => o.ID == 2);
Debug.WriteLine(query.ToTraceString());
Once you get the SQL you can determine whether it's the query or EF which is causing the delay. Depending on the complexity of your base class the query might include a lot of additional columns, which could be avoided using projection. With using projections, you can perform a query like this:
var query = from d in myContext.Documents
where d.ID == 2
select new
{
o.Id
};
This should basically perform a SELECT ID FROM Documents WHERE ID = 2 query and you can measure how long this takes to gain further information. Of course the projected query might not fit your needs but it might get you on the right track. If this still takes up to 5 seconds you should look into performance problems with the database itself rather than EF.
Update
Apparently with code-first you can use .ToString() instead of .ToTraceString(), thanks Slauma for noticing.
I've just had a 5 sec delay in ExecuteFunction, on a stored procedure that runs instantaneously when called from SQL Management Studio. I fixed it by re-writing the procedure.
It appears that EF (and SSRS BTW) tries to do something like a "prepare" on the stored proc and for some (usually complex) procs that can take a very long time.
A quick and dirty solution is to duplicate and then replace your SP parameters with internal variables:
create proc ListOrders(#CountryID int = 3, #MaxOrderCount int = 20)
as
declare #CountryID1 int, #MaxOrderCount1 int
set #CountryID1 = #CountryID
set #MaxOrderCount1 = #MaxOrderCount
select top (#MaxOrderCount1) *
from Orders
where CountryID = #CountryID1
One of the few valid complaints I hear about EF4 vis-a-vis NHibernate is that EF4 is poor at handling lazily loaded collections. For example, on a lazily-loaded collection, if I say:
if (MyAccount.Orders.Count() > 0) ;
EF will pull the whole collection down (if it's not already), while NH will be smart enough to issue a select count(*)
NH also has some nice batch fetching to help with the select n + 1 problem. As I understand it, the closest EF4 can come to this is with the Include method.
Has the EF team let slip any indication that this will be fixed in their next iteration? I know they're hard at work on POCO, but this seems like it would be a popular fix.
What you describe is not N+1 problem. The example of N+1 problem is here. N+1 means that you execute N+1 selects instead of one (or two). In your example it would most probably mean:
// Lazy loads all N Orders in single select
foreach(var order in MyAccount.Orders)
{
// Lazy loads all Items for single order => executed N times
foreach(var orderItem in order.Items)
{
...
}
}
This is easily solved by:
// Eager load all Orders and their items in single query
foreach(var order in context.Accounts.Include("Orders.Items").Where(...))
{
...
}
Your example looks valid to me. You have collection which exposes IEnumerable and you execute Count operation on it. Collection is lazy loaded and count is executed in memory. The ability for translation Linq query to SQL is available only on IQueryable with expression trees representing the query. But IQueryable represents query = each access means new execution in DB so for example checking Count in loop will execute a DB query in each iteration.
So it is more about implementation of dynamic proxy.
Counting related entities without loading them will is already possible in Code-first CTP5 (final release will be called EF 4.1) when using DbContext instead of ObjectContext but not by direct interaction with collection. You will have to use something like:
int count = context.Entry(myAccount).Collection(a => a.Orders).Query().Count();
Query method returns prepared IQueryable which is probably what EF runs if you use lazy loading but you can further modify query - here I used Count.
Greetings,
Considering the Northwind sample tables Customers, Orders, and OrderDetails I would like to eager load the related entities corresponding to the tables mentioned above and yet I need ot order the child entities on the database before fetching entities.
Basic case:
var someQueryable = from customer in northwindContext.Customers.Include("Orders.OrderDetails")
select customer;
but I also need to sort Orders and OrderDetails on the database side (before fetching those entities into memory) with respect to some random column on those tables. Is it possible without some projection, like it is in T-SQL? It doesn't matter whether the solution uses e-SQL or LINQ to Entities. I searched the web but I wasn't satisfied with the answers I found since they mainly involve projecting data to some anonymous type and then re-query that anonymous type to get the child entities in the order you like. Also using CreateSourceQuery() doesn't seem to be an option for me since I need to get the data as it is on the database side, with eager loading but just by ordering child entities. That is I want to do the "ORDER BY" before executing any query and then fetch the entities in the order I'd like. Thanks in advance for any guidance. As a personal note, please excuse the direct language since I am kinda pissed at Microsoft for releasing the EF in such an immature shape even compared to Linq to SQL (which they seem to be getting away slowly). I hope this EF thingie will get much better and without significant bugs in the release version of .NET FX 4.0.
Actually I have Tip that addresses exactly this issue.
Sorting of related entities is not 'supported', but using the projection approach Craig shows AND relying on something called 'Relationship Fixup' you can get something very similar working:
If you do this:
var projection = from c in ctx.Customers
select new {
Customer = c,
Orders = c.Orders.OrderByDescending(
o => o.OrderDate
)
};
foreach(var anon in projection )
{
anon.Orders //is sorted (because of the projection)
anon.Customer.Orders // is sorted too! because of relationship fixup
}
Which means if you do this:
var customers = projection.AsEnumerable().Select(x => x.Customer);
you will have customers that have sorted orders!
See the tip for more info.
Hope this helps
Alex
You are confusing two different problems. The first is how to materialize entities in the database, the second is how to retrieve an ordered list. The EntityCollection type is not an ordered list. In your example, customer.Orders is an EntityCollection.
On the other hand, if you want to get a list in a particular order, you can certainly do that; it just can't be in a property of type EntityCollection. For example:
from c in northwindContext.Customers
orderby c.SomeField
select new {
Name = c.Name,
Orders = from o in c.Orders
orderby c.SomeField
select new {
SomeField = c.SomeField
}
}
Note that there is no call to Include. Because I am projecting, it is unnecessary.
The Entity Framework may not work in the way you expect, coming from a LINQ to SQL background, but it does work. Be careful about condemning it before you understand it; deciding that it doesn't work will prevent you from learning how it does work.
Thank you both. I understand that I can use projection to achieve what I wanted but I thought there might be an easy way to do it since in T-SQL world it's perfectly possible with a few nested queries (or joins) and order bys. On the other hand seperation of concerns sounds reasonable and we are in the entity domain now so I will use the way you two both recommended though I have to admit this is easier and cleaner to achieve in LINQ to SQL by using AssociateWith.
Kind regards.