EF eager loading from stored procedure - entity-framework

I've created a stored procedure to manage searching on my application. The stored procedure returns a collection of "cases".
When displaying my search results I reference a property of each "case" that is a linked table in the database. At the moment lazy loading makes the EF framework load the linked property for each "case".
Mini profiler shows this as duplicate trips to the database one for each object returned (possibly 200 objects) is there a way to say load all the linked properties in one go rather than leaving lazy loading to do it?

I don't think you can use .Include() directly but once you get the results from the stored procs you can run a single Linq query to get all the related results - something like:
var relatedReults = ctx.RelatedResults.Where(r => ids.Contains(r.Id)).ToArray();
where ids is an array of ids you build based on the results returned by your stored proc. EF should be able to fix up the relations so after running the query you should be able to access related entities without sending additional queries.

Related

JPA First level cache and when its filled

working with Spring data JPA and reading it Hibernate first level cache is missed, the answer says "Hibernate does not cache queries and query results by default. The only thing the first level cache is used is when you call EntityManger.find() you will not see a SQL query executing. And the cache is used to avoid object creation if the entity is already loading."
So, if If get an entity not by its Id but other criteria, if I update some property I should not see an update sql inside a transactional methods because it has not been stored int the first level cache, right?
According to the above answer, if I get some list of entities, they will not be stored in first level cache not matter the criteria I use to find them, right?
When a Transactional(propagation= Propagation.NEVER) method loads the same entity by its id two times, is not supposed it will hit the database two times because each loading will run in its own "transaction" and will have its own persistent context? What is the expected behaviour in this case?
Thanks

How does EntityFramework Core mange data internally?

I'm trying to understand how EntityFramework Core manages data internally because it influences how I call DbSets. Particularly, does it refer to in-memory data or re-query the database every time?
Example 1)
If I call _context.ToDo.Where(x => x.id == 123).First() and then in a different procedure call the same command again, will EF give me the in-memory value or re-query the DB?
Example 2)
If I call _context.ToDo.Where(x => x.id == 123).First() and then a few lines later call _context.ToDo.Find(123).Where(x => x.id == 123).Incude(x => x.Children).First(), will it use the in-memeory and then only query the DB for "Children" or does it recall the entire dataset?
I guess I'm wondering if it matters if I duplicate a call or not?
Is this affected by the AsNoTracking() switch?
What you really ask is how caching works in EF Core, not how DbContext manages data.
EF always offered 1st level caching - it kept the entities it loaded in memory, as long as the context remains alive. That's how it can track changes and save all of them when SaveChanges is called.
It doesn't cache the query itself, so it doesn't know that Where(....).First() is meant to return those specific entities. You'd have to use Find() instead. If tracking is disabled, no entities are kept around.
This is explained in Querying and Finding Entities, especially Finding entities using primary keys:
The Find method on DbSet uses the primary key value to attempt to find an entity tracked by the context. If the entity is not found in the context then a query will be sent to the database to find the entity there. Null is returned if the entity is not found in the context or in the database.
Find is different from using a query in two significant ways:
A round-trip to the database will only be made if the entity with the given key is not found in the context.
Find will return entities that are in the Added state. That is, Find will return entities that have been added to the context but have not yet been saved to the database.
In Example #2 the queries are different though. Include forces eager loading, so the results and entities returned are different. There's no need to call that a second time though, if the first entity and context are still around. You could just iterate over the Children property and EF would load the related entities one by one, using lazy loading.
EF will execute 1 query for each child item it loads. If you need to load all of them, this is slow. Slow enough to be have its own name, the N+1 selects problem. To avoid this you can load a related collection explicitly using explicit loading, eg. :
_context.Entry(todo).Collection(t=>t.Children).Load();
When you know you're going to use all children though, it's better to eagerly load all entities with Include().

How do I avoid large generated SQL queries in EF when using Include()

I'm using EF (dll version is 4.4) to query against a database. The database contains several tables with course information. When having a look what actually is sent to the db I see a massive, almost 1300 line SQL query (which I'm not going to paste here because of it's size). The query I'm running on the context looks like:
entities.Plans
.Include("program")
.Include("program.offers")
.Include("program.fees")
.Include("program.intakes")
.Include("program.requirements")
.Include("program.codes")
.Include("focuses")
.Include("codes")
.Include("exceptions")
.Include("requirements")
where plans.Code == planCode
select plans).SingleOrDefault();
I want to avoid having to go back to the server when collecting information from each of the related tables but with such a large query I'm wondering if there is there a better way of doing this?
Thanks.
A bit late but, as your data is only changing once a day, look at putting everything you need into an indexed view and place this view in your EF model.
You can usually add the .Include() after the where clause. This means that you're only pulling out information that matches what you want, see if that reduces your query any.
As you are performing an eager loading, so if you are choosing the required entities then its fine. Otherwise you can go with Lazy Loading, but as you specified that you don't want database round trip, so you can avoid it.
I would suggest, If this query is used multiple times then you can use compiled query. So that it will increase the performance.
Go through this link, if you want..
http://msdn.microsoft.com/en-us/library/bb896297.aspx
If you're using DbContext, you can use the .Local property on the context to look if your entity is already retrieved and therefore attached to the context.
If the query had run before and your root Plan entities are already attached based on Plan.Code == planId, presumably its sub-entities are also already attached since you did eager loading, so referring to them via the navigation properties won't hit the DB for them again during the lifetime of that context.
This article may be helpful in using .Local.
You may be able to get a slightly more concise SQL query by using projection rather than Include to pull back your referenced entities:
var planAggregate =
(from plan in entities.Plans
let program = plan.Program
let offers = program.Offers
let fees = program.Fees
//...
where plan.Code == planCode
select new {
plan
program,
offers,
fees,
//...
})
.SingleOrDefault();
If you disable lazy loading on your context this kind of query will result in the navigation properties of your entities being populated with the entities which were included in your query.
(I've only tested this on EF.dll v5.0, but it should behave the same on EF.dll v4.4, which is just EF5 on .NET 4.0. When I tested using this pattern rather than Include on a similarly shaped query it cut about 70 lines off of 500 lines of SQL. Your mileage may vary.)

Are navigation properties read each time they are accessed? (EF4.1)

I am using EF 4.1, with POCOs which are lazy loaded.
Some sample queries that I run:
var discountsCount = product.Discounts.Count();
var currentDiscountsCount = product.Discounts.Count(d=>d.IsCurrent);
var expiredDiscountsCount = product.Discounts.Count(d=>d.IsExpired);
What I'd like to know, is whether my queries make sense, or are poorly performant:
Am I hitting the database each time, or will the results come from cached data in the DbContext?
Is it okay to access the navigation properties "from scratch" each time, as above, or should I be caching them and then performing more queries on them, for example:
var discounts = product.Discounts;
var current = discounts.Count(d=>d.IsCurrent);
var expired = discounts.Count(d=>d.Expired);
What about a complicated case like below, does it pull the whole collection and then perform local operations on it, or does it construct a specialised SQL query which means that I cannot reuse the results to avoid hitting the database again:
var chained = discounts.OrderBy(d=>d.CreationDate).Where(d=>d.CreationDate < DateTime.Now).Count();
Thanks for the advice!
EDIT based on comments below
So once I call a navigation property (which is a collection), it will load the entire object graph. But what if I filtered that collection using .Count(d=>d...) or Select(d=>d...) or Min(d=>d...), etc. Does it load the entire graph as well, or only the final data?
product.Discounts (or any other navigation collection) isn't an IQueryable but only an IEnumerable. LINQ operations you perform on product.Discounts will never issue a query to the database - with the only exception that in case of lazy loading product.Discounts will be loaded once from the database into memory. It will be loaded completely - no matter which LINQ operation or filter you perform.
If you want to perform filters or any queries on navigation collections without loading the collection completely into memory you must not access the navigation collection but create a query through the context, for instance in your example:
var chained = context.Entry(product).Collection(p => p.Discounts).Query()
.Where(d => d.CreationDate < DateTime.Now).Count();
This would not load the Discounts collection of the product into memory but perform a query in the database and then return a single number as result. The next query of this kind would go to the database again.
In your examples above the Discounts collection should be populated by Ef the first time you access it. The subsequent linq queries on the Discount collection should then be performed in memory. This will even include the last complex expression.
You can also use the Include method to make sure you are getting back associated collection first time. example .Include("Discounts");
If your worried about performance I would recommend using SQL Profiler to have a look at what SQL is being executed.

Entity Framework - what's the difference between using Include/eager loading and lazy loading?

I've been trying to familiarize myself with the Entity Framework. Most of it seems straight forward, but I'm a bit confused on the difference between eager loading with the Include method and default lazy loading. Both seem like they load related entities, so on the surface it looks like they do the same thing. What am I missing?
Let's say you have two entities with a one-to-many relationship: Customer and Order, where each Customer can have multiple Orders.
When loading up a Customer entity, Entity Framework allows you to either eager load or lazy load the Customer's Orders collection. If you choose to eager load the Orders collection, when you retrieve a Customer out of the database Entity Framework will generate SQL that retrieves both the Customer's information and the Customer's Orders in one query. However, if you choose to lazy load the Orders collection, when you retrieve a Customer out of the database Entity Framework will generate SQL that only pulls the Customer's information (Entity Framework will then generate a separate SQL statement if you access the Customer's Orders collection later in your code).
Determining when to use eager loading and when to use lazy loading all comes down to what you expect to do with the entities you retrieve. If you know you only need a Customer's information, then you should lazy-load the Orders collection (so that the SQL query can be efficient by only retrieving the Customer's information). Conversely, if you know you'll need to traverse through a Customer's Orders, then you should eager-load the Orders (so you'll save yourself an extra database hit once you access the Customer's Orders in your code).
P.S. Be very careful when using lazy-loading as it can lead to the N+1 problem. For example, let's say you have a page that displays a list of Customers and their Orders. However, you decide to use lazy-loading when fetching the Orders. When you iterate over the Customers collection, then over each Customer's Orders, you'll perform a database hit for each Customer to lazy-load in their Orders collection. This means that for N customers, you'll have N+1 database hits (1 database hit to load up all the Customers, then N database hits to load up each of their Orders) instead of just 1 database hit had you used eager loading (which would have retrieved all Customers and their Orders in one query).
If you come from SQL world think about JOIN.
If you have to show in a grid 10 orders and the customer that put the order you have 2 choices:
1) LAZY LOAD ( = 11 queryes = SLOW PERFORMANCES)
EF will shot a query to retrieve the orders and a query for each order to retrieve the customer data.
Select * from order where order=1
+
10 x (Select * from customer where id = (order.customerId))
1) EAGER LOAD ( = 1 query = HIGH PERFORMANCES)
EF will shot a single query to retrieve the orders and customers with a JOIN.
Select * from orders INNER JOIN customers on orders.customerId=customer.Id where order=1
PS:
When you retrieve an object from the db, the object is stored in a cache while the context is active.
In the example that I made with LAZY LOAD, if all the 10 orders relate to the same customer you will see only 2 query because when you ask to EF to retrieve an object the EF will check if the object is in the cache and if it find it will not fire another SQL query to the DB.
Eager loading is intended to solve the N+1 Selects problem endemic to ORMs. The short version is this: If you are going to directly retrieve some number of entities and you know you will be accessing certain related entities via the retrieved entities, it is much more efficient to retrieve all the related entities up-front in one pass, as compared to retrieving them incrementally via lazy loading.
An important issue is serialization. Microsoft recommends NOT using the default lazy loading if you're dealing with serialized objects. Serialization causes ALL related properties to be called, which can start a chain reaction of related entities being queried. This really comes into play if you're returning JSON data from a controller. JSON data is obviously serialized. You'd either want to return data immediately via Eager or turn the lazyloading off in the context and employ Explicit Lazy loading.