Entity Framework generated queries for nested collections

Entity Framework generated queries for nested collections - entity-framework

Using Entity Framework 6.
Suppose I have an entity Parent with two nested collections ICollection<Child> and ICollection<Child2>. I want to fetch both eagerly:
dbContext.Parent.Include(p => p.Child).Include(p => Child2).ToList()
This generates a big query, which looks like this at a high level:
SELECT ... FROM (
SELECT (parent columns), (child columns), NULL as (child2 columns)
FROM Parent left join Child on ...
WHERE (filter on Parent)
UNION ALL
SELECT (parent columns), NULL as (child columns), (child2 columns)
FROM Parent left join Child2 on ...
WHERE (filter on Parent)
))
Is there a way to get Entity Framework to behave like batch fetch in NHibernate (or JPA, EclipseLink, Hibernate etc.) where you can specify that you want to query the parent table first, then each child table separately?
SELECT ... from Parent -- as usual
SELECT ... from Child where parent_id in (list of IDs)
SELECT ... from Child2 where parent_id in (list of IDs)
-- alternatively, you can specify EXISTS instead of IN LIST:
SELECT ... from Child where exists (select 1 from Parent where child.parent_id = parent.id and (where clause for parent))
I find this easier to understand and reason about, since it more closely resembles the SQL you would write if you were writing it by hand. Also, it prevents the redundant parent table rows in the result set. On the other hand, it's more round trips.

I do not believe this is possible with the Entity Framework, at least using LINQ. At the end of the day the ORM attempts to generate the most efficient query possible, at least to it. That being said ORMs like Entity don't always generate the nicest looking SQL or the most efficient. My guess, and this is just a guess, is Entity is trying to reduce the number of trips and I/O becaus I/O is experience, in relativity.
If you are looking for fine grain control over your SQL I recommend you avoid ORMs, or do like I do, use Entity for your basic CRUD and simple queries, used stored procedures for your complex queries, such as complex reports. There is always the ADO.NET too, but seems like you are more intent on using an ORM.
You may fine this useful as well. Basically not much tuning is available. https://stackoverflow.com/a/22390400/2272004

Entity Framework misses out on many sophisticated features NHibernate offers. EF's unique selling point is its versatile LINQ support, but if you need declarative control over how an ORM fetches your data spanning multiple tables, EF is not the tool of choice. With EF, you can only try to find procedural tricks.
Let me illustrate that by showing what you'd need to do to achieve "batch fetching" with EF:
context.Configuration.LazyLoadingEnabled = false;
context.Children1.Where(c1 => parentIds.Contains(c1.ParentId)).Load();
context.Children2.Where(c2 => parentIds.Contains(c2.ParentId)).Load();
var parents = dbContext.Parent.Where(p => parentIds.Contains(p.Id)).ToList();
This loads the required data into to context and EF connects the parent and children by relationship fixup. The result is a parents list with their two child collections populated. But of course, it has several downsides:
You need disable lazy loading, because even though the child collections are populated, they are not marked as loaded. Accessing them would still trigger lazy loading, when enabled.
Repetitive code: you need to repeat the predicates three times. It's not really easy to avoid this.
Too specific. For each different scenario, even if they are almost identical, you have to write a new set of statements. Or make it configurable, which is still a procedural solution.
EF's current main production version (6) doesn't have a query batch facility. You need third-party tools like EntityFramework.Extended to run these queries in one database roundtrip.

Related

Projection with Entity Splitting in Entity Framework 6

I am using entity splitting to split properties across multiple tables - http://msdn.microsoft.com/en-us/data/jj591617
This adds an inner join to the resulting SQL query.
I expected this join would only be included when the query projection includes the properties in the secondary table. This is not the case when I use an anonymous type to isolate (project) a subset of needed fields. The resulting SQL query only selects columns from the base table but still includes the join.
Is there anyway to continue to use entity splitting and only include the join when necessary?

As far as I can tell, no. With Entity Framework, you can't lazy-load simple properties that are mapped to a column (which, in your case, would theoretically help you avoid the join); only navigation properties can be lazy-loaded. Perhaps you would need to employ table splitting instead to achieve the goal of eliminating the join.
For reference:
http://www.eidias.com/blog/2013/11/18/entity-framework-lazy-loading-properties

JPA/Eclipselink - Multpile entities in single table

I'm using Eclipselink to map my tables to entities.
I have one big database table (actually it's view) with columns like groupId, groupName, categoryId, categoryName etc. I know it's redundand, but we're trying to minimize queries and it's dynamically created view.
The question is: How to map such table to several entities like Group, Category etc?

You would probably be better off mapping to the real tables and use query optimization to reduce your queries (such as join fetching and batch fetching)
See,
http://java-persistence-performance.blogspot.com/2010/08/batch-fetching-optimizing-object-graph.html
If you really want to have several class map to the same table, you will need to have one Entity and make the rest Embeddables.
See,
http://en.wikibooks.org/wiki/Java_Persistence/Embeddables

ADO.NET Entity Framework - How to select data from one Table only (and ignore other tables)?

Background is the team i'm in has just started using the EntityFramework; first we designed the database, put all the table relationships in place, foreign keys, etc; then thru visual studio add a new ADO.NET Entity Data Model, and auto-magically we get the generated edmx file representing the whole database !
Now i focus on two tables that provide data for all dropdowns and lookup lists;
TLookupDomain (domainID, domainName, domainDesc )
TLookup (lookupID, domainID, lookupCode, lookupDisplay, lookupDesc, sortOrder)
Relationship is 1-M going from left to right:
TLookupDomain -< TLookup -< TOther (+ another 30 or so other tables)
So lookupID is a foreign-Key to as many as 30 tables;
IQueryable<TLookup> qList = from l in ctx.TLookups
where l.domainID == 24
select l;
foreach (TLookup l in qList)
{
//do something.
System.Diagnostics.Debug.WriteLine("{0}\t{1}", l.lookupCode, l.lookupDisplay);
foreach (TOther f in l.TOthers)
{
System.Diagnostics.Debug.WriteLine("{0}\t{1}", f.feeAmount, f.feeDesc);
}
}
When i execute the above LINQ, i get all the fields for TLookup table (which is fair), BUT data is also fetched for the 30 or so tables that are linked to it, even though i am NOT interested in the other table's data at this point, and i am going to discard all data soon as LINQ fetches it.
Two Questions i have:
Q.1) Can i somehow modify the LINQ query above or tell the EntityFramework otherwise not to bother fetchin data from the 30 other linked tables ?
Q.2) is it "right" to have one edmx file that models the entire database? (sounds dodgy to me).

Configure Lazy Load to true for the model. Relations should be loaded only upon navegation. You can also split the models to avoid too many unnecessary relations.

Linq-to-Entities queries do not fetch anything automatically. Fetching of navigation properties is performet either by eager or lazy loading. You are not using eager loading because that requires calling Include in query (or ctx.LoadProperty separately). So if your data are fetched it must be due to lazy loading wich is enabled by default. Lazy loading triggers once you access the navigation property in the code.
You can also return only the data you need by using projections. Something like this should return readonly data:
var query = from l in ctx.TLookups
where l.domainId == 24
select new
{
l.lookupCode,
l.lookupDisplay,
l.TOthers
};
Having one or more EDMX is common dilemma. Working with single EDMX makes things more simple. If you want to know how to use multiple EDMXs and share conceptual definitions check these two articles: Part 1, Part 2.

Select N+1 in next Entity Framework

One of the few valid complaints I hear about EF4 vis-a-vis NHibernate is that EF4 is poor at handling lazily loaded collections. For example, on a lazily-loaded collection, if I say:
if (MyAccount.Orders.Count() > 0) ;
EF will pull the whole collection down (if it's not already), while NH will be smart enough to issue a select count(*)
NH also has some nice batch fetching to help with the select n + 1 problem. As I understand it, the closest EF4 can come to this is with the Include method.
Has the EF team let slip any indication that this will be fixed in their next iteration? I know they're hard at work on POCO, but this seems like it would be a popular fix.

What you describe is not N+1 problem. The example of N+1 problem is here. N+1 means that you execute N+1 selects instead of one (or two). In your example it would most probably mean:
// Lazy loads all N Orders in single select
foreach(var order in MyAccount.Orders)
{
// Lazy loads all Items for single order => executed N times
foreach(var orderItem in order.Items)
{
...
}
}
This is easily solved by:
// Eager load all Orders and their items in single query
foreach(var order in context.Accounts.Include("Orders.Items").Where(...))
{
...
}
Your example looks valid to me. You have collection which exposes IEnumerable and you execute Count operation on it. Collection is lazy loaded and count is executed in memory. The ability for translation Linq query to SQL is available only on IQueryable with expression trees representing the query. But IQueryable represents query = each access means new execution in DB so for example checking Count in loop will execute a DB query in each iteration.
So it is more about implementation of dynamic proxy.
Counting related entities without loading them will is already possible in Code-first CTP5 (final release will be called EF 4.1) when using DbContext instead of ObjectContext but not by direct interaction with collection. You will have to use something like:
int count = context.Entry(myAccount).Collection(a => a.Orders).Query().Count();
Query method returns prepared IQueryable which is probably what EF runs if you use lazy loading but you can further modify query - here I used Count.

How to sort related entities with eager loading in ADO.NET Entity Framework

Greetings,
Considering the Northwind sample tables Customers, Orders, and OrderDetails I would like to eager load the related entities corresponding to the tables mentioned above and yet I need ot order the child entities on the database before fetching entities.
Basic case:
var someQueryable = from customer in northwindContext.Customers.Include("Orders.OrderDetails")
select customer;
but I also need to sort Orders and OrderDetails on the database side (before fetching those entities into memory) with respect to some random column on those tables. Is it possible without some projection, like it is in T-SQL? It doesn't matter whether the solution uses e-SQL or LINQ to Entities. I searched the web but I wasn't satisfied with the answers I found since they mainly involve projecting data to some anonymous type and then re-query that anonymous type to get the child entities in the order you like. Also using CreateSourceQuery() doesn't seem to be an option for me since I need to get the data as it is on the database side, with eager loading but just by ordering child entities. That is I want to do the "ORDER BY" before executing any query and then fetch the entities in the order I'd like. Thanks in advance for any guidance. As a personal note, please excuse the direct language since I am kinda pissed at Microsoft for releasing the EF in such an immature shape even compared to Linq to SQL (which they seem to be getting away slowly). I hope this EF thingie will get much better and without significant bugs in the release version of .NET FX 4.0.

Actually I have Tip that addresses exactly this issue.
Sorting of related entities is not 'supported', but using the projection approach Craig shows AND relying on something called 'Relationship Fixup' you can get something very similar working:
If you do this:
var projection = from c in ctx.Customers
select new {
Customer = c,
Orders = c.Orders.OrderByDescending(
o => o.OrderDate
)
};
foreach(var anon in projection )
{
anon.Orders //is sorted (because of the projection)
anon.Customer.Orders // is sorted too! because of relationship fixup
}
Which means if you do this:
var customers = projection.AsEnumerable().Select(x => x.Customer);
you will have customers that have sorted orders!
See the tip for more info.
Hope this helps
Alex

You are confusing two different problems. The first is how to materialize entities in the database, the second is how to retrieve an ordered list. The EntityCollection type is not an ordered list. In your example, customer.Orders is an EntityCollection.
On the other hand, if you want to get a list in a particular order, you can certainly do that; it just can't be in a property of type EntityCollection. For example:
from c in northwindContext.Customers
orderby c.SomeField
select new {
Name = c.Name,
Orders = from o in c.Orders
orderby c.SomeField
select new {
SomeField = c.SomeField
}
}
Note that there is no call to Include. Because I am projecting, it is unnecessary.
The Entity Framework may not work in the way you expect, coming from a LINQ to SQL background, but it does work. Be careful about condemning it before you understand it; deciding that it doesn't work will prevent you from learning how it does work.

Thank you both. I understand that I can use projection to achieve what I wanted but I thought there might be an easy way to do it since in T-SQL world it's perfectly possible with a few nested queries (or joins) and order bys. On the other hand seperation of concerns sounds reasonable and we are in the entity domain now so I will use the way you two both recommended though I have to admit this is easier and cleaner to achieve in LINQ to SQL by using AssociateWith.
Kind regards.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse