Materialize entity framework query - entity-framework

I am using entity framework 5 for a query like this:
var query =
from i in context.Instrument
from p in i.InstrumentPerformance // 1 : n
where p.PortfolioScenarioID == 6013
select i;
I want to store a queryable respresentation of this (filtered) query in memory. Ideally, I would be able to disconnect the context and still request a specific InstrumentPerformance collection like so:
var perf = query.First(i => i.InstrumentID == 407240).InstrumentPerformance;
But this - of course - does not produce the desired result, since the "perf" object will contain an InstrumentPerformance collection that contains every 1:n joined InstrumentPerformance entity (whether its PortfolioScenarioID is 6013 or not) and it will retrieve these entities via lazy loading, with context.ContextOptions.LazyLoadingEnabled = false (or the context runnning out of scope) the query will not yield anyting.
So this is far from where I want to get: an easy to query in-memory representation from the original query. I tried to materialize into dictionaries and similar approaches, but ended up coding custom data objects for the result which I would like to avoid.
So my question is: what is the recommended method to get such in-memory view?
EDIT:
I am currently using two dictionaries to cache the data, e.g:
var instruments = (
from i in context.Instrument
from p in i.InstrumentPerformance
where p.PortfolioScenarioID == 6013
select i)
.ToDictionary (i => p.InstrumentID, i => i);
var performances = (
from i in context.Instrument
from p in i.InstrumentPerformance
where p.PortfolioScenarioID == 6013
select p)
.ToDictionary (p => p.InstrumentID, p => p);
However, this requires two roundtrips to the database where one seems sufficient and more importantly the semantics for querying the performance data (which is now performances[InstrumentID]) is inconsistent with the EF way of querying (which should be instrument.InstrumentPerformance.First() and the like).

It is possible to retrieve the objects in one take first and then create the dictionaries by:
var query =
(from i in context.Instrument
select new {
i,
ps = i.InstrumentPerformance
.Where(p.PortfolioScenarioID == 6013)
}).AsEnumerable()
.Select(x => x.i);
This materializes and selects Instrument entities and, here's the trick, their partly loaded InstrumentPerformance collections. I.e. the instruments only contain InstrumentPerformance entities that meet the condition PortfolioScenarioID == 6013. This is because EF runs a process known as relationship fixup that ties child objects to the right parent object when they are fetched from the database.
So now you can dispose the context and any time after that do
var perf = query.First(i => i.InstrumentID == 407240).InstrumentPerformance;
or build your dictionaries using from i in query in stead of from i in context.Instrument.
IMPORTANT: lazy loading should be disabled, otherwise EF will still try to load the full collections when they are addressed.

Look at EntityCollection<T> CreateSourceQuery and Attach. I think you could do this (not tested):
var instrumentQuery =
from i in context.Instrument
from p in i.InstrumentPerformance // 1 : n
where p.PortfolioScenarioID == 6013
select i;
var instruments = instrumentQuery.ToList();
foreach (var instrument in instruments) {
var performanceQuery =
instrument.InstrumentPerformance.CreateSourceQuery()
.Where(p => p.PortfolioScenarioID == 6013);
instrument.InstrumentPerformance.Attach(performanceQuery);
}
This executes everything at once (no lazy loading) and has a bit of code duplication, but it would result in a list of Instrument where each i.InstrumentPerformance returns the filtered collection, meaning any subsequent code that operates on it can treat it like any other EF collection without needing to know the details of the query.

Related

Filtering related entities in entity framework 6

I want to fetch the candidate and the work exp where it is not deleted. I am using repository pattern in my c# app mvc.
Kind of having trouble filtering the record and its related child entities
I have list of candidates which have collection of workexp kind of throws error saying cannot build expression from the body.
I tried putting out anonymous object but error still persist, but if I use a VM or DTO for returning the data the query works.
It's like EF doesn't like newing up of the existing entity within its current context.
var candidate = dbcontext.candidate
.where(c=>c.candiate.ID == id).include(c=>c.WorkExperience)
.select(e=>new candidate
{
WorkExperience = e.WorkExperience.where(k=>k.isdeleted==false).tolist()
});
Is there any workaround for this?
You cannot call ToList in the expression that is traslated to SQL. Alternatively, you can start you query from selecting from WorkExperience table. I'm not aware of the structure of your database, but something like this might work:
var candidate = dbcontext.WorkExperience
.Include(exp => exp.Candidate)
.Where(exp => exp.isdeleted == false && exp.Candidate.ID == id)
.GroupBy(exp => exp.Candidate)
.ToArray() //query actually gets executed and return grouped data from the DB
.Select(groped => new {
Candidate = grouped.Key,
Experience = grouped.ToArray()
});
var candidate =
from(dbcontext.candidate.Include(c=>c.WorkExperience)
where(c=>c.candiate.ID == id)
select c).ToList().Select(cand => new candidate{WorkExperience = cand.WorkExperience.where(k=>k.isdeleted==false).tolist()});

Entity Framework: A better way to update lots of table records

I am new at Entity framework, and curious what the best way would be to update all tables with records of new data. I have a method which returns a list of objects with updated records. Most of the information stays the same; just two fields will be updated.
Currently I created two ways of doing that update.
The first one is to get data from the database table and iterate from both Lists to find a match and update that match:
var previousDatafromTable= db.Widgets.ToList();
var newDataReturnedFromMethod =.......
foreach (var d in previousDatafromTable)
{
foreach (var l in newDataReturnedFromMethod )
{
if (d.id == l.id)
{
d.PositionColumn = l.PositionColumn;
d.PositionRow = l.PositionRow;
}
}
The second one is:
foreach (var item in newDataReturnedFromMethod )
{
var model = db.Widgets.Find(item.id);
model.PositionColumn = item.PositionColumn;
model.PositionRow = item.PositionRow;
}
I am iterating through the updated data and updating my database table by ID.
So I am interested to know which method is the better way of doing this, and maybe there is an option in Entity Framework to measure the performance of these two tasks? Thanks for your time in answering.
Neither is really efficient.
The first option loops through newDataReturnedFromMethod for each iteration of previousDatafromTable. That's a lot of iterations.
The second options probably executes a database query for each iteration of newDataReturnedFromMethod.
It's far more efficient to join:
var query = from n in newDataReturnedFromMethod
join p in previousDatafromTable on n.id equals p.id
select new { n,p };
foreach (var pair in query)
{
pair.p.PositionColumn = pair.n.PositionColumn;
pair.p.PositionRow = pair.n.PositionRow;
}
EF doesn't have built-in performance measurements. You'd typically use a profiler for that, or the StopWatch class.

entity framework 4.0 multiple joins

This is my real world example.
I have 4 tables:
Person
Plan
Coverage
CoveredMembers
Each person can have many plans, each of those plans can have many coverages. Each of those coverages can have many CoveredMembers.
I need a query that will apply a filter on Plan.PlanType == 1 and CoveredMembers.TermDate == null.
This query should bring back any person who has a medical type plan that is not terminated.
This SQL statement would do just that:
SELECT Person.*, Plans.*, Coverages.*, CoveredMembers.*
FROM Person P
INNER JOIN Plan PL ON P.PersonID = PL.PersonID
INNER JOIN Coverage C on PL.PlanID = C.PlanID
INNER JOIN CoveredMember CM on C.CoverageID = CM.CoverageID
WHERE CM.TermDate = NULL AND PL.PlanType = 1
I have figured out how to do this using anonymous types, but I sometimes need to update the data and save back to the database - and anonymous types are read only.
I was given a solution that did work using JOIN but it only brought back the persons (albeit filtered the way I needed). I can then loop through each person:
foreach (var person in persons) {
foreach (var plan in person.Plans{
//do stuff
}
}
But wouldn't that make a db call for each iteration of the loop? I have 500 persons with 3 unterminated medical plans each, so it would call the db 1500 times?
This is why I want to bring the whole data tree from Persons to CoveredMembers back in one shot. Is this not possible?
I believe this is accomplished in two parts:
Your query to determine the people you wish to have returned based on your criteria as discussed in this question previously: Entity framework. Need help filtering results
Properly setting the navigation properties for entities you want brought together to be eagerly loaded: http://msdn.microsoft.com/en-us/data/jj574232.aspx
For example if your Person entity looks like:
public class Person {
public List<Plan> Plans {get; set;}
...
}
When returning data from the dbcontext you can also use explicit eager loading with the include option:
var people = context.People
.Include(p => p.Plans)
.ToList();
....
If these are nested - coverage is part of plan, etc (which it looks like, it goes something like):
var people = context.People
.Include(p => p.Plans.Select(pl=>pl.Coverage).Select(c=>c.CoveredMembers)))
.ToList();
....
I am making some assumptions about your data model here, and my code above probably needs a little tweaking.
EDIT:
I might need someone else to weigh in here, but I don't think you can add the where clause into an include like that (my example above leads you that way a bit by putting the include on the context object, instead return an IQueryable with your conditions set as solved in your first post (without a ToList() called on it) and then use the code you wrote above without the Where clauses:
From first post (you supplied different criteria in this one, but same concept)
var q = from q1 in dbContext.Parent
join q2 in dbContext.Children
on q1.key equals q2.fkey
join q3 in ........
where q4.col1 == 3000
select q1;
Then:
List<Person> people = q.Include(p => p.Plans
.Select(pl => pl.Coverages)
.Select(c => c.CoveredMembers).ToList();
Again, doing this without being able to troubleshoot - I am sure it would take me a few attempts to iron this one out too.

Entity Framework Timeout

I have been trying to figure out how to optimize the following query for the past few days and just not having much luck. Right now my test db is returning about 300 records with very little nested data, but it's taking 4-5 seconds to run and the SQL being generated by LINQ is awfully long (too long to include here). Any suggestions would be very much appreciated.
To sum up this query, I'm trying to return a somewhat flattened "snapshot" of a client list with current status. A Party contains one or more Clients who have Roles (ASPNET Role Provider), Journal is returning the last 1 journal entry of all the clients in a Party, same goes for Task, and LastLoginDate, hence the OrderBy and FirstOrDefault functions.
Guid userID = 'some user ID'
var parties = Parties.Where(p => p.BrokerID == userID).Select(p => new
{
ID = p.ID,
Title = p.Title,
Goal = p.Goal,
Groups = p.Groups,
IsBuyer = p.Clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "buyer")),
IsSeller = p.Clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "seller")),
Journal = p.Clients.SelectMany(c => c.Journals).OrderByDescending(j => j.OccuredOn).Select(j=> new
{
ID = j.ID,
Title = j.Title,
OccurredOn = j.OccuredOn,
SubCatTitle = j.JournalSubcategory.Title
}).FirstOrDefault(),
LastLoginDate = p.Clients.OrderByDescending(c=>c.LastLoginDate).Select(c=>c.LastLoginDate).FirstOrDefault(),
MarketingPlanCount = p.Clients.SelectMany(c => c.MarketingPlans).Count(),
Task = p.Tasks.Where(t=>t.DueDate != null && t.DueDate > DateTime.Now).OrderBy(t=>t.DueDate).Select(t=> new
{
ID = t.TaskID,
DueDate = t.DueDate,
Title = t.Title
}).FirstOrDefault(),
Clients = p.Clients.Select(c => new
{
ID = c.ID,
FirstName = c.FirstName,
MiddleName = c.MiddleName,
LastName = c.LastName,
Email = c.Email,
LastLogin = c.LastLoginDate
})
}).OrderBy(p => p.Title).ToList()
I think posting the SQL could give us some clues, as small things like the order of OrderBy coming before or after the projection could make a big difference.
But regardless, try extracting the Clients in a seperate query, this will simplify your query probably. And then include other tables like Journal and Tasks before projecting and see how this affects your query:
//am not sure what the exact query would be, and project it using ToList()
var clients = GetClientsForParty();
var parties = Parties.Include("Journal").Include("Tasks")
.Where(p=>p.BrokerID == userID).Select( p => {
....
//then use the in-memory clients
IsBuyer = clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "buyer")),
...
}
)
In all cases, install EF profiler and have a look at how your query is affected. EF can be quiet surprising. Something like putting OrderBy before the projection, the same for all these FirstOrDefault or SingleOrDefault, they can all have a big effect.
And go back to the basics, if you are searching on LoweredRoleName, then make sure it is indexed so that the query is fast (even though that could be useless since EF could end up not making use of the covering index since it is querying so many other columns).
Also, since this is query is to view data (you will not alter data), don't forget to turn off Entity tracking, that will give you some performance boost as well.
And last, don't forget that you could always write your SQL query directly and project to your a ViewModel rather than anonymous type (which I see as a good practice anyhow) so create a class called PartyViewModel that includes the flatten view you are after, and use it with your hand-crafted SQL
//use your optimized SQL query that you write or even call a stored procedure
db.Database.SQLQuery("select * from .... join .... on");
I am writing a blog post about these issues around EF. The post is still not finished, but all in all, just be patient, use some of these tricks and observe their effect (and measure it) and you will reach what you want.

Can I get EntityFramework 4.1 to return queries that return results that consider data in both the DbSet.Local and database?

Say I have a table in a database called TestDB with a table called Table1 which contains two columns
ID
Name
In this table ,there is two row,
ID 1 and Name 'Row1'
ID 2 and Name 'Row2'
If this is my code
var currObj = Table1.Where(o => o.Name.Contains("Row1")).FirstOrDefault();
currObj.Name = "Row1a";
var a = Table1.Where(o => o.Name.Contains("Row1a")).FirstOrDefault();
var b = Table1.Local.Where(o => o.Name.Contains("Row1a")).FirstOrDefault();
a is going to return null, whereas b is going to return a value.
if I do this though
var c = Table1.Where(o => o.Name.Contains("Row2")).FirstOrDefault();
var d = Table1.Local.Where(o => o.Name.Contains("Row2")).FirstOrDefault();
then c will return something, but d will not.
For me, this seems unintuitive because there is two different places the data could be. For every query I do, I have to look in the database, and the Local object and merge them together. Like if it's changed in the local object, I have to take that into account. Does Entity Framework have any mechanism for this, which would consider BOTH the data in the database AND the local at the same time?
So if I went
var e = Table1.AMagicSolution.Where(o => o.Name.Contains("Row1a")).FirstOrDefault();
var f = Table1.AMagicSolution.Where(o => o.Name.Contains("Row2")).FirstOrDefault();
then e and f would both return something (f from the database and e from the Local object)
Your expectation is little bit strange. EF context works like a unit of work = it is used to process single logical transaction and because of that you should try to design your application in the way that it knows if the record is in the local storage or not.
If for any reason your application doesn't know if the record is in the local storage you should always separate queries to the local storage and to the database! Reasons are performance and unnecessary roundtrips to the database. Use helper like in following example to query the local storage first and only if record is not present in the local storage query the database:
public static Table1 AMagicSolution(this IQueryable<Table1> query, string name)
{
var item = Table1.Local.Where(t => t.Name.Contains(name)).FirstOrDefault();
if (item != null)
{
item = Table1.Where(t => t.Name.Contains(name)).FirstOrDefault();
}
return item;
}
Not much different from your other question.
Table1.ObjectStateManager.GetObjectStateEntries(EntityState.Added | EntityState.Unchanged)
.Where(o => o.GetType() == typeof(Table1Type)
&& o => o.Name.Contains("Row1a"))
.FirstOrDefault();
From the docs: "The EntityState is a bit field, so state entries for multiple states can be retrieved in one call by doing a bitwise OR of more than one EntityState values."