Entity Framework - Eager load two many-to-many relationships - entity-framework

Sorry for this being so long, but at least I think I got all info to be able to understand and maybe help?
I would like to load data from my database using eager loading.
The data is set up in five tables, setting up two Levels of m:n relations. So there are three tables containing data (ordered in a way of hierarchy top to bottom):
CREATE TABLE [dbo].[relations](
[relation_id] [bigint] NOT NULL
)
CREATE TABLE [dbo].[ways](
[way_id] [bigint] NOT NULL
)
CREATE TABLE [dbo].[nodes](
[node_id] [bigint] NOT NULL,
[latitude] [int] NOT NULL,
[longitude] [int] NOT NULL
)
The first two really only consist of their own ID (to hook other data not relevant here into).
In between these three data tables are two m:n tables, with a sorting hint:
CREATE TABLE [dbo].[relations_ways](
[relation_id] [bigint] NOT NULL,
[way_id] [bigint] NOT NULL,
[sequence_id] [smallint] NOT NULL
)
CREATE TABLE [dbo].[ways_nodes](
[way_id] [bigint] NOT NULL,
[node_id] [bigint] NOT NULL,
[sequence_id] [smallint] NOT NULL
)
This is, essentially, a part of the OpenStreetMap data structure. I let Entity Framework build it's objects from this database and it set up the classes exactly as the tables are.
The m:n tables do really exist as class. (I understand in EF you can build your objects m:n relation without having the explicit in-between class - should I try to change the object model in this way?)
What I want to do: My entry point is exactly one item of relation.
I think it would be best to first eager load the middle m:n relation, and then in a loop iterate over that and eager load the lowest one. I try to do that in the following way
IQueryable<relation> query = context.relations;
query = query.Where( ... ); // filters down to exactly one
query = query.Include(r => r.relation_members);
relation rel = query.SingleOrDefault();
That loads the relation and all it's 1:n info in just one trip to the database - ok, good. But I noticed it only loads the 1:n table, not the middle data table "ways".
This does NOT change if I modify the line like so:
query = query.Include(r => r.relation_members.Select(rm => rm.way));
So I cannot get the middle level loaded here, it seems?
What I cannot get working at all is load the node level of data eagerly. I tried the following:
foreach (relation_member rm in rel.relation_members) {
IQueryable<way_node> query = rm.way.way_nodes.AsQueryable();
query = query.Include(wn => wn.node);
query.Load();
}
This does work and eagerly loads the middle level way and all 1:n info of way_node in one statement for each iteration, but not the Information from node (latitude/longitude). If I access one of these values I trigger another trip to the database to load one single node object.
This last trip is deadly, since I want to load 1 relation -> 300 ways which each way -> 2000 nodes. So in the end I am hitting the server 1 + 300 + 300*2000... room for improvment, I think.
But how? I cannot get this last statement written in valid syntax AND eager loading.
Out of interest; is there a way to load the whole object graph in one trip, starting with one relation?

Loading the whole graph in one roundtrip would be:
IQueryable<relation> query = context.relations;
query = query.Where( ... ); // filters down to exactly one
query = query.Include(r => r.relation_members
.Select(rm => rm.way.way_nodes
.Select(wn => wn.node)));
relation rel = query.SingleOrDefault();
However, since you say that the Include up to ...Select(rm => rm.way) didn't work it is unlikely that this will work. (And if it would work the performance possibly isn't funny due to the complexity of the generated SQL and the amount of data and entities this query will return.)
The first thing you should investigate further is why .Include(r => r.relation_members.Select(rm => rm.way)) doesn't work because it seems correct. Is your model and mapping to the database correct?
The loop to get the nodes via explicit loading should look like this:
foreach (relation_member rm in rel.relation_members) {
context.Entry(rm).Reference(r => r.way).Query()
.Include(w => w.way_nodes.Select(wn => wn.node))
.Load();
}

Include() for some reason sometimes gets ignored when there is sorting/grouping/joining involved.
In most cases you can rewrite an Include() as a Select() into an anonymous intermediary object:
Before:
context.Invoices
.Include(invoice => invoice .Positions)
.ToList();
After:
context.Invoices
.Select(invoice => new {invoice, invoice.Positions})
.AsEnumerable()
.Select(x => x.invoice)
.ToList();
This way the query never should loose Include() information.

//get an associate book to an author
var datatable = _dataContext.Authors
.Where(x => authorids.Contains(x.AuthorId))
.SelectMany(x => x.Books)
.Distinct();

Related

EF LINQ Get list of records by values not existing in another list

So, I have junction table called:
AppointmentsActivities
consisting of:
AppointmentID
ActivityID
I need to implement Update operation. Since it's a junction table the update should be able to not only to update existing records, but also to insert new records, or to delete one who no longer need to exists in the table(because i'm passing an entity with an appointmentID and list of ActivityID's).
I'm struggling to delete the records, that no longer should exist in the table.
I have to delete every record which have the same AppointmentId, but his ActivityID should not be present in any of the objects from the new list of Activities.
The query I have written looks like this :
var remove = _context.AppointmentsActivities.
Where(i => i.AppointmentID == entity.ID && entity.Activities.Any(u => u.ActivityID != i.ActivityID)).
ToList();
Where:
i => i.AppointmentID == entity.ID
Checks if the appointmentID of the newly passed entity is the same as the one in the database table.
And:
entity.Activities.Any(u => u.ActivityID != i.ActivityID)
Is supposed to check if any of the activityID's in the list of Activities equals the activityID from the database table.
Obviosly,I'm missing something, because EF cannot resolve this LINQ query. What am I missing? Any help will be appreciated. Thank you.
Try to rewrite your LINQ query to be acceptable by EF. Any with local collections will not work, so replace with Contains:
var activityIds = entity.Activities.Select(a => a.ActivityID).ToList();
var remove = _context.AppointmentsActivities
.Where(i => i.AppointmentID == entity.ID && !activityIds.Contains(i.ActivityID))
.ToList();

How to use Where condition inside Include in entity framework LINQ?

My sample code lines are,
var question = context.EXTests
.Include(i => i.EXTestSections.Where(t => t.Status != (int)Status.InActive))
.Include(i => i.EXTestQuestions)
.FirstOrDefault(p => p.Id == testId);
Here Include was not supporting Where Clause. How can I modify above code?
You have a sequence of ExTests. Every ExText has zero or more ExTestSections, Every Extest also has a property ExtestQuestions, which is probably also a sequence. Finally every ExTest is identified by an Id.
You want a query where you get the first ExTest that has Id equal to testId, inclusive all its ExTestQuestions and some ExTestSections. You want only those ExTestSections whith an InActive status.
Use Select instead of Using
One of the slower parts of database queries is the transfer of the data from the DBMS to your process. Hence it is wise to limit it to only the data you actually plan to use.
It seems that you have designed a one-to-many relation between ExTests and its ExTestSections: every ExTest has zero or more ExTestSections and every ExTestSection belongs to exactly one ExTest. In databases this is done by giving the ExTestSection a foreign key to the ExTest that it belongs to. It might be that you've designed a many-to-many relation. The principle remains the same.
If you ask an ExTest with its hundred ExTestSections, you get the Id of the the ExTest and hundred times the value of the foreign key of the ExTestSection, thus sending the same value 101 times. What a waste.
So if you query data from the database, only query for the data you actually plan to use.
Use Include if you plan to update the queried data, otherwise use Select
Back to your question
var result = myDbContext.EXTests
.Where(exTest => exTest.Id == testId)
.Select( exTest => new
{
// only select the properties you plan to use
Id = exTest.Id;
Name = exTest.Name,
Result = exText.Result,
... // other properties
ExTestSections = exTest.Sections
.Where(exTestSection => exTestSection.Status != (int)Status.InActive)
.Select(exTestSection => new
{
// again: select only those properties you actually plan to use
Id = exTestSection.Id,
// foreign key not needed, you know it equals ExTest primary key
// ExTestId = exTestSection.ExtTestId
... // other ExtestSection properties you plan to use
})
.ToList(),
ExTestQuestions = exTest.ExTestQuestions
.Select( ...) // only the properties you'll use
})
.FirstOrDefault();
I've transferred the test on equal TestId to a Where. This would allow you to omit the Id of the requested item: you know it will equal testId, so not meaningful to transfer it.

Trace Entity Framework 4.0 : Extra queries for foreign keys

In the following example, we insert an entity called taskinstance to our context. we have a foreign key FK_Contract that we set at 2.
entity.FK_Contract = 2;
context.TaskInstances.AddObject(entity);
The query generated by entity framework is a simple insert. (everything is fine)
However, the following query works differently.
int contractId = context.Contracts.Where((T) => T.Name == contractName).Single().Id;
entity.FK_Contract = contractId;
context.TaskInstances.AddObject(entity);
In the trace created by entity framework we see without surprise the query selecting the Id according a contractName but we also see an extra request looking like:
select id,... from [TaskInstances] WHERE [Extent1].[FK_Task] = #contractId
This extra query leads to many problems, especially when we work with a foreign table with millions of record. The network goes down!
Therefore we 'd like to figure out the purpose of this extra query and the way to make it disappear.
It looks like the extra query is populating a collection of tasks on the returned Contract object. Try projecting just the column you want:
int contractId = context.Contracts
.Where(T => T.Name == contractName)
.Select(T => T.Id)
.Single();

How to populate a property that is not directly stored in Database with CodeFirst

I have 2 Entities, each of which is managed by EF Code First, and each happily sitting in its own table. Entity_A has a property, called "SumTotal", which should be the sum of a specific column in Entity_B that matches a certain criteria.
SumTotal should not be persisted in the DB, but rather calculated each time an instance of Entity_A is retrieved.
I have looked at ComputedColumns, but it appears that the computedcolumn can only be defined relative to columns in the same table.
I also have a feeling that I need to set SumTotal to NotMapped (or something similar with AutoGenerated), but dont know how to get the actual value into SumTotal.
Hope this question makes sense, thanks in advance
You can project the results to an anonymous object and transform that it to your entity
var projection = db.EntityAs.Where(/* */)
.Select(a => new {A = a, Sum = a.Bs.Sum(b => b.Total)})
foreach(p in projection)
{
p.A.SumTotal = p.Sum;
}
var As = projection.Select(p => p.A);

Entity Framework 4.1 Code First - auto increment field on insert for non primary key

My model contains an Order (parent object) and Shipments (child object). The database table for these already have a surrogate key as an auto-increment primary key.
I have the business rule is that for each shipment in the order, we need to have an auto generated "counter" field -- e.g. Shipment 1, Shipment 2, Shipment 3, etc. Shipment model has properties: "ShipmentId", "OrderId", "ShipmentNumber". My attempted implemention is to have ShipmentNumber an int and in code(as opposed to database), query the Shipment collection and do max() + 1.
Here's a code snipet of what I'm doing.
Shipment newShipmentObj = // blah;
int? currentMaxId = myOrderObj.Shipments
.Select(x => (int?) x.ShipmentNumber)
.Max();
if (currentMaxId.HasValue)
newShipmentObj.ShipmentNumber = currentMaxId.Value + 1;
else
newShipmentObj.ShipmentNumber = 1; // 1st one
myOrderObj.Shipments.Add(newShipmentObj);
// etc.. rest of EF4 code
Is there a better way?
I don't really like this as I have the following problems because of potential transaction/concurrency issues.
My Order object also has a autoincrement "counter" -- e.g. Order 1, Order 2, Order 3, ... My Order model has properties: "OrderId", "CustomerId", "OrderNumber".
My design is that I have an OrderRepository but not a ShipmentRepository. The ShipmentRepository could query off the Order.Shipment collection... but with Orders, I have to query directly off the dbcontext, e.g.
int? currentMaxId = (_myDbContext)).Orders
.Where(x => x.CustomerId == 123456)
.Select(x => (int?)x.OrderNumber)
.Max();
However, the above part doesn't work well if I attempt to add multiple objects to the DbContext without committing/saving changes to the database. (i.e. the .Where() returns null... and only works if I use DbContext ".Local", which is not what I want.)
Help! Not sure what the best solution would be. Thanks!
you seem to already have shipmentid that is incremental. you can use it for you shipment number and maybe combined with current date as described here: How to implement gapless, user-friendly IDs in NHibernate? what you are trying to do with Max() is evil. Stay away from it as it can cause problems with getting the same shipment numbers for multiple shipments when the load is high