How to use Where condition inside Include in entity framework LINQ? - entity-framework

My sample code lines are,
var question = context.EXTests
.Include(i => i.EXTestSections.Where(t => t.Status != (int)Status.InActive))
.Include(i => i.EXTestQuestions)
.FirstOrDefault(p => p.Id == testId);
Here Include was not supporting Where Clause. How can I modify above code?

You have a sequence of ExTests. Every ExText has zero or more ExTestSections, Every Extest also has a property ExtestQuestions, which is probably also a sequence. Finally every ExTest is identified by an Id.
You want a query where you get the first ExTest that has Id equal to testId, inclusive all its ExTestQuestions and some ExTestSections. You want only those ExTestSections whith an InActive status.
Use Select instead of Using
One of the slower parts of database queries is the transfer of the data from the DBMS to your process. Hence it is wise to limit it to only the data you actually plan to use.
It seems that you have designed a one-to-many relation between ExTests and its ExTestSections: every ExTest has zero or more ExTestSections and every ExTestSection belongs to exactly one ExTest. In databases this is done by giving the ExTestSection a foreign key to the ExTest that it belongs to. It might be that you've designed a many-to-many relation. The principle remains the same.
If you ask an ExTest with its hundred ExTestSections, you get the Id of the the ExTest and hundred times the value of the foreign key of the ExTestSection, thus sending the same value 101 times. What a waste.
So if you query data from the database, only query for the data you actually plan to use.
Use Include if you plan to update the queried data, otherwise use Select
Back to your question
var result = myDbContext.EXTests
.Where(exTest => exTest.Id == testId)
.Select( exTest => new
{
// only select the properties you plan to use
Id = exTest.Id;
Name = exTest.Name,
Result = exText.Result,
... // other properties
ExTestSections = exTest.Sections
.Where(exTestSection => exTestSection.Status != (int)Status.InActive)
.Select(exTestSection => new
{
// again: select only those properties you actually plan to use
Id = exTestSection.Id,
// foreign key not needed, you know it equals ExTest primary key
// ExTestId = exTestSection.ExtTestId
... // other ExtestSection properties you plan to use
})
.ToList(),
ExTestQuestions = exTest.ExTestQuestions
.Select( ...) // only the properties you'll use
})
.FirstOrDefault();
I've transferred the test on equal TestId to a Where. This would allow you to omit the Id of the requested item: you know it will equal testId, so not meaningful to transfer it.

Related

Validate data exits using EF core best practices

List list = new List();
I have a list of Guid. What is the best to check all guid exits or not using ef core table?
I am currently using the below code but the performance is very bad. assume user table as 1 million records.
for Example
public async Task<bool> IsIdListValid(IEnumerable<int> idList)
{
var validIds = await _context.User.Select(x => x.Id).ToListAync();
return idList.All(x => validIds.Contains(x));
}
The performance is bad because you are reading each row of the table into memory, and then iterating through it (ToList materializes the query.) Try using the Any() method to take advantage of the strength of the database. Use something like the following: bool exists = _context.User.Any(u => idList.Contains(u));. This should translate to an SQL IN clause.
Provided you assert that the # of IDs being sent in is kept reasonable, you could do the following:
var idCount = _context.User.Where(x => idList.Contains(x.Id)).Count();
return idCount == idList.Count;
This assumes that you are comparing on a unique constraint like the PK. We get a count of how many rows have a matching ID from the list, then compare that to the count of IDs sent.
If you're passing a large # of IDs, you would need to break the list up into reasonable sets as there are limits to what you can do with an IN clause and potential performance costs as well.

'Client side GroupBy is not supported.' [duplicate]

This question already has answers here:
Client side GroupBy is not supported
(6 answers)
Closed 2 years ago.
I am trying to run GroupBy() command in northwind db this is my code
using(var ctx = new TempContext())
{
var customer = (from s in ctx.Customers
group s by s.LastName into custByLN
select custByLN);
foreach(var val in customer)
{
Console.WriteLine(val.Key);
{
foreach(var element in val)
{
Console.WriteLine(element.LastName);
}
}
}
}
it gives System.InvalidOperationException: 'Client side GroupBy is not supported'
Apparently you are trying to make groups of Customers with the same value for LastName. Some database management systems don't support GroupBy, although this is very rare, as Grouping is a very common database action.
To see if your database management system supports grouping, try the GroupBy using method syntax. End with ToList, to execute the GroupBy:
var customerGroupsWithSameLastName = dbContext.Customers.GroupBy(
// Parameter KeySelector: make groups of Customers with same value for LastName:
customer => customer.LastName)
.ToList();
If this works, the DBMS that your DbContext communicates with accepts GroupBy.
The result is a List of groups. Every Group object implements IGrouping<string, Customer>, which means that every Group has a Key: the common LastName of all Customers in this group. The group IS (not HAS) a sequence of all Customers that have this LastName.
By the way: a more useful overload of GroupBy has an extra parameter: resultSelector. With the resultSelector you can influence the output: it is not a sequence of IGrouping objects, but a sequence of objects that you specify with a function.
This function has two input parameters: the common LastName, and all Customers with this LastName value. The return value of this function is one of the elements of your output sequence:
var result = dbContext.Customers.GroupBy(
customer => customer.LastName,
// parameter resultSelector: take the lastName and all Customers with this LastName
// to make one new:
(lastName, customersWithThisLastName) => new
{
LastName = lastName,
Count = customersWithThisLastName.Count(),
FirstNames = customersWithThisLastName.Select(customer => customer.FirstName)
.ToList(),
... // etc
})
.ToList();
Back to your question
If the above code showed you that the function is not supported by your DBMS, you can let your local process do the grouping:
var result = dbContext.Customer
// if possible: limit the number of customers that you fetch
.Where(customer => ...)
// if possible: limit the customer properties that you fetch
.Select(customer => new {...})
// Transfer the remaining data to your local process:
.AsEnumerable()
// Now your local process can do the GroupBy:
.GroupBy(customer => customer.LastName)
.ToList();
Since you selected the complete Customer, all Customer data would have been transferred anyway, so it is not a big loss if you let your local process do the GroupBy, apart maybe that the DBMS is probably more optimized to do grouping faster than your local process.
Warning: Database management systems are extremely optimized in selecting data. One of the slower parts of a database query is the transfer of the selected data from the DBMS to your local process. So if you have to use AsEnumerable(), you should realize that you will transfer all data that is selected until now. Make sure that you don't transfer anything that you won't use anyhow after the AsEnumerable(); so if you are only interested in the FirstName and LastName, don't transfer primary keys, foreign keys, addresses, etc. Let your DBMS do the Where and Select`

Entity Framework - Eager load two many-to-many relationships

Sorry for this being so long, but at least I think I got all info to be able to understand and maybe help?
I would like to load data from my database using eager loading.
The data is set up in five tables, setting up two Levels of m:n relations. So there are three tables containing data (ordered in a way of hierarchy top to bottom):
CREATE TABLE [dbo].[relations](
[relation_id] [bigint] NOT NULL
)
CREATE TABLE [dbo].[ways](
[way_id] [bigint] NOT NULL
)
CREATE TABLE [dbo].[nodes](
[node_id] [bigint] NOT NULL,
[latitude] [int] NOT NULL,
[longitude] [int] NOT NULL
)
The first two really only consist of their own ID (to hook other data not relevant here into).
In between these three data tables are two m:n tables, with a sorting hint:
CREATE TABLE [dbo].[relations_ways](
[relation_id] [bigint] NOT NULL,
[way_id] [bigint] NOT NULL,
[sequence_id] [smallint] NOT NULL
)
CREATE TABLE [dbo].[ways_nodes](
[way_id] [bigint] NOT NULL,
[node_id] [bigint] NOT NULL,
[sequence_id] [smallint] NOT NULL
)
This is, essentially, a part of the OpenStreetMap data structure. I let Entity Framework build it's objects from this database and it set up the classes exactly as the tables are.
The m:n tables do really exist as class. (I understand in EF you can build your objects m:n relation without having the explicit in-between class - should I try to change the object model in this way?)
What I want to do: My entry point is exactly one item of relation.
I think it would be best to first eager load the middle m:n relation, and then in a loop iterate over that and eager load the lowest one. I try to do that in the following way
IQueryable<relation> query = context.relations;
query = query.Where( ... ); // filters down to exactly one
query = query.Include(r => r.relation_members);
relation rel = query.SingleOrDefault();
That loads the relation and all it's 1:n info in just one trip to the database - ok, good. But I noticed it only loads the 1:n table, not the middle data table "ways".
This does NOT change if I modify the line like so:
query = query.Include(r => r.relation_members.Select(rm => rm.way));
So I cannot get the middle level loaded here, it seems?
What I cannot get working at all is load the node level of data eagerly. I tried the following:
foreach (relation_member rm in rel.relation_members) {
IQueryable<way_node> query = rm.way.way_nodes.AsQueryable();
query = query.Include(wn => wn.node);
query.Load();
}
This does work and eagerly loads the middle level way and all 1:n info of way_node in one statement for each iteration, but not the Information from node (latitude/longitude). If I access one of these values I trigger another trip to the database to load one single node object.
This last trip is deadly, since I want to load 1 relation -> 300 ways which each way -> 2000 nodes. So in the end I am hitting the server 1 + 300 + 300*2000... room for improvment, I think.
But how? I cannot get this last statement written in valid syntax AND eager loading.
Out of interest; is there a way to load the whole object graph in one trip, starting with one relation?
Loading the whole graph in one roundtrip would be:
IQueryable<relation> query = context.relations;
query = query.Where( ... ); // filters down to exactly one
query = query.Include(r => r.relation_members
.Select(rm => rm.way.way_nodes
.Select(wn => wn.node)));
relation rel = query.SingleOrDefault();
However, since you say that the Include up to ...Select(rm => rm.way) didn't work it is unlikely that this will work. (And if it would work the performance possibly isn't funny due to the complexity of the generated SQL and the amount of data and entities this query will return.)
The first thing you should investigate further is why .Include(r => r.relation_members.Select(rm => rm.way)) doesn't work because it seems correct. Is your model and mapping to the database correct?
The loop to get the nodes via explicit loading should look like this:
foreach (relation_member rm in rel.relation_members) {
context.Entry(rm).Reference(r => r.way).Query()
.Include(w => w.way_nodes.Select(wn => wn.node))
.Load();
}
Include() for some reason sometimes gets ignored when there is sorting/grouping/joining involved.
In most cases you can rewrite an Include() as a Select() into an anonymous intermediary object:
Before:
context.Invoices
.Include(invoice => invoice .Positions)
.ToList();
After:
context.Invoices
.Select(invoice => new {invoice, invoice.Positions})
.AsEnumerable()
.Select(x => x.invoice)
.ToList();
This way the query never should loose Include() information.
//get an associate book to an author
var datatable = _dataContext.Authors
.Where(x => authorids.Contains(x.AuthorId))
.SelectMany(x => x.Books)
.Distinct();

Trace Entity Framework 4.0 : Extra queries for foreign keys

In the following example, we insert an entity called taskinstance to our context. we have a foreign key FK_Contract that we set at 2.
entity.FK_Contract = 2;
context.TaskInstances.AddObject(entity);
The query generated by entity framework is a simple insert. (everything is fine)
However, the following query works differently.
int contractId = context.Contracts.Where((T) => T.Name == contractName).Single().Id;
entity.FK_Contract = contractId;
context.TaskInstances.AddObject(entity);
In the trace created by entity framework we see without surprise the query selecting the Id according a contractName but we also see an extra request looking like:
select id,... from [TaskInstances] WHERE [Extent1].[FK_Task] = #contractId
This extra query leads to many problems, especially when we work with a foreign table with millions of record. The network goes down!
Therefore we 'd like to figure out the purpose of this extra query and the way to make it disappear.
It looks like the extra query is populating a collection of tasks on the returned Contract object. Try projecting just the column you want:
int contractId = context.Contracts
.Where(T => T.Name == contractName)
.Select(T => T.Id)
.Single();

Entity Framework Timeout

I have been trying to figure out how to optimize the following query for the past few days and just not having much luck. Right now my test db is returning about 300 records with very little nested data, but it's taking 4-5 seconds to run and the SQL being generated by LINQ is awfully long (too long to include here). Any suggestions would be very much appreciated.
To sum up this query, I'm trying to return a somewhat flattened "snapshot" of a client list with current status. A Party contains one or more Clients who have Roles (ASPNET Role Provider), Journal is returning the last 1 journal entry of all the clients in a Party, same goes for Task, and LastLoginDate, hence the OrderBy and FirstOrDefault functions.
Guid userID = 'some user ID'
var parties = Parties.Where(p => p.BrokerID == userID).Select(p => new
{
ID = p.ID,
Title = p.Title,
Goal = p.Goal,
Groups = p.Groups,
IsBuyer = p.Clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "buyer")),
IsSeller = p.Clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "seller")),
Journal = p.Clients.SelectMany(c => c.Journals).OrderByDescending(j => j.OccuredOn).Select(j=> new
{
ID = j.ID,
Title = j.Title,
OccurredOn = j.OccuredOn,
SubCatTitle = j.JournalSubcategory.Title
}).FirstOrDefault(),
LastLoginDate = p.Clients.OrderByDescending(c=>c.LastLoginDate).Select(c=>c.LastLoginDate).FirstOrDefault(),
MarketingPlanCount = p.Clients.SelectMany(c => c.MarketingPlans).Count(),
Task = p.Tasks.Where(t=>t.DueDate != null && t.DueDate > DateTime.Now).OrderBy(t=>t.DueDate).Select(t=> new
{
ID = t.TaskID,
DueDate = t.DueDate,
Title = t.Title
}).FirstOrDefault(),
Clients = p.Clients.Select(c => new
{
ID = c.ID,
FirstName = c.FirstName,
MiddleName = c.MiddleName,
LastName = c.LastName,
Email = c.Email,
LastLogin = c.LastLoginDate
})
}).OrderBy(p => p.Title).ToList()
I think posting the SQL could give us some clues, as small things like the order of OrderBy coming before or after the projection could make a big difference.
But regardless, try extracting the Clients in a seperate query, this will simplify your query probably. And then include other tables like Journal and Tasks before projecting and see how this affects your query:
//am not sure what the exact query would be, and project it using ToList()
var clients = GetClientsForParty();
var parties = Parties.Include("Journal").Include("Tasks")
.Where(p=>p.BrokerID == userID).Select( p => {
....
//then use the in-memory clients
IsBuyer = clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "buyer")),
...
}
)
In all cases, install EF profiler and have a look at how your query is affected. EF can be quiet surprising. Something like putting OrderBy before the projection, the same for all these FirstOrDefault or SingleOrDefault, they can all have a big effect.
And go back to the basics, if you are searching on LoweredRoleName, then make sure it is indexed so that the query is fast (even though that could be useless since EF could end up not making use of the covering index since it is querying so many other columns).
Also, since this is query is to view data (you will not alter data), don't forget to turn off Entity tracking, that will give you some performance boost as well.
And last, don't forget that you could always write your SQL query directly and project to your a ViewModel rather than anonymous type (which I see as a good practice anyhow) so create a class called PartyViewModel that includes the flatten view you are after, and use it with your hand-crafted SQL
//use your optimized SQL query that you write or even call a stored procedure
db.Database.SQLQuery("select * from .... join .... on");
I am writing a blog post about these issues around EF. The post is still not finished, but all in all, just be patient, use some of these tricks and observe their effect (and measure it) and you will reach what you want.