Validate data exits using EF core best practices - entity-framework

List list = new List();
I have a list of Guid. What is the best to check all guid exits or not using ef core table?
I am currently using the below code but the performance is very bad. assume user table as 1 million records.
for Example
public async Task<bool> IsIdListValid(IEnumerable<int> idList)
{
var validIds = await _context.User.Select(x => x.Id).ToListAync();
return idList.All(x => validIds.Contains(x));
}

The performance is bad because you are reading each row of the table into memory, and then iterating through it (ToList materializes the query.) Try using the Any() method to take advantage of the strength of the database. Use something like the following: bool exists = _context.User.Any(u => idList.Contains(u));. This should translate to an SQL IN clause.

Provided you assert that the # of IDs being sent in is kept reasonable, you could do the following:
var idCount = _context.User.Where(x => idList.Contains(x.Id)).Count();
return idCount == idList.Count;
This assumes that you are comparing on a unique constraint like the PK. We get a count of how many rows have a matching ID from the list, then compare that to the count of IDs sent.
If you're passing a large # of IDs, you would need to break the list up into reasonable sets as there are limits to what you can do with an IN clause and potential performance costs as well.

Related

How to use Where condition inside Include in entity framework LINQ?

My sample code lines are,
var question = context.EXTests
.Include(i => i.EXTestSections.Where(t => t.Status != (int)Status.InActive))
.Include(i => i.EXTestQuestions)
.FirstOrDefault(p => p.Id == testId);
Here Include was not supporting Where Clause. How can I modify above code?
You have a sequence of ExTests. Every ExText has zero or more ExTestSections, Every Extest also has a property ExtestQuestions, which is probably also a sequence. Finally every ExTest is identified by an Id.
You want a query where you get the first ExTest that has Id equal to testId, inclusive all its ExTestQuestions and some ExTestSections. You want only those ExTestSections whith an InActive status.
Use Select instead of Using
One of the slower parts of database queries is the transfer of the data from the DBMS to your process. Hence it is wise to limit it to only the data you actually plan to use.
It seems that you have designed a one-to-many relation between ExTests and its ExTestSections: every ExTest has zero or more ExTestSections and every ExTestSection belongs to exactly one ExTest. In databases this is done by giving the ExTestSection a foreign key to the ExTest that it belongs to. It might be that you've designed a many-to-many relation. The principle remains the same.
If you ask an ExTest with its hundred ExTestSections, you get the Id of the the ExTest and hundred times the value of the foreign key of the ExTestSection, thus sending the same value 101 times. What a waste.
So if you query data from the database, only query for the data you actually plan to use.
Use Include if you plan to update the queried data, otherwise use Select
Back to your question
var result = myDbContext.EXTests
.Where(exTest => exTest.Id == testId)
.Select( exTest => new
{
// only select the properties you plan to use
Id = exTest.Id;
Name = exTest.Name,
Result = exText.Result,
... // other properties
ExTestSections = exTest.Sections
.Where(exTestSection => exTestSection.Status != (int)Status.InActive)
.Select(exTestSection => new
{
// again: select only those properties you actually plan to use
Id = exTestSection.Id,
// foreign key not needed, you know it equals ExTest primary key
// ExTestId = exTestSection.ExtTestId
... // other ExtestSection properties you plan to use
})
.ToList(),
ExTestQuestions = exTest.ExTestQuestions
.Select( ...) // only the properties you'll use
})
.FirstOrDefault();
I've transferred the test on equal TestId to a Where. This would allow you to omit the Id of the requested item: you know it will equal testId, so not meaningful to transfer it.

How insert item on top table

How insert item on top table in PostgreSQL? That it is possible? In the table I have only two fields as text. First is primary key.
CREATE TABLE news_table (
title text not null primary key,
url text not null
);
I need a simple query for the program in java.
OK, this is my code:
get("/getnews", (request, response) -> {
List<News> getNews = newsService.getNews();
List<News> getAllNews = newsService.getAllNews();
try (Connection connection = DB.sql2o.open()) {
String sql = "INSERT INTO news_table(title, url) VALUES (:title, :url)";
for (News news : getNews) {
if (!getAllNews.contains(news)) {
connection.createQuery(sql, true)
.addParameter("title", news.getTitle())
.addParameter("url", news.getUrl())
.executeUpdate()
.getKey();
}
}
}
return newsService.getNews();
}, json());
The problem is that as it calls getnews method for the second time this new news adds at the end of the table, and there is no extant hronologi news. How this resolve? I use Sql2o + sparkjava.
Probably already I know. I need to reverse the List getnews before I will must contains object getnews and getallnews?
There is no start or end in a table. If you want to sort your data, just use an ORDER BY in your SELECT statements. Without ORDER BY, there is no order.
Relational theory, the mathematical foundation of relational databases, lays down certain conditions that relations (represented in real databases as tables) must obey. One of them is that they have no ordering (i.e., the rows will neither be stored nor retrieved in any particular order, since they are treated as a mathematical set). It's therefore completely under the control of the RDBMS where a new row is entered into a table.
Hence there is no way to ensure a particular ordering of the data without using an ORDER BY clause when you retrieve the data.

What are the ways to optimize Entity Framework queries with Contains()?

Wee load large object graph from DB.
The query has many Includes and Where()uses Contains() to filter the final result.
Contains is called for the collection containing about thousand entries.
The profiler shows monstrous human-unreadable SQL.
The query cannot be precompiled because of Contains().
Is there any ways for optimization of such queries?
Update
public List<Vulner> GetVulnersBySecurityObjectIds(int[] softwareIds, int[] productIds)
{
var sw = new Stopwatch();
var query = from vulner in _businessModel.DataModel.VulnerSet
join vt in _businessModel.DataModel.ObjectVulnerTieSet.Where(ovt => softwareIds.Contains(ovt.SecurityObjectId))
on vulner.Id equals vt.VulnerId
select vulner;
var result = ((ObjectQuery<Vulner>)query.OrderBy(v => v.Id).Distinct())
.Include("Descriptions")
.Include("Data")
.Include("VulnerStatuses")
.Include("GlobalIdentifiers")
.Include("ObjectVulnerTies")
.Include("Object.ProductObjectTies.Product")
.Include("VulnerComment");
//Если переданы конкретные продукты, добавляем фильтрацию
if (productIds.HasValues())
result = (ObjectQuery<Vulner>)result.Where(v => v.Object.ProductObjectTies.Any(p => productIds.Contains(p.ProductId)));
sw.Start();
var str = result.ToTraceString();
sw.Stop();
Debug.WriteLine("Сборка запроса заняла {0} секунд.", sw.Elapsed.TotalSeconds);
sw.Restart();
var list = result.ToList();
sw.Stop();
Debug.WriteLine("Получение уязвимостей заняло {0} секунд.", sw.Elapsed.TotalSeconds);
return list;
}
It's almost certain that splitting the query in pieces performs better, in spite of more db round trips. It is always advised to limit the number of includes, because they not only blow up the size and complexity of the query (as you noticed) but also blow up the result set both in length and in width. Moreover, they often get translated into outer joins.
Apart from that, using Contains the way you do is OK.
Sorry, it is hard to be more specific without knowing your data model and the size of the tables involved.

Entity Framework Timeout

I have been trying to figure out how to optimize the following query for the past few days and just not having much luck. Right now my test db is returning about 300 records with very little nested data, but it's taking 4-5 seconds to run and the SQL being generated by LINQ is awfully long (too long to include here). Any suggestions would be very much appreciated.
To sum up this query, I'm trying to return a somewhat flattened "snapshot" of a client list with current status. A Party contains one or more Clients who have Roles (ASPNET Role Provider), Journal is returning the last 1 journal entry of all the clients in a Party, same goes for Task, and LastLoginDate, hence the OrderBy and FirstOrDefault functions.
Guid userID = 'some user ID'
var parties = Parties.Where(p => p.BrokerID == userID).Select(p => new
{
ID = p.ID,
Title = p.Title,
Goal = p.Goal,
Groups = p.Groups,
IsBuyer = p.Clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "buyer")),
IsSeller = p.Clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "seller")),
Journal = p.Clients.SelectMany(c => c.Journals).OrderByDescending(j => j.OccuredOn).Select(j=> new
{
ID = j.ID,
Title = j.Title,
OccurredOn = j.OccuredOn,
SubCatTitle = j.JournalSubcategory.Title
}).FirstOrDefault(),
LastLoginDate = p.Clients.OrderByDescending(c=>c.LastLoginDate).Select(c=>c.LastLoginDate).FirstOrDefault(),
MarketingPlanCount = p.Clients.SelectMany(c => c.MarketingPlans).Count(),
Task = p.Tasks.Where(t=>t.DueDate != null && t.DueDate > DateTime.Now).OrderBy(t=>t.DueDate).Select(t=> new
{
ID = t.TaskID,
DueDate = t.DueDate,
Title = t.Title
}).FirstOrDefault(),
Clients = p.Clients.Select(c => new
{
ID = c.ID,
FirstName = c.FirstName,
MiddleName = c.MiddleName,
LastName = c.LastName,
Email = c.Email,
LastLogin = c.LastLoginDate
})
}).OrderBy(p => p.Title).ToList()
I think posting the SQL could give us some clues, as small things like the order of OrderBy coming before or after the projection could make a big difference.
But regardless, try extracting the Clients in a seperate query, this will simplify your query probably. And then include other tables like Journal and Tasks before projecting and see how this affects your query:
//am not sure what the exact query would be, and project it using ToList()
var clients = GetClientsForParty();
var parties = Parties.Include("Journal").Include("Tasks")
.Where(p=>p.BrokerID == userID).Select( p => {
....
//then use the in-memory clients
IsBuyer = clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "buyer")),
...
}
)
In all cases, install EF profiler and have a look at how your query is affected. EF can be quiet surprising. Something like putting OrderBy before the projection, the same for all these FirstOrDefault or SingleOrDefault, they can all have a big effect.
And go back to the basics, if you are searching on LoweredRoleName, then make sure it is indexed so that the query is fast (even though that could be useless since EF could end up not making use of the covering index since it is querying so many other columns).
Also, since this is query is to view data (you will not alter data), don't forget to turn off Entity tracking, that will give you some performance boost as well.
And last, don't forget that you could always write your SQL query directly and project to your a ViewModel rather than anonymous type (which I see as a good practice anyhow) so create a class called PartyViewModel that includes the flatten view you are after, and use it with your hand-crafted SQL
//use your optimized SQL query that you write or even call a stored procedure
db.Database.SQLQuery("select * from .... join .... on");
I am writing a blog post about these issues around EF. The post is still not finished, but all in all, just be patient, use some of these tricks and observe their effect (and measure it) and you will reach what you want.

Entity Framework, How to include related entities in this example

I have a table AccountSecurity which is a many-to-many table that relates Account entities and Securities. When I write the query below it returns all Securities that satisfy the where clause. However each Security instance in the list no longer has the reference to the AccountSecurity it came from. So when I do list[0].AccountSecurity it is empty. Is there anyway to include that information? I know I can rewrite the query to return AccountSecurities instead and use .Include("Security") on that, but I wonder if it can be done another way.
var list = (from acctSec in base.context.AccountSecurities
where acctSec.AccountId == accountId
select acctSec.Security).ToList();
UPDATE
Of course if I do two queries the graph gets populated properly, there has to be a way to do this in one shot.
var securities = (from acctSec in base.context.AccountSecurities
where acctSec.AccountId == accountId
select acctSec.Security).ToList();
//this query populates the AccountSecurities references within Security instances returned by query above
var xref = (from acctSec in base.context.AccountSecurities
where acctSec.AccountId == accountId
select acctSec).ToList();
var list = (from sec in base.context.Securities
.Include("AccountSecurity")
where sec.AccountSecurities.Any(as => as.AccountId == accountId)
select sec).ToList();
Try this:
var list = (from acctSec in base.context.AccountSecurities.Include("Security")
where acctSec.AccountId == accountId
select acctSec).ToList();
Then simply use the Security property as needed, and since it's read at the same time AccountSecurities is (single SQL with join), it will be very efficient.