EntityFramework counting of query results vs counting list - entity-framework

Should efQuery.ToList().Count and efQuery.Count() produce the same value?
How is it possible that efQuery.ToList().Count and efQuery.Count() don't produce the same value?
//GetQuery() returns a default IDbSet which is used in EntityFramework
using (var ds = _provider.DataSource())
{
//return GetQuery(ds, filters).Count(); //returns 0???
return GetQuery(ds, filters).ToList().Count; //returns 605 which is correct based on filters
}

Just ran into this myself. In my case the issue is that the query has a .Select() clause that causes further relationships to be established which end up filtering the query further as the relationship inner join's constrain the result.
It appears that .Count() doesn't process the .Select() part of the query.
So I have:
// projection created
var ordersData = orders.Select( ord => new OrderData() {
OrderId = ord.OrderId,
... more simple 1 - 1 order maps
// Related values that cause relations in SQL
TotalItemsCost = ord.OrderLines.Sum(lin => lin.Qty*lin.Price),
CustomerName = ord.Customer.Name,
};
var count = ordersData.Count(); // 207
var count = ordersData.ToList().Count // 192
When I compare the SQL statements I find that Count() does a very simple SUM on the Orders table which returns all orders, while the second query is a monster of 100+ lines of SQL that has 10 inner joins that are triggered by the .Select() clause (there are a few more related values/aggregations retrieved than shown here).
Basically this seems to indicate that .Count() doesn't take the .Select() clause into account when it does its count, so those same relationships that cause further constraining of the result set are not fired for .Count().
I've been able to make this work by explicitly adding expressions to the .Count() method that pull in some of those aggregated result values which effectively force them into the .Count() query as well:
var count = ordersData.Count( o=> o.TotalItemsCost != -999 &&
o.Customer.Name != "!##"); // 207
The key is to make sure that any of the fields that are calculated or pull in related data and cause a relationship to fire, are included in the expression which forces Count() to include the required relationships in its query.
I realize this is a total hack and I'm hoping there's a better way, but for the moment this has allowed us at least to get the right value without pulling massive data down with .ToList() first.

Assuming here that efQuery is IQueryable:
ToList() actually executes a query. If changes to data in the datastore, between calls to ToList() and .Count(), result in a different resultset, calling ToList() will repopulate the list. ToList().Count and .Count() should then match until the data in the store changes the resultset again.

Related

Conditional WHERE clause in KDB?

Full Query:
{[tier;company;ccy; startdate; enddate] select Deal_Time, Deal_Date from DEALONLINE_REMOVED where ?[company = `All; 1b; COMPANY = company], ?[tier = `All;; TIER = tier], Deal_Date within(startdate;enddate), Status = `Completed, ?[ccy = `All;1b;CCY_Pair = ccy]}
Particular Query:
where ?[company = `All; 1b; COMPANY = company], ?[tier = `All; 1b; TIER = tier],
What this query is trying to do is to get the viewstate of a dropdown.
If there dropdown selection is "All", that where clause i.e. company or tier is invalidated, and all companies or tiers are shown.
I am unsure if the query above is correct as I am getting weird charts when displaying them on KDB dashboard.
What I would recommend is to restructure your function to make use of the where clause using functional qSQL.
In your case, you need to be able to filter based on certain input, if its "All" then don't filter else filter on that input. Something like this could work.
/Define sample table
DEALONLINE_REMOVED:([]Deal_time:10#.z.p;Deal_Date:10?.z.d;Company:10?`MSFT`AAPL`GOOGL;TIER:10?`1`2`3)
/New function which joins to where clause
{[company;tier]
wc:();
if[not company=`All;wc:wc,enlist (=;`Company;enlist company)];
if[not tier=`All;wc:wc,enlist (=;`TIER;enlist tier)];
?[DEALONLINE_REMOVED;wc;0b;()]
}[`MSFT;`2]
If you replace the input with `All you will see that everything is returned.
The full functional select for your query would be as follows:
whcl:{[tier;company;ccy;startdate;enddate]
wc:(enlist (within;`Deal_Date;(enlist;startdate;enddate))),(enlist (=;`Status;enlist `Completed)),
$[tier=`All;();enlist (=;`TIER;enlist tier)],
$[company=`All;()enlist (=;`COMPANY;enlist company)],
$[ccy=`All;();enlist (=;`CCY_Pair;enlist ccy)];
?[`DEALONLINE_REMOVED;wc;0b;`Deal_Time`Deal_Date!`Deal_Time`Deal_Date]
}
The first part specifies your date range and status = `Completed in the where clause
wc:(enlist (within;`Deal_Date;(enlist;startdate;enddate))),(enlist (=;`Status;enlist `Completed)),
Next each of these conditionals checks for `All for the TIER, COMPANY and CCY_Pair column filtering. It then joins these on to the where clause when a specific TIER, COMPANY or CCY_Pair are specified. (otherwise an empty list is joined on):
$[tier=`All;();enlist (=;`TIER;enlist tier)],
$[company=`All;();enlist (=;`COMPANY;enlist company)],
$[ccy=`All;();enlist (=;`CCY_Pair;enlist ccy)];
Finally, the select statement is called in its functional form as follows, with wc as the where clause:
?[`DEALONLINE_REMOVED;wc;0b;`Deal_Time`Deal_Date!`Deal_Time`Deal_Date]

How write RQLQuery?

I am new to ATG, and I have this question. How can I write my RQLQuery that provide me data, such as this SQL query?
select avg(rating) from rating WHERE album_id = ?;
I'm trying this way:
RqlStatement statement;
Object rqlparam[] = new Object[1];
rqlparam[0] = album_Id;
statement= RqlStatement.parseRqlStatement("album_id= ? 0");
MutableRepository repository = (MutableRepository) getrMember();
RepositoryView albumView = repository.getView(ALBUM);
This query returns me an item for a specific album_id, how can I improve my RQL query so that it returns to me the average field value, as SQL query above.
There is no RQL syntax that will allow for the calculation of an average value for items in the query. As such you have two options. You can either execute your current statement:
album_id= ? 0
And then loop through the resulting RepositoryItem[] and calculate the average yourself (this could be time consuming on large datasets and means you'll have to load all the results into memory, so perhaps not the best solution) or you can implement a SqlPassthroughQuery that you execute.
Object params[] = new Object[1];
params[0] = albumId;
Builder builder = (Builder)view.getQueryBuilder();
String str = "select avg(rating) from rating WHERE album_id = 1 group by album_id";
RepositoryItem[] items =
view.executeQuery (builder.createSqlPassthroughQuery(str, params));
This will execute the average calculation on the database (something it is quite good at doing) and save you CPU cycles and memory in the application.
That said, don't make a habit of using SqlPassthroughQuery as means you don't get to use the repository cache as much, which could be detrimental to your application.

Unable to figure out filter in slickdb

Using scala with slickdb. I have table called persons. And I am filtering out persons by name as below
table.Persons.filter({ row => {
println("inside filter")
req.personName.map(name => row.personName === name).getOrElse(true:Rep[Boolean])
})
The table contains 3 rows. But still println() is executed only once. How is this filter working?
First of all when you write something like
personTable.filter(p => { .... })
It evaluates it self as a Query which can generate the SQL Query when needed for actual DB querying. The generated SQL will be something like,
SELECT ...
FROM persons
WHERE ...
Now this SQL query is submitted to the DB for execution.
So, you code inside { ... } gets evaluated to generate the Query itself. And it has no relation to how many rows do you have in your DB table.
So, the println in your example will run just once even if your DB table has 0 rows, 1 row or a million rows.

Searching a column in Entity Framework with multiple values

I am trying to run a search on one particular field of a table with a list of values. Not able to find a solution so far. Any help is really appreciated.
Here is the scenario
var records = new PagedList<Required>();
var result = db.Required.Where(x => filter == null || (x.Title.Contains(filter)) || x.CID.Contains(filter));
foreach (string str in SelectedNetwork)
{
string tempStr = str;
result = result.Where(x => x.Network == tempStr);
records.TotalRecords = result.Count();
}
records.Content = result
.Where(x => filter == null ||
(x.Title.Contains(filter))
|| x.CID.Contains(filter)
)
.OrderBy(sort + " " + sortdir)
.Skip((page - 1) * Convert.ToInt32(records.PageSize))
.Take(Convert.ToInt32(records.PageSize))
.ToList();
highlighted code in the foreach loop fails to run as per expectation. Is there any way, I can fix it?
Thanks
Tutumon
You must take into account that LINQ expressions are queries, until you materialize them. To materialize them you need to either enumerate them, or convert them to a list, array, or whatever, i.e. enumerate their members in a foreach, or call a method like ToList(), or ToArray().
In your code the original query stored in result is not materialized, so everytime a foreach loop is executed, a new Where contidion is added to the original query. To vaoid this behavior you need to recreate the whole results query in each iteration, so that you get a fresh copy of the unfilterd expression.
There would be another solution which would be to materialize the result query and then run the foreach loop as is. The problem of this solution would be that you would get all the data from the database, keep it in memory and run the Where and the Count on the in-memory copy. Unless there is a very small number of rows in Required that would be a very bad idea.

Entity Framework .Any does not generate expected SQL WHERE clause

Entity Framework and Linq-To-Entities are really giving me some headaches. I have a fairly simple query:
var result = feed.FeedItems.Any(ei => ei.ServerId == "12345");
feed is a single EF entity I selected earlier in a separate query from the same context.
But the generated SQL just throws away the .Any condition and requests all FeedItems of the feed object which can be several thousands of records which is a waste of Network bandwith. Seems the actual .Any comparison is done in C#:
exec sp_executesql N'SELECT [t0].[Id], [t0].[FeedId], [t0].[ServerId], [t0].[Published], [t0].[Inserted], [t0].[Title], [t0].[Content], [t0].[Author], [t0].[WebUri], [t0].[CommentsUri]
FROM [dbo].[FeedItem] AS [t0]
WHERE [t0].[FeedId] = #p0',N'#p0 int',#p0=3
I also tried:
!feed.FeedItems.Where(ei => ei.ServerId == "12345").Any();
But it doesn't change anything. Even removing Any() and querying for the complete list of items does not change the query.
I don't get it ... why isn't this working as I would expect? There should be a
WHERE ServerId == 1234
clause in the SQL statement.
Thanks very much for any help/clarification :)
As Nicholas already noticed, looks like query executed in FeedItems property (possibly you are returning List or IEnumerable) and whole list of items are returned from database. After that you are applying Any to in-memory collection. That's why you don't see WHERE ServerId == 1234 in SQL query.
When you apply Any to IQueryable generated query will look like:
SELECT
(CASE
WHEN EXISTS(
SELECT NULL AS [EMPTY]
[dbo].[FeedItem] AS [t0]
WHERE [t0].[ServerId] = #p0
) THEN 1
ELSE 0
END) AS [value]