Searching a column in Entity Framework with multiple values - entity-framework

I am trying to run a search on one particular field of a table with a list of values. Not able to find a solution so far. Any help is really appreciated.
Here is the scenario
var records = new PagedList<Required>();
var result = db.Required.Where(x => filter == null || (x.Title.Contains(filter)) || x.CID.Contains(filter));
foreach (string str in SelectedNetwork)
{
string tempStr = str;
result = result.Where(x => x.Network == tempStr);
records.TotalRecords = result.Count();
}
records.Content = result
.Where(x => filter == null ||
(x.Title.Contains(filter))
|| x.CID.Contains(filter)
)
.OrderBy(sort + " " + sortdir)
.Skip((page - 1) * Convert.ToInt32(records.PageSize))
.Take(Convert.ToInt32(records.PageSize))
.ToList();
highlighted code in the foreach loop fails to run as per expectation. Is there any way, I can fix it?
Thanks
Tutumon

You must take into account that LINQ expressions are queries, until you materialize them. To materialize them you need to either enumerate them, or convert them to a list, array, or whatever, i.e. enumerate their members in a foreach, or call a method like ToList(), or ToArray().
In your code the original query stored in result is not materialized, so everytime a foreach loop is executed, a new Where contidion is added to the original query. To vaoid this behavior you need to recreate the whole results query in each iteration, so that you get a fresh copy of the unfilterd expression.
There would be another solution which would be to materialize the result query and then run the foreach loop as is. The problem of this solution would be that you would get all the data from the database, keep it in memory and run the Where and the Count on the in-memory copy. Unless there is a very small number of rows in Required that would be a very bad idea.

Related

Is there a way to expand dynamically tables found in multiple columns using Power Query?

I have used the List.Accumulate() to merge mutliple tables. This is the output I've got in this simple example:
Now, I need a solution to expand all these with a formula, because in real - world I need to merge multiple tables that keep increasing in number (think Eurostat tables, for instance), and modifying the code manually wastes much time in these situations.
I have been trying to solve it, but it seems to me that the complexity of syntax easily becomes the major limitation here. For instance, If I make a new step where I nest in another List.Accumulate() the Table.ExpandTableColumns(), I need to pass inside a column name of an inner table, as a text. Fine, but to drill it down actually, I first need to pass a current column name in [] in each iteration - for instance, Column 1 - and it triggers an error if I store column names to a list because these are between "". I also experimented with TransformColumns() but didn't work either.
Does anyone know how to solve this problem whatever the approach?
See https://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/
which boils down to this function
let Source = (TableToExpand as table, optional ColumnNumber as number) =>
//https://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/
let ActualColumnNumber = if (ColumnNumber=null) then 0 else ColumnNumber,
ColumnName = Table.ColumnNames(TableToExpand){ActualColumnNumber},
ColumnContents = Table.Column(TableToExpand, ColumnName),
ColumnsToExpand = List.Distinct(List.Combine(List.Transform(ColumnContents, each if _ is table then Table.ColumnNames(_) else {}))),
NewColumnNames = List.Transform(ColumnsToExpand, each ColumnName & "." & _),
CanExpandCurrentColumn = List.Count(ColumnsToExpand)>0,
ExpandedTable = if CanExpandCurrentColumn then Table.ExpandTableColumn(TableToExpand, ColumnName, ColumnsToExpand, NewColumnNames) else TableToExpand,
NextColumnNumber = if CanExpandCurrentColumn then ActualColumnNumber else ActualColumnNumber+1,
OutputTable = if NextColumnNumber>(Table.ColumnCount(ExpandedTable)-1) then ExpandedTable else ExpandAll(ExpandedTable, NextColumnNumber)
in OutputTable
in Source
alternatively, unpivot all the table columns to get one column, then expand that value column
ColumnsToExpand = List.Distinct(List.Combine(List.Transform(Table.Column(#"PriorStepNameHere", "ValueColumnNameHere"), each if _ is table then Table.ColumnNames(_) else {}))),
#"Expanded ColumnNameHere" = Table.ExpandTableColumn(#"PriorStepNameHere", "ValueColumnNameHere",ColumnsToExpand ,ColumnsToExpand ),

How do I get unique values of one column based on another column using the insert database query in Anylogic?

How do I get unique values of one column based on another column using the query?
I tried using
(double)selectFrom(tasks).where(tasks.tasks_type.eq()).uniqueResult(tasks.task_cycle_time_hr);
I want to automate this and make sure that all the values of task_type are being read and a unique value for each of the tasks_type is being returned!
For all the values in the column task_type, I require a unique value from the column task_cycle_time_hr.
I don't really understand why you're trying to do this in one query.
If you want to get the cycle time (task_cycle_time_hr column) for each task type (tasks_type column), just do queries in a loop for each possible tasks_type value. If you don't know those a priori, do queries for each value returned by a query of the task type values, which would look something like
for (String taskType : selectFrom(tasks).list(tasks.tasks_type)) {
double cycleTime = (double) selectFrom(tasks)
.where(db_table.tasks_type.eq(taskType))
.firstResult(tasks.task_cycle_time_hr);
traceln("Task type " + taskType + ", cycle time " + cycleTime);
}
But this just amounts to querying all rows and reading the task type and cycle time values from each, so you wouldn't normally do it like this: you'd just have a single query looping through all the full rows instead...
List<Tuple> rows = selectFrom(tasks).list();
for (Tuple row : rows) {
traceln("Task type " +
row.get(tasks.tasks_type) + ", cycle time " +
row.get(tasks.task_cycle_time_hr));
}
NB: I assume you don't have any rows with duplicate task types because then the whole exercise doesn't make sense unless you want only the first row for each task type value, or want some kind of aggregate (e.g., sum) of the cycle time values for each given task type. You were trying to use uniqueResult, which may mean you want to get a value if there is exactly one row (for a given task type) and 'no result otherwise', but uniqueResult throws an exception (errors) if there isn't exactly one row (so you can't use that directly like that). In that case one way (there are others, some probably slightly better) would be to do a count first to check; e.g. something like
for (String taskType : selectFrom(tasks).list(tasks.tasks_type)) {
int rowCount = (int) selectFrom(tasks)
.where(db_table.task.eq(taskType))
.count();
if (rowCount == 1) {
double cycleTime = (double) selectFrom(tasks)
.where(db_table.tasks_type.eq(taskType))
.firstResult(tasks.task_cycle_time_hr);
traceln("Task type " + taskType + ", unique cycle time " + cycleTime);
}
}
Import your excel sheet into the AnyLogi internal DB and then make use of the DB wizard that will take you step by step to write the code to retrieve the data you want
(double) selectFrom(data)
.where(data.tasks.eq("T1"))
.firstResult(data.task_cycle_time_hr)

EF- two WHERE IN clauses in an INCLUDE table produce two EXISTS in SQL sent to server

I have an INCLUDE table that I want to check a couple of values, in the same row, using an IN clause. The below doesn't return the correct result set because it produces two EXISTS clauses with subqueries. This results in the 2 values being checked independently and not strictly in the same child row. (forgive any typos as I'm typing this in from printed code)
var db = new dbEntities();
IQueryable<dr> query = db.drs;
// filter the parent table
query = query.Where(p => DropDown1.KeyValue.ToString().Contains(p.system_id.ToString()));
// include the child table
query = query.Include(p => p.drs_versions);
// filter the child table using the other two dropdowns
query = query.Where(p => p.drs_versions.Any(c => DropDown2.KeyValue.ToString().Contains(c.version_id.ToString())) && c => DropDown3.KeyValue.ToString().Contains(c.status_id.ToString()));
// I tried removing the second c=> but received an error "'c' is inaccessible due to its protection level" error and couldn't find an clear answer to how this related to Entity Framework
// query = query.Where(p => p.drs_versions.Any(c => DropDown2.KeyValue.ToString().Contains(c.version_id.ToString())) && DropDown3.KeyValue.ToString().Contains(c.status_id.ToString()));
This is an example of the query the code above produces...
SELECT *
FROM drs d
LEFT OUTER JOIN drs_versions v ON d.dr_id = v.dr_id
WHERE d.system_id IN (9,8,3)
AND EXISTS (SELECT 1 AS C1
FROM drs_versions sub1
WHERE d.tr_id = sub1.tr_id
AND sub1.version_id IN (9, 4, 1))
AND EXISTS (SELECT 1 AS C1
FROM drs_versions sub2
WHERE d.tr_id = sub2.tr_id
AND sub2.status_id IN (12, 7))
This is the query I actually want:
SELECT *
FROM drs d
LEFT OUTER JOIN drs_versions v ON d.dr_id = v.dr_id
WHERE d.system_id IN (9, 8, 3)
AND v.version_id IN (9, 4, 1)
AND v.status_id IN (12, 7)
How do I get Entity Framework to create a query that will give me the desired result set?
Thank you for your help
I'd drop all of the .ToString() everywhere and format your values ahead of the query to make it a lot easier to follow.. If EF is generating SQL anything like what you transcribed, you are casting to String just to have EF revert it back to the appropriate type.
From that it just looks like your parenthesis are a bit out of place:
I'm also not sure how something like DropDown2.KeyBalue.ToString() resolves back to what I'd expect to be a collection of numbers based on your SQL examples... I've just substituted this with a method called getSelectedIds().
IEnumerable<int> versions = getSelectedIds(DropDown2);
IEnumerable<int> statuses = getSelectedIds(DropDown3);
query = query
.Where(p => p.drs_versions
.Any(c => versions.Contains(c.version_id)
&& statuses.Contains(c.status_id));
As a general bit of advice I suggest always looking to simplify the variables you want to use in a linq expression as much as possible ahead of time to keep the text inside the expression as simple to read as possible. (avoiding parenthesis as much as possible) Make liberal use of line breaks and indentation to organize what falls under what, and use the code highlighting to double-check your closing parenthesis that they are closing the opening you expect.
I don't think your first example actually was input correctly as it would result in a compile error as you cannot && c => ... within an Any() block. My guess would be that you have:
query = query.Where(p => p.drs_versions.Any(c => DropDown2.KeyValue.ToString().Contains(c.version_id.ToString())) && p.drs_versions.Any(c => DropDown3.KeyValue.ToString().Contains(c.status_id.ToString()));
Your issue is closing off the inner .Any()
query.Where(p => p.drs_versions.Any(c => DropDown2.KeyValue.Contains(c.version_id))
&& DropDown3.KeyValue.Contains(c.status_id)); //<-- "c" is still outside the single .Any() condition so invalid.
Even then I'm not sure this will fully explain the difference in queries or results. It sounds like you've tried typing across code rather than pasting the actual statements and captured EF queries. It may help to copy the exact statements from the code because it's pretty easy to mistype something when trying to simplify an example only to find out you've accidentally excluded the smoking gun for your issue.

ANY operator has significant performance problem when using an array as a parameter

I started using 'ANY()' function in query instead of 'IN' due to some parameter bound error. Currently it's something like that.
Select *
FROM geo_closure_leaf
WHERE geoId = ANY(:geoIds)
But it has a huge impact on performance. Using the query with IN is very much faster than with ANY.
Any suggestion how can we bound array of string parameters can be passed in 'IN' expression.
I have tried temporary fix using
Select *
FROM geo_closure_leaf
WHERE geoId IN (''('' || array_to_string(:geoIds::text[] ,''),('') || '')'')
Select *
FROM geo_closure_leaf
WHERE geoId IN (select unnest(:geoIds::text[]))
geoIds = array of strings
It's working this way.
**public override T Query<T>(string query, IDictionary<string, object> parameters, Func<IDataReader, T> mapper)**
{
T Do(NpgsqlCommand command)
{
IDataReader reader = null;
try
{
** command.CommandText = query;
reader = command.AddParameters(parameters).ExecuteReader();**
return mapper(reader);
}
finally
{
CloseDataReader(reader);
}
}
return Execute(Do);
}
Object is array of string.
Expected is: I should be able to do this without having to put extra logic in sql.
Select *
FROM geo_closure_leaf
WHERE geoId IN (:geoIds)
The performance difference cannot be IN versus = ANY, because PostgreSQL will translate IN into = ANY during query optimization.
The difference must be the subselect. If you are using unnest, PostgreSQL will always estimate that the subquery returns 100 rows, because that is how unnest is defined.
It must be that the estimate of 100 somehow produces a different execution plan that happens to work better.
We'd need the complete execution plans to say anything less uncertain.
https://dba.stackexchange.com/questions/125413/index-not-used-with-any-but-used-with-in
Found this post explaining how indeexs are getting used in different constructors of 'ANY' & 'IN'.

EntityFramework counting of query results vs counting list

Should efQuery.ToList().Count and efQuery.Count() produce the same value?
How is it possible that efQuery.ToList().Count and efQuery.Count() don't produce the same value?
//GetQuery() returns a default IDbSet which is used in EntityFramework
using (var ds = _provider.DataSource())
{
//return GetQuery(ds, filters).Count(); //returns 0???
return GetQuery(ds, filters).ToList().Count; //returns 605 which is correct based on filters
}
Just ran into this myself. In my case the issue is that the query has a .Select() clause that causes further relationships to be established which end up filtering the query further as the relationship inner join's constrain the result.
It appears that .Count() doesn't process the .Select() part of the query.
So I have:
// projection created
var ordersData = orders.Select( ord => new OrderData() {
OrderId = ord.OrderId,
... more simple 1 - 1 order maps
// Related values that cause relations in SQL
TotalItemsCost = ord.OrderLines.Sum(lin => lin.Qty*lin.Price),
CustomerName = ord.Customer.Name,
};
var count = ordersData.Count(); // 207
var count = ordersData.ToList().Count // 192
When I compare the SQL statements I find that Count() does a very simple SUM on the Orders table which returns all orders, while the second query is a monster of 100+ lines of SQL that has 10 inner joins that are triggered by the .Select() clause (there are a few more related values/aggregations retrieved than shown here).
Basically this seems to indicate that .Count() doesn't take the .Select() clause into account when it does its count, so those same relationships that cause further constraining of the result set are not fired for .Count().
I've been able to make this work by explicitly adding expressions to the .Count() method that pull in some of those aggregated result values which effectively force them into the .Count() query as well:
var count = ordersData.Count( o=> o.TotalItemsCost != -999 &&
o.Customer.Name != "!##"); // 207
The key is to make sure that any of the fields that are calculated or pull in related data and cause a relationship to fire, are included in the expression which forces Count() to include the required relationships in its query.
I realize this is a total hack and I'm hoping there's a better way, but for the moment this has allowed us at least to get the right value without pulling massive data down with .ToList() first.
Assuming here that efQuery is IQueryable:
ToList() actually executes a query. If changes to data in the datastore, between calls to ToList() and .Count(), result in a different resultset, calling ToList() will repopulate the list. ToList().Count and .Count() should then match until the data in the store changes the resultset again.