EF Core 5 Upgrade - Query Timeouts - entity-framework

We're belatedly upgrading a successful .net Core 2.1 MVC application to .net Core 5, and everything has gone well, apart from some confusing Microsoft.Data.SqlClient.SqlException 'Execution Timeout Expired' exceptions in a number of queries that used to work perfectly well in EF Core 2.1.
This is an example of one of the queries we're having problems with
var products = await _context.ShopProducts
.Include(p => p.Category)
.Include(p => p.Brand)
.Include(p => p.CreatedBy)
.Include(p => p.LastUpdatedBy)
.Include(p => p.Variants).ThenInclude(pv => pv.ProductVariantAttributes)
.Include(p => p.Variants).ThenInclude(pv => pv.CountryOfOrigin)
.Include(p => p.Page)
.Include(p => p.CountryOfOrigin)
.OrderBy(p => p.Name)
.Where(p =>
(string.IsNullOrEmpty(searchText)
|| (
p.Name.Contains(searchText)
|| p.Description.Contains(searchText)
|| p.Variants.Any(v => v.SKU.Contains(searchText))
|| p.Variants.Any(v => v.GTIN.Contains(searchText))
|| p.Brand.BrandName.Contains(searchText)
|| p.CountryOfOriginCode == searchText
|| p.Category.Breadcrumb.Contains(searchText)
)
)
).ToPagedListAsync(page, pageSize);
And the exceptions we're getting.
Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParserStateObject.ThrowExceptionAndWarning(Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadByteArray(Span`1 buff, Int32 len, Int32& totalRead)
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadChar(Char& value)
at Microsoft.Data.SqlClient.TdsParser.TryReadPlpUnicodeCharsChunk(Char[] buff, Int32 offst, Int32 len, TdsParserStateObject stateObj, Int32& charsRead)
at Microsoft.Data.SqlClient.TdsParser.TryReadPlpUnicodeChars(Char[]& buff, Int32 offst, Int32 len, TdsParserStateObject stateObj, Int32& totalCharsRead)
at Microsoft.Data.SqlClient.TdsParser.TryReadSqlStringValue(SqlBuffer value, Byte type, Int32 length, Encoding encoding, Boolean isPlp, TdsParserStateObject stateObj)
at Microsoft.Data.SqlClient.TdsParser.TryReadSqlValue(SqlBuffer value, SqlMetaDataPriv md, Int32 length, TdsParserStateObject stateObj, SqlCommandColumnEncryptionSetting columnEncryptionOverride, String columnName, SqlCommand command)
at Microsoft.Data.SqlClient.SqlDataReader.TryReadColumnInternal(Int32 i, Boolean readHeaderOnly)
at Microsoft.Data.SqlClient.SqlDataReader.ReadColumnHeader(Int32 i)
at Microsoft.Data.SqlClient.SqlDataReader.IsDBNull(Int32 i)
at lambda_method1671(Closure , QueryContext , DbDataReader )
at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.PopulateIncludeCollection[TIncludingEntity,TIncludedEntity](Int32 collectionId, QueryContext queryContext, DbDataReader dbDataReader, SingleQueryResultCoordinator resultCoordinator, Func`3 parentIdentifier, Func`3 outerIdentifier, Func`3 selfIdentifier, IReadOnlyList`1 parentIdentifierValueComparers, IReadOnlyList`1 outerIdentifierValueComparers, IReadOnlyList`1 selfIdentifierValueComparers, Func`5 innerShaper, INavigationBase inverseNavigation, Action`2 fixup, Boolean trackingQuery)
at lambda_method1679(Closure , QueryContext , DbDataReader , ResultContext , SingleQueryResultCoordinator )
at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.Enumerator.MoveNext()
This query will work perfectly if the searchText parameter is not specfied, so I thought it must be something index / data related, but running the SQL generated for the query, with and without the searchText parameter, executes instantaneously when run directly on the database, so it seems to rule out the database as the problem.
Could EF Core 5 be struggling to assemble all the data into object instances? I realise we're returning a large object tree from this query, 152 columns in total, but only 10 rows due to the pageSize variable.
As it doesn't time out when no searchText is specified, and as EF Core 2.1 was able to put it all together with no problem, that doesn't seem to quite make sense either.
Any suggestions for ways to tune the query, or insights anyone has from their own EF Core 2.1 to 3.1 / 5 upgrades, or anything that leaps out from the exception stack trace would be very much appreciated.
UPDATE
Apologies for doing my debugging live on SO, but I've found that it's the p.Description.Contains(searchText) clause in the query that seems to cause the timeout. If I comment it out, the query runs successfully.
The Product.Description data is an HTML string of up to 1028 chars, with an average length of 350 chars, and again, querying this directly in SQL is no problem at all. Could it cause problems for EF in some other way though?
[DataType(DataType.Html)]
public string Description { get; set; }

Consider using Split Queries to improve performance on queries with lots of Include's.

The empty search text will work because the composed SQL will ignore any of the other conditions if it is empty. A better way of implementing that would be to keep the condition logic in code rather than the expression:
var query = _context.ShopProducts
.Include(p => p.Category)
.Include(p => p.Brand)
.Include(p => p.CreatedBy)
.Include(p => p.LastUpdatedBy)
.Include(p => p.Variants)
.ThenInclude(pv => pv.ProductVariantAttributes)
.Include(p => p.Variants)
.ThenInclude(pv => pv.CountryOfOrigin)
.Include(p => p.Page)
.Include(p => p.CountryOfOrigin)
.OrderBy(p => p.Name);
if (!string.IsNullOrEmpty(searchText))
query = query.Where(p => p.Name.Contains(searchText)
|| p.Description.Contains(searchText)
|| p.Variants.Any(v => v.SKU.Contains(searchText))
|| p.Variants.Any(v => v.GTIN.Contains(searchText))
|| p.Brand.BrandName.Contains(searchText)
|| p.CountryOfOriginCode == searchText
|| p.Category.Breadcrumb.Contains(searchText));
var products = await query.ToPagedListAsync(page, pageSize);
Leaving the if statement in code ensures these conditions only make it to SQL if they are needed. This is advisable especially where you might have multiple separate search terms and conditionally apply each of them.
The crux of your problem is likely that you are kicking off what will surely be an extremely inefficient query that cannot take advantage of any form of indexing, being a multi LIKE %{term}% query. If there is one way to allow users to trigger an effective DDOS against your server, it is this. It will be slow, and users will doubt it started and kick it off repeatedly. (spawning new tabs and such)
In systems there is often a text search capability where there is the possibly of searching across multiple different possible values. You don't do yourself or your users any favours by condensing these down into one uber search. The trouble is that if 95% of the time the search could be directed by the user to the most common type or fields of search it could be considerably faster, but to cater for the 5%, all searches must be the (s)lowest common denominator.
Some options to consider:
Let the user specify what they want to search against. This could be a drop-down multi-select or breadcrumb auto-selecting the most common search target (Name & SKU, etc.)
Default to "Begins With" type searching and give the user the option to opt for a slower "Contains" type search.
Apply some logic to pre-inspect the search text for what fields it may match. For instance if some of those values are numeric or follow a particular pattern (that a regex could match) then direct the search at those values, or ignore those values if the search text is not appropriate.
Always enforce a minimum length check on the search text, or at least whether that search text is applied to certain values.
Changes like this can help keep the 95%+ of the searching as fast and efficient as possible. I would also consider employing a queuing mechanism for potentially expensive searches where the criteria and pagination data is popped onto a queue with a searchQueueID passed back to the caller which then polls for results using that ID, or can cancel their search. The queue is serviced by a small pool of worker threads that process search requests in sequence and populate a results storage against the searchQueueID and completed status to be picked up by the polling loop. This can help ensure that only so many of these potentially expensive searches are being executed at any given time. (I.e. all kicked off by different users.)

Related

How to Parse an int in an EF Core 3 Query?

Upon upgrading to EF Core 3, I am getting the following error at the following code:
System.InvalidOperationException: 'The LINQ expression 'DbSet
.Max(c => Convert.ToInt32(c.ClaimNumber.Substring(c.ClaimNumber.Length - 6)))'
could not be translated. Either rewrite the query in a form that can
be translated, or switch to client evaluation explicitly by inserting
a call to either AsEnumerable(), AsAsyncEnumerable(), ToList(), or
ToListAsync(). See https://go.microsoft.com/fwlink/?linkid=2101038 for
more information.'
var maxId = Db.Claims
.Select(c => c.ClaimNumber.Substring(c.ClaimNumber.Length - 6))
.Max(x => Convert.ToInt32(x));
I have also tried using int.Parse instead of Convert.ToInt32, and it produces the same error. I understand the error message. However, it's trivial to get SQL Server to parse a string to an int in T-SQL with CAST or CONVERT, I would hope there's a simple way to write the query so that it translates to a server-side operation right?
UPDATE After Claudio's excellent answer, I thought I should add some info for the next person who comes along. The reason I believed the parsing was the problem with the above code is because the following runs without error and produces the right result:
var maxId = Db.Claims
.Select(c => c.ClaimNumber.Substring(c.ClaimNumber.Length - 6))
.AsEnumerable()
.Max(x => int.Parse(x));
However, I dug deeper and found that this is the SQL query EF is executing from that code:
SELECT [c].[ClaimNumber], CAST(LEN([c].[ClaimNumber]) AS int) - 6
FROM [Claims] AS [c]
WHERE [c].[ClaimNumber] IS NOT NULL
That is clearly not doing anything like what I wanted, and therefore, Claudio is right that the call to Substring is, in fact, the problem.
Disclaimer: although feasable, I strongly recommed you do not use type conversion in your query, because causes heavy query performance degradation.
Fact is that Convert.ToInt(x) part is not the problem here. It is c.ClaimsNumber.Substring(c.ClaimNumber.Length - 6), that the EF Core translator isn't able to translate in T-SQL.
Despite RIGHT function exists in Sql Server, also, you won't able to use it with current versions of EF Core (last version is 3.1.2 at the moment I'm writing).
Only solution to get what you want is to create a Sql Server user function, map it with EF Core and use it in your query.
1) Create function via migration
> dotnet ef migrations add CreateRightFunction
In newly created migration file put this code:
public partial class CreateRightFunctions : Migration
{
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.Sql(#"
CREATE FUNCTION fn_Right(#input nvarchar(4000), #howMany int)
RETURNS nvarchar(4000)
BEGIN
RETURN RIGHT(#input, #howMany)
END
");
}
protected override void Down(MigrationBuilder migrationBuilder)
{
migrationBuilder.Sql(#"
DROP FUNCTION fn_Right
");
}
}
Then run db update:
dotnet ef database update
2) Map function to EF Core context
In your context class[DbFunction("fn_Right")]
public static string Right(string input, int howMany)
{
throw new NotImplementedException(); // this code doesn't get executed; the call is passed through to the database function
}
3) Use function in your query
var maxId = Db.Claims.Select(c => MyContext.Right(c.ClaimNumber, 6)).Max(x => Convert.ToInt32(x));
Generated query:
SELECT MAX(CONVERT(int, [dbo].[fn_Right]([c].[ClaimNumber], 6)))
FROM [Claims] AS [c]
Again, this is far from best practice, I think you should consider to add an int column to your table to store this "number", whatever it represents in your domain.
Also, first time last 6 characters of ClaimNumber contain a non-digit character, this won't work anymore. If the ClaimNumber is input by a human, sooner or later this will happen.
You should code and design your database and application for robustness, even if you're super sure that those 6 characters will always represent a number. They could not do it forever :)
Please change your code as below. It's working for me in Dotnet core 3.1 version
var maxId = Db.Claims.Select(c => c.ClaimNumber.Substring(c.ClaimNumber.Length - 6))
.Max(x => (Convert.ToInt32((x == null)? "0" : x.ToString())));

EF Core 3 GroupBy multiple columns Count Throws with extensions but linq works

Here is the one that throws full exception:
var duplicateCountOriginal = _db.TableName
.GroupBy(g => new {g.ColumnA, g.ColumnB, g.ColumnC})
.Count(g => g.Count() > 1);
Exception:
System.ArgumentException: Expression of type 'System.Func2[System.Linq.IGrouping2[Microsoft.EntityFrameworkCore.Storage.ValueBuffer,Microsoft.EntityFrameworkCore.Storage.ValueBuffer],Microsoft.EntityFrameworkCore.Storage.ValueBuffer]' cannot be used for parameter of type 'System.Func2[Microsoft.EntityFrameworkCore.Storage.ValueBuffer,Microsoft.EntityFrameworkCore.Storage.ValueBuffer]' of method 'System.Collections.Generic.IEnumerable1[Microsoft.EntityFrameworkCore.Storage.ValueBuffer] Select[ValueBuffer,ValueBuffer](System.Collections.Generic.IEnumerable1[Microsoft.EntityFrameworkCore.Storage.ValueBuffer], System.Func2[Microsoft.EntityFrameworkCore.Storage.ValueBuffer,Microsoft.EntityFrameworkCore.Storage.ValueBuffer])' (Parameter 'arg1')
But the same thing works when it is written as linq (I prefer extensions)
var duplicateCount =
from a in _db.TableName
group a by new {a.ColumnA, a.ColumnB, a.ColumnC}
into g
where g.Count() > 1
select g.Key;
duplicateCount.Count()
I am unable to understand why one works or the other doesn't.
Also if I change the first one a little bit based on EF Core 3 changes like the following
var duplicateCountOriginal = _db.TableName
.GroupBy(g => new {g.ColumnA, g.ColumnB, g.ColumnC})
.AsEnumerable()
.Count(g => g.Count() > 1);
I get the following exception:
System.InvalidOperationException: Client projection contains reference to constant expression of 'Microsoft.EntityFrameworkCore.Metadata.IPropertyBase' which is being passed as argument to method 'TryReadValue'. This could potentially cause memory leak. Consider assigning this constant to local variable and using the variable in the query instead. See https://go.microsoft.com/fwlink/?linkid=2103067 for more information.
According to me, the link given by ms has no meaning to the whatever problem here is.
Please LMK if there is any logical explanation.
There is no logical explanation. Just EF Core query translation is still far from perfect and have many defects/bugs/unhandled cases.
In this particular the problem is not the query syntax or method syntax (what you call extensions), but the lack of Select after GroupBy. If you rewrite the method syntax query similar to the one using query syntax, i.e. add .Where, .Select and then Count:
var duplicateCount = _db.TableName
.GroupBy(g => new {g.ColumnA, g.ColumnB, g.ColumnC})
.Where(g => g.Count() > 1)
.Select(g => g.Key)
.Count();
then it will be translated and executed successfully.

Does my "zipLatest" operator already exist?

quick question about an operator I've written for myself.
Please excuse my poor-man's marble diagrams:
zip
aa--bb--cc--dd--ee--ff--------gg
--11----22--33--------44--55----
================================
--a1----b2--c3--------d4--e5----
combineLatest
aa--bb--cc--dd--ee--ff--------gg
--11----22--33--------44--55----
================================
--a1b1--c2--d3--e3--f3f4--f5--g5
zipLatest
aa--bb--cc--dd--ee--ff--------gg
--11----22--33--------44--55----
================================
--a1----c2--d3--------f4------g5
zipLatest (the one I wrote) fires at almost the same times as zip, but without the queueing zip includes.
I've already implemented it, I'm just wondering if this already exists.
I know I wrote a similar method in the past, to discover by random chance that I'd written the sample operator without knowing it.
So, does this already exist in the framework, or exist as a trivial composition of elements I haven't thought of?
Note: I don't want to rely on equality of my inputs to deduplicate (a la distinctUntilChanged).
It should work with a signal that only outputs "a" on an interval.
To give an update on the issue: There is still no operator for this requirement included in RxJS 6 and none seems to be planned for future releases. There are also no open pull requests that propose this operator.
As suggested here, a combination of combineLatest, first and repeat will produce the expected behaviour:
combineLatest(obs1, obs2).pipe(first()).pipe(repeat());
combineLatest will wait for the emission of both Observables - throwing away all emissions apart from the latest of each. first will complete the Observable after the emission and repeat resubscribes on combineLatest, causing it to wait again for the latest values of both observables.
The resubscription behaviour of repeat is not fully documented, but can be found in the GitHub source:
source.subscribe(this._unsubscribeAndRecycle());
Though you specifically mentions not to use DistinctUntilChanged, you can use it with a counter to distinct new values:
public static IObservable<(T, TSecond)> ZipLatest<T, TSecond>(this IObservable<T> source, IObservable<TSecond> second)
{
return source.Select((value, id) => (value, id))
.CombineLatest(second.Select((value, id) => (value, id)), ValueTuple.Create)
.DistinctUntilChanged(x => (x.Item1.id, x.Item2.id), new AnyEqualityComparer<int, int>())
.Select(x => (x.Item1.value, x.Item2.value));
}
public class AnyEqualityComparer<T1, T2> : IEqualityComparer<(T1 a, T2 b)>
{
public bool Equals((T1 a, T2 b) x, (T1 a, T2 b) y) => Equals(x.a, y.a) || Equals(x.b, y.b);
public int GetHashCode((T1 a, T2 b) obj) => throw new NotSupportedException();
}
Note that I've used Int32 here - because that's what Select() gives me - but it might be to small for some use cases. Int64 or Guid might be a better choice.

NHibernate QueryOver Conversion Error On DB2 Date Type

I'm new to NHibernate and I am hoping I can find some assistance in tracking down the source of a conversion error I'm getting when trying to use a DateTime comparison for a predicate.
return _session.QueryOver<ShipmentSegment>()
.Where(ss => ss.SegmentOrigin == selOrig)
// Whenever I add the predicate for the SegmentDate below
// I receive a conversion error
.And(ss => ss.SegmentDate == selDate)
.List<ShipmentSegment>();
Exception
NHibernate.Exceptions.GenericADOException was unhandled by user code
Message=could not execute query
[ SELECT this_.ajpro# as ajpro1_28_0_, this_.ajleg# as ajleg2_28_0_, this_.ajpu# as ajpu3_28_0_, this_.ajlorig as ajlorig28_0_, this_.ajldest as ajldest28_0_, this_.segdate as segdate28_0_, this_.ajldptwin as ajldptwin28_0_, this_.ajlfrtype as ajlfrtype28_0_, this_.ajlfrdest as ajlfrdest28_0_, this_.ajtpmfst# as ajtpmfst10_28_0_, this_.ajspplan as ajspplan28_0_, this_.ajhload as ajhload28_0_ FROM go52cst.tstshprte this_ WHERE this_.ajlorig = #p0 and this_.segdate = #p1 ]
Name:cp0 - Value:WIC Name:cp1 - Value:3/28/2012 12:00:00 AM
Inner Exception
Message=A conversion error occurred.
Source=IBM.Data.DB2.iSeries
ErrorCode=-2147467259
MessageCode=111
MessageDetails=Parameter: 2.
SqlState=""
StackTrace:
- at IBM.Data.DB2.iSeries.iDB2Exception.throwDcException(MpDcErrorInfo
mpEI, MPConnection conn)
- at IBM.Data.DB2.iSeries.iDB2Command.openCursor()
- at IBM.Data.DB2.iSeries.iDB2Command.ExecuteDbDataReader(CommandBehavior
behavior)
- at System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader()
- at NHibernate.AdoNet.AbstractBatcher.ExecuteReader(IDbCommand cmd)
- at NHibernate.Loader.Loader.GetResultSet(IDbCommand st, Boolean autoDiscoverTypes, Boolean callable, RowSelection selection,
ISessionImplementor session)
- at NHibernate.Loader.Loader.DoQuery(ISessionImplementor session, QueryParameters queryParameters, Boolean returnProxies)
- at NHibernate.Loader.Loader.DoQueryAndInitializeNonLazyCollections(ISessionImplementor
session, QueryParameters queryParameters, Boolean returnProxies)
- at NHibernate.Loader.Loader.DoList(ISessionImplementor session, QueryParameters queryParameters)
I appreciate anything that can help point me in the right direction.
My experience with this specific iSeries exception came from when I was creating a parameter (of iDB2TypeParameter) for the ADO.NET command parameter list for a store procedure type ADO command. I had to explicitly tell what kind of iDB2DbType to use: iDB2DbType.Date, iDB2DbType.Time or iDB2DbType.TimeStamp. The iSeries ADO provider can't possibly know which of the three types to use when you create a parameter from a .NET System.DateTime type.
new iDB2Parameter(parameterName, iDB2DbType.Date){ Value = myValue };
new iDB2Parameter(parameterName, iDB2DbType.Time){ Value = myValue };
new iDB2Parameter(parameterName, iDB2DbType.TimeStamp){ Value = myValue };
I realize that you are not creating your parameters manually like this example but instead using NHibernate. So, I would make sure that NHibernate's LINQ provider for DB2/iSeries is aware of this. That's nothing against NHibernate, I just find it nearly impossible to find good, solid ORM for DB2/iSeries.

EF builds EntityCollection, but I (think I) want IQueryable

I have an entity A with a simple navigation property B. For any given instance of A, we expect several related thousand instances of B.
There is no case where I call something like:
foreach(var x in A.B) { ... }
Instead, I'm only interested in doing aggregate operations such as
var statY = A.B.Where(o => o.Property == "Y");
var statZ = A.B.Where(o => o.CreateDate > DateTime.Now.AddDays(-1));
As far as I can tell, EF instantiates thousands of references to B and does these operations in memory. This is because navigation properties use EntityCollection. Instead, I'd like it to perform these queries at the SQL level if possible.
My current hunch is that Navigation Properties may not be the right way to go. I'm not attached to EF, so I am open to other approaches. But I'd be very interested to know the right way to do this under EF if possible.
(I'm using EF4.)
CreateSourceQuery seems to do the trick.
So my examples would now be:
var statY = A.B.CreateSourceQuery().Where(o => o.Property == "Y");
var statZ = A.B.CreateSourceQuery().Where(o => o.CreateDate > DateTime.Now.AddDays(-1));
There's one thing you should know. Members that derives from IQueryable<> are executed on the server, not in memory. Members which are derived from IEnumerable<> is executed in memory.
for example
var someEntities = db.SomeEntities; <-- returns an IQueryable<> object. no data fetched. SomeEntities table may contain thousands of rows, but we are not fetching it yet, we are just building a query.
someEntities = someEntities.Where(s => s.Id > 100 && s.Id < 200); <-- creates expression tree with where statement. The query is not executed yet and data is not fetched on the client. We just tell EF to perform a where filter when query will execute. This statement too returns an IQueryable<> object.
var entities = someEntities.AsEnumerable(); <-- here we tell EF to execute query. now entities will be fetched and any additional linq query will be performed in memory.
you can also fetch the data using foreach, calling ToArray() or ToList<>.
Hope you understand what I mean, and sorry for my english :)