EFCore 2.2.2 filtering on related child entities - entity-framework

I have a entity object of type Complex.
A Complex has a 1:1 to a Forum which has many Topics, each of which have many Posts. I am trying to page the Posts but getting an error which I don't understand.
Message=The ThenInclude property lambda expression 'p => {p.Posts => Skip((__pageIndex_0 - 1)) => Take(__pageSize_1)}' is invalid. The expression should represent a property access: 't => t.MyProperty'. To target navigations declared on derived types, specify an explicitly typed lambda parameter of the target type, E.g. '(Derived d) => d.MyProperty'.
This works ..
public Complex GetComplexWithForumAndPosts(Guid Id, int pageIndex, int pageSize = 10)
{
var complex = CoDBContext.Complexes
.Include(x => x.Forum)
.ThenInclude(x => x.Topics)
.ThenInclude(p => p.Posts)
.Single(x => x.Id == Id);
return complex;
}
but this doesnt
public Complex GetComplexWithForumAndPosts(Guid Id, int pageIndex, int pageSize = 10)
{
var complex = CoDBContext.Complexes
.Include(x => x.Forum)
.ThenInclude(x => x.Topics)
.ThenInclude(p => p.Posts.Skip((pageIndex-1)*pageSize).Take(pageSize))
.Single(x => x.Id == Id);
return complex;
}

Some background: what Include doesn't include
The Include method is a bitch. The most commonly used overload accepting an expression parameter (existing since Entity Framework 4.1 if memory serves) looks suspiciously like those versatile LINQ methods doing all kinds of wonderful stuff with the most wild expressions we feed them.
In reality it --and ThenInclude-- aren't LINQ methods. (Then)Include is nothing but a sturdy old method refusing to do anything outside of its single task: passing a navigation property name to the EF query engine, instructing it to eagerly load a collection or a reference with the root entity. Think of it as the strong-typed version of Include("PropertyName"). Its only purpose is enabling compile-type type checking.
Which means: you can only use expressions that represent a navigation property's name: .ThenInclude(p => p.Posts) is OK. Anything added to it, not.
The expression parameter makes people expect it to do much more than that. And why not? The code compiles alright, why shouldn't it run? But no. Common disappointments are that Include can't be filtered or sorted. Efforts to combine Include and Skip/Take are less common, but equally understandable.
As for me, the EF team might as well consider ditching this overload of (Then)Include altogether now we have the nameof keyword that does a similar thing and could be used likewise. That would put an end to all confusion, to an endless influx of Stack Overflow questions, and to the never-ending requests for change that, so far, have never even been road-mapped. [Also understandable, but that's way beyond the scope of this question].
The issue
In the mean time you still have your issue. The reason that Skip/Take isn't often combined with Include is that it's hard to imagine what it should do. Just suppose you had two nested Includes in there, or ThenInclude(x => x.Topics.Skip().Take().ThenInclude(p => p.Posts)) which, if supported, should all have been equally legal. In fact, only paging on the root entity of a query is well-defined.
So you can only get paged posts by querying posts. For example like so:
CoDBContext.Posts
.Where(p => p.Topic.Forum.ComlexId = Id)
.Skip((pageIndex-1)*pageSize).Take(pageSize))
This is also a much leaner query than the one with Includes.
If you need more context information, for example on the Complex, you could consider querying the Complex in one call, keep it on the page (if this is a SPA) and query the posts in subsequent ajax calls.

Related

ToAsyncEnumerable().Single() vs SingleAsync()

I'm constructing and executing my queries in a way that's independent of EF-Core, so I'm relying on IQueryable<T> to obtain the required level of abstraction. I'm replacing awaited SingleAsync() calls with awaited ToAsyncEnumerable().Single() calls. I'm also replacing ToListAsync() calls with ToAsyncEnumerable().ToList() calls. But I just happened upon the ToAsyncEnumerable() method so I'm unsure I'm using it correctly or not.
To clarify which extension methods I'm referring to, they're defined as follows:
SingleAsync and ToListAsync are defined on the EntityFrameworkQueryableExtensions class in the Microsoft.EntityFrameworkCore namespace and assembly.
ToAsyncEnumerable is defined on the AsyncEnumerable class in the System.Linq namespace in the System.Interactive.Async assembly.
When the query runs against EF-Core, are the calls ToAsyncEnumerable().Single()/ToList() versus SingleAsync()/ToListAsync() equivalent in function and performance? If not then how do they differ?
For methods returning sequence (like ToListAsync, ToArrayAsync) I don't expect a difference.
However for single value returning methods (the async versions of First, FirstOrDefault, Single, Min, Max, Sum etc.) definitely there will be a difference. It's the same as the difference by executing those methods on IQueryable<T> vs IEnumerable<T>. In the former case they are processed by database query returning a single value to the client while in the later the whole result set will be returned to the client and processed in memory.
So, while in general the idea of abstracting EF Core is good, it will cause performance issues with IQueryable<T> because the async processing of queryables is not standartized, and converting to IEnumerable<T> changes the execution context, hence the implementation of single value returning LINQ methods.
P.S. By standardization I mean the following. The synchronous processing of IQueryable is provided by IQueryProvider (standard interface from System.Linq namespace in System.Core.dll assembly) Execute methods. Asynchronous processing would require introducing another standard interface similar to EF Core custom IAsyncQueryProvider (inside Microsoft.EntityFrameworkCore.Query.Internal namespace in Microsoft.EntityFrameworkCore.dll assembly). Which I guess requires cooperation/approval from the BCL team and takes time, that's why they decided to take a custom path for now.
When the original source is a DbSet, ToAsyncEnumerable().Single() is not as performant as SingleAsync() in the exceptional case where the database contains more than one matching row. But in in the more likely scenario, where you both expect and receive only one row, it's the same. Compare the generated SQL:
SingleAsync():
SELECT TOP(2) [l].[ID]
FROM [Ls] AS [l]
ToAsyncEnumerable().Single():
SELECT [l].[ID]
FROM [Ls] AS [l]
ToAsyncEnumerable() breaks the IQueryable call chain and enters LINQ-to-Objects land. Any downstream filtering occurs in memory. You can mitigate this problem by doing your filtering upstream. So instead of:
ToAsyncEnumerable().Single( l => l.Something == "foo" ):
SELECT [l].[ID], [l].[Something]
FROM [Ls] AS [l]
you can do:
Where( l => l.Something == "foo" ).ToAsyncEnumerable().Single():
SELECT [l].[ID], [l].[Something]
FROM [Ls] AS [l]
WHERE [l].[Something] = N'foo'
If that approach still leaves you squirmish then, as an alternative, consider defining extension methods like this one:
using System.Linq;
using System.Threading.Tasks;
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Query.Internal;
static class Extensions
{
public static Task<T> SingleAsync<T>( this IQueryable<T> source ) =>
source.Provider is IAsyncQueryProvider
? EntityFrameworkQueryableExtensions.SingleAsync( source )
: Task.FromResult( source.Single() );
}
According to the official Microsoft documentation for EF Core (all versions, including the current 2.1 one):
This API supports the Entity Framework Core infrastructure and is not intended to be used directly from your code. This API may change or be removed in future releases.
Source: https://learn.microsoft.com/en-us/dotnet/api/microsoft.entityframeworkcore.query.internal.asynclinqoperatorprovider.toasyncenumerable?view=efcore-2.1
p.s. I personally found it problematic in combination with the AutoMapper tool (at least, until ver. 6.2.2) - it just doesn't map collection of type IAsyncEnumerable (unlike IEnumerable, with which the AutoMapper works seamlessly).
I took a peek at the source code of Single (Line 90).
It cleary illustrates that the enumerator is only advanced once (for a successful operation).
using (var e = source.GetEnumerator())
{
if (!await e.MoveNext(cancellationToken)
.ConfigureAwait(false))
{
throw new InvalidOperationException(Strings.NO_ELEMENTS);
}
var result = e.Current;
if (await e.MoveNext(cancellationToken)
.ConfigureAwait(false))
{
throw new InvalidOperationException(Strings.MORE_THAN_ONE_ELEMENT);
}
return result;
}
Since this kind of implementation is as good as it gets (nowadays), one can say with certainty that using the Ix Single Operator would not harm performance.
As for SingleAsync, you can be sure that it is implemented in a similar manner, and even if it is not (which is doubtful), it could not outperform the Ix Single operator.

What's the recommended way to deal with errors in Scala?

Let's say that I have a method addUser that adds a user to database. When called, the method might:
succeed
fail, because the input was invalid (i. e. the user name already exists)
fail, because the database crashed or whatever
The method would probably consist of a single database API call that would in case of failure throw an exception. If it was in plain Java, I'd probably catch the exception inside the method and examine the reason. If it fell in the second category (invalid input), I would throw a custom checked exception explaining the reason (for example UserAlreadyExistsException). In case of the second category, I'd just re-throw the original exception.
I know that there are strong opinions in Java about error handling so there might be people disagreeing with this approach but I'd like to focus on Scala now.
The advantage of the described approach is that when I call addUser I can choose to catch UserAlreadyExistsException and deal with it (because it's appropriate for my current level of abstraction) but at the same time I can choose to completely ignore any other low-level database exception that might be thrown and let other layers deal with it.
Now, how do I achieve the same thing in Scala? Or what would be the right Scala approach? Obviously, exceptions would work in Scala exactly the same way but I came across opinions that there are better and more suitable ways.
As far as I know, I could go either with Option, Either or Try. Neither of those, however, seem as elegant as good old exceptions.
For example, dealing with the Try result would look like this (borrowed from similar question):
addUser(user) match {
case Success(user) => Ok(user)
case Failure(t: PSQLException) if(e.getSQLState == "23505") => InternalServerError("Some sort of unique key violation..")
case Failure(t: PSQLException) => InternalServerError("Some sort of psql error..")
case Failure(_) => InternalServerError("Something else happened.. it was bad..")
}
Those last two lines are exactly something I'd like to avoid because I'd have to add them anywhere I make a database query (and counting on MatchError doesn't seem like a good idea).
Also dealing with multiple error sources seems a bit cumbersome:
(for {
u1 <- addUser(user1)
u2 <- addUser(user2)
u3 <- addUser(user3)
} yield {
(u1, u2, u3)
}) match {
case Success((u1, u2, u3)) => Ok(...)
case Failure(...) => ...
}
How is that better than:
try {
u1 = addUser(user1)
u2 = addUser(user2)
u3 = addUser(user3)
Ok(...)
} catch {
case (e: UserAlreadyExistsException) => ...
}
Has the former approach any advantages that I'm not aware of?
From what I understood, Try is very useful when passing exceptions between different threads but as long as I'm working within a single thread, it doesn't make much sense.
I'd like to hear some arguments and recommendations about this.
Much of this topic is of course a matter of opinion. Still, there are some concrete points that can be made:
You are correct to observe that Option, Either and Try are quite generic; the names do not provide much documentation. Therefore, you could consider a custom sealed trait:
sealed trait UserAddResult
case object UserAlreadyExists extends UserAddResult
case class UserSuccessfullyAdded(user: User) extends UserAddResult
This is functionally equivalent to an Option[User], with the added benefit of documentation.
Exceptions in Scala are always unchecked. Therefore, you would use them in the same cases you use unchecked exceptions in Java, and not for the cases where you would use checked exceptions.
There are monadic error handling mechanisms such as Try, scalaz's Validation, or the monadic projections of Either.
The primary purpose of these monadic tools is to be used by the caller to organize and handle several exceptions together. Therefore, there is not much benefit, either in terms of documentation or behavior, to having your method return these types. Any caller who wants to use Try or Validation can convert your method's return type to their desired monadic form.
As you can maybe guess from the way I phrased these points, I favor defining custom sealed traits, as this provides the best self-documenting code. But, this is a matter of taste.

NDepend Query to show const, enum, struct dependencies

I wrote the following query in order to determine the dependencies between my solution and other assemblies. We have a large library of Internal Nuget shared libraries which is used extensively, and I want to ensure these are included - hence I'm using 't' below to eliminate certain 3rd party assemblies but including our internal libraries.
This query works great, but I have realised that it only shows us dependencies where the dependency is a method call. It doesn't include Constants, enums and structs.
How can I enhance the query below to show us the detail of these and any other dependencies?
let t = Assemblies.WithNameWildcardMatchIn("xunit","RestSharp","NSubstitute","O2S*","EntityFramework","AxInterop*","AutoMapper","Autofac*","ADODB","mscorlib","System*", "Microsoft*","Infra*","Interop*").ToDictionary<IAssembly,String>(c=>c.Name)
from a in Application.Assemblies
from m in a.ChildMethods
from b in m.MethodsCalled
let isThirdParty = t.ContainsKey(b.ParentAssembly.Name)
select new { a,isThirdParty,m.ParentNamespace, m.ParentType,m.Name,DependsOnAssembly=b.ParentAssembly.Name, DependsOnNamespace=b.ParentNamespace,DependsOnParentType=b.ParentType,DependsOnMethod=b.Name}
What about this refactored version of your query:
from a in Application.Assemblies
from m in a.ChildMethods
from b in m.MethodsCalled.Cast<IMember>().Union(m.FieldsUsed.Cast<IMember>())
let isThirdParty = b.IsThirdParty
select new {
a,
isThirdParty,
m.ParentNamespace,
m.ParentType,
m.Name,
DependsOnAssembly=b.ParentAssembly.Name,
DependsOnNamespace=b.ParentNamespace,
DependsOnParentType=b.ParentType,
DependsOnMember=b.Name
}
First we simplified it greatly by using b.IsThirdParty :)
Second we do a Union<IMember>() between MethodsCalled and FieldsUsed. So you get also all fields read and/or assigned in addition to methods called.
Concerning structure usage, as long as you use a member of the structure (constructor, property, field...) the dependency will be listed.
Concerning enum, if a method uses an enumeration, you'll see a dependency toward the instance field EnumName.value__.
However you won't see usage of constant nor enumeration values. The reason is that this information get lost in the IL code that NDepend analyze. Constant (and enumeration values are also constants) are replaced with their values within the IL code.
Hope this help !
As a side note, isn't the query result more readable from within the NDepend UI if you write instead:
from m in Application.Methods
select new {
m,
thirdPartyMethodsCalled = m.MethodsCalled.Where(m1 => m1.IsThirdParty),
applicationMethodsCalled = m.MethodsCalled.Where(m1 => !m1.IsThirdParty),
thirdPartyFieldsUsed = m.FieldsUsed.Where(m1 => m1.IsThirdParty),
applicationFieldsUsed = m.FieldsUsed.Where(m1 => m1.IsThirdParty)
}

Is this a good return-type pattern for designing Scala API's?

I see this type of pattern (found this example here) quite often in Scala:
class UserActor extends Actor {
def receive = {
case GetUser(id) =>
// load the user, reply with None or Some(user)
val user: Option[User] = ...
sender ! user
case FindAll() =>
// find all users
val users: List[User] = ...
sender ! users
case Save(user) =>
// persist the user
sender ! Right(user)
}
}
So depending on the call you get: Option[User], List[User], Right[User]. This approach is fine! I'm just asking out of interest if this is optimal? For example (and this may be a bad one): Will it make API's better or worse to try and generalise by always returning List[User]? So when a user is not found or if a save fails, then the list will simply be empty. I'm just curious.... any other suggestions on how the above 'pattern' may be improved?
I'm just trying to identify a perfect pattern for this style of API where you sometimes get one entity and sometimes none and sometimes a list of them. Is there a 'best' way to do this, or does everyone role their own?
The return types should help clarify your API's intended behavior.
If GetUser returned a List, developers might get confused and wonder if multiple users could possibly be returned. When they see that an Option is returned, they will immediately understand the intended behavior.
I once had to work with a fairly complex API which provided CRUD operations that had been generalized in the manner you describe. I found it to be difficult to understand, vaguely defined, and hard to work with.
In my opinion it is a very good pattern for API design.
I use very often Option as return type of my functions, if I want to return a single element, obviously because I don't need to deal with null. Returning a Seq is naturally for multiple elements and Either is optimal if you want to return a failure-description, I use it often while parsing I/O. Sometimes I even combine the Seq with one of the others. You likely don't know the preferences and goals of an user of your API, so it makes sence to provide all of these return-types to make the user feel as comfortable as possible.

Why is IQueryable used instead of IEnumerable with MEF catalogs?

Why is it that the catalogs expose parts through IQueryable and not just IEnumerable. I've been thinking about it but I don't understand how (or if) they actually use any of the IQueryable services that interface provides.
Because it allows for implementations which don't have to scan all available parts (an O(N) operation) for each query.
To give a concrete example, consider the following query which might be similar to something that MEF does internally to find an export with the right contract:
var matches = catalog.Parts
.Where(part => part.ExportDefinitions.Any(
export => export.ContractName == "foo"));
The catalog implementation of IQueryProvider could recognize the resulting expression tree as "give me parts which export the contract 'foo'" and then retrieve them from a dictionary by using 'foo' as the key, an O(1) operation - instead of actually enumerating all parts and executing the lambda passed to .Where, as would be the case for an IEnumerable.
edit: my example above isn't really a good one because there already is a GetExports method specifically for this case; it wouldn't be necessary to query the Parts property like that. Perhaps a better example would involve export.Metadata.