EFCore 3.0 and usage of Group By - entity-framework

I am using EF Core 3.0 and I am facing a problem with Group By using Linq.
After searching, I came to know that as per design in EF Core 3.0 the Group By is not possible on the client side. Hence when using Group By EF throws an exception.
If client side Group By is not possible, then how to use the SQL Group By?
Can anyone could give an hint of to issue Group By on the server side (i.e. in the database server)?
var results = db.TransferJobs.Include(j => j.TransferObjects)
.Where(j => j.Status == JobStatus.Succeeded)
.GroupBy(j => j.LinkedJobIdentifier).ToList();
TransferJob is a simple objet having a list of TransferObjects.

Related

Entity Framework Core 2.2 nested select generates multiple queries

I have a query similar to this:
var customers = dbContext.Customers.Select(c => new
{
FirstName = c.FirstName,
LoanStatuses = c.LoanRequests.Select(l => l.Status)
}).ToList();
When ToList() is executed, the property LoanStatuses is not materialized, like it happens in EntityFramework, but instead new queries are sent when customer.LoanStatuses is called.
I also tried to add ToList() in .Select method, as suggested on various blogs. That works well, however, 2 queries are sent instead of one (actually, since I have 8 collection properties similar to this, I get 9 queries instead of one.
Is there any way to force EF Core 2.2 to perform a single query, with all the required joins in order to return all the required data in a single hit, like in non core versions of Entity Framework?
Is there any way to force EF Core 2.2 to perform a single query, with all the required joins in order to return all the required data in a single hit, like in non core versions of Entity Framework?
No. Single query mode has been introduced in EF Core 3.0.
In EF Core 2.x you should use the aforementioned ToList in collection projections to get K + 1 queries, where K is the number of the correlated collections. This way at least you avoid the N * K + 1 queries (worst) where N is the number of records returned by the main query.
But note that for many sub collections this actually is better than single query, and EF Core 3.x suffers from that, especially with multiple collection includes.
That's why EF Core 5.0 will introduce split query option to allow bringing EFC 2.x "multi-query" mode back.
To recap, K + 1 query mode is the best you can get in EFC 2.x, and if you have many sub collections, you'd better not upgrade to EFC 3.x, but wait for EFC 5.x and keep that mode.

Accessing DB2-LUW 10 with entity framework 6

I am developing an application in which the database is selected by the end user at runtime. The database can either be on a MS SQL server or an IBM DB2 server. I am currently using IBM DB2 10 Express-c on a windows server for testing. I am developing using Visual Studio 2013 C# and Entity Framework 6. I have installed the EntityFramework.IBM.DB2 Nuget package for the DB2 support. I am using reverse-engineer code-first against an existing SQL server database to generate my base code. The application works fine against a SQL Server database.
I am using System.Data.Common.DbProviderFactories.GetFactory to generate the provider.
System.Data.EntityClient.EntityConnectionStringBuilder connectString = new System.Data.EntityClient.EntityConnectionStringBuilder(a_Connection);
System.Data.Common.DbConnection conn = System.Data.Common.DbProviderFactories.GetFactory(connectString.Provider).CreateConnection();
conn.ConnectionString = connectString.ProviderConnectionString;
LB500Database = new LB402_TestContext(conn, true);
a_Connection is provider=IBM.Data.DB2;provider connection string="Database=LISTBILL;User ID=xxxx;Password=yyyy;Server=db210:50000"
and is being parsed correctly by the EntityConnectionStringBuilder.
I then try to access a table in the database with
LBData500.LB_System oneSystem;
System.Linq.IQueryable<LB_System> allSystem = LB500Database.LB_System.Where(g => g.DatabaseVersion == databaseVersion && g.CompanyID == companyID);
I get an invalid operation exception "Sequence contains no matching element" which means that no elements are returned. If I remove the Where so that all rows are returned (there is one in the table) and try to enumerate the result set using the VS debugger I see the message:
"The context cannot be used while the model is being created. This exception may be thrown if the context is used inside the OnModelCreating method or if the same context instance is accessed by multiple threads concurrently. Note that instance members of DbContext and related classes are not guaranteed to be thread safe."
I am not using multi-threading. I am not inside the OnModelCreating.
Just changing the connect string to point to SQL server works fine, so I think my basic approach is sound. If I were getting some kind of error back from the server I would have something to go on. I can run the query from inside Visual Studio, so I have connectivity.
Any pointers would be appreciated.
UPDATE:
I turns out the EF objects were generated using EF5 and the EF6 runtime was being used. I regenerated the EF objects using EF6 reverse engineer code first. I can now connect to the database and get an error message:
"ERROR [42704] [IBM][DB2/NT64] SQL0204N \"DBO.LB_SYSTEM\" is an undefined name."
The schema in the DB2 database is the same as my userid (in this case, not always). I added the CurrentSchema=xxxx to the provide connection string, but EF is still passing dbo as the schema name.
Now I need a way to change the schema name at run time. I saw a link to codeplex EFModelAdapter (http://efmodeladapter.codeplex.com). So I may give that a try.
Update2 After looking through EFModelAdapter, I decided to take a different route. Since I only need database access and not schema management, I decided to go with Dapper (https://github.com/StackExchange/dapper-dot-net). This works great for what I need and allows me to change the schema name when accessing DB2 databases.
As per my Update 2, Entity Framework was a little overkill for what I needed. I switched to dapper https://github.com/StackExchange/dapper-dot-net and I am working fine against multiple DBMSs.

Entity Framework SQL Profiler - How to trace back to code that generated SQL

We are using entity framework 6.0 for development of our new application. All of our entity queries are generated from a DAL layer. For our current applications that are deployed to production, we use a SQL monitoring tool to track the performance of SQL queries.
My concern is how will I track down the DAL class that is generating the SQL so, I can address performance issues with the entity query. All I have from the tool is the SQL query that was generated by entity framework.
How are others tracking down SQL query issues in production? I know I can use Glimpse but how can you track back to the entity framework query that generated the SQL if you just have the raw SQL? I tried using the predicate builder to add a dummy where clause to see if that would show up in the SQL but it is ignored. like
predicate = predicate.Or(u => "methodName" == "methodName");
Thanks for the help.
You could use New Relic to instrument the application and SQL. They have a feature called "Key Transactions" which can tell you the slow transactions (really set up just for web requests, but you can theoretically get it working for other types of apps) and allow you to see the slow SQL queries within those transactions.
In order to add your data access layer in to the methods being instrumented, you can edit the instrumentation xml files as per https://docs.newrelic.com/docs/dotnet/dotnet-agent-custom-metrics
Note that the Key Transactions feature is in the premium addition, which costs. You do get that free for a while, so it might be worth a look to see if it provides enough value to you.
(I have no affiliation with New Relic, by the way.)
If you have a test suite covering the code that generates the queries, you could use this to save the generated SQL queries out to file along with the DAL method that generated them. You could do this by using the following code (taken from this SO answer regarding viewing the SQL):
var result = from x in appEntities
where x.id = 32
select x;
var sql = ((System.Data.Objects.ObjectQuery)result).ToTraceString();
If you saved these queries and method names in some structured fashion (e.g. CSV), perhaps as a side effect of the test run, you might be able to do a reverse lookup by searching this file for the query you see from production. You might have to do some normalisation, e.g. strip out all non-significant whitespace and in both cases take out the parameter assignments.

Get the latest approved member

I'm using ASP .NET MVC 2.0 with the default membership provider. How would I go about retrieving the name of the latest registered approved member?
As far as I can tell there are no 'out of the box' methods of the Membership class that allow you to do this.
You will need to run a custom query or stored procedure against the aspnet_Membership table using either ADO.NET or your preferred ORM. The key columns you need to query against are CreateDate and IsApproved.

Challenge: Getting Linq-to-Entities to generate decent SQL without unnecessary joins

I recently came across a question in the Entity Framework forum on msdn:
http://social.msdn.microsoft.com/Forums/en-US/adodotnetentityframework/thread/bb72fae4-0709-48f2-8f85-31d0b6a85f68
The person who asked the question tried to do a relatively simple query, involving two tables, a grouping, order by, and an aggregation using Linq-to-Entities. A pretty straightforward Linq query, and straightforward to do in SQL as well - the kind of stuff people try to do every day.
However, when using Linq-to-Entities the outcome is a complex query with lots of unnecessary joins etc. I tried it and wasn't able to get Linq-to-Entities to generate a decent SQL query from it if using just pure Linq against the EF entities.
Having seen a fair share of monster queries from EF I thought maybe the OP (and me, and others) are doing something wrong. Maybe there is a better way to do this?
So here's my challenge: using the example from the EF forum and using just Linq-to-Entities against the two entities, is it possible to get EF to generate a SQL query without unnecessary joins and other complexities?
I'd like to see EF generate something a little bit closer to what Linq-to-SQL does for the same kind of queries, while still using Linq against a EF model.
Restrictions: use EFv1 .net 3.5 SP1 or EFv4 (beta 1 is part of the VS2010/.net4 beta available for download from Microsoft). No CSDL->SSDL mapping tricks, model 'definingqueries', stored procs, db-side functions, or views allowed. Just plain 1:1 mapping between the model and the db and a pure L2E query that does what the original thread on MSDN asked. An association must exist between the two entities (i.e. my "workaround #1" answer to the original thread is not a valid workaround)
Update: 500pt bounty added. Have fun.
Update: As mentioned above, a solution that uses EFv4 / .net 4 (β1 or later) is of course eligible for the bounty. If you're using .net 4 post β1, please include build number (e.g. 4.0.20605), the L2E query you used, and the SQL it generated and sent to the DB.
Update: This issue has been fixed in VS2010 / .net 4 beta 2. Although the generated SQL still has a couple of [relatively harmless] extra levels of nesting, it doesn't do any of the nutty stuff it used to. The final execution plan after SQL Server's optimizer has had a go at it is now as good as it can be. +++ for the dudes and dudettes responsible for the SQL generating part of EFv4...
If I was that worried about the crazy SQL, I just wouldn't do any of the grouping in the database. I would first query all of the data I needed by finishing it off with a ToList() while using the Include function to load all the data in a single select.
Here's my final result:
var list = from o in _entities.orderT.Include("personT")
.Where(p => p.personT.person_id == person_id &&
p.personT.created >= fromTime &&
p.personT.created <= toTime).ToList()
group o by new { o.name, o.personT.created.Year, o.personT.created.Month, o.personT.created.Day } into g
orderby g.Key.name
select new { g.Key, count = g.Sum(x => x.price) };
This results in a much simpler select:
SELECT
1 AS [C1],
[Extent1].[order_id] AS [order_id],
[Extent1].[name] AS [name],
[Extent1].[created] AS [created],
[Extent1].[price] AS [price],
[Extent4].[person_id] AS [person_id],
[Extent4].[first_name] AS [first_name],
[Extent4].[last_name] AS [last_name],
[Extent4].[created] AS [created1]
FROM [dbo].[orderT] AS [Extent1]
LEFT OUTER JOIN [dbo].[personT] AS [Extent2] ON [Extent1].[person_id] = [Extent2].[person_id]
INNER JOIN [dbo].[personT] AS [Extent3] ON [Extent1].[person_id] = [Extent3].[person_id]
LEFT OUTER JOIN [dbo].[personT] AS [Extent4] ON [Extent1].[person_id] = [Extent4].[person_id]
WHERE ([Extent1].[person_id] = #p__linq__1) AND ([Extent2].[created] >= #p__linq__2) AND ([Extent3].[created] <= #p__linq__3)
Additionally, with the example data provided, SQL Profiler only notices a 3 ms increase in duration of the SQL call.
Personally, I think that anyone that whines about not liking the output SQL of an ORM layer should go back to using Stored Procedures and Datasets. They simply aren't ready to evolve yet, and need to spend a few more years in the proverbial oven. :)
Interesting discussion. I have used 2 ORM models so far (NHibernate and LINQ-to-Entities). In my experience, there is always a line where you have to give up on ORM to generated SQL and resort back to stored procedures or views to achieve best scalable queries. Having said that, I personally think that LINQ works better on more normalized databases and all the nested queries/joins are not a major issue. There are some cases where, in order to increase performance or scalability, you have to use DB server features (indexed views for example on SQL 2008 SE works only with query hints) and you simply cannot use an ORM (except iBatis?).
Granted that you won't get the best performance or scalability by using these nested joins/queries generated by linq but please don't forget the advantages and development benefits given by LINQ (or NHibernate) in any project. Surely there must be some merit to it.
Finally, although I risk comparing apple and oranges but isn't think more like asking: Do you want rapid website development (asp.net webforms, swing) or more control on your HTML (asp.net mvc, RoR)? pick the thing that best suits your requirements.
My 2 cents!
The SQL that linq generates is very efficient. It may look bulky but it takes into account relations on tables and constraints etc. In my opinion you should just blindly use the linq commands and not worry about scale. There are benefits of the large queries as its automatically generated. It avoids any slip ups in relational constraints and adds its own wrappers for faults/exceptions.
If however you want to write the SQL's yourself and still want to work behind the confines of an ORM, then try iBatishttp://ibatis.apache.org/ You have to write the SQL's and joins yourself, so it gives you complete control over the backend model.
Personally, just use SQLMetal and linq. Dont worry about performance and scale, unless you need to.