When did EF Core include batch update natively and why is people still recommending 3rd options? - entity-framework

Back in the EF6 era, bulk updates or inserts were very slow because each changing entity triggered a round trip.
Last year when using EF Core 3.1, I found this feature where it automatically batches (at the size of 42 by default) together all updates in a single roundtrip.
https://learn.microsoft.com/en-us/ef/core/performance/efficient-updating
It is working well. An AddRange() inserting ~10K records took ~10 seconds instead of 10 minutes. However, I could not find any more discussion of the feature, like when and from which version it came to exist (except for mentioning it is in EF Core), the recommendation for using it, performance implications, etc.
Questions on this topic on StackOverflow still see answers recommending 3rd party libraries. Could anyone explain why this out-of-box feature isn't more prominent? Are there any captchas? It seems to me a very important feature that deserves more attention.

Related

EF pre-generate view. How to be sure that these views are using by EF

I have several performance issue in my website.
I'm using asp.net mvc 2 and Entity Framework 4.0. I bought a Entity Framework Profiler to see what kind of SQL request that EF generated.
By example, some page take between 3 and 5 seconds to open. This is to much for my client.
To see if it's a performance problem with SQL generated by EF, I used my profiler and Copy / Paste the generated SQL in Sql Management Studio to see the execution plan and the sql statistic. The result show in less than a second.
Now that I eliminated the SQL query, I suspect EF at buidling query step.
I Follow the msdn step by step to pre-generate my view. I didn't see any performance gain.
How to be sure that my query use these Pre-Generated Views ?
Is there anything I can do to increase performance of my website ?
thanks
First of all, keep in mind that the pre-compiled queries still take just as long (in fact a little longer) the first time they are run, because the queries are compiled the first time they are invoked. After the first invocation, you should see a significant performance increase on the individual queries.
However, you will find the best answer to all performance questions is: figure out what's taking the most time first, then work on improving in that area. Until you have run a profiler and know where your system is blocking, any time you spend trying to speed things up is likely to be wasted.
Once you've determined what's taking the most time, there are a lot of possible techniques to use to speed things up:
Caching data that doesn't change often
Restructuring your data accesses so you pull the data you need in fewer round trips.
Ensuring you're not pulling more data than you need when you do your database queries.
Buying better hardware.
... and many others
One last note: In Entity Framework 5, they plan to implement automatic query caching, which will make precompiling queries practically useless. So I'd only recommend doing it where you know for sure that you'll get a significant improvement.

Entity Framework - is it suitable for Enterprise Level application?

I have a web application with:
1 Terabyte DB
200+ tables
At least 50 tables with 1+ million records each
10+ developers
1000s of concurrent users
This project is currently using Ad-Hoc Sql, which is generated by custom ORM solution.
Instead of supporting custom ORM (which is missing a lot of advanced features), I am thinking to switch to Entity Framework.
I used EF 4.1 (Code-First) on a smaller project and it worked pretty well, but is it scalable for a much larger project above?
I (highly) agree with marvelTracker (and Ayende's) thoughts.
Here is some further information though:
Key Strategy
There is a well-known cost when using GUIDs as Primary Keys. It was described by Jimmy Nilsson and it has been publicly available at http://www.informit.com/articles/article.aspx?p=25862. NHibernate supports the GUIDCOMB primary key strategy. However, to achieve that in EntityFramework is a little tricky and requires additional steps.
Enums
EntityFramework doesn’t support enums natively. Until June CTP which adds support for Enums http://blogs.msdn.com/b/adonet/archive/2011/06/30/walkthrough-enums-june-ctp.aspx the only way to map enumerations was using workarounds Please look at: How to work with Enums in Entity Framework?
Queries:
NHibernate offers many ways for querying data:
LINQ (using the re-motion’s re-linq provider, https://www.re-motion.org/web/)
Named Queries encapsulated in query objects
ICriteria/QueryOver for queries where the criteria are not known in advance
Using QueryOver projections and aggregates (In cases, we only need specific properties of an entity. In other cases, we may need the results of an aggregate function, such as average or count):
PagedQueries: In an effort to avoid overwhelming the user, and increase application responsiveness, large result sets are commonly broken into smaller pages of results.
MultiQueries that combine several ICriteria and QueryOver queries into a single database roundtrip
Detached Queries which are query objects in parts of the application without access to the NHibernate session. These objects are then executed elsewhere with a session. This is good because we can avoid complex repositories with many methods.
ISession’s QueryOver:
// Query that depends on a session:
premises = session.QueryOver<Premise>().List();
Detached QueryOver:
// Full reusable query!
var query = QueryOver.Of<Premise>();
// Then later, in some other part of ther application:
premises = query.GetExecutableQueryOver(session).List(); // Could pass IStateleSession too.
Open source
NHibernate has a lot of contribution projects available at http://sourceforge.net/projects/nhcontrib/
This project provides a number of very useful extensions to NHibernate (among other):
Cache Providers (for 2nd-level cache)
Dependency Injection for entities with no default constructor
Full-Text Search (Lucene.NET integration)
Spatial Support (NetTopologySuite integration)
Support
EntityFramework comes with Microsoft support.
NHibernate has an active community:
https://stackoverflow.com/questions/tagged/nhibernate
http://forum.hibernate.org/
http://groups.google.com/group/fluent-nhibernate
Also, have a look at:
http://www.infoq.com/news/2010/01/Comparing-NHibernate-EF-4
NHibernate is the best choice for you because it has good support of complex query, second level cacheing and great support of optimizations. I think EF is getting there. If you are dealing with Legacy systems NHibernate is the best approach.
http://ayende.com/blog/4351/nhibernate-vs-entity-framework-4-0
Suitable is an interesting term. Is it usable? Yes, and you'll find a number of nice features well suited toward rapid application development. That said, it's somewhat of a half baked technology, and lacks many advanced features of its own predecessor, LINQ to SQL (even 3 years after its first release). Here are a few annoyances:
Poor complex LINQ support
No Enum property types
Missing SQL Conversions (parse DateTime, parse int, etc.) (though you can implement these via model defined functions)
Poor SQL readability
Problems keeping multiple ssdl/csdl/msl resources independent for sharding (not really a problem with Code First)
Problems with running multiple concurrent Transactions in different ObjectContext's
Problems with Detached entity scenarios
That said, Microsoft has devoted a lot of effort to it, and hopefully it will continue to improve over time. I personally would spend time implementing a well abstracted Repository/Unit of Work pattern so your code doesn't know it's using EF at all and if necessary you can switch to another LINQ to DB provider in the future.
Most modern ORMs will be a step up from ad-hoc SQL.

How would you use EF in a typical Business Layer/Data Access Layer/Stored Procedures set up?

Whenever I watch a demo regarding the Entity Framework the demonstrator simply sets up some tables and performs Inserts, Updates and Deletes using automatically created code stubs but never shows any use of stored procedures. It seems to me that this is executing SQL from the client.
In my experience this is not particular good practice so I am presuming that my understanding of the Entity Framework is wrong.
Similarly WCF RIA Services demos use the EF and the demos are always the same. Can anyone shed any light on how you would use EF in a typical Business Layer/Data Access Layer/Stored Procedures set up.
I think I am confused and shouldn't be!!?
There's nothing wrong with executing SQL from the client. Most (if not all) of the problems that it might cause are in fact not there when using something like EF. For instance:
Client generated SQL might cause runtime syntax errors. This is not unlikely since the description of your query is mostly checked on compile time (assuming that the generator itself doesn't generate invalid SQL, which is also unlikely)
Client generated SQL might be inefficient. This is not true with modern database software which have query caches. EF works in a way that's compatible with query caches, i.e. it generates the same SQL consistently (as long as you use the same code consistently) and uses parameters for varying data.
Client generated SQL might be insecure (SQL injections and whatnot). This is all handled by the generator, which uses parameters for your values and does not interpolate user input into the query itself.
Back in the old Client / Server days, it used to be considered good practice to do all db updates using stored procedures.
But now, it's perfectly acceptable to have an O/RM generate SQL and run directly against DB.
Well, part of the reason why executing sql in stored procedures is a good idea is that it gives you a level of abstraction - when db changes inevitably occur, you make a change in a single place (the proc) rather than a dozen places (all the places where you were calling the client sql). Entity Framework provides this layer of abstraction through the data model, and you have the same advantage.
There are some other reasons why you might want to look at procs, like security granularity (only allowing certain users the right to execute), and some minor performance differences. Ultimately, you have to decide for yourself what the right trade-off is. EF is an attempt to dramatically reduce the developer time spent creating a data layer, with the trade-offs listed above.
never shows any use of stored procedures
Take a look at this video: Using Your Own Stored Procedures to Insert, Update and Delete Entities in Entity Framework.
Note that there are a lot of other videos on that topic there that are certainly worth watching!
The legend is that Scott Hanselman once said "It's not a real demo unless someone drags a datagrid" (pg 478 Silverlight 4 In Action, Pete Brown)
You have to remember that demos, are all about selling software, and not at all about communicating best practice. So your observations about the demos are absolutely correct, they cover the basics, and leave it to the observer to fill in the blanks.
As to your comment about Stored Procedures, and various answers to your question about the generator. The generator is good, and getting better. Howerver there are certain circumstances when it will generate completely unusable queries. (see my SO question here and discussed on the ADO.NET team blog)
Therefore there are occasions when hand crafted queries are your only recourse (either by way of stored proc, table value functions, views etc)

should I use Entity Framework instead of raw ADO.NET

I am new to CSLA and Entity Framework. I am creating a new CSLA / Silverlight application that will replace a 12 year old Win32 C++ system. The old system uses a custom DCOM business object library and uses ODBC to get to SQL Server. The new system will not immediately replace the old system -- they must coexist against the same database for years to come.
At first I thought EF was the way to go since it is the latest and greatest. After making a small EF model and only 2 CSLA editable root objects (I will eventually have hundreds of objects as my DB has 800+ tables) I am seriously questioning the use of EF.
In the current system I have the need many times to do fine detail performance tuning of the queries which I can do because of 100% control of generated SQL. But it seems in EF that so much happens behind the scenes that I lose that control. Article like http://toomanylayers.blogspot.com/2009/01/entity-framework-and-linq-to-sql.html don't help my impression of EF.
People seem to like EF because of LINQ to EF but since my criteria is passed between client and server as criteria object it seems like I could build queries just as easily without LINQ. I understand in WCF RIA that there is query projection (or something like that) where I can do client side LINQ which does move to the server before translation into actual SQL so in that case I can see the benefit of EF, but not in CSLA.
If I use raw ADO.NET, will I regret my decision 5 years from now?
Has anyone else made this choice recently and which way did you go?
In your case, I would still choose EF over doing it all by hand.
Why? EF - especially in .NET 4 - has matured considerably. It will allow you to do most of your database operations a lot easier and with a lot less code than if you have to all hand-code your data access code.
And in cases where you do need the absolute maximum performance, you can always plug in stored procedures for insert, update, delete which EF4 will then use instead of the default behavior of creating the SQL statements on the fly.
EF4 has a much better stored proc integration, and this really opens up the best of both worlds:
use the high productivity of EF for the 80% cases where performance isn't paramount
fine tune and handcraft stored procs for the remaining 20% and plug them into EF4
See some resources:
Using Stored Procedures for Insert, Update and Delete in an Entity Data Model
Practical Entity Framework for C#: Stored Procedures (video)
You seem to have a mix of requirements and a mix of solutions.
I normally rate each requirement with an essential, nice to have, not essential. And then see what works.
I agree with what #marc_s has said, you can have the best of both worlds.
The only other thing I would say is that if this solution is to be around for the next 5 years, have you considered Unit Testing?
There's plenty of examples on how to set this up using EF. (I personally avoid ADO.Net just because the seperating of concerns is so complicated for Unit Tests.)
There is no easy solution. I would pick a feature in your project that would take you a day or so to do. Try the different methods (raw sql, EF, EF + Stored Procs) and see what works!
Take an objective look at CSLA - invoke the 'DataPortal' and check out the call stack.
Next, put those classes on a CI build server that stores runtime data and provides a scatter plot over a series of runs.
Next, look at the code that gets created. Ask yourself how you can use things like dependecy injection in light of classes that rely on static creators with protected/private constructors.
Next, take a look at how many responsibilities the 'CSLA' classes take on.
Finally ask yourself if creating objects with different constructors per environment make sense, and ask yourself how you will unit test those.

What problems have you had with Entity Framework?

We have used Entity Framework on 2 projects both with several 100 tables.
Our experiance is mainly positive. We have had large productivity gains, compare with using Enterprise Library and stored procedures.
However, when I suggest using EF on stackoverflow, I often get negative comments.
On the negative side we have found that there is a steep learning curve for certain functionality.
Finally, to the question: What problems have people had with EF, why do they prefer other ORMS?
Like you, my experience with the EF is mostly positive. The biggest problem I've had is that very complex queries can take a long time to compile. The visual designer is also much less stable and has fewer features than the framework itself. I wish the framework would put the GeneratedCode attribute on code it generates.
I recently used EF and had a relatively good experience with it. I too see a lot of negative feedback around EF, which I think is unfortunate considering all that it offers.
One issue that surprised me was the performance difference between two strategies of fetching data. Initially, I figured that doing eager loading would be more efficient since it would pull the data via a single query. In this case, the data was an order and I was doing an eager load on 5-8 related tables. During development, we found this query to be unreasonably slow. Using SQL profiler, we watched the traffic and analyzed the resulting queries. The generated SQL statement was huge and SQL Server didn't seem to like it all that much.
To work around the issue, I reverted to a lazy-loading / on-demand mode, which resulted in more queries to the server, but a significant boost in performance. This was not what I initially expected. My take-away, which IMHO holds true for all data access implementations, is that I really need to perf test the data access. This is true regardless of whether I use an ORM or SQL procs or parameterized SQL, etc.
I use Entity Framework too and have found the following disadvantages:
I can't work with Oracle that is really necessary for me.
Model Designer for Entity Framework. During update model from database storage part is regenerated too. It is very uncomfortably.
Doesn't have support for instead of triggers in Entity framework.