I'm experiencing some performance issues with EF and was wondering... well ... why.
The query I am running is simply:
var procs = ctx.Procedures
.Include(p => p.ProcedureProcedureFields.Select(ppf => ppf.ProcedureField))
.Where(p => p.IsActive)
.Where(p => !p.ProcedureLogbookTypes1.Any()).ToList();
So, not even passing in any parameters, which rules out a out of the issues. If I take the SQL from SQL Profiler and run it directly in SSMS, it takes less than 1s.
The EF call takes about 12s for the procs variable to be populated.
A few more things.
I have not just run the sql after the EF for comparison. I've made sure that there was no plan in cache. In fact, I've done both. I've run the SQL when a plan is in cache and when not. Any and every combination of running the 2 things yields the same results. The raw SQL is a fraction of the time of the EF query.
I'm all for using Stored Procedures where the query is uber-complex and the requirement for customizing the SQL is required for performance.
But the query above is simple. The SQL generated is simple.
I'd rather not litter my database with a million little stored procs, just because I cannot figure out how to make EF perform.
Is there a way to speed this up?
Thanks
You can use AsNoTracking as described by this answer (which is sourced from this article):
Entity Framework exposes a number of performance tuning options to help you optimise the performance of your applications. One of these tuning options is .AsNoTracking(). This optimisation allows you to tell Entity Framework not to track the results of a query. This means that Entity Framework performs no additional processing or storage of the entities which are returned by the query. However, it also means that you can't update these entities without reattaching them to the tracking graph.
Related
I have taken over an existing MVC website which uses entity framework and hangfire and is hosted on Azure and uses Azure database. Every so often the website times out.
I'm new to Azure portal, entity framework and hangfire.
If I increase the DTU's it clears the timeout issues?
I'm looking for ways of how to diagnose why the website times out. I have added error logging using elmah and checked hangfire but this doesn't give me any further information.
Is there anything in azure portal that can help?
If it "times out" and if "increasing DTU resolves timeouts" and these observations are true (I think it's on you to really convince yourself this is absolutely true, don't make this assumption lightly) then the usual and obvious candidate is "a slow sql query". Entity Framework is often used with linq to create sql queries without writing sql. These queries are often fine for very simple tasks, such as someData.Where(x=>x.Id == 1).First(), however, if linq is used to join tables, or create complex associations, the generated sql can become monstrously bad, from a performance perspective. You can add logging to write out the sql generated by linq, or you can try to trace the database to see what sql is running on it. If tracing is out of the question, there are still meta queries you can use to view things like cached query plans and SQL Server can give you estimated costs and cached execution counts.
You can still hang yourself without using linq. You can still use stored procedures with EF. Way too many developers are naive about SQL performance still; you need to comb over your back end and learn the schema, the stored procedures; inspect the sql contents of everything. Check for any database triggers (easy to miss). Red flags are subqueries, too many joining, too many results from a query, lots of string manipulation in a query, joining tables on strings, or XML/JSON-based SQL work.
Be aware that "slow sql queries" will become slower when load is high. And when slow sql queries build up, they only take more time to resolve. This can also cause debilitating table locking, depending on the nature of the query.
But queries can be performant and still cause locking. ie One table is being written to often and it's blocking other writes or reads from that table. This is a little harder to diagnose, but you can figure it out by carefully inspecting logs of database calls and how long they take to execute. There are also sql queries you can run on the database to diagnose long-running queries, or what tables are locked at a given point in time.
Finally, check for any back end webjobs for your application. If timeouts occur at reoccurring days or times, then somebody's batch SQL could be blocking your production database from being read.
But this is all speculation. I think you need to do more research to determine what is actually causing the site to become unresponsive. If you can log response times for common queries, you can rule out SQL-based latency as being the culprit or not and work from there. There's nothing inherently "amiss" about any of the technologies you specified.
If queries are perfomant but still causing issues, a long term solution is to add something like a message queue and batch your sql work intelligently, or just make the database work asynchronous and not block the UI.
You should correlate any logged timeouts with azure's monitoring. Azure can give you CPU/RAM/page visits and such on the dashboard.
SQL Azure is a bit of a different beast. It doesn't have the on-demand performance of a dedicated DB unless you're prepared to throw serious $$ at it. And even then ...
EF, when written for well can perform quite well. When written poorly it can be a dog, and those problems are compounded on a platform like SQL Azure.
The first thing is to check that your EF contexts are set up to use an execution strategy suited to Azure: https://learn.microsoft.com/en-us/ef/ef6/fundamentals/connection-resiliency/retry-logic
The next thing would be to see what kinds of SQL tracing you can run on Azure. Tracing is essential to see what EF is doing behind the scenes. I'm not familiar with tools available for Azure, in my case my Azure experience was running SQL Server on VMs because SQL Azure was too immature, not HIPAA compliant at the time, and expensive for the DTU estimates we were able to get. Worst case, can you restore an database backup into an SQL Server instance and point a copy of your application environment temporarily at that to run through common usage scenarios? Using an SQL Trace you can pick up on exactly when and how often EF is executing queries, and what queries it is executing.
Things to look at:
How many queries are running? If you are loading a set of records and expect one query, are there a whole heap of queries getting sent? This would indicate lazy-load calls being triggered.
What queries are being run? Is it selecting a lot more fields than are being displayed? This would be potentially a case where entire entities are being loaded where a .Select() could be used to reduce the amount of data. Perhaps even the case where entire sets of entities are being loaded that aren't relevant to what is displayed/done, such as cases where someone is using .ToList() prior to just doing a .Count() or .Any() or doing a .FirstOrDefault() just to do a != null check.
Is the database properly indexed? Copy some of the heavier queries into SQL Manager and execute them with an execution plan. Are there indexing suggestions?
The common sins of developing with EF and other ORMs boil down to "pulling too much, too often." It's surprising how many clients I've worked with have development teams that have not used a profiler to inspect their ORM use efficiency. (and I'm talking 0% so far.)
I would like to ask the Entity Framework Core team what their ambition is for the scope/complexity of query translation compared to EF6.
I've used EF6 extensively and I know that if you can express it in LINQ and don't use any untranslatable functions, EF can probably translate the query correctly.
Will Entity Framework's translation be eventually as good as that, or is that something that is considered secondary, like the lazy loading feature.
If so, about what is the team eventually aiming at compare to EF6?
There's a ticket discussing GroupBy that appears to indicate they deem grouping an advanced type of query, but compared to what EF6 can translate, a normal group-by is pretty average.
(I'm asking here as the EF Core team says on it's site it is monitoring SO for questions.)
We took a very different approach in EF Core. Every LINQ query should work--even if you use untranslatable functions. We do this by translating the parts of the query we can into SQL and processing the rest on the client after the results are returned by the server. As EF Core evolves, we'll translate more and more of the query into SQL (e.g. GROUP BY) which can make it more efficient.
In theory, our goal is to translate everything that the store supports. In some cases however (especially on NoSQL stores) there simply is no translation for a LINQ operator, and we feel it's better to be functional and inefficient than to throw.
If you want to ensure your whole query is translated, you can disable client evaluation. This will cause it to throw like EF6.
I found a very interesting in page: http://msdn.microsoft.com/en-us/library/cc853327.aspx
Here you can see during query stages, there is an stage named "Generating views" will cost a lot. Even though EF provides some method to pre-compile it, but if you have many query without pre-compile you still may get problems.
You can find How to: Pre-Generate Views to Improve Query Performance here: http://msdn.microsoft.com/en-us/library/bb896240.aspx
And here, you can see query without pre-generate will cost twice time. So that means it does cost a lot. http://blogs.msdn.com/b/appfabriccat/archive/2010/08/06/isolating-performance-with-precompiled-pre-generated-views-in-the-entity-framework-4.aspx
So I have a question, why EF design this stage? And does NHibernate also have this stage? If true, how about it performance in Nhibernate?
EF views have nothing to do with SQL views - EF views are mapping transformation compiled into executable code. EF use these transformation to convert its query representation into target SQL representation. The reason for this compilation is performance of the whole application - you need to invest time to initialization but all your subsequent queries and updates will use compiled code instead of some lookup in EDM. If you don't need to modify mapping at runtime you can even pre-compile those views during compilation of your application.
EF views are used for query preparation (transforming one representation into another) but the query preparation must be done for each unique query. In EF 4 this preparation is not cached unless you manually use Compiled query. In EF 4.5 and 5.0 (.NET 4.5) all queries are automatically "compiled" = there is the cache and each unique query is really prepared only once. Subsequent execution of the same query use compiled version from the cache.
You can read more about performance and EF 5.0 in this beginner guide.
Is there a way to essentially make the EF context stateless so I can insert a bunch of POCOs and not have them remain in memory, kind of the equivalent of a stateless session in NHibernate? This is to try an improve performance of bulk inserts. I'm going to be inserting 1.7M POCOs into a SQL Server Compact table in the first run, and then inserting/updating records on subsequent runs.
No. EF will require you to load all objects into context (memory) and after that it will insert / update each object in separate roundtrip to database.
Every performance improvement is mostly based on hacking EF and trying to overcome its limitations. In such case you can directly write insert yourselves and batch them to SqlCeCommands manually - you will build such solution faster, it will have better performance and it will be less error prone.
I am using Entity Framework to layer on my SQL Server 2008 database. The EF is present in my web service and the webservice is invoked by a Silverlight client.
I am seeing a serious performance issue in terms of the duration taken by a query to execute in the EF. This wouldn't happen in the consecutive calls.
A little bit of googling revealed that, it's caused per app domain to construct the in-memory model of the db objects. I found this Microsoft link explaining pre-generation of views for performance improvement. Even after implementing the steps, the performance actually degraded instead of improving. I am curious, if anyone has tried this approach successfully and if there are any other avenues for improving performance.
I am using .NET 3.5.
A couple areas to look at for EF performance
Do as much of the processing before calling things like tolist(). ToList will bring everything in the set into memory. By default, EF will keep building the expression tree and only actually process it when you need the data in memory. That first query will be against the database, but afterwards the processing will be in memory. When working with large data, you definitely want as much of the heavy lifting done by the database as possible.
EF 1 only has the option to pull the entire row back. Therefore if you have a column that is a large string or binary blob, it is going to be pulled down and into memory whether you need it or not. You can create a projection that doesn't include this column, but then you don't get the benefits of having it be an entity.
You can look at the sql generated by EF using the suggestion in this post
How do I view the SQL generated by the Entity Framework?
The same laws of physics apply for EF queries as they do for ordinary SQL. Check your database tables and make sure that you have indexes on primary and foreign keys, that your database is properly normalized, and so forth. If performance is degrading after Microsoft's suggestions, then that's my guess as to the problem area.
Are you hosting the webservice in IIS? Is it running on the same site as the Silverlight App? What about the database itself? Is it running on a dedicated machine? Are there other apps hitting it? The first call to a dormant database is painful (I've had situations where it would actually time out in my environment.)
There are a number of factors to take into consideration here. But it comes down to more than just EF's overhead.
edit I didn't fully qualify but the process of opening the first connection to SQL Server is slow regardless of your data access solution.
Use SQL Profiler to check how many queries executed to retrieve your data.If it's large number use Include() method of ObjectQuery to retrieve child objects with parent in one query.