How to Improve SQL Azure Query Performance - tsql

I am using EF 6 to execute a SqlQuery, which is a SELECT statement returning 47 columns with a max of 500 rows. The query has eight LEFT JOIN statements.
In my local environment, the query only takes around 300 ms, as indicated in Express Profiler.
On SQL Azure, the same query can take 4800 ms, as indicated in the Query Performance > Query Plan Details section on the SQL Azure portal. The tier is S0.
At the time of testing, there was only one local session. There are probably only less than 10 sessions when accessing SQL Azure. It's on the development environment only.
What can be the causes of the difference, and what can I do to improve the performance?

SQL azure there are many things to consider if a query is not running as expected.Unlike normal server,SQL azure preformance is expressed in form of DTUs which simply mean amount of transactions that can be handled based on performance tier..To quote from MDSN.., a performance level with 5 DTUs has five times more power than a performance level with 1 DTU...
Can you also check sys.dm_db_resource_stats DMV usage and see the metrics state,if any metric is showing values more than 90%,it means your peformance level may needs to be upgraded...
You also can run same query in loop and capture data from sys.dm_exec_requests to see any signs of waits and blocking..
select top 10* from sys.dm_exec_requests
where session_id=<<sessionid from where you are running the query>>

Related

How to: Change actual execution method from "row" to "batch" - Azure SQL Server

I am having some major issues. When inserting data into my database, I am using an INSTEAD OF INSERT trigger which performs a query.
On my TEST database, this query takes much less than 1 second for insert of a single row. In production however, this query takes MUCH longer (> 30 seconds for 1 row).
When comparing the Execution plans for both of them, there seems to be some CLEAR differences:
Test has: "Actual Execution Method: Batch"
Prod has: "Actual Execution Method: Row"
Test has: "Actual number of rows: 1"
Prod has: "Actual number of rows 92.000.000"
Less than a week ago production was running similar to test. But not anymore - sadly.
Can any of you help me figure out why?
I believe, if I can just get the same execution plan for both, it should be no problem.
Sometimes using query hint OPTION(hash Join) helps to force a query plan to use batch processing mode. The following query that uses AdventureWorks2012 sample database demonstrates what I am saying.
SELECT s.OrderDate, s.ShipDate, sum(d.OrderQty),avg(d.UnitPrice),avg(d.UnitPriceDiscount)
FROM Demo d
join Sales.SalesOrderHeader s
on d.SalesOrderID=s.SalesOrderID
WHERE d.OrderQty>500
GROUP BY s.OrderDate,s.ShipDate
The above query uses row mode. With the query hint it then uses batch mode.
SELECT s.OrderDate, s.ShipDate, sum(d.OrderQty),avg(d.UnitPrice),avg(d.UnitPriceDiscount)
FROM Demo d
join Sales.SalesOrderHeader s
on d.SalesOrderID=s.SalesOrderID
WHERE d.OrderQty>500
GROUP BY s.OrderDate,s.ShipDate
OPTION(hash Join)
You don't get to force row vs. batch processing directly in SQL Server. It is a cost-based decision in the optimizer. You can (as you have noticed) force a plan that was generated that uses batch mode. However, there is no specific "only use batch mode" model on purpose as it is not always the fastest. Batch mode execution is like a turbo on a car engine - it works best when you are working with larger sets of rows. It can be slower on small cardinality OLTP queries.
If you have a case where you have 1 row vs. 92M rows, then you have a bigger problem with having a problem that has high variance in the number of rows processed in the query. That can make it very hard to make a query optimal for all scenarios if you have parameter sensitivity or the shape of the query plan internally can create cases where sometimes you have only one row vs. 92M. Ultimately, the solutions for this kind of problem are either to use option(recompile) if the cost of the compile is far less than the variance from having a bad plan or (as you have done) finding a specific plan in the query store that you can force that works well enough for all cases.
Hope that helps explain what is happening under the hood.
I have found a somewhat satifying solution to my problem.
By going into Query store of the database, using Microsoft SQL Server Management Studio, I was able to Force a specific plan for a specific query - but only if the plan was already made by the query.

EntityFramework taking excessive time to return records for a simple SQL query

I have already combed through this old article:
Why is Entity Framework taking 30 seconds to load records when the generated query only takes 1/2 of a second?
but no success.
I have tested the query:
without lazy loading (not using .Include of related entities) and
without merge tracking (using AsNoTracking)
I do not think I can easily switch to compiled queries in general due to the complexity of queries and using a Code First model, but let me know if you experience otherwise...
Setup
Entity Framework '4.4' (.Net 4.0 with EF 5 install)
Code First model and DbContext
Testing directly on the SQL Server 2008 machine hosting the database
Query
- It's just returning simple fields from one table:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Active] AS [Active],
[Extent1].[ChangeUrl] AS [ChangeUrl],
[Extent1].[MatchValueSetId] AS [MatchValueSetId],
[Extent1].[ConfigValueSetId] AS [ConfigValueSetId],
[Extent1].[HashValue] AS [HashValue],
[Extent1].[Creator] AS [Creator],
[Extent1].[CreationDate] AS [CreationDate]
FROM [dbo].[MatchActivations] AS [Extent1]
The MatchActivations table has relationships with other tables, but for this purpose using explicit loading of related entities as needed.
Results (from SQL Server Profiler)
For Microsoft SQL Server Management Studio Query: CPU = 78 msec., Duration = 587 msec.
For EntityFrameworkMUE: CPU = 31 msec., Duration = 8216 msec.!
Does anyone know, besides suggesting the use of compiled queries if there is anything else to be aware of when using Entity Framework for such a simple query?
A number of people have run into problems where cached query execution plans due to parameter sniffing cause SQL Server to produce a very inefficient execution plan when running a query through ADO.NET, while running the exact same query directly from SQL Server Management Studio uses a different execution plan because some flags on the query are set differently by default.
Some people have reported success in forcing a refresh of the query execution plans by running one or both of the following commands:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
But a more long-term, targeted solution to this problem would be to use Query Hints like OPTIMIZE FOR and OPTION(Recompile), as described in this article, to help ensure that good execution plans are chosen more consistently in the first place.
I think the framework is doing something funky if you what you say is true i.e. running the query in management studio takes half a second while entity framework takes 8.2 seconds. My hunch is that it's trying to do something with those 25K+ records set (perhaps bind to something else).
Can you download NP NET profiler and profile your app once? http://www.microsoft.com/en-in/download/details.aspx?id=35370
This nifty little program is going to record every method call and their execution time and basically give you info from under the hood on where it's spending those 7+ seconds. If that does not help, I also recommend trying out JetBrains .NET profiler. https://www.jetbrains.com/profiler/
Previous answer suggests that the execution plan can be off and that's true in many cases but it's also worth to sometimes look under the hood to determine the cause.
My thanks to Kalagen and others who responded to this - I did come to a conclusion on this, but forgot about this post.
It turns it is the number of records being returned X processing time (LINQ/EF I presume) to repurpose the raw SQL data back into objects on the client side. I set up wireshark on the SQL server to monitor the network traffic between it and client machines post-query and discovered:
There is a constant stream of network traffic between SQL server and
the rate of packet processing varies greatly between different machines (8x)
While that is occurring, the SQL server CPU utilization is < 25% and no resource starvation seems to be happening (working set, virtual memory, thread, handle counts, etc.)
so it is basically the constant conversion of the results back into EF objects.
The query in question BTW was part of a 'performance' unit test so we ended up culling it down to a more reasonable typical web-page loading of 100 records in under 1 sec. which passes easily.
If anyone wants to chime in on the details of how Entity Framework processes records post-query, I'm sure that would be useful to know.
It was an interesting discovery that the processing time depended more heavily on the client machine than on the SQL server machine (this is an intranet application).

What is the maximum number of joins allowed in SQL Server 2012? [duplicate]

What is the maximum number of joins allowed in SQL Server 2008?
The other answers already give the direct answer to your question
Limited only by available resources
However even if SQL Server successfully compiles a plan for your query that doesn't mean that you should. The more joins you have the exponentially larger the space of possible query plans will be and you may well get very sub optimal plans.
For a query with 12 joins the number of possible join orders is 28,158,588,057,600. Additionally each join may be of three possible algorithms (hash, nested loops, merge)
In the book "SQL server 2005 practical troubleshooting" Cesar Galindo-Legaria says
If you are joining over 20 tables, chances are the optimizer is not
reviewing the entire search space but relying more on heuristics ....
we have seen applications that run regular queries dealing with over
100 tables. While it is possible to run such very large queries, you
really are stretching the system in these cases and should be very
careful going this far
The limitations for SQL Server are listed here
The number of tables per query is only limited by the amount of available resources.
In SQL Server 2008, the maximum number of tables you can have in a SELECT is limited only by available resources (source).
In SQL Server 2005, there was a 256 table limit for a single SELECT (source).
Though, if you're getting up to those sorts of numbers, then I'd be getting a bit concerned tbh!
For inner join, max 256 tables can be joined.
For outer join, max 2 tables can be joined.
Source: classroom training.

SQL Server 2008 R2 table access times

Does SQL Server maintain statistics for each table on read, write, update times etc?
What we are wanting to know which tables our ERP applications spend the most time and begin looking for ways to optimize the tables.
Well, SQL Server doesn't keep track of those statistics by table name. But you could look at DMVs like sys.dm_exec_query_stats to see which queries are taking the longest.
SELECT [sql] = SUBSTRING
(
st.[text],
(s.statement_start_offset/2)+1,
(CASE s.statement_end_offset
WHEN -1 THEN DATALENGTH(CONVERT(NVARCHAR(MAX), st.[text]))
ELSE s.statement_end_offset END
- s.statement_start_offset)/2
), s.*
FROM sys.dm_exec_query_stats AS s
CROSS APPLY sys.dm_exec_sql_text(s.[sql_handle]) AS st
WHERE s.execution_count > 1
AND st.[dbid] = DB_ID('Your_ERP_Database_Name')
ORDER BY total_worker_time*1.0 / execution_count DESC;
Of course you can order by any metrics you want, and quickly eyeball the first column to see if you identify anything that looks suspicious.
You can also look at sys.dm_exec_procedure_stats to identify procedures that are consuming high duration or reads.
Keep in mind that these and other DMVs reset for various events including reboots, service restarts, etc. So if you want to keep a running history of these metrics for trending / benchmarking / comparison purposes, you're going to have to snapshot them yourself, or get a 3rd party product (e.g. SQL Sentry Performance Advisor) that can help with that and a whole lot more.
Disclaimer: I work for SQL Sentry.
You could create a SQL Server Audit as per the following link:
http://msdn.microsoft.com/en-us/library/cc280386(v=sql.105).aspx
SQL Server does capture the information you're asking about, but it's on a per index basis, not per table - look in sys.dm_db_index_operational_stats and sys.dm_db_index_usage_stats. You'll have to aggregate the data based on object_id to get table information. However, there are caveats - for example, if an index is not used (no reads and no writes), it won't show up in the output. These statistics are reset on instance restart, and there's a bug that causes them to be reset in index_usage_stats when an index is rebuilt (https://connect.microsoft.com/SQLServer/feedback/details/739566/rebuilding-an-index-clears-stats-from-sys-dm-db-index-usage-stats). And, there are notable differences between the outputs from the DMVs - check out Craig Freedman's post for more information (http://blogs.msdn.com/b/craigfr/archive/2008/10/30/what-is-the-difference-between-sys-dm-db-index-usage-stats-and-sys-dm-db-index-operational-stats.aspx).
The bigger question is, what problem are you trying to solve by having this information? I would agree with Aaron that finding queries that are taking a long time is a better place to start in terms of optimization. But, I wanted you to be aware that SQL Server does have this information.
we use sp_whoisActive from Adam Mechanics blog.
It gives us a snap shot of what is currently going on on the server, and what execution plan the statements are using.
It is easy to use and free of charge.

SSRS report VERY SLOW in prod but SQL query runs FAST

I've spent hours troubleshooting this and I need some fresh perspective . . .
We have a relatively simple report setup in SSRS, simple matrix with columns across the top and data points going down. The SQL query behind the report is "medium" complexity -- has some subqueries and several joins, but nothing real crazy.
Report has worked fine for months and recently has become REALLY slow. Like, 15-20 minutes to generate the report. I can clip-and-paste the SQL query from the Report Designer into SQL Mgmt Studio, replace the necessary variables, and it ruturns results in less than 2 seconds. I even went so far as to use SQL profiler to get the exact query that SSRS is executing, and clipped-and-pasted this into Mgmt Studio, still the same thing, sub-second results. The parameters and date ranges specified don't make any difference, I can set parameters to return a small dataset (< 100 rows) or a humongous one (> 10,000 rows) and still the same results; super-fast in Mgmt Studio but 20 minutes to generate the SSRS report.
Troubleshooting I've attempted so far:
Deleted and re-deployed the report in SSRS.
Tested in Visual Studio IDE on multiple machines and on the SSRS server, same speed (~20 minutes) both places
Used SQL Profiler to monitor the SPID executing the report, captured all SQL statements being executed, and tried them individualy (and together) in Mgmt Studio -- runs fast in Mgmt Studio (< 2 seconds)
Monitored server performance during report execution. Processor is pretty darn hammered during the 20 minute report generation, disk I/O is slightly above baseline
Check the execution plans for both to ensure that a combination of parameter sniffing and/or differences in set_options haven't generated two separate execution plans.
This is a scenario I've come across when executing a query from ADO.Net and from SSMS. The problem occurred when the use of different options created different execution plans. SQL Server makes use of the parameter value passed in to attempt to further optimise the execution plan generated. I found that different parameter values were used for each of the generated execution plans, resulting in both an optimal and sub-optimal plan. I can't find my original queries for checking this at the moment but a quick search reveals this article relating to the same issue.
http://www.sqlservercentral.com/blogs/sqlservernotesfromthefield/2011/10/25/multiple-query-plans-for-the-same-query_3F00_/
If you're using SQL Server 2008 there's also an alternative provided via query hint called "OPTIMIZE FOR UNKNOWN" which essentially disables parameter sniffing. Below is a link to an article that assisted my original research into this feature.
http://blogs.msdn.com/b/sqlprogrammability/archive/2008/11/26/optimize-for-unknown-a-little-known-sql-server-2008-feature.aspx
An alternative to the above for versions earlier than 2008 would be to store the parameter value in a local variable within the procedure. This would behave in the same way as the query hint above. This tip comes from the article below (in the edit).
Edit
A little more searching has unearthed an article with a very in-depth analysis of the subject in case it's of any use, link below.
http://www.sommarskog.se/query-plan-mysteries.html
This issue has been a problem for us as well. We are running SSRS reports from CRM 2011. I have tried a number of the solutions suggested (mapping input parameters to local variables, adding WITH RECOMPILE to the stored procedure) without any luck.
This article on report server application memory configuration (http://technet.microsoft.com/en-us/library/ms159206.aspx), more specifically, adding the 4000000 value to our RSReportServer.config file solved the problem.
Reports which would take 30-60 seconds to render now complete in less than 5 seconds which is about the same time the underlying stored procedure takes to execute in SSMS.