Does SQL Server maintain statistics for each table on read, write, update times etc?
What we are wanting to know which tables our ERP applications spend the most time and begin looking for ways to optimize the tables.
Well, SQL Server doesn't keep track of those statistics by table name. But you could look at DMVs like sys.dm_exec_query_stats to see which queries are taking the longest.
SELECT [sql] = SUBSTRING
(
st.[text],
(s.statement_start_offset/2)+1,
(CASE s.statement_end_offset
WHEN -1 THEN DATALENGTH(CONVERT(NVARCHAR(MAX), st.[text]))
ELSE s.statement_end_offset END
- s.statement_start_offset)/2
), s.*
FROM sys.dm_exec_query_stats AS s
CROSS APPLY sys.dm_exec_sql_text(s.[sql_handle]) AS st
WHERE s.execution_count > 1
AND st.[dbid] = DB_ID('Your_ERP_Database_Name')
ORDER BY total_worker_time*1.0 / execution_count DESC;
Of course you can order by any metrics you want, and quickly eyeball the first column to see if you identify anything that looks suspicious.
You can also look at sys.dm_exec_procedure_stats to identify procedures that are consuming high duration or reads.
Keep in mind that these and other DMVs reset for various events including reboots, service restarts, etc. So if you want to keep a running history of these metrics for trending / benchmarking / comparison purposes, you're going to have to snapshot them yourself, or get a 3rd party product (e.g. SQL Sentry Performance Advisor) that can help with that and a whole lot more.
Disclaimer: I work for SQL Sentry.
You could create a SQL Server Audit as per the following link:
http://msdn.microsoft.com/en-us/library/cc280386(v=sql.105).aspx
SQL Server does capture the information you're asking about, but it's on a per index basis, not per table - look in sys.dm_db_index_operational_stats and sys.dm_db_index_usage_stats. You'll have to aggregate the data based on object_id to get table information. However, there are caveats - for example, if an index is not used (no reads and no writes), it won't show up in the output. These statistics are reset on instance restart, and there's a bug that causes them to be reset in index_usage_stats when an index is rebuilt (https://connect.microsoft.com/SQLServer/feedback/details/739566/rebuilding-an-index-clears-stats-from-sys-dm-db-index-usage-stats). And, there are notable differences between the outputs from the DMVs - check out Craig Freedman's post for more information (http://blogs.msdn.com/b/craigfr/archive/2008/10/30/what-is-the-difference-between-sys-dm-db-index-usage-stats-and-sys-dm-db-index-operational-stats.aspx).
The bigger question is, what problem are you trying to solve by having this information? I would agree with Aaron that finding queries that are taking a long time is a better place to start in terms of optimization. But, I wanted you to be aware that SQL Server does have this information.
we use sp_whoisActive from Adam Mechanics blog.
It gives us a snap shot of what is currently going on on the server, and what execution plan the statements are using.
It is easy to use and free of charge.
Related
I have a complex postgres query that I've optimised with pg_hint_plan. Planning time is about 150ms while query time is about 30ms. The plan should never change, therefore there's no point in gathering statistics each any every time for each query. The structural problem with the query is that it hits too many tables.
Tweaking the join collapse limit and from select collapse limit has limited effect.
Most 'enterprise' databases have a shared query cache, but as far as I can see Postgres does not.
What are ways around this? Prepared statements aren't really suitable as their lifetime is bound to the connection.
There is no way around this. The best solution I can think of is to use connection pooling, so that your connections live for a long time, and use a prepared statement.
No PostgreSQL does not have a plan cache area like Microsoft SQL Server or Oracle...
It is one of the many differences with professional RDBMS, like SQL Server... For the last one, a complete comparison can be read here
I was wondering if you could tell me which NoSQL db or technology/tools should I use for my scenario. We are looking at replacing our OLAP cubes based on SQL server Analysis services with an open source technology coz the data is getting too huge to manage and queries are taking too long to return. We have followed every rule in the book to shard the data, optimize the design of the cube by using aggregations and partitions etc and still some of our distinct count queries take 1-2 mins :( The data size of our fact table is roughly around 250GB. And there are 10-12 dimensions connected in star schema fashion.
So we decided to give open source technologies like Hadoop/HBase/NoSQL dbs a try to see if they can solve our OLAP scenarios with minimal setup and onboarding.
Our main requirements for the new technology are
It has to get blazing fast or instantaneous results for distinct count queries ( < 2 secs)
Supports the concept of measures and dimensions (like in OLAP).
Support SQL like query language as many of our developers are SQL experts.
Ability to connect Excel/Tableau to visualize the data.
As there are so many new technologies and tools in the open source world today, I was hoping if you can help me point to the right direction.
Notes: I'm from Apache Kylin team.
Please refer to below answers which may bring some idea for you:
Our main requirements for the new technology are
It has to get blazing fast or instantaneous results for distinct count queries ( < 2 secs)
--Luke: 90%tile query latency less than 5s is our current statistics. For <2s on distinct count, how many data you will have? Is approximate result ok?
Supports the concept of measures and dimensions (like in OLAP).
--Luke: Kylin is pure OLAP engine which has dimension (supports hierarchy also) and measure (Sum/Count/Min/Max/Avg/DistinctCount) definition
Support SQL like query language as many of our developers are SQL experts.
--Luke: Kylin support ANSI SQL interface (most SELECT functions)
Ability to connect Excel/Tableau to visualize the data.
--Luke: Kylin has ODBC Driver works very well with Tableau, Excel/PowerBI will coming soon.
Please let's know if you have more questions.
Thanks.
Looks like "Kylin" http://www.kylin.io/ is my answer. This has all the requirements that I wanted and even more. I'm gonna give it a try now! :)
I have an application written on Play Framework 1.2.4 with Hibernate(default C3P0 connection pooling) and PostgreSQL database (9.1).
Recently I turned on slow queries logging ( >= 100 ms) in postgresql.conf and found some issues.
But when I tried to analyze and optimize one particular query, I found that it is blazing fast in psql (0.5 - 1 ms) in comparison to 200-250 ms in the log. The same thing happened with the other queries.
The application and database server is running on the same machine and communicating using localhost interface.
JDBC driver - postgresql-9.0-801.jdbc4
I wonder what could be wrong, because query duration in the log is calculated considering only database processing time excluding external things like network turnarounds etc.
Possibility 1: If the slow queries occur occasionally or in bursts, it could be checkpoint activity. Enable checkpoint logging (log_checkpoints = on), make sure the log level (log_min_messages) is 'info' or lower, and see what turns up. Checkpoints that're taking a long time or happening too often suggest you probably need some checkpoint/WAL and bgwriter tuning. This isn't likely to be the cause if the same statements are always slow and others always perform well.
Possibility 2: Your query plans are different because you're running them directly in psql while Hibernate, via PgJDBC, will at least sometimes be doing a PREPARE and EXECUTE (at the protocol level so you won't see actual statements). For this, compare query performance with PREPARE test_query(...) AS SELECT ... then EXPLAIN ANALYZE EXECUTE test_query(...). The parameters in the PREPARE are type names for the positional parameters ($1,$2,etc); the parameters in the EXECUTE are values.
If the prepared plan is different to the one-off plan, you can set PgJDBC's prepare threshold via connection parameters to tell it never to use server-side prepared statements.
This difference between the plans of prepared and unprepared statements should go away in PostgreSQL 9.2. It's been a long-standing wart, but Tom Lane dealt with it for the up-coming release.
It's very hard to say for sure without knowing all the details of your system, but I can think of a couple of possibilities:
The query results are cached. If you run the same query twice in a short space of time, it will almost always complete much more quickly on the second pass. PostgreSQL maintains a cache of recently retrieved data for just this purpose. If you are pulling the queries from the tail of your log and executing them immediately this could be what's happening.
Other processes are interfering. The execution time for a query varies depending on what else is going on in the system. If the queries are taking 100ms during peak hour on your website when a lot of users are connected but only 1ms when you try them again late at night this could be what's happening.
The point is you are correct that the query duration isn't affected by which library or application is calling it, so the difference must be coming from something else. Keep looking, good luck!
There are several possible reasons. First if the database was very busy when the slow queries excuted, the query may be slower. So you may need to observe the load of the OS at that moment for future analysis.
Second the history plan of the sql may be different from the current session plan. So you may need to install auto_explain to see the actual plan of the slow query.
i am new at db2 i want to select around 2 million data with single query like that
which will select and display first 5000 data and in back process it will select other 5000 data and keep on same till end of the all data help me out with this how to write query or using function
Sounds like you want what's known as blocking. However, this isn't actually handled (not the way you're thinking of) at the database level - it's handled at the application level. You'd need to specify your platform and programming language for us to help there. Although if you're expecting somebody to actually read 2 million rows, it's going to take a while... At one row a second, that's 23 straight days.
The reason that SQL doesn't really perform this 'natively' is that it's (sort of) less efficient. Also, SQL is (by design) set up to operate over the entire set of data, both conceptually and syntactically.
You can use one of the new features, that incorporates paging from Oracle or MySQL: https://www.ibm.com/developerworks/mydeveloperworks/blogs/SQLTips4DB2LUW/entry/limit_offset?lang=en
At the same time, you can influence the optimizer by indicating OPTIMIZED FOR n ROWS, and FETCH FIRST n ROWS ONLY. If you are going to read only, it is better to specify this clause in the query "FOR READ ONLY", this will increase the concurrency, and the cursor will not be update-able. Also, assign a good isolation level, for this case you could eventually use "uncommitted read" (with UR). A Previous Lock table will be good.
Do not forget the common practices like: index or cluster index, retrieve only the necessary columns, etc. and always analyze the access plan via the Explain facility.
How can i check the Query running from long time & steps of tuning the query? (Oracle)
Run explain plan for select .... to see what Oracle is doing with your query.
Post your query here so that we can look at it and help you out.
Check out the Oracle Performance Tuning FAQ for some tricks-of-the-trade, if you will.
You can capture the query by selecting from v$sql or v$sqltext.
If you are not familiar with it, look up 'Explain Plan' in the Oracle
documentation. There should be plenty on it in the performance tuning
guide.
Have a look at Quest Software's Toad for a third party tool that helps
in this area too.
K
Unfortunately your question is not expressed clearly. The other answers have already tackled the issue of tuning a known bad query, but another interpretation is that you want to monitor your database to find poorly performing queries.
If you don't have Enterprise Edition with the Diagnostics pack - and not many of us do - your best bet is to run statspack snapshots on a reqular basis. This will give you a lot of information about your system, including which queries take a long time to complete and which queries consume a lot of your system's resources. You can find out more about statspack here.
If you do not want to use OEM, then you can query and find out.
First find the long running query. If it's currently being executing, You can join gv$session to find which session running since long time. Then go to gv$sql to find SQL details. You need to look last_call_et column.If SQL executed some time inpast you can use dba_hist_snapshot ,dba_hist_sqlstat ,DBA_HIST_SQLTEXT tables to find offending SQL.
Once you get query, you can check what plan it's picking from dba_hist_sql_plan table if this SQL executed in past or from gv$sql_plan if it's currently executing.
Now you analyze execution plan and see if it's using right index, join etc.
If not tune those.
Let me know which step you have the problem. I can help you in answering those.