Calculate the time a "SELECT query with huge result" takes in PostgreSQL - postgresql

I've been trying to calculate the time my queries take to complete in PostgreSQL.
I've written a bash script to issue a query using the following command "psql < query1.txt > /dev/null". However, the time measured using EXPLAIN is significantly different than the time measured using my bash script.
For one of the queries that returns 200,000+ rows, using the bash script, I got 13 seconds average elapsed time. But when I use EXPLAIN, the JSON file shows that it should take 218.735 milliseconds.
Is there is a way to find out where this extra time comes from?
I'm assuming that this happens because of the huge number of rows the from the query. Is there is a way to test a SELECT query without outputting its rows?
Note: I've also used a java application to measure the elapsed time. I got 1.2 seconds compared to 218.735 milliseconds with EXPLAIN command.
Can it be that EXPLAIN is not accurate?

Related

What is the difference between MongoDB executionTimeMillis vs response time

After reading the internet I am still not sure.
Using different tools like mongo command line or Robo 3T GUI I see that my query takes about 70ms to provide results.
At the same time if I use explain it gives me executionTimeMillis at 14ms
The connection is already established, so there should be no overhead there and yet the difference is around X5.
What are your thoughts?
explain.executionStats.executionTimeMillis
Total time in milliseconds required for query plan selection and query execution.
Response time:
Time between start and end of the query. It includes below times.
Time taken for processor cycles if any[wait time] + executionTimeMillis[plan selection, execution] + Time taken to return the response[returning the last byte of the response]

How can I bench mark query performance in postgreSQL? CPU or TIME but need to be consistant for every run

How can I bench mark SQL performance in postgreSQL? I tried using Explain Analyze but that gives varied Execution time every time when I repeat same query.
I am applying some tuning techniques on my query and trying to see whether my tuning technique is improving the query performace. The Explain analyze has varying execution times that I cant bechmark and compare . The tuning has imapact in MilliSeconds so I am looking for bechmarch that can give fixed values to compare against.
There will always be variations in the time it takes a statement to complete:
Pages may be cached in memory or have to be read from disk. This is usually the source of the greatest deviations.
Concurrent processes may need CPU time
You have to wait for internal short time locks (latches) to access a data structure
These are just the first three things that come to my mind.
In short, execution time is always subject to small variations.
Run the query several times and take the the median of the execution times. That is as good as it gets.
Tuning for milliseconds only makes sense if it is a query that is executed a lot.
Also, tuning only makes sense if you have realistic test data. Don't make the mistake to examine and tune a query with only a few test data when it will have to perform with millions of rows.

Get job average elapsed time

I need to get the average elapsed time for each job in Active Job Environment in order to produce a report.
I've tried to extract it from SMF records but I don't seem to get the right one. Also I've tried keystroke language but it's to slow! The job takes around 15min to collect all the data. I thought about using CTMJSA but since I only have examples to UPDATE and DELETE the statistics I thought it would be wiser not to use it.
There must be a file that loads the Statistics Screen and I'd like to ask if anyone knows which is it or how could I get that information.
Thank you!!
Ctmruninf is a better utility to use in this case. I use it on Unix to produce total numbers (via perl) but you should be able to adapt it to mainframe and get averages. To list everything between fixed dates do -
ctmruninf -list 20151101120101 20151109133301 -JOBNAME pdiscm005

Long runtime when query is executed the first time in RedShift

I noticed that the first time I run a query on RedShift, it takes 3-10 second. When I run same query again, even with different arguments in WHERE condition, it runs fast (0.2 sec).
Query I was talking about runs on a table of ~1M rows, on 3 integer columns.
Is this huge difference in execution times caused by the fact that RedShift compiles the query first time its run, and then re-uses the compiled code?
If yes - how to always keep this cache of compiled queries warm?
One more question:
Given queryA and queryB.
Let's assume queryA was compiled and executed first.
How similar should queryB be to queryA, such that execution of queryB will use the code compiled for queryA?
The answer of first question is yes. Amazon Redshift compiles code for the query and cache it. The compiled code is shared across sessions in a cluster, so the same query with even different parameters in the different session will run faster because of no overhead.
Also they recommend to use the result of the second execution of the query for the benchmark.
There is the answer for this question and details in the following link.
http://docs.aws.amazon.com/redshift/latest/dg/c-compiled-code.html

Executing same query makes time difference in postgresql

I just want to know what is the reason for having different time while executing the same query in PostgreSQL.
For Eg: select * from datas;
For the first time it takes 45ms
For the second time the same query takes 55ms and the next time it takes some other time.Can any one say What is the reason for having non static time.
Simple, everytime the database has to read the whole table and retrieve the rows. There might be 100 different things happening in database which might cause a difference of few millis. There is no need to panic. This is bound to happen. You can expect the operation to take same time with some millis accuracy. If there is a huge difference then it is something which has to be looked.
Have u applied indexing in your table . it also increases speed to a great deal!
Compiling the explanation from
Reference by matt b
EXPLAIN statement? helps us to display the execution plan that the PostgreSQL planner generates for the supplied statement.
The execution plan shows how the
table(s) referenced by the statement will be scanned — by plain
sequential scan, index scan, etc. — and if multiple tables are
referenced, what join algorithms will be used to bring together the
required rows from each input table
And Reference by Pablo Santa Cruz
You need to change your PostgreSQL configuration file.
Do enable this property:
log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this number
# of milliseconds
After that, execution time will be logged and you will be able to figure out exactly how bad (or good) are performing your queries.
Well that's about the case with every app on every computer. Sometimes the operating system is busier than other times, so it takes more time to get the memory you ask it for or your app gets fewer CPU time slices or whatever.