Logging slow queries on Google Cloud SQL PostgreSQL instances - postgresql

The company I work for uses Google Cloud SQL to manage their SQL databases in production.
We're having performance issues and I thought it'd be a good idea (among other things) to see/monitor all queries above a specific threshold (e.g. 250ms).
By looking at the PostgreSQL documentation I think log_min_duration_statement seems like the flag I need.
log_min_duration_statement (integer)
Causes the duration of each completed statement to be logged if the statement ran for at least the specified number of milliseconds. Setting this to zero prints all statement durations.
But judging from the Cloud SQL documentation I see that is only possible to set a narrow set of database flags (as in for each DB instance) but as you can see from here log_min_duration_statement is not among those supported flags.
So here comes the question. How do I log/monitor my slow PostgreSQL queries with Google Cloud SQL? If not possible then what kind of tool/methodologies do you suggest I use to achieve a similar result?

April 3, 2019 UPDATE
It is now possible to log slow queries on Google Cloud SQL PostgreSQL instances, see https://cloud.google.com/sql/docs/release-notes#april_3_2019:
database_flags = [
{
name = "log_min_duration_statement"
value = "1000"
},
]
Once you enable log_min_duration_statement, you can view the logs using Stackdriver logging. Select Cloud SQL Database -> cloudsql.googleapis.com/postgres.log and you will see the log like this.
[103402]: [9-1] db=cloudsqladmin,user=cloudsqladmin LOG: duration: 11.211 ms statement: [YOUR SQL HERE]
References:
Full list of supported flags (CTRL+F for log_min_duration_statement): https://cloud.google.com/sql/docs/postgres/flags#postgres-l
Issue tracker: https://issuetracker.google.com/issues/74578509#comment54
PostgreSQL docs: https://www.postgresql.org/docs/9.6/runtime-config-logging.html#GUC-LOG-MIN-DURATION-STATEMENT

The possibility of monitoring slow PostgreSQL queries for Cloud SQL instances is currently not available. As you comment, the log_min_duration_statement flag is currently not supported by Cloud SQL.
Right now, work is being made on adding this feature to Cloud SQL, and you can keep track on the progress made through this link. You can click on the star icon on the top left corner to get email notifications whenever any significant progress has been achieved.

There is a way to log slow queries through the pg_stat_statements extension which is supported by Cloud SQL.
Since Cloud SQL doesn't grant superuser right to any of the users you need to use some workaround.
First, you need to enable the extension with
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
then you can check slow queries with a query like
SELECT pd.datname,
us.usename,
pss.userid,
pss.query AS SQLQuery,
pss.rows AS TotalRowCount,
(pss.total_time / 1000) AS TotalSecond,
((pss.total_time / 1000) / calls) as TotalAverageSecond
FROM pg_stat_statements AS pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
INNER JOIN pg_user AS us
ON pss.userid = us.usesysid
ORDER BY TotalAverageSecond DESC
LIMIT 10;
As postgres user you can have a look on all slow queries, but since the user is not superuser you will see <insufficient privilege> on all other users' queries.
To get around this limitation you can install the extension on other databases too (normally only postgres user has rigths to install extensions) and you can check the query texts with the owner of the db.

Not ideal by any measure, but what we do is run something like this on a cron once a minute and log out the result:
SELECT EXTRACT(EPOCH FROM now() - query_start) AS seconds, query
FROM pg_stat_activity
WHERE state = 'active' AND now() - query_start > interval '1 seconds' AND query NOT LIKE '%pg_stat_activity%'
ORDER BY seconds DESC LIMIT 20
You'd need to fiddle with the query to get millisecond granularity, and even then it'll only catch queries that overlap with your cron frequency, but probably better than nothing?

Related

INSERT OPENQUERY timeout

I'm trying to execute and insert query to a linked server in SQL Server.
For that I'm using INSERT INTO OPENQUERY statement.
The linked server is an Apache HIVE using Cloudera ODBC Provider.
The insert operation takes around 1 minute in my setup when performed from HIVE client.
However, SQL INSERT always times out after 30 seconds.
I set the Query Timeout parameter to 0 but it seems to be not affecting INSERT statement, however, it is working fine for SELECT statements taking longer time.
Is this a known limitation?
Is there a way to change the timeout for the insert statement when using OPENQUERY?
EDIT
I would like to clarify the setup I'm working with.
---------- ---------------------- ---------------
| MS SQL | => Linked Server => | Hive ODBC Provider | => | Hive Server |
---------- ---------------------- ---------------
In Hive, I have a table called calc_result where I would like to periodically store calculation results from the SQL server. For example, I try to insert using a query like this.
insert openquery(HIVE, 'select timestamp timestamp , tag tag, value value from calc_result')
values('2019-04-22 11:50:41', 'test',2.0)
The insert operation is captured correctly by HIVE server and a MapReduce job starts. However, the job will be killed after 30 seconds due to timeout.
The SQL server will show the below error message.
OLE DB provider "MSDASQL" for linked server "HIVE" returned message "[Cloudera][Hardy] (72) Query execution timeout expired.".
However, SELECT OPENQUERY works fine and would follow Query Timeout settings of the linked server (Which is set to 0 in this case).
Edit that is completely different use case from what I've imagined. In that case there should not be any difference in select/insert.
As you have configured your linked server timeout, there is a second place in the linked server properties you can check a Command Timeout setting in the provider string:
Other option that comes into my mind is instance wide timout. Default set for 600 seconds (10 minutes) which is way above your 30 seconds. However, you can still try it to see if there is any impact.
For infinite wait:
sp_configure 'show advanced options',1
go
reconfigure
go
sp_configure 'remote query timeout (s)',0
go
reconfigure
go
I would try using SELECT INTO temporary table and then materializing it using regular INSERT INTO:
SELECT c1, c2
INTO #temp_tab
FROM OPENQUERY(mylinkedserver, 'SELECT c1, c2 FROM remote_table');
INSERT INTO normal_table(col1, col2)
SELECT c1, c2
FROM #temp_tab;
EDIT:
You could try wrapping it with transaction and remove aliases:
BEGIN TRAN;
insert openquery(HIVE, 'select timestamp, tag, value from calc_result')
values('2019-04-22 11:50:41', 'test',2.0);
COMMIT;
If necessary set up DTC: How can I enable distributed transactions for a linked server?
While I didn't find a way to change OPENQUERYtimeout from 30 seconds, I found that using EXEC AT Linked Server to work fine for INSERT queries while adhering to timeout settings.
I accidentally stumbled upon the solution in this 2009 blog post. Databases might not be my strength, but I feel SQL Server documentation can be improved. A simple page that lists possible ways to interact with a Linked Server could've saved me lots of retries.

SQL Server OpenQuery() behaving differently then a direct query from TOAD

The following query works efficiently when run directly against Oracle 11 using TOAD (with native Oracle drivers)
select ... from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date('12/31/9999','mm/dd/yyyy')
and rgn_nm = 'Boston'
) ...
;
The exact same query "never" returns if passed from SQL Server 2008 to the same Oracle database via openquery(). SQL Server has a link to the Oracle database using an Oracle Provider OLE DB driver.
select * from openquery( servername, '
select ... from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date(''12/31/9999'',''mm/dd/yyyy'')
and rgn_nm = ''Boston''
) ...
');
The query doesn't return in a reasonable amount of time, and the user kills the query. I don't know if it would eventually return with the correct result.
This result where the direct TOAD query works efficiently and the openquery() version "never" returns is reproducible.
A small modification to the openquery() gives the correct efficient result: Change eff_endt to trunc(eff_endt).
That is well and good, but it doesn't seem like the change should be necessary.
openquery() is supposed to be pass through, so how can there be a difference between the TOAD and openquery() behavior?
The reason we care is because we frequently develop complex queries with TOAD directly accessing Oracle. Once we have the query functioning and optimized, we convert it to an openquery() string for use in a SQL Server application. It is extremely aggravating to have a query suddenly fail with openquery() when we know it worked as a direct query. Then we have to search for a work-around through trial and error.
I would like to see the Oracle trace files for the two scenarios, but the Oracle server is within another organization, and we are not getting cooperation from the Oracle DBAs.
Does anyone know of any driver, or TOAD, or ??? issues that could account for the discrepancy? Is there any way to eliminate the problem such that both methods always give the same result?
I know you asked this a while ago but I just came across your question.
I agree, they should be the same. Obviously there is a difference. We need to find out where the difference is.
I am thinking out loud as I type...
What happens if you specify just a few column instead of select * from openquery?
How many rows are supposed to be returned?
What if, in the oracle select, you limit the returned rows?
How quickly does the openquery timeout?
Are TOAD and SS on the same machine? Are you RDPing into the SS and running toad from there?
Are they using the same drivers? including bit? (32/64) version?
Are they using the same account on oracle?
It is interesting that using the trunc() makes a difference. I assume [eff_endt] is one of the returned fields?
I am wondering if SS is getting all the rows back but it is choking on doing the date conversions. The date type in oracle may need to be converted to a ss date type before ss shows it to you.
What if you insert the rows from the openquery into a table where the date field is just a (n)varchar. I am thinking ss might just dump the date it is getting back from oracle into that text field without trying to convert it.
something like:
insert into mytable(f1,f2,f3,datetimeX)
select f1,f2,f3,datetimeX from openquery( servername, '
select f1,f2,f3,datetimeX from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date(''12/31/9999'',''mm/dd/yyyy'')
and rgn_nm = ''Boston''
) ...
');
What if toad or ss is modifying the query statement before sending it to oracle. You could fire up wireshark and see what toad and ss are actually sending.
I would be very curious if you get this resolved. I link ss to oracle often and have not run into this issue.
Here are basic things you can check for to see what the database is doing after it receives the query. First, check that the execution plans are the same in TOAD as when the query runs using openquery. You could plan the query yourself in TOAD using:
explain plan set statement_id = 'openquery_test' for <your query here>;
select *
from table(dbms_xplan.display(statement_id => 'openquery_test';
then have someone initiate the query using openquery() and have someone with permissions to view v$ tables to run:
select sql_id from v$session where username = '<user running the query>';
(If there's more than one connection with the same user, you'll have to find an additional attribute to isolate the row representing the session running the query.)
select *
from table(dbms_xplan.display_cursor('<value from query above'));
If those look the same then I'd move on to checking database waits and see what it's stuck on.
select se.username
, sw.event
, sw.p1text
, sw.p2text
, sw.p3text
, sw.wait_time_micro/1000000 as seconds_in_wait
, sw.state
, sw.time_since_last_wait_micro/1000000 as seconds_since_last_wait
from v$session se
inner join
v$session_wait sw
on se.sid = sw.sid
where se.username = '<user running the query>'
;
(again, if there's more than one session with the same username, you'll need another attribute to whittle it down to the one you're interested in.)
If the plans are different, then you need to find out why, or if they're the same, look into what it's waiting on (e.g. SQL*Net message to client ?) and why.
I noticed a difference using OLEDB through MS Access (2013) connecting to Oracle 10g & 11g tables, in that it did not always recognize indexes or primary keys on the Oracle tables properly. The same query through an MS Access 2000 database (using odbc) worked fine / had no problem with indexes & keys. The only way I found to fix the OLEDB version was to include all of the key fields in the SELECT -- which was not a satisfying answer, but it's all I could find. This might be an option to try through SSMS / OpenQuery(...) as well.
Besides that... you can try some alternatives to OPENQUERY, such as:
4-part names: SELECT ... FROM Server..Schema.Table
Execute AT: EXEC ('select...') at linked server
But as for why the OLEDB provider works differently than the native Oracle Provider -- the providers are not identical, and the native provider would be more likely to pave-over Oracle quirks than the more generic OLEDB provider would.

In Postgres SQL Server 8.4 , how to get number of request times to each table?

In Postgres SQL Server 8.4 how to get number of request time to each tables?
For example , what I want is like that
Table_Name request_time
person 50
department 20
Plz give me some guideLine.
You want to use pg_stat_statements and/or csv format logging with log_statement = all or log_min_duration_statement = 0.
There is no way to get statement statistics in a queryable form retroactively. pgFouine can help analyse logs, but only if you've configured PostgreSQL to produce detailed logs.
You probably also want to read about the statistics collector and associated views, which will help provide things like table- and index-utilisation data.

Selecting 2 tables in the same editor

In SQL I can write 2 separate SELECT statements and execute them at the same time, such as:
SELECT * FROM Table_Name
SELECT * FROM Another_Table
Is it possible to do this in PostgreSQL?
It sounds to me like you're really asking a PGAdmin question. In SQL Server's query analyzer / Enterprise Manager if you execute two select statements, two result sets will be displayed in the output area.
PGAdmin doesn't do this; you need two separate query windows to see both result sets. I know of no other free GUI admin tool for PostgreSQL. Maybe someone else will.
EDIT:
I've recently started using Squirrel-SQL and DBeaver. Either of these may be more what you're looking for. I prefer DBeaver myself.
Brian

Creating a connection from Microsoft SQL server to an AS/400

I'm trying to connect from Microsoft SQL server to as AS/400 so i can pull data from the AS/400 then flag the data as being pulled.
I've successfully created and OLE DB "IBMDASQL" connection, and am able to pull data some data, but i'm running into an issue when i try to pull data from a very large table
This runs fine, and returns a count of 170 million:
select count(*)
from transactions
This query executed for 15 hours before i gave up on it. (It should return zero since i haven't flagged anything as 'in process' yet)
select count(*)
from transactions
where processed = 'In process'
I'm a Microsoft guy, but my AS/400 guy says that there is an index on the 'processed' column and that locally, that query run instantaneously.
Any thoughts on what i might be doing wrong? I found a table with only 68 records in it, and was able to run this query in about a second:
select count(*)
from smallTable
where RandomColumn = 'randomValue'
So I know that the AS/400 is at least able to understand that type of query.
I have had to fight this battle many times.
There are two ways of approaching this.
1) Stage your data from the AS400 into SQL server where you can optimize your indexes
2) Ask the AS400 folks to create logical views which speed up data retrieval, your AS400 programmer is correct, index will help but I forget the term they use to define a "view" similar to a sql server view, I beleive its something like "physical" v/s "logical". Logical is what you want.
Thirdly, 170 million is a lot of records, even for a relational database like SQL server, have you considered running an SSIS package nightly that stages your data into your own SQL table to see if it improves performance?
I would suggest this way to have good performance, i suppose you have at least SQL2005, i havent tested yet but this is a tip
Let the AS400 perform the select in native way by creating stored procedure in the AS400
open a AS400 session
launch STRSQL
create an AS400 stored procedure in this way to get/update the recordset
CREATE PROCEDURE MYSELECT (IN PARAM CHAR(10))
LANGUAGE SQL
DYNAMIC RESULT SETS 1
BEGIN
DECLARE C1 CURSOR FOR SELECT * FROM MYLIB.MYFILE WHERE MYFIELD=PARAM;
OPEN C1;
RETURN;
END
create an AS400 stored procedure to update the recordset
CREATE PROCEDURE MYUPDATE (IN PARAM CHAR(10))
LANGUAGE SQL
RESULT SETS 0
BEGIN
UPDATE MYLIB.MYFILE SET MYFIELD='newvalue' WHERE MYFIELD=PARAM;
END
Call those AS400 SP from SQL SERVER
declare #myParam char(10)
set #myParam = 'In process'
-- get the recordset
EXEC ('CALL NAME_AS400.MYLIB.MYSELECT(?) ', #myParam) AT AS400 -- < AS400 = name of linked server
-- update
EXEC ('CALL NAME_AS400.MYLIB.MYUPDATE(?) ', #myParam) AT AS400
Hope it helps
I recommend following the suggestions in the IBM Redbook SQL Performance Diagnosis on IBM DB2 Universal Database for iSeries to determine what's really happening.
IBM technical support can also be extremely helpful in diagnosing issues such as these. Don't be afraid to get in touch with them as the software support is generally included as part of the maintenance contract and there is no charge to talk to them.
I've seen OLEDB connections eat up 100% cpu for hours and when the same query is run through VisualExplain (query analyzer) it estimates mere seconds to execute.
We found that running the query like this performed liked expected:
SELECT *
FROM OpenQuery( LinkedServer,
'select count(*)
from transactions
where processed = ''In process''')
GO
Could this be a collation problem? - your WHERE clause is testing on a text field and if the collations of the two servers don't match this clause will be applied clientside rather than serverside so you are first of all pulling all 170 million records down to the client and then performing the WHERE clause on it there.
Based on the past interactions I have had, the query should take about the same amount of time no matter how you access the data. Another thought would be if you could create a view on the table to get the data you need or use a stored procedure.