Postgres statement_timeout does not work as expected - postgresql

I am working on a use case where in my current api application I need to kill any query that has been running more than 30 sec (as my server has a timeout of 30 sec but the query keeps running on Postgres).
So after some finding i came across the statement_timeout configuration in postgres. and implemented it in my sqlAlchemy code like this:
#contextmanager
def db_session():
"""Executes the query."""
import os
from my_aws import secretsmanager
secret_name = f'<my_secrey_key>'
secret = secretsmanager.get_secret(secret_name)
conn = f'{secret["dbname"]}://{secret["username"]}:{secret["password"]}#' \
f'{secret["host"]}:{secret["port"]}/{secret["dbname"]}'
eng = create_engine(
conn,
connect_args={'options': '-c statement_timeout=30s'})
connection = eng.connect()
db_session = scoped_session(sessionmaker(autocommit=False, autoflush=True, bind=eng))
yield db_session
db_session.close()
connection.close()
So my expectation here was that any query whcih cannot complete within 30s should timeout and return an error.
So when testing this.
I place a lock in one of my tables to delay my queries by doing this:
BEGIN WORK;
LOCK TABLE <schema>.<table_name> IN ACCESS EXCLUSIVE mode;
then i trigger an API call which queries the locked table (from above). this api does not repond as expected becuase the query is unable to execute witin 30 sec.
however the query does not terminate and i can still see it running in the pg_stat_activity
SELECT pid, age(clock_timestamp(), query_start), usename, query
FROM pg_stat_activity
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%' and usename='api_user'
ORDER BY query_start desc;
So the above query gives the reponse:
pid |age |usename |query
----|---------------|--------|-------------------------------
3334|00:05:17.962059|api_user|SELECT count(*) AS count_1 ¶FRO
1752|00:05:22.577919|api_user|COMMIT
1754|00:05:22.627446|api_user|COMMIT
3270|00:05:22.791417|api_user|SELECT count(*) AS count_1 ¶FRO
1755|00:05:23.058261|api_user|COMMIT
1753|00:05:23.123582|api_user|COMMIT
1689|00:05:24.149163|api_user|SELECT count(*) AS count_1 ¶FRO
1759|00:05:24.579171|api_user|SELECT DISTINCT sum(public.dema
1760|00:05:24.631371|api_user|SELECT count(*) AS count_1 ¶FRO
As you can see that the query on the locked tables are still waiting from more than 5 min.
Is there something wrong with my understanding of statement_timeout here.
FYI: I can see that the timeout is set on the postgres as the result of this query:
show statement_timeout;
Result:
statement_timeout|
-----------------|
30s |

I recommend that you set the parameter in postgresql.conf (then it is valid for the whole PostgreSQL server) or with ALTER DATABASE (then it is valid only for new connections to that database).
If that does not do the trick, the setting must be overridden somewhere. To debug, run the following SQL statement using SQLAlchemy:
SELECT current_setting('statement_timeout');
However, when I look at your query, perhaps everything is working anyway: add the state column to the pg_stat_activity query and check if the state is indeed active. Perhaps the query has already been canceled, and the state is idle or idle in transaction (aborted) (note that query shows the last query on that connection, which need not be active any more).

I think The statement_timeout should be a value in milliseconds. If you are really passing in 30s, that might be the wrong parameter value. Try using 30000 for 30 seconds.
eng = create_engine(
conn,
connect_args={'options': '-c statement_timeout=30000'})

Related

If calling PQfinish, is there any benefit of calling PQcancel?

If calling PQFinish, which "closes the connection to the server", is there any benefit to calling PQCancel beforehand?
i.e. if the connection is closed, would the PostgreSQL server cancel any in-progress queries on this connection, just as it would with PQCancel?
I've given this a test with just a SELECT pg_sleep(120) query. Even after PQFinish is called, connecting in via psql and running
SELECT pid, age(clock_timestamp(), query_start), usename, query
FROM pg_stat_activity
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc
still showed the query to be running.
So I think there is a benefit of running PQCancel - it would increase the chance of queries being cancelled and reducing resource use on the server.

INSERT OPENQUERY timeout

I'm trying to execute and insert query to a linked server in SQL Server.
For that I'm using INSERT INTO OPENQUERY statement.
The linked server is an Apache HIVE using Cloudera ODBC Provider.
The insert operation takes around 1 minute in my setup when performed from HIVE client.
However, SQL INSERT always times out after 30 seconds.
I set the Query Timeout parameter to 0 but it seems to be not affecting INSERT statement, however, it is working fine for SELECT statements taking longer time.
Is this a known limitation?
Is there a way to change the timeout for the insert statement when using OPENQUERY?
EDIT
I would like to clarify the setup I'm working with.
---------- ---------------------- ---------------
| MS SQL | => Linked Server => | Hive ODBC Provider | => | Hive Server |
---------- ---------------------- ---------------
In Hive, I have a table called calc_result where I would like to periodically store calculation results from the SQL server. For example, I try to insert using a query like this.
insert openquery(HIVE, 'select timestamp timestamp , tag tag, value value from calc_result')
values('2019-04-22 11:50:41', 'test',2.0)
The insert operation is captured correctly by HIVE server and a MapReduce job starts. However, the job will be killed after 30 seconds due to timeout.
The SQL server will show the below error message.
OLE DB provider "MSDASQL" for linked server "HIVE" returned message "[Cloudera][Hardy] (72) Query execution timeout expired.".
However, SELECT OPENQUERY works fine and would follow Query Timeout settings of the linked server (Which is set to 0 in this case).
Edit that is completely different use case from what I've imagined. In that case there should not be any difference in select/insert.
As you have configured your linked server timeout, there is a second place in the linked server properties you can check a Command Timeout setting in the provider string:
Other option that comes into my mind is instance wide timout. Default set for 600 seconds (10 minutes) which is way above your 30 seconds. However, you can still try it to see if there is any impact.
For infinite wait:
sp_configure 'show advanced options',1
go
reconfigure
go
sp_configure 'remote query timeout (s)',0
go
reconfigure
go
I would try using SELECT INTO temporary table and then materializing it using regular INSERT INTO:
SELECT c1, c2
INTO #temp_tab
FROM OPENQUERY(mylinkedserver, 'SELECT c1, c2 FROM remote_table');
INSERT INTO normal_table(col1, col2)
SELECT c1, c2
FROM #temp_tab;
EDIT:
You could try wrapping it with transaction and remove aliases:
BEGIN TRAN;
insert openquery(HIVE, 'select timestamp, tag, value from calc_result')
values('2019-04-22 11:50:41', 'test',2.0);
COMMIT;
If necessary set up DTC: How can I enable distributed transactions for a linked server?
While I didn't find a way to change OPENQUERYtimeout from 30 seconds, I found that using EXEC AT Linked Server to work fine for INSERT queries while adhering to timeout settings.
I accidentally stumbled upon the solution in this 2009 blog post. Databases might not be my strength, but I feel SQL Server documentation can be improved. A simple page that lists possible ways to interact with a Linked Server could've saved me lots of retries.

Is postgres caching my query?

I have a pretty simple snippet of Python code to run a Postgres query then send the results to a dashboard. I'm using psycopg2 to periodically run the same query. Let's not worry about the looping mechanism for now.
conn = psycopg2.connect(<connection info>)
while True:
# Run query and update dashboard
cur = conn.cursor()
cur.execute(q_tcc)
query_results = cur.fetchall()
update_dashboard(query_results)
time.sleep(5)
For reference, the actual query is :
q_tcc = """SELECT client_addr, application_name, count(*) cnt FROM pg_stat_activity
GROUP BY client_addr, application_name ORDER BY cnt DESC;"""
When I run this, I keep getting the same results even though they should be changing. If i move the psycopg2.connect() line into the loop with a conn.close(), everything works fine. According to the connection and cursor docs, however, I should be able to keep using the same cursor (and, therefore, connection) the whole time.
Does this mean Postgres is caching my query on a per-client-connection basis?
PostgreSQL doesn't have a query cache.
However, if you're using SERIALIZABLE isolation, you might be seeing the same snapshot of the data, since you appear to do all your queries within a single transaction.
You should really commit (or rollback) the transaction after each query in your loop. conn.rollback()

Why psycopg2 returns same result for repeated SELECT?

When I execute ordinary Select correct results are returned, but when I execute this select for DB uptime it returns same first result all the time. I did check Postgres logs and I see that select is executed.
#!/usr/bin/python3
import psycopg2
from time import sleep
conn = psycopg2.connect("dbname='MyDB' user='root' host='127.0.0.1' password='********'")
cur = conn.cursor()
def test():
e = 0
while e != 100:
cur.execute("SELECT date_trunc('second', current_timestamp - pg_postmaster_start_time()) as uptime;")
uptv = cur.fetchone()
print(uptv)
e += 1
sleep(0.1)
test()
Per the documentation, current_timestamp returns the timestamp at the start of the transaction. This behaviour is required by the SQL standard.
psycopg2 begins a transaction when you run a query. It does not autocommit. So unless you conn.commit(), the same xact is running for the first query and your later iterations.
You should:
conn.commit() after each query (or if you like, conn.rollback() if it's read only and makes no changes); or
Use clock_timestamp() instead of current_timestamp, since the former changes throughout the transaction.
It's best to avoid leaving transactions running anyway, as they can tie up resources the server needs.

Transaction time out workaround for PostgreSQL

AFAIK, PostgreSQL 8.3 does not support transaction time out. I've read about supporting this feature in the future and there's some discussion about it. However, for specific reasons, I need a solution for this problem. So what I did is a script that runs periodically:
1) Based on locks and activity, query in order to retrieve processID of the transactions that is taking too long, and keeping the oldest (trxTimeOut.sql):
SELECT procpid
FROM
(
SELECT DISTINCT age(now(), query_start) AS age, procpid
FROM pg_stat_activity, pg_locks
WHERE pg_locks.pid = pg_stat_activity.procpid
) AS foo
WHERE age > '30 seconds'
ORDER BY age DESC
LIMIT 1
2) Based on this query, kill the corresponding process (trxTimeOut.sh):
psql -h localhost -U postgres -t -d test_database -f trxTimeOut.sql | xargs kill
Although I've tested it and seems to work, I'd like to know if it's an acceptable approach or should I consider a different one?
PostgreSQL provides idle_in_transaction_session_timeout since version 9.6, to automatically terminate transactions that are idle for too long.
It's also possible to set a limit on how long a command can take, through statement_timeout, independently on the duration of the transaction it's in, or why it's stuck (busy query or waiting for a lock).
To auto-abort transactions that are stuck specifically waiting for a lock, see lock_timeout.
These settings can be set at the SQL level with commands like SET shown below, or can be set as defaults to a database with ALTER DATABASE, or to a user with ALTER USER, or to the entire instance through postgresql.conf.
SET statement_timeout=10000; -- time out after 10 seconds