PostgreSQL risks of force terminating a session using pg_terminate_backend - postgresql

I have learned that you can terminate a session accessing a table by performing
SELECT t.relname, l.locktype, pid, mode, GRANTED
FROM pg_locks l INNER JOIN pg_stat_all_tables t ON l.relation = t.relid
ORDER BY relation asc;
and then terminating the PID accessing your relation by using
SELECT pg_terminate_backend(xxx);
But what are the potential risks and downsides of doing so?
Is it bad practice to use pg_terminate_backend(xxx); frequently? (as part of a routine execution)
Is it always safe to terminate the session?
Assuming the session is JDBC, would it result in anything else than an SQLException being thrown at the terminated session?
The use case is terminating a session reading from table before table is being refreshed, since reads would be bad anyway. The JDBC session will handle the SQLException accordingly.

The only risk you have, is that any changes that are not yet committed by that session will be rolled back when you terminate the session.
The correct solution however, is to investigate the root cause why you need this so frequently. This sounds like a bug in your application.

Related

How to detect what caused a short time lock after it was released in PostgreSQL

In Java application I frequently see such errors:
org.springframework.r2dbc.UncategorizedR2dbcException: executeMany; SQL [select * from table1 where id=:id and column1 <> :a for update]; could not serialize access due to concurrent update; nested exception is io.r2dbc.postgresql.ExceptionFactory$PostgresqlTransientException: [40001] could not serialize access due to concurrent update
Transaction with query select * from table1 where id=:id and column1 <> :a for update was rollbacked.
Transaction isolation level - REPEATABLE READ.
How can I see what has locked this row? Lock is very short (milliseconds).
I see no helpful information in Postgres log and application log.
The problem here is not a concurrent lock, but a concurrent data modification. It is perfectly normal to get that error, and if you get it, you should simply repeat the failed transaction.
There is no way to find out which concurrent transaction updated the row unless you log all DML statements.
If you get a lot of these errors, you might consider switching to pessimistic locking using SELECT ... FOR NO KEY UPDATE.

Did this idle query cause my create unique index command to lock up?

I had an open connection from Matlab to my postgres server, and the last query was insert into table_name ..., which had state idle when I look at the processes on the database server using:
SELECT datname,pid,state,query FROM pg_stat_activity;
I tried to create a unique index on table_name and it was taking a very long time, with no discernable CPU usage for the pgadmin process. When I closed the connection from Matlab, the query dropped out of pg_stat_activity, and the unique index was immediately built.
Was the idle query the reason why it took so long to build the index?
No, a session in state “idle” does not hold any locks and cannot block anything. It's “idle in transaction” sessions that usually are the problem, because they tend to hold locks over a long time. Such sessions are almost always caused by an application bug.
To investigate problems like that, look at the view pg_locks. A hanging CREATE INDEX statement will usually hang acquiring the ACCESS EXCLUSIVE lock on the table to be indexed. You should see that by a value of FALSE in the granted column of pg_locks. Then figure out which backend (pid) has a lock on the table in question, an you have the culprit(s).

Postgres EXPLAIN never returns a result

I was trying to figure out why a query was taking so long to run. When I ran EXPLAIN on the query it doesn't return (after waiting 5+ minutes). The query is
SELECT *
FROM Foo JOIN Bar USING (FileNum, LineNum)
The tuple (FileNum, LineNum) is the primary key of both tables. In addition, (FileNum, LineNum) is a foreign key in Bar that references Foo.
I'm having performance issues with the database but this seems to be a new phenomenon.
I've run ANALYZE on the database and have not made any changes since then.
Does anyone have any idea why the query planner doesn't return a result?
The simplest explanation is that another session has grabbed an AccessExclusive lock on one of the tables involved in the query and won't release it.
This happens if, for instance, the other session does an ALTER TABLE in a transaction and never commits.
You might try the queries in
https://wiki.postgresql.org/wiki/Lock_Monitoring to check if there is such a blocking transaction in progress, then possibly terminate it with pg_cancel_backend() or pg_terminate_backend().
Note that if the blocking transaction is a prepared transaction:
the locks would survive to a server restart.
the transaction would not have a corresponding session in pg_stat_activity, so any query that joins through pid would miss it.
Prepared transactions can be seen through pg_prepared_xacts.

Hanging query in Postgresql 9.2

I have an issue with a postgresql-9.2 DB that is causing what is effectively a deadlock of the entire DB system.
Basically I have a table that acts as an operation queue. Entries are added to the table to indicate the need for an operation to be done. Subsequently one of multiple services will update these entries to indicate that the operation has been picked up for processing, and eventually delete the entry to indicate that the operation has been completed.
All access to the table is through transactions that first acquire an transactional advisory lock. This is to ensure that only one service is manipulating the queue at any point in time.
I have seen instances where queries on this queue will basically lock up and do nothing. I can see from pg_stat_activity that the affected query is state = active, waiting = false. I can also see that all requested locks for the PID in pg_locks have been granted. But the process just sits there and does nothing.
Typically I can precipitate this by repeated mass addition and removal of (several hundred thousand) entries to the queue. All access has to go through the advisory lock, so only one thing is getting done at a time.
Once one of these queries locks up then other queries pile up behind it, waiting on the advisory lock - eventually exhausting all DB connections.
The query that locks up is typically a deletion of all entries from the queue (delete from queue_table;). I have, however, seen one instance where the query that locked up was an update of several tuples within the table.
Right now I don't see anywhere where I could be deadlocking against any other transaction. These are straightforward inserts, deletes and updates. No other tables are involved (accept during addition of entries where the data is selected from other tables).
Other pertinent facts:
All access to the table is in fact through a view (can't access the table directly which is why i'm using an advisory lock instead of an exclusive lock on the table or similar).
The table is logged (which is probably a really bad choice in this case, i'm going to try using an unlogged table next week).
I usually also see an autovacuum (analyze) operation, also active, also waiting = false and also apparently locked up. I presume the autovacuum is coming along to re-optimize after my mass additions / removals.
Looking for any suggestions of what else I might look at to debug this issue when I next reproduce it. I'm kind of starting to feel that this might be some kind of performance optimization / DB configuration related issue.
Any suggestions of things to look at would be most welcomed!

Dropping index concurrently PostgreSQL 9.1

DROP INDEX CONCURRENTLY first appeared in PSQL 9.2, but my server runs 9.1. Unfortunately that operation locks my app for an unpredictable amount of time, that's a very sad fact when doing it on production.
Is there a way to drop an index concurrently?
No, there's no simple workaround - otherwise it's rather less likely that DROP INDEX CONCURRENTLY would've been added in 9.2.
However, you can kill all sessions to force the drop to occur promptly.
What you want to avoid is the drop waiting on a partially acquired exclusive lock that prevents other transactions from proceeding, but doesn't let it proceed either, while it waits for other transactions to finish and release their share locks. The best way to ensure that happens is to kill all concurrent sessions.
So, in one session:
DROP INDEX my_index;
In another session, as a superuser, terminate all other sessions using the following untested query, which you'll need to adapt appropriately and test before use:
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE procpid <> (
SELECT pid
FROM pg_stat_activity
WHERE query = 'DROP INDEX my_index;')
AND
procpid <> pg_backend_pid();
Your well-written and well-tested application will immediately reconnect and retry its queries without bothering the user or getting upset, because it knows that transient errors are something it has to cope with so it runs all its database access in retry loops. If it isn't well written then you'll discover that with a flood of error user-visible messages. If it's really not well written you'll have to restart it before it gets its head together, but it's rare to see apps that are quite that broken.
This is a heavy handed approach. You can be rather softer about it by joining against pg_locks and only terminating sessions that actually hold a lock on the relation you're interested in or the index you wish to modify. You get to enjoy writing that query, because my interest in working around limitations of older database versions is limited.