truncate on one table blocked by select of another - postgresql

Postgres 9.4, Ubuntu 10
I have been unable to find this exact problem here, so here it goes:
For each table t in my database, I have a table t_audit. Each delete, insert, and update on table t triggers a function that inserts a record to table t_audit.
Each night, a process truncates each t_audit table.
Last night, a select on table t prevented the truncate on t_audit from proceeding. I did not save what was in pg_stat_activity at the time, but I did save the output from blocking_locks().
Blocking pid: RowExclusiveLock, t, select * from t where ...,
Waiting pid: AccessExclusiveLock, t_audit, truncate table t_audit,
I am uncertain as to why a select on t would block the truncate on t_audit. As I did not save pg_stat_activity, the best that I can assume is that the select was "idle in transaction". I asked the person who was running the query at the time, and he said he was not running the update as part of a transaction. He did update table t just prior to the select. He did not close his connection as the pid was still active until I ran pg_terminate_backend on the pid.
Has anyone experienced this issue before? Is there a recommended procedure for this other than running pg_terminate_backend on any pids which are "idle in transaction" just prior to calling truncates?
Thank you for reading and taking time to respond.

Are there any triggers in place that might cause even something as innocuous as a SELECT on the audit table at the same time as the TRUNCATE (although the fact that it's a Row Exclusive lock indicates that whatever is being triggered is something like an UPDATE instead)? From the PG 9.4 locking documentation, SELECT and TRUNCATE would indeed block each other as expected behavior. The relevant tidbits are these:
ACCESS SHARE
Conflicts with the ACCESS EXCLUSIVE lock mode only.
The SELECT command acquires a lock of this mode on referenced tables. In general, any query that only reads a table and does not modify it will acquire this lock mode.
ACCESS EXCLUSIVE
Conflicts with locks of all modes (ACCESS SHARE, ROW SHARE, ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE). This mode guarantees that the holder is the only transaction accessing the table in any way.
Acquired by the DROP TABLE, TRUNCATE, REINDEX, CLUSTER, and VACUUM FULL commands. Many forms of ALTER TABLE also acquire a lock at this level.
And even more specifically telling is this explicit tip on that page:
Tip: Only an ACCESS EXCLUSIVE lock blocks a SELECT (without FOR UPDATE/SHARE) statement.
As for what to do in this scenario, if your use case is tolerant of unceremonious terminations of (possibly idle) connections, that is certainly a straightforward way of ensuring that the TRUNCATE succeeds.
A more flexible alternative may be to clear out the table with DELETE instead, and then follow up with some variation of VACUUM afterwards (DELETE and SELECT will not block each other, but it will block UPDATE). The suitability of this approach would depend a lot on things like the growth pattern of the table from day-to-day (simply VACUUM may be suitable if its maximum size is not that different day-to-day) and how badly you need that space reclaimed in the short term if it is a huge table - you may need to VACUUM FULL that table during a quiet window if you need the space quickly and badly, but VACUUM FULL is not a gentle hammer to swing by any means.

Related

How does postgresql lock tables when inserting and selecting?

I'm migrating data from one table to another in an environment where any long locks or downtime is not acceptable, in total about 80000 rows. Essentially the query boils down to this simple case:
INSERT INTO table_2
SELECT * FROM table_1
JOIN table_3 on table_1.id = table_3.id
All 3 tables are being read from and could have an insert at any time. I want to just run the query above, but I'm not sure how the locking works and whether the tables will be totally inaccessible during the operation. My understanding tells me that only the affected rows (newly inserted) will be locked. Table 1 is just being selected, so no harm, and concurrent inserts are safe so table 2 should be freely accessible.
Is this understanding correct, and can I run this query in a production environment without fear? If it's not safe, what is the standard way to accomplish this?
You're fine.
If you're interested in the details, you can read up on multiversion concurrency control, or on the details of the Postgres MVCC implementation, or how its various locking modes interact, but the implications for your case are nicely summarised in the docs:
reading never blocks writing and writing never blocks reading
In short, every record stored in the database has some version number attached to it, and every query knows which versions to consider and which to ignore.
This means that an INSERT can safely write to a table without locking it, as any concurrent queries will simply ignore the new rows until the inserting transaction decides to commit.

Hanging query in Postgresql 9.2

I have an issue with a postgresql-9.2 DB that is causing what is effectively a deadlock of the entire DB system.
Basically I have a table that acts as an operation queue. Entries are added to the table to indicate the need for an operation to be done. Subsequently one of multiple services will update these entries to indicate that the operation has been picked up for processing, and eventually delete the entry to indicate that the operation has been completed.
All access to the table is through transactions that first acquire an transactional advisory lock. This is to ensure that only one service is manipulating the queue at any point in time.
I have seen instances where queries on this queue will basically lock up and do nothing. I can see from pg_stat_activity that the affected query is state = active, waiting = false. I can also see that all requested locks for the PID in pg_locks have been granted. But the process just sits there and does nothing.
Typically I can precipitate this by repeated mass addition and removal of (several hundred thousand) entries to the queue. All access has to go through the advisory lock, so only one thing is getting done at a time.
Once one of these queries locks up then other queries pile up behind it, waiting on the advisory lock - eventually exhausting all DB connections.
The query that locks up is typically a deletion of all entries from the queue (delete from queue_table;). I have, however, seen one instance where the query that locked up was an update of several tuples within the table.
Right now I don't see anywhere where I could be deadlocking against any other transaction. These are straightforward inserts, deletes and updates. No other tables are involved (accept during addition of entries where the data is selected from other tables).
Other pertinent facts:
All access to the table is in fact through a view (can't access the table directly which is why i'm using an advisory lock instead of an exclusive lock on the table or similar).
The table is logged (which is probably a really bad choice in this case, i'm going to try using an unlogged table next week).
I usually also see an autovacuum (analyze) operation, also active, also waiting = false and also apparently locked up. I presume the autovacuum is coming along to re-optimize after my mass additions / removals.
Looking for any suggestions of what else I might look at to debug this issue when I next reproduce it. I'm kind of starting to feel that this might be some kind of performance optimization / DB configuration related issue.
Any suggestions of things to look at would be most welcomed!

Does DROP COLUMN block on a postrgeSQL database

I have the following column in a postgreSQL database
column | character varying(10) | not null default 'default'::character varying
I want to drop it, but the database is huge and if it blocks updates for an extended period of time I will be publicly flogged, and likely drawn and quartered. I found a blog from braintree, here, which suggests this is a safe operation but its a little vague.
The ALTER TABLE command needs to acquire an ACCESS EXCLUSIVE lock on the table, which will block everything trying to access that table, including SELECTs, and, as the name implies, needs to wait for existing operations to finish so it can be exclusive.
So, if your table is extremely busy, it may not get an opportunity to actually acquire the exclusive lock, and will simply block for what is functionally forever.
It also depends whether this column has a lot of indexes and dependencies. If there are dependencies (i.e. foreign keys or views), you'll need to add CASCADE to the DROP COLUMN, and this will increase the work that needs to be done, and the amount of time it will need to hold the exclusive lock.
So, it's not risk free. However, you should know fairly quickly after trying it whether it's likely to block for a long time. If you can try it and safely take a minute or two of potentially blocking that table, it's worth a shot -- try the drop and see. If it doesn't complete within a relatively short period of time, abort the command and you'll likely need to schedule some downtime of at least the app(s) that are hammering the table. (You can take a look at the server activity and the lock activity to try to surmise what's hammering that table.)
does drop column block a PostgreSQL database
The answer to that is no, because it does not block the database.
However any DDL statement requires an exclusive lock on the table being changed. Which means no other transaction can access the table. So the table is "blocked", not the database.
However the time to drop a column is really very short, because the column isn't physically removed from the table but only marked as no longer there.
And don't forget to commit the DDL statement (if you have turned autocommit off), otherwise the table will be blocked until you commit your change.

How to drop a trigger in a resilient manner in postgresql

I'm looking to drop a currently in production trigger because it's no longer needed, but the problem is that when I try the simplest way, which is something like
drop trigger <triggername> on <tablename>
It caused a huge table lock and everything froze!
What the trigger does is:
When a row is inserted or updated, check for a field's contents, split it and populate another table.
How should I proceed to instantly disable (and dropping afterwards) without causing hassle in our production environment?
Thanks in advance and sorry for my english ;)
You could try ALTER TABLE ... DISABLE TRIGGER - but it requires the same strength of lock, so I don't think it'll do you much good.
There's work in PostgreSQL 9.4 to make ALTER TABLE take weaker locks for some operations. It might help with this.
In the mean time, I'd CREATE OR REPLACE FUNCTION to replace the trigger with a simple no-op function.
Then, to actually drop the trigger, I'd probably write a script that does:
BEGIN;
LOCK TABLE the_table IN ACCESS EXCLUSIVE MODE NOWAIT;
DROP TRIGGER ...;
COMMIT;
If anybody's using the table the script will abort at the LOCK TABLE.
I'd then run it in a loop until it succeeded.
If that didn't work (if the table is always busy) but if most transactions were really short, I might attempt a LOCK TABLE without NOWAIT, but set a short statement_timeout. So the script would be something like:
BEGIN;
SET LOCAL statement_timeout = '5s';
LOCK TABLE the_table IN ACCESS EXCLUSIVE MODE NOWAIT;
DROP TRIGGER ...;
COMMIT;
That ensures a fairly short disruption by failing if it can't complete the job in time. Again, I'd run it periodically until it succeeded.
If neither approach was effective - say, due to lots of long-running transactions - I'd probably just accept the need to lock it for a little while. I'd start the drop trigger then I'd pg_terminate_backend all concurrent transactions that held locks on the table so their connections dropped and their transactions terminated. That'd let the drop trigger proceed promptly, at the cost of greater disruption. You can only consider an approach like this if your apps are well-written so they'll just retry transactions on transient errors like connection drops.
Another possible approach is to disable (not drop) the trigger by modifying the system catalogs directly.
According to the docs, since 9.5 alter table ... disable trigger now takes a SHARE ROW EXCLUSIVE lock, so that might be the way to go now.

Online backup blocking truncate table

It´s documented that in DB2 the TRUNCATE statement is not compatible with online backup because it gets a Z lock on the table and prevents an online backup from running concurrently.
The lock wait happens when a truncate tries to get a shared lock on an internal online backup object.
Since this is by design in the product I will have to go for workarounds, so this thread is not about a solution, but why they can´t work together. I didn´t find a reasonable explanation why there is such limitation in db2.
Any insights?
Thanks,
Luciano Moreira
from http://www.ibm.com/developerworks/data/library/techarticle/dm-0501melnyk/
When a table holds a Z lock, no concurrent application can read or
update data in that table.
So now we know that a Z lock is and exclusive access to a table denying read and write access to the table.
from http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.sql.ref.doc/doc/r0053474.html
Exclusive Access: No other session can have a cursor open on the table, or a lock held on the table (SQLSTATE 25001).
from https://sites.google.com/site/umeshdanderdbms/difference-between-truncate-and-delete
Delete is logging operation, where as Truncate is makes the table empty on container level.
(Logging operation – DML operation are logged into logs (redo log in oracle, transaction log in DB2 etc). It is stored in logs for commit or rollback operation.)
This is the most interesting part. Truncate just 'forgets' the content of the table whereas deletes removes line by line processing all triggers, bells, and whistles. Therefore when you truncate all reading cursors will get invalid. To prevent stupid stuff like that you can only completely empty a table when nobody tries to access it. Online backup obviously needs to read the table. Therefore it is not possible to have both accessing the same table at the same time.