For example, I want to iterate over 5000k rows via without holder cursor inside a readOnly transaction, it will definitely run for a long period.
Will such kind transaction slow down other requests on the same table ?
No, it won't, unless concurrent transaction require an ACCESS EXCLUSIVE lock on the table (they are running something like DROP TABLE, ALTER TABLE or CREATE INDEX). Such transactons would hang until the read-only transaction is done.
The problem with long transactions is that they keep autovacuum from doing its work, and if there is a lot of concurrent data modifying activity, you could end up with bloated tables and indexes.
Related
I'm pretty new to PostgreSQL and I'm sure I'm missing something here.
The scenario is with version 11, executing a big drop table and insert transaction on a given table with the nodejs driver, which may take 30 minutes.
While doing that, if I try to query with select on that table using the jdbc driver, the query execution waits for the transaction to finish. If I close the transaction (by finishing it or by forcing it to exit), the jdbc query becomes responsive.
I thought I can read a table with one connection while performing a transaction with another one.
What am I missing here?
Should I keep the table (without dropping it at the beginning of the transaction) ?
DROP TABLE takes an ACCESS EXCLUSIVE lock on the table, which is there precisely to prevent it from taking place concurrently with any other operation on the table. After all, DROP TABLE physically removes the table.
Since all locks are held until the end of the database transaction, all access to the dropped table is blocked until the transaction ends.
Of course the files are only removed when the transaction commits, so you might wonder why PostgreSQL doesn't let concurrent transactions read in the mean time. But that would mean that COMMIT may be blocked by a concurrent reader, or a SELECT might cause a system error in the middle of reading, both of which don't sound appealing.
If we perform alter query (like adding constraint or adding column) or update the column, Postgres hangs and keeps on processing the query taking never ending time.
We have to kill the query explicitly.
Why do our ALTER statements frequently get stuck?
There are two scenarios in which such a query will hang for a long time:
Such an ALTER TABLE will require an ACCESS EXCLUSIVE lock on the table, which will block all concurrent activity and will be blocked by all concurrent activity.
The lock request will be queued at the end of the queue of transactions waiting for a lock on that table, so if these are many and take a long time to finish, the ALTER TABLE will have to wait for a long time. Other transactions that request locks on that table later will also hang because they queue after the ALTER TABLE statement.
Note that sessions that are “idle in transacrion” or prepared statements can hold locks for a very long time. It is an error to leave such things lying around.
Many forms of ALTER TABLE have to either rewrite the table (for example if a new row with a non-NULL default values is added) or scan the whole table (for example if a constraint is added, it will have to be checked for each row).
This can take a long time to finish if the table is large.
To disambiguate between these two cases, look at the locks in the pg_lockssystem view.
I have an issue with a postgresql-9.2 DB that is causing what is effectively a deadlock of the entire DB system.
Basically I have a table that acts as an operation queue. Entries are added to the table to indicate the need for an operation to be done. Subsequently one of multiple services will update these entries to indicate that the operation has been picked up for processing, and eventually delete the entry to indicate that the operation has been completed.
All access to the table is through transactions that first acquire an transactional advisory lock. This is to ensure that only one service is manipulating the queue at any point in time.
I have seen instances where queries on this queue will basically lock up and do nothing. I can see from pg_stat_activity that the affected query is state = active, waiting = false. I can also see that all requested locks for the PID in pg_locks have been granted. But the process just sits there and does nothing.
Typically I can precipitate this by repeated mass addition and removal of (several hundred thousand) entries to the queue. All access has to go through the advisory lock, so only one thing is getting done at a time.
Once one of these queries locks up then other queries pile up behind it, waiting on the advisory lock - eventually exhausting all DB connections.
The query that locks up is typically a deletion of all entries from the queue (delete from queue_table;). I have, however, seen one instance where the query that locked up was an update of several tuples within the table.
Right now I don't see anywhere where I could be deadlocking against any other transaction. These are straightforward inserts, deletes and updates. No other tables are involved (accept during addition of entries where the data is selected from other tables).
Other pertinent facts:
All access to the table is in fact through a view (can't access the table directly which is why i'm using an advisory lock instead of an exclusive lock on the table or similar).
The table is logged (which is probably a really bad choice in this case, i'm going to try using an unlogged table next week).
I usually also see an autovacuum (analyze) operation, also active, also waiting = false and also apparently locked up. I presume the autovacuum is coming along to re-optimize after my mass additions / removals.
Looking for any suggestions of what else I might look at to debug this issue when I next reproduce it. I'm kind of starting to feel that this might be some kind of performance optimization / DB configuration related issue.
Any suggestions of things to look at would be most welcomed!
I have the following column in a postgreSQL database
column | character varying(10) | not null default 'default'::character varying
I want to drop it, but the database is huge and if it blocks updates for an extended period of time I will be publicly flogged, and likely drawn and quartered. I found a blog from braintree, here, which suggests this is a safe operation but its a little vague.
The ALTER TABLE command needs to acquire an ACCESS EXCLUSIVE lock on the table, which will block everything trying to access that table, including SELECTs, and, as the name implies, needs to wait for existing operations to finish so it can be exclusive.
So, if your table is extremely busy, it may not get an opportunity to actually acquire the exclusive lock, and will simply block for what is functionally forever.
It also depends whether this column has a lot of indexes and dependencies. If there are dependencies (i.e. foreign keys or views), you'll need to add CASCADE to the DROP COLUMN, and this will increase the work that needs to be done, and the amount of time it will need to hold the exclusive lock.
So, it's not risk free. However, you should know fairly quickly after trying it whether it's likely to block for a long time. If you can try it and safely take a minute or two of potentially blocking that table, it's worth a shot -- try the drop and see. If it doesn't complete within a relatively short period of time, abort the command and you'll likely need to schedule some downtime of at least the app(s) that are hammering the table. (You can take a look at the server activity and the lock activity to try to surmise what's hammering that table.)
does drop column block a PostgreSQL database
The answer to that is no, because it does not block the database.
However any DDL statement requires an exclusive lock on the table being changed. Which means no other transaction can access the table. So the table is "blocked", not the database.
However the time to drop a column is really very short, because the column isn't physically removed from the table but only marked as no longer there.
And don't forget to commit the DDL statement (if you have turned autocommit off), otherwise the table will be blocked until you commit your change.
I'm using PostgreSQL 9.2 in a Windows environment.
I'm in a 2PC (2 phase commit) environment using MSDTC.
I have a client application, that starts a transaction at the SERIALIZABLE isolation level, inserts a new row of data in a table for a specific foreign key value (there is an index on the column), and vote for completion of the transaction (The transaction is PREPARED). The transaction will be COMMITED by the Transaction Coordinator.
Immediatly after that, outside of a transaction, the same client requests all the rows for this same specific foreign key value.
Because there may be a delay before the previous transaction is really commited, the SELECT clause may return a previous snapshot of the data. In fact, it does happen sometimes, and this is problematic. Of course the application may be redesigned but until then, I'm looking for a lock solution. Advisory Lock ?
I already solved the problem while performing UPDATE on specific rows, then using SELECT...FOR SHARE, and it works well. The SELECT waits until the transaction commits and return old and new rows.
Now I'm trying to solve it for INSERT.
SELECT...FOR SHARE does not block and return immediatley.
There is no concurrency issue here as only one client deals with a specific set of rows. I already know about MVCC.
Any help appreciated.
To wait for a not-yet-committed INSERT you'd need to take a predicate lock. There's limited predicate locking in PostgreSQL for the serializable support, but it's not exposed directly to the user.
Simple SERIALIZABLE isolation won't help you here, because SERIALIZABLE only requires that there be an order in which the transactions could've occurred to produce a consistent result. In your case this ordering is SELECT followed by INSERT.
The only option I can think of is to take an ACCESS EXCLUSIVE lock on the table before INSERTing. This will only get released at COMMIT PREPARED or ROLLBACK PREPARED time, and in the mean time any other queries will wait for the lock. You can enforce this via a BEFORE trigger to avoid the need to change the app. You'll probably get the odd deadlock and rollback if you do it that way, though, because INSERT will take a lower lock then you'll attempt lock promotion in the trigger. If possible it's better to run the LOCK TABLE ... IN ACCESS EXCLUSIVE MODE command before the INSERT.
As you've alluded to, this is mostly an application mis-design problem. Expecting to see not-yet-committed rows doesn't really make any sense.