PostgreSQL Equivalent of SQLServer's NoLock Hint - postgresql

In SQLServer, you can use syntax "(nolock)" to ensure the query doesn't lock the table or isn't blocked by other queries locking the same table.
e.g.
SELECT * FROM mytable (nolock) WHERE id = blah
What's the equivalent syntax in Postgres? I found some documentation on table locking in PG (http://www.postgresql.org/docs/8.1/interactive/sql-lock.html), but it all seems geared at how to lock a table, not ensure it's not locked.

A SELECT doesn't lock any table in PostgreSQL, unless you want a lock:
SELECT * FROM tablename FOR UPDATE;
PostgreSQL uses MVCC to minimize lock contention in order to allow for reasonable performance in multiuser environments. Readers do not conflict with writers nor other readers.

I've done some research and it appears that the NOLOCK hint in SQL Server is roughly the same as READ UNCOMMITTED transaction isolation level. In PostgreSQL, you can set READ UNCOMMITTED, but it silently upgrades the level to READ COMMITTED. READ UNCOMMITTED is not supported.
PostgreSQL 8.4 documentation for Transaction Isolation: http://www.postgresql.org/docs/8.4/static/transaction-iso.html

This is an old question, but I think the actual question has not been answer.
A SELECT query (that does not contain an for update clause) will never lock any rows (or the table) nor will it block concurrent access to the table. Concurrent DML (INSERT, UPDATE, DELETE) will also not block a SELECT statement.
Simply put: there is no need for (nolock) in Postgres.
Readers never block writers and writers never block readers

The purpose of the nolock or readpast is to see if the record is currenlty locked. The user can use this in an update to see if the record identified was changed (rowsaffected); if the record was not locked, then therowsaffected would be 1; if o, then the record is locked
Based upon that outcome, then the user can use a select for update to lock it for their own use.

Every SQL statement is an implicit transaction. The NOLOCK hint corresponds to READ UNCOMMITTED (DIRTY READ) transaction isolation level.
BEGIN TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT COUNT(1) FROM my_table;
END;
Actually, this code do the same that BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED but guarantees the expected behavior further.
Also avoid to use COUNT(*) except you really need it

Related

How to detect what caused a short time lock after it was released in PostgreSQL

In Java application I frequently see such errors:
org.springframework.r2dbc.UncategorizedR2dbcException: executeMany; SQL [select * from table1 where id=:id and column1 <> :a for update]; could not serialize access due to concurrent update; nested exception is io.r2dbc.postgresql.ExceptionFactory$PostgresqlTransientException: [40001] could not serialize access due to concurrent update
Transaction with query select * from table1 where id=:id and column1 <> :a for update was rollbacked.
Transaction isolation level - REPEATABLE READ.
How can I see what has locked this row? Lock is very short (milliseconds).
I see no helpful information in Postgres log and application log.
The problem here is not a concurrent lock, but a concurrent data modification. It is perfectly normal to get that error, and if you get it, you should simply repeat the failed transaction.
There is no way to find out which concurrent transaction updated the row unless you log all DML statements.
If you get a lot of these errors, you might consider switching to pessimistic locking using SELECT ... FOR NO KEY UPDATE.

Postgres locks within a transaction

I'm having trouble understanding how locks interact with transactions in Postgres.
When I run this (long) query, I am surprised by the high degree of locking that occurs:
BEGIN;
TRUNCATE foo;
\COPY foo FROM 'backup.txt';
COMMIT;
The documentation for \COPY doesn't mention what level of lock it requires, but this post indicates that it only gets a RowExclusiveLock. But when I run this query during the \COPY:
SELECT mode, granted FROM pg_locks
WHERE relation='foo'::regclass::oid;
I get this:
mode granted
RowExclusiveLock true
ShareLock true
AccessExclusiveLock true
Where the heck is that AccessExclusiveLock coming from? I assume it's coming from the TRUNCATE, which requires an AccessExclusiveLock. But the TRUNCATE finishes quickly, so I'd expect the lock to release quickly as well. This leaves me with a few questions.
When a lock is acquired by a command within a transaction, is that lock released at the end of the command (before the end of the transaction)? If so, why do I observe the above behavior? If not, why not? In fact, since transactions don't touch the table until the COMMIT, why does a TRUNCATE in a transaction need to block the table at all?
I don't see any discussion of this in the documentation for transactions in PG.
There are a couple of misconceptions to be cleared up here.
First, a transaction does touch the table before it is committed. The comment you are quoting says that a ROLLBACK (and a COMMIT as well) don't touch the table, which is something different. They record the transaction state in the commit log (in pg_clog), and COMMIT flushes the transaction log to disk (one notable exception to this is TRUNCATE, which is relevant for your question: the old table is kept around until the end of the transaction and gets deleted during COMMIT).
If all changes were held back until COMMIT and no locks would be taken, COMMIT would be quite expensive and would routinely fail because of concurrent modifications. The transaction would have to remember the state of the database as it was before and check if the changes still apply. This way of handling concurrency is called optimistic concurreny control, and while it is a decent strategy for an application, it won't work well for a relational database, where COMMIT should be efficient and should not fail (unless there are major problems with the infrastructure).
So what relational databases use is pessimistic concurrency control or locking, i.e. they lock a database object before they access it to prevent concurrent activity from getting in their way.
Second, relational databases use two-phase locking, in which locks (at least the user-visible, so-called heavyweight locks) are always held until the end of the transaction.
This is necessary (but not sufficient) to keep transactions in a logical order (serializable) and consistent. What if you release a lock, and somebody else removes the row that your inserted, but uncommitted row refers to via a foreign key constraint?
Answer to the question
The upshot of all this is that your table will keep the ACCESS EXCLUSIVE lock from TRUNCATE until the end of the transaction. Isn't it evident why that is necessary? If other transactions were allowed to even read the table after the (as of yet uncommitted) TRUNCATE, they would find it empty, since TRUNCATE really empties the table and does not adhere to MVCC semantics. Such a dirty read (of uncommitted data that might yet be rolled back) cannot be allowed.
If you really need read access to the table during the refill, you could use DELETE instead of TRUNCATE. The downside is that this is a much more expensive operation that will leave the table with a lot of “dead tuples” that have to be removed by autovacuum, resulting in a lot of empty space (table bloat). But if you are willing to live with a table and indexes that are bloated such that table and index scans will take at least twice as long, it is an option.

How do transactions work in the context of reads to the database?

I am using transactions to make changes to a SQL database. As I understand it, this means that changes to the database will happen in an all-or-nothing fashion. What I want to know is, does this have any guarantees for reads? For example, suppose I have some (pseudo)-code like this:
1) start TRANSACTION
2) INSERT INTO users ... // insert some data
3) count = SELECT COUNT(*) FROM ... // count something in the database
4) if count > 10: // do something based on the read
5) INSERT INTO other_table ... // write based on the read
6) COMMMIT TRANSACTION
In this code, I'm doing an INSERT, followed by a SELECT, and then conditionally doing another INSERT based on the outcome of the SELECT.
So my question is, if another process modifies the database between steps (3) and (5), what happens to the count variable, and to my transaction?
If it makes a difference, I am using PostgreSQL.
As Xin pointed out, it depends on the isolation level.
At the default READ COMMITTED level, records from other sessions will become visible as they are committed; you would see the same records if you didn't start a transaction at all (though of course, other processes would see your inserts appear at different times).
With REPEATABLE READ, your queries will not see any records committed by other sessions after your transaction starts. But while you don't have to worry about the result of SELECT COUNT(*) changing during your transaction, you can't assume that this result will still be accurate by the time you commit.
Using SERIALIZABLE provides the strongest guarantee: if your script does the right thing when given exclusive access to the database, then it will do the right thing in the presence of other serialisable transactions (or it will fail outright). However, this means that all transactions which might interfere with yours must be using the same isolation level (which comes at a cost), and all must be prepared to retry their transaction in the event of a serialisation failure.
When serialisable transactions are not an option, you generally guard against race conditions by explicitly locking things against concurrent writes. It's often enough to lock a selection of records, but you can't exactly lock the result of a COUNT(*); in your case, you'd probably need to lock the whole table.
I am not working on postgreSQL, but I think I can answer your question. Think of every query is parallel. I am saying so, because there are 2 transactions: when you insert into a; others can insert into b; then when you check b; whether you can see the new data depends on your isolation setting (read committed or just dirty read).
Also please note that, in database, there is a technology called lock: you can lock a table so that prevent altering it from others before committing your transaction.
Hope

FDW seems to lock table on foreign server

I try to use foreign table to link 2 postgresql databases
everything is fine and I can retrieve all data I want
the only issue is that the data wrapper seems to lock tables in foreign server and it's very annoying when I unit test my code
if I don't do any select request I can initialize data and truncate both tables in local server and tables in remote server
but I execute one select statement the truncate command on remote server seems to be in deadlock state
do you know how I can avoid this lock?
thanks
[edit]
I use this data wrapper to link 2 postgresql databases: http://interdbconnect.sourceforge.net/pgsql_fdw/pgsql_fdw-en.html
I use table1 of db1 as foreign table in db2
when I execute a select query in foreign_table1 in db2, there is an AccessShareLock for table1 in db1
the query is very simple: select * from foreign_table1
the lock is never released so when I execute a truncate command at the end of my unit test, there is a conflict because the truncate add an AccessExclusiveLock
I don't know how to release the first AccessShareLock but I think it would be done automatically by the wrapper...
hope this help
AccessExclusiveLock and AccessShareLock aren't generally obtained explicitly. They're obtained automatically by certain normal statements. See locking - the lock list says which statements acquire which locks, which says:
ACCESS SHARE
Conflicts with the ACCESS EXCLUSIVE lock mode only.
The SELECT command acquires a lock of this mode on referenced tables.
In general, any query that only reads a table and does not modify it
will acquire this lock mode.
What this means is that your 1st transaction hasn't committed or rolled back (thus releasing its locks) yet, so the 2nd can't TRUNCATE the table because TRUNCATE requires ACCESS EXCLUSIVE which conflicts with ACCESS SHARE.
Make sure the 1st transaction commits or rolls back.
BTW, is the "foreign" database actually the local database, ie are you using pgsql_fdw as an alternative to dblink to simulate autonomous transactions?

Advisory locks in postgres and evaluation order (how to acquire lock without using a separate query)

Is there any safe way of acquiring an advisory lock before executing a particular statement without using two separate queries? E.g., I assume that if I do something like the following, there is no guarantee that the lock will be acquired before the insert:
WITH x AS (SELECT pg_advisory_lock(1,2)) INSERT ...
But is there some similar way of getting the desired effect?
I'm pretty sure that SQL standards require implementations to behave as if the very first thing they do is to effectively materialize the common table expressions in the WITH clause. PostgreSQL complies with this requirement.
Common table expressions behave (mostly) as named objects. Multiple CTEs are materialized in the order they're declared. Backward references by name work as you'd expect, and forward references by name raise an error.
So I'm pretty sure that, in the general case, the CTE will have to materialize before the INSERT statement will run. But in your case, using PostgreSQL, I'm not dead certain, and here's why.
PostgreSQL's implementation [of common table expressions]
evaluates only as many rows of a WITH query as are actually
fetched by the parent query.
I'm not sure an INSERT statement fetches a row in this sense.
You haven't really told us enough about your use case to be sure, but in general for explicit locks to be useful in PostgreSQL they need to be acquired before the transaction acquires its snapshot. Advisory locks can be acquired before you start the transaction, and most locks with transactional scope should be acquired right after your begin your transaction; before anything which will need a transaction ID.
If you really don't need to acquire the lock before you have your transaction ID assigned and your snapshot set, and it is important to you that you issue one statement to acquire the lock and perform the insert, create a function which does both.