Lightest lock for exclusive inserts in PostgreSQL - postgresql

I want to limit INSERTs to whichever transaction gets the lock (with the others waiting in line, not failing) while allowing concurrent reads, updates, and deletes (but obviously not of the data being inserted, which is impossible in PG anyway).
What is the lightest LOCK to achieve this?

Yes, if you want to lock a table against all concurrent modifications, SHARE ROW EXCLUSIVE is the cheapest lock.
I'm not going to ask why you want to restrict concurrency in that way...

Related

In databases, is row level locking an example of ACID, optimistic concurrency, or both?

simultaneous writes
Also what happens in a nosql database?
I'll ignore the NoSQL part, otherwise I would have to close the question as too unfocused.
Row level locking is a technique that relational databases use to provide isolation, which is the I of ACID. Isolation means that concurrent database sessions are isolated from each other – the database tries to keep them from being influenced by each other's activities.
Specifically, if two concurrent sessions try to modify the same data row, they have to “take turns”: the second one has to wait until the transaction of the first session is done. This wait is usually very short and does not hurt, but it prevents inconsisiencies (consistency is the C of ACID).
Row level locking, and locking in general, are part of pessimistic locking: you lock a row to prevent other sessions from messing with the row while you are working on it. It is done with SELECT ... FOR UPDATE. It is called “pessimistic” because it reflects a mindset like “I expect someone will try to modify the row while I am working on it, so let's lock it to be sure”.
Optimistic locking is ill-named, because no locks are actually taken. You don't prevent concurrent transactions from modifying the row you are interested in. Instead you check afterwards if the row has been modified by a concurrent transaction or not, and if it has, you try the operation again.

Lock for SELECT so another process doesn't get old data

I have a table that could have two threads reading data from it. If the data is in a certain state (let's say state 1) then the process will do something (not relevant to this question) and then update the state to 2.
It seems to me that there could be a case where thread 1 and thread 2 both perform a select within microseconds of one another and both see that the row is in state 1, and then both do the same thing and 2 updates occur after locks have been released.
Question is: Is there a way to prevent the second thread from being able to modify this data in Postgres - AKA it is forced to do another SELECT after the lock from the first one is released for its update so it knows to bail in order to prevent dupes?
I looked into row locking, but it says you cannot prevent select statements which sounds like it won't work for my condition here. Is my only option to use advisory locks?
Your question, referencing an unknown source:
I looked into row locking, but it says you cannot prevent select
statements which sounds like it won't work for my condition here. Is
my only option to use advisory locks?
The official documentation on the matter:
Row-level locks do not affect data querying; they block only writers
and lockers to the same row.
Concurrent attempts will not just select but try to take out the same row-level lock with SELECT ... FOR UPDATE - which causes them to wait for any previous transaction holding a lock on the same row to either commit or roll back. Just what you wanted.
However, many use cases are better solved with advisory locks - in versions before 9.5. You can still lock rows being processed with FOR UPDATE additionally to be safe. But if the next transaction just wants to process "the next free row" it's often much more efficient not to wait for the same row, which is almost certainly unavailable after the lock is released, but skip to the "next free" immediately.
In Postgres 9.5+ consider FOR UPDATE SKIP LOCKED for this. Like #Craig commented, this can largely replace advisory locks.
Related question stumbling over the same performance hog:
Function taking forever to run for large number of records
Explanation and code example for advisory locks or FOR UPDATE SKIP LOCKED in Postgres 9.5+:
Postgres UPDATE ... LIMIT 1
To lock many rows at once:
How to mark certain nr of rows in table on concurrent access
What you want is the fairly-common SQL SELECT ... FOR UPDATE. The Postgres-specific docs are here.
Using SELECT FOR UPDATE will lock the selected records for the span of the transaction, allowing you time to update them before another thread can select.

postgresql concurrent queries as stored procedures

I have 2 stored procedures that interact with the same datatables.
first executes for several hours and second one is instant.
So if I run first one, and after that second one (second connection) then the second procedure will wait for the first one to end.
It is harmless for my data if both can run at the same time, how to do that?
The fact that the shorter query is blocked while being on a second connection suggests that the longer query is getting an exclusive lock on the table during the query.
That suggests it is doing writes, as if they were both reads there shouldn't be any locking issues. PgAdmin can show what locks are active during the longer query and also if the shorter query is indeed blocked on the longer one.
If the longer query is indeed doing writes, it's possible that you may be able to reduce the lock contention -- by chunking it, for example, which could allow readers in between chunked updates/inserts -- but if it's an operation that requires an exclusive write lock, then it will block everybody until it's done.
It's also possible that you may be able to optimize the query such that it needs to be a lower-level lock that isn't exclusive, but that would all depend on the specifics of what the query is doing and your data.

Controlling duration of PostgreSQL lock waits

I have a table called deposits
When a deposit is made, the table is locked, so the query looks something like:
SELECT * FROM deposits WHERE id=123 FOR UPDATE
I assume FOR UPDATE is locking the table so that we can manipulate it without another thread stomping on the data.
The problem occurs though, when other deposits are trying to get the lock for the table. What happens is, somewhere in between locking the table and calling psql_commit() something is failing and keeping the lock for a stupidly long amount of time. There are a couple of things I need help addressing:
Subsequent queries trying to get the lock should fail, I have tried achieving this with NOWAIT but would prefer a timeout method (because it may be ok to wait, just not wait for a 'stupid amount of time')
Ideally I would head this off at the pass, and have my initial query only hold the lock for a certain amount of time, is this possible with postgresql?
Is there some other magic function I can tack onto the query (similar to NOWAIT) which will only wait for the lock for 4 seconds before failing?
Due to the painfully monolithic spaghetti code nature of the code base, its not simply a matter of changing global configs, it kinda needs to be a per-query based solution
Thanks for your help guys, I will keep poking around but I haven't had much luck. Is this a non-existing function of psql, because I found this: http://www.postgresql.org/message-id/40286F1F.8050703#optusnet.com.au
I assume FOR UPDATE is locking the table so that we can manipulate it without another thread stomping on the data.
Nope. FOR UPDATE locks only those rows, so that another transaction that attempts to lock them (with FOR SHARE, FOR UPDATE, UPDATE or DELETE) blocks until your transaction commits or rolls back.
If you want a whole table lock that blocks inserts/updates/deletes you probably want LOCK TABLE ... IN EXCLUSIVE MODE.
Subsequent queries trying to get the lock should fail, I have tried achieving this with NOWAIT but would prefer a timeout method (because it may be ok to wait, just not wait for a 'stupid amount of time')
See the lock_timeout setting. This was added in 9.3 and is not available in older versions.
Crude approximations for older versions can be achieved with statement_timeout, but that can lead to statements being cancelled unnecessarily. If statement_timeout is 1s and a statement waits 950ms on a lock, it might then get the lock and proceed, only to be immediately cancelled by a timeout. Not what you want.
There's no query-level way to set lock_timeout, but you can and should just:
SET LOCAL lock_timeout = '1s';
after you BEGIN a transaction.
Ideally I would head this off at the pass, and have my initial query only hold the lock for a certain amount of time, is this possible with postgresql?
There is a statement timeout, but locks are held at transaction level. There's no transaction timeout feature.
If you're running single-statement transactions you can just set a statement_timeout before running the statement to limit how long it can run for. This isn't quite the same thing as limiting how long it can hold a lock, though, because it might wait 900ms of an allowed 1s for the lock, only actually hold the lock for 100ms, then get cancelled by the timeout.
Is there some other magic function I can tack onto the query (similar to NOWAIT) which will only wait for the lock for 4 seconds before failing?
No. You must:
BEGIN;
SET LOCAL lock_timeout = '4s';
SELECT ....;
COMMIT;
Due to the painfully monolithic spaghetti code nature of the code base, its not simply a matter of changing global configs, it kinda needs to be a per-query based solution
SET LOCAL is suitable, and preferred, for this.
There's no way to do it in the text of the query, it must be a separate statement.
The mailing list post you linked to is a proposal for an imaginary syntax that was never implemented (at least in a public PostgreSQL release) and does not exist.
In a situation like this you may want to consider "optimistic concurrency control", often called "optimistic locking". It gives you greater control over locking behaviour at the cost of increased rates of query repetition and the need for more application logic.

PostgreSQL Lock Row on indefinitely time

I wanna lock one row by some user until he work with this row on indefinitely time and he must unlock it when done. So any others users will not be able to lock this row for yourself. It is possible to do on data base level?
You can do it with a long-lived transaction, but there'll be performance issues with that. This sounds like more of a job for optimistic concurrency control.
You can just open a transaction and do a SELECT 1 FROM mytable WHERE clause to match row FOR UPDATE;. Then keep the transaction open until you're done. The problem with this is that it can cause issues with vacuum that result in table and index bloat, where tables get filled with deleted data and indexes fill up with entries pointing to obsolete blocks.
It'd be much better to use an advisory lock. You still have to hold the connection the holds the lock open, but it doesn't have to keep an open idle transaction, so it's much lower impact. Transactions that wish to update the row must explicitly check for a conflicting advisory lock, though, otherwise they can just proceed as if it wasn't locked. This approach also scales poorly to lots of tables (due to limited advisory lock namespace) or lots of concurrent locks (due to number of connections).
You can use a trigger to check for the advisory lock and wait for it if you can't make sure your client apps will always get the advisory lock explicitly. However, this can create deadlock issues.
For that reason, the best approach is probably to have a locked_by field that records a user ID, and a locked_time field that records when it was locked. Do it at the application level and/or with triggers. To deal with concurrent attempts to obtain the lock you can use optimistic concurrency control techniques, where the WHERE clause on the UPDATE that sets locked_by and locked_time will not match if someone else gets there first, so the rowcount will be zero and you'll know you lost the race for the lock and have to re-check. That WHERE clause usually tests locked_by and locked_time. So you'd write something like:
UPDATE t
SET locked_by = 'me' AND locked_time = current_timestamp
WHERE locked_by IS NULL AND locked_time IS NULL
AND id = [ID of row to update];
(This is a simplified optimistic locking mode for grabbing a lock, where you don't mind if someone else jumped in and did an entire transaction. If you want stricter ordering, you use a row-version column or you check that a last_modified column hasn't changed.)