postgres SKIP LOCKED not working

postgres SKIP LOCKED not working - postgresql

Below the steps I followed to test the SKIP LOCKED:
open one sql console of some Postgres UI client
Connect to Postgres DB
execute the queries
CREATE TABLE t_demo AS
SELECT *
FROM generate_series(1, 4) AS id;
check rows are created in that table:
TABLE t_demo
select rows using below query:
SELECT *
FROM t_demo
WHERE id = 2
FOR UPDATE SKIP LOCKED;
it is returning results as 2
Now execute the above query again:
SELECT *
FROM t_demo
WHERE id = 2
FOR UPDATE SKIP LOCKED;
this second query should not return any results, but it is returning results as 2

https://www.postgresql.org/docs/current/static/sql-select.html#SQL-FOR-UPDATE-SHARE
To prevent the operation from waiting for other transactions to
commit, use either the NOWAIT or SKIP LOCKED option
(emphasis mine)
if you run both queries in one window - you probably either run both in one transaction (then your next statement is not other transaction" or autocommiting after each statement (default)((but then you commit first statement transaction before second starts, thus lock released and you observe no effect

Related

How to avoid deadlock when delete/update the same record in the Postgres

I have a scenario when I play with Postgres.
We have one table with primary key, and there are two concurrent process, the one can update record, another process can delete record.
Now we are facing deadlock, when two processes play with update/delete the same record in the table.
I google how to avoid deadlock, someone says to use "SELECT FOR UPDATE".
Suppose there are two statements as following
update table_A set name='aaaa' where cid=1;
delete table_A where cid=1;
My question is,
(1) Do I need to add "SELECT FOR UPDATE" to both statements or just one statement in order to avoid deadlock?
(2) Could you give a complete example how to add "SELECT FOR UPDATE" ? I mean, what does it look like after you add "SELECT FOR UPDATE"? I never do it before, I want to learn how to add it.

SELECT ... FOR UPDATE locks the selected rows so that any other transaction can neither perform an update nor a SELECT ... FOR UPDATE on these rows. These transactions must wait until the transaction with the first SELECT ... FOR UPDATE releases the lock on the rows again.
If SELECT ... FOR UPDATE is the first statement in all transactions, no deadlock can occur. Because no transaction can lock other lines, which could be used in the further course of other transactions.
So your two transactions should look like this:
BEGIN;
SELECT * FROM table_A WHERE cid = 1 FOR UPDATE;
-- some other statements
UPDATE table_A SET name = 'aaaa' WHERE cid = 1;
END;
and:
BEGIN;
SELECT * FROM table_A WHERE cid = 1 FOR UPDATE;
-- some other statements
DELETE FROM table_A WHERE cid = 1;
END;

Force a "lock" with Postgres and GO

I am new to Postgres so this may be obvious (or very difficult, I am not sure).
I would like to force a table or row to be "locked" for at least a few seconds at a time. Which will cause a second operation to "wait".
I am using golang with "github.com/lib/pq" to interact with the database.
The reason I need this is because I am working on a project that monitors postgresql. Thanks for any help.

You can also use select ... for update to lock a row or rows for the length of the transaction.
Basically, it's like:
begin;
select * from foo where quatloos = 100 for update;
update foo set feens = feens + 1 where quatloos = 100;
commit;
This will execute an exclusive row-level lock on foo table rows where quatloos = 100. Any other transaction attempting to access those rows will be blocked until commit or rollback has been issued once the select for update has run.
Ideally, these locks should live as short as possible.
See: https://www.postgresql.org/docs/current/static/explicit-locking.html

apparent transaction isolation violation in postgresql

I am using PostgreSQL 9.3.12 on CentOS Linux.
I have two processes connecting to the same database, using a default transaction isolation level of "read committed". According to the postgres docs, one process in a transaction should not "see" changes made by another process in a transaction until they are committed.
A sequence I am seeing is:
process A starts its transaction
process A deletes everything from table T
process B starts its transaction
process B attempts a select for update on one row in table T
process B comes up empty (0 rows) and calls rollback
process A repopulates table T from incoming data
process A commits its transaction
Now, table T should have been populated before both transactions began, and process B's query should have turned up one row. And it does if these processes do not run concurrently.
My understanding is that process B should see the old copy of the desired row in table T, makes its changes, and those changes should be clobbered by process A's deletion and repopulation of table T. I can't figure out why process B is coming up empty.
Beyond a complete misunderstanding by myself of these preconditions, can anyone think of another reason why I would see this behaviour?
Worry not about the lousy architecture, it is going away. I'm just trying to understand why this situation seems to violate the "read committed" transaction isolation as I understand it.
Thanks.

According to the postgres docs, one process in a transaction should
not "see" changes made by another process in a transaction until they
are committed.
Yes and No - as usual, it depends. The documentation strictly says that:
Read Committed is the default isolation level in PostgreSQL.
When a transaction uses this isolation level, a SELECT query (without
a FOR UPDATE/SHARE clause) sees only data committed before the query
began; it never sees either uncommitted data or changes committed
during query execution by concurrent transactions. In effect, a SELECT
query sees a snapshot of the database as of the instant the query
begins to run. However, SELECT does see the effects of previous
updates executed within its own transaction, even though they are not
yet committed. Also note that two successive SELECT commands can see
different data, even though they are within a single transaction, if
other transactions commit changes after the first SELECT starts and
before the second SELECT starts.
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands
behave the same as SELECT in terms of searching for target rows: they
will only find target rows that were committed as of the command start
time. However, such a target row might have already been updated (or
deleted or locked) by another concurrent transaction by the time it is
found. In this case, the would-be updater will wait for the first
updating transaction to commit or roll back (if it is still in
progress). If the first updater rolls back, then its effects are
negated and the second updater can proceed with updating the
originally found row. If the first updater commits, the second updater
will ignore the row if the first updater deleted it, otherwise it will
attempt to apply its operation to the updated version of the row. The
search condition of the command (the WHERE clause) is re-evaluated to
see if the updated version of the row still matches the search
condition. If so, the second updater proceeds with its operation using
the updated version of the row. In the case of SELECT FOR UPDATE and
SELECT FOR SHARE, this means it is the updated version of the row that
is locked and returned to the client.
In other word, simply SELECT differs from SELECT FOR UPDATE/DELETE/UPDATE.
You can create simple test case to observe that behaviour:
Session 1
test=> START TRANSACTION;
START TRANSACTION
test=> SELECT * FROM test;
x
----
1
2
3
4
5
6
7
8
9
10
(10 rows)
test=> DELETE FROM test;
DELETE 10
test=>
Now login in another Session 2:
test=> START TRANSACTION;
START TRANSACTION
test=> SELECT * FROM test;
x
----
1
2
3
4
5
6
7
8
9
10
(10 rows)
test=> SELECT * FROM test WHERE x = 5 FOR UPDATE;
After the last command SELECT ... FOR UPDATE session 1 "hangs" and is waiting for something ......
Back in session 1
test=> insert into test select * from generate_series(1,10);
INSERT 0 10
test=> commit;
COMMIT
And now when you go back to session 2 you will see this:
test=> SELECT * FROM test WHERE x = 5 FOR UPDATE;
x
---
(0 rows)
test=> select * from test;
x
----
1
2
3
4
5
6
7
8
9
10
(10 rows)
That is - simple SELECT still doesn't see any changes, while SELECT ... FOR UPDATE does see that rows have been deleted. But it doesn't see new rows inserted by session 1
In fact a sequence you are seeing is:
process A starts its transaction
process A deletes everything from table T
process B starts its transaction
process B attempts a select for update on one row in table T
process B "hangs" and is waiting until session A does a commit or rollback
process A repopulates table T from incoming data
process A commits its transaction
process B comes up empty (0 rows- after session A commit) and calls rollback

SQL Isolation levels or locks in large procedures

I have big stored procedures that handle user actions.
They consist of multiple select statements. These are filtered, most of the times only getting one row. The Selects are copied into temptables or otherwise evaluated.
Finally, a merge-Statement does the needed changes in the DB.
All is encapsulated in a transaction.
I have concurrent input from users, and the selected rows of the select statements should be locked to keep data integrity.
How can I lock the selected Rows of all select statements, so that they aren't updated through other transactions while the current transaction is in process?
Does a table hint combination of ROWLOCK and HOLDLOCK work in a way that only the selected rows are locked, or are the whole tables locked because of the HOLDLOCK?
SELECT *
FROM dbo.Test
WITH (ROWLOCK HOLDLOCK )
WHERE id = #testId
Can I instead use
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
right after the start of the transaction? Or does this lock the whole tables?
I am using SQL2008 R2, but would also be interested if things work differently in SQL2012.
PS: I just read about the table hints UPDLOCK and SERIALIZE. UPDLOCK seems to be a solution to lock only one row, and it seems as if UPDLOCK always locks instead of ROWLOCK, which does only specify that locks are row based IF locks are applied. I am still confused about the best way to solve this...

Changing the isolation level fixed the problem (and locked on row level):
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Here is how I tested it.
I created a statement in a blank page of the SQL Management Studio:
begin tran
select
*
into #message
from dbo.MessageBody
where MessageBody.headerId = 28
WAITFOR DELAY '0:00:05'
update dbo.MessageBody set [message] = 'message1'
where headerId = (select headerId from #message)
select * from dbo.MessageBody where headerId = (select headerId from #message)
drop table #message
commit tran
While executing this statement (which takes at last 5 seconds due to the delay), I called the second query in another window:
begin tran
select
*
into #message
from dbo.MessageBody
where MessageBody.headerId = 28
update dbo.MessageBody set [message] = 'message2'
where headerId = (select headerId from #message)
select * from dbo.MessageBody where headerId = (select headerId from #message)
drop table #message
commit tran
and I was rather surprised that it executed instantaneously. This was due to the default SQL Server transaction level "Read Commited" http://technet.microsoft.com/en-us/library/ms173763.aspx . Since the update of the first script is done after the delay, during the second script there are no umcommited changes yet, so the row 28 is read and updated.
Changing the Isolation level to Serialization prevented this, but it also prevented concurrency - both scipts were executed consecutively.
That was OK, since both scripts read and changed the same row (via headerId=28). Changing headerId to another value in the second script, the statements were executed parallel. So the lock from SERIALIZATION seems to be on row level.
Adding the table hint
WITH ( SERIALIZABLE)
in the first select of the first statement does also prevent further reads oth the selected row.

How can I see the numbers of rows a query affects when deleting

I want to see how many rows my delete query effects so I know its correct.
Is this possible using pgadmin?

Start a transaction, delete and then rollback;
In psql :
test1=> begin;
BEGIN
test1=> delete from test1 where test1_id = 1;
DELETE 2
test1=> rollback;
ROLLBACK
In pgAdmin (in the "History" tab on the "Output pane"):
-- Executing query:
begin;
Query returned successfully with no result in 16 ms.
-- Executing query:
delete from test1 where test1_id = 1;
Query returned successfully: 2 rows affected, 16 ms execution time.
-- Executing query:
rollback;
Query returned successfully with no result in 16 ms.

I'm not sure how to automatically do this but you can always do a select then a delete.
SELECT COUNT(*) FROM foo WHERE delete_me=true;
DELETE FROM foo WHERE delete_me=true;

As Andrew said, when doing interactive administration, you can just replace DELETE by SELECT COUNT(*).
If you want to this information in a program of yours (after executing the DELETE), many programming languages provide a construct for this. For example, in PHP it's pg_affected_rows and in .NET it's the return value of ExecuteNonQuery.

Use RETURNING and fetch the result like you would fetch a SELECT-result:
DELETE FROM test1 WHERE test1_id = 1 RETURNING id;
This works as of version 8.2