How to lock a SELECT in PostgreSQL? - postgresql

I am used to do this in MySQL:
INSERT INTO ... SELECT ...
which would lock the table I SELECT from.
Now, I am trying to do something similar in PostgreSQL, where I select a set of rows in a table, and then I insert some stuff in other tables based on those rows values. I want to prevent having outdated data, so I am wondering how can I lock a SELECT in PostgresSQL.

There is no need to explicitly lock anything. A SELECT statement will always see a consistent snapshot of the table, no matter how long it runs.
The result will be no different if you lock the table against concurrent modifications before starting the SELECT, but you will harm concurrency unnecessarily.
If you need several queries to see a consistent state of the database, start a transaction with the REPEATABLE READ isolation level. Then all statements in the transaction will see the same state of the database.

Related

Is there any smart way for me to lock only the needed tables in my database?

I have a postgres db in which all my tables have a history table. To avoid concurrent writes, I want to lock my table while one editor is editing it. But the issue is from the front end service, the editor any of the table that postgres db is not aware of. I dont want to lock all the tables when the editor is editing it. Is there any smart way for me to lock only the needed tables?
I understand your use case, but I strongly recommend that you do not use table locks for that. That would have unpleasant side effects.
You can hold a table lock only for the duration of a database transaction, so you would have to keep a transaction open for the whole time that the table is edited.
Now the user that edits the table may leave without finishing the editing session, and the transaction and lock stay active indefinitely. That has the following consequences:
The necessary autovacuum maintenance job can no longer remove dead rows in any database table, leading to bloat.
Worse, if the transaction stays open long enough, you may get problems with transaction ID wraparound.
Nobody else can edit the table.
So I recommend some “optimistic locking” approach:
Do not really lock the table.
When a user tries to edit a row, make sure that nobody else has changed the row since the user last read it:
UPDATE mytable
SET col1 = ..., ...
WHERE col1 = <original col1 value>
AND col2 = <original col2 value>
...;
If the values are no longer the original values, that UPDATE will change 0 rows. Then notify the user that the data were changed concurrently and load the current data again.
If you really want to exclude others from modifying the table, use other synchonization methods than database locks, either something in the application code or PostgreSQL advisory locks.

PostgreSQL select query on table that is being updated

I assume this question has been asked before, but unfortunately I cannot find the answer to my question.
I have a table, and I am using an update statement to update a column. Simultaneously I am running a create table query with a select statement that is retrieving data from the table and column that is also being updated.
My questions are: can this lead to wrong results in the output of the create table statement? does the update query finish 1st then the create table with the select execute? I just know that the create table statement is taking way longer to execute.
In PostgreSQL readers never lock writers and vice versa. This is guaranteed by PostgreSQL's MVCC implementation that keeps old row versions around.
If the updating transaction isn't finished yet, the reading transaction will see the old value, and the result is consistent.
There is nothing inside PostgreSQL that should slow down the SELECT statement noticeably, but of course I/O contention is a possible explanation.

How to obtain an indefinite lock on database in Rails (postgres database) for QA purposes?

I'm trying to obtain an indefinite lock on my Postgresql database (specifically on a table called orders) for QA purposes. In short, I want to know if certain locks on a table prevent or indefinitely block database migrations for adding columns (I think ALTER TABLE grabs an ACCESS EXCLUSIVE LOCK).
My plan is to:
grab a table lock or a row lock on the orders table
run the migration to add a column (an ALTER TABLE statement that grabs an ACCESS EXCLUSIVE LOCK)
issue a read statement to see if (2) is blocked (the ACCESS EXCLUSIVE LOCK blocks reads, and so this would be a problem that I'm trying to QA).
How would one do this? How do I grab a row lock on a table called orders via the Rails Console? How else could I do this?
Does my plan make sense?
UPDATE
It turns out open row-level transactions actually do block ALTER TABLE statements that grab an ACCESS EXCLUSIVE LOCK like table migrations that add columns. For example, when I run this code in one process:
Order.first.with_lock do
binding.pry
end
It blocks my migration in another process to add a column to the orders table. That migration's ACCESS EXCLUSIVE LOCK blocks all reads and select statements to the orders table, causing problems for end users.
Why is this?
Let's say you're in a transaction, selecting rows from a table with various where clauses. Halfway through, some other transaction adds a column to that table. Now you are getting back more fields than you did previously. How is your application supposed to handle this?

Move truncated records to another table in Postgresql 9.5

Problem is following: remove all records from one table, and insert them to another.
I have a table that is partitioned by date criteria. To avoid partitioning each record one by one, I'm collecting the data in one table, and periodically move them to another table. Copied records have to be removed from first table. I'm using DELETE query with RETURNING, but the side effect is that autovacuum is having a lot of work to do to clean up the mess from original table.
I'm trying to achieve the same effect (copy and remove records), but without creating additional work for vacuum mechanism.
As I'm removing all rows (by delete without where conditions), I was thinking about TRUNCATE, but it does not support RETURNING clause. Another idea was to somehow configure the table, to automatically remove tuple from page on delete operation, without waiting for vacuum, but I did not found if it is possible.
Can you suggest something, that I could use to solve my problem?
You need to use something like:
--Open your transaction
BEGIN;
--Prevent concurrent writes, but allow concurrent data access
LOCK TABLE table_a IN SHARE MODE;
--Copy the data from table_a to table_b, you can also use CREATE TABLE AS to do this
INSERT INTO table_b AS SELECT * FROM table_a;
--Zeroying table_a
TRUNCATE TABLE table_a;
--Commits and release the lock
COMMIT;

Possible to let the stored procedure run one by one even if multiple sessions are calling them in postgresql

In postgresql: multiple sessions want to get one record from the the table, but we need to make sure they don't interfere with each other. I could do it using message queue: put the data in a queue, and them let each session get data from the queue. But is it doable in postgresql? since it will be easier for SQL guys to cal stored procedure. Any way to configure a stored procedure so that no concurrent calling will happen, or use some special lock?
I would recommend making sure the stored procedure uses SELECT FOR UPDATE, which should prevent the same row in the table from being accessed by multiple transactions.
Per the Postgres doc:
FOR UPDATE causes the rows retrieved by the SELECT statement to be
locked as though for update. This prevents them from being modified or
deleted by other transactions until the current transaction ends. That
is, other transactions that attempt UPDATE, DELETE, SELECT FOR UPDATE,
SELECT FOR NO KEY UPDATE, SELECT FOR SHARE or SELECT FOR KEY SHARE of
these rows will be blocked until the current transaction ends. The FOR
UPDATE lock mode is also acquired by any DELETE on a row, and also by
an UPDATE that modifies the values on certain columns. Currently, the
set of columns considered for the UPDATE case are those that have an
unique index on them that can be used in a foreign key (so partial
indexes and expressional indexes are not considered), but this may
change in the future.
More SELECT info.
So you don't end up locking all of the rows in the table at once (i.e. by SELECTing all of the records), I would recommend you use ORDER BY to sort the table in a consistent manner, and then do a LIMIT 1, so that it only gets the next one in the queue. Also add a WHERE clause that checks for a certain column value (i.e. processed), and then once processed set the column to a value that will prevent the WHERE clause from picking it up.