INSERT and transaction serialization in PostreSQL

INSERT and transaction serialization in PostreSQL - postgresql

I have a question. Transaction isolation level is set to serializable. When the one user opens a transaction and INSERTs or UPDATEs data in "table1" and then another user opens a transaction and tries to INSERT data to the same table, does the second user need to wait 'til the first user commits the transaction?

Generally, no. The second transaction is inserting only, so unless there is a unique index check or other trigger that needs to take place, the data can be inserted unconditionally. In the case of a unique index (including primary key), it will block if both transactions are updating rows with the same value, e.g.:
-- Session 1 -- Session 2
CREATE TABLE t (x INT PRIMARY KEY);
BEGIN;
INSERT INTO t VALUES (1);
BEGIN;
INSERT INTO t VALUES (1); -- blocks here
COMMIT;
-- finally completes with duplicate key error
Things are less obvious in the case of updates that may affect insertions by the other transaction. I understand PostgreSQL does not yet support "true" serialisability in this case. I do not know how commonly supported it is by other SQL systems.
See http://www.postgresql.org/docs/current/interactive/mvcc.html

The second user will be blocked until the first user commits or rolls back his/her changes.

Related

Postgres trigger and row locking

Please help with my understanding of how triggers and locks can interact
I bulk load records to a table with statements something like this…..
BEGIN;
INSERT INTO table_a VALUES (record1) , (record2), (record3)………;
INSERT INTO table_a VALUES (record91) , (record92), (record93)………;
…..
….
COMMIT;
There can be several hundred records in a single insert, and there can be several dozen INSERT statements between COMMITs
Table_a has a trigger on it defined as….
AFTER INSERT ON table_a FOR EACH ROW EXECUTE PROCEDURE foo();
The procedure foo() parses each new row as it’s added, and will (amongst other stuff) update a record in a summary table_b (uniquely identified by primary key). So, for every record inserted into table_a a corresponding record will be updated in table_b
I have a 2nd process that also attempts to (occasionally) update records in table_b. On very rare occasions it may attempt to update the same row in table_b that the bulk process is updating
Questions – should anything in the bulk insert statements affect my 2nd process being able to update records in table_b? I understand that the bulk insert process will obtain a row lock each time it updates a row in table_b, but when will that row lock be released? – when the individual record (record1, record2, record3 etc etc) has been inserted? Or when the entire INSERT statement has completed? Or when the COMMIT is reached?
Some more info - my overall purpose for this question is to try to understand why my 2nd process occasionally pauses for a minute or more when trying to update a row in table_b that is also being updated by the bulk-load process. What appears to be happening is that the lock on the target record in table_b isn't actually being released until the COMMIT has been reached - which is contrary to what I think ought to be happening. (I think a row-lock should be released as soon as the UPDATE on that row is done)
UPDATE after answer(s) - yes of course you're both right. In my mind I had somehow convinced myself that the individual updates performed within the trigger were somehow separate from the overall BEGIN and COMMIT of the whole transaction. Silly me.
The practice of adding multiple records with one INSERT, and multiple INSERTs between COMMITs was introduced to improve the bulk load speed (which it does) I had forgotten about the side-effect of increasing the time before locks would be released.

What should happen when the transaction is rolled back? It is rather obvious that all inserts on table_a, as well as all updates on table_b, should be rolled back. This is why all rows of table_b updated by the trigger will be locked until the transaction completes.
Committing after each insert (reducing the number of rows inserted in a single transaction) will reduce the chance of conflicts with concurrent processes.

Regarding locks when creating a new row in PostgreSQL

I am trying to get a better grasp on how transactions work in PostgreSQL. I did a lot of research, but I could not find any answers on the following question.
question 1
I have two transactions with isolation set to read committed, the default. I also have the following table:
create table test(a integer primary key);
Let's start the first transaction:
begin;
insert into test(a) values(1);
Now let's start the second transaction and do the same:
begin;
insert into test(a) values(1);
Now I notice that the second transaction is blocking until the first transaction either commits or rollbacks. Why is that? Why isn't it possible in the second transaction to simply continue after the insert and throw a unique-key-constraint-exception when the transaction is requested to be committed instead of throwing the exception directly after the insert call?
question 2
Now, a second scenario. Let's start from scratch with the first transaction:
begin;
insert into test(a) values(1);
delete from test where a = 1;
Now let's go to the second transaction:
begin;
insert into test(a) values(1);
Now I notice that the second transaction is also blocking. Why is it blocking on a row which does not exists anyway?

Why is it blocking on a row which does not exists anyway?
Because both transactions are inserting the same value for the primary key. The second transactions needs to wait for the outcome of the first transaction to know whether it can succeed or not. If the first transactions commits, the second will fail with a primary key violation. If the first transaction rolls back, the insert in the second transaction succeeds.

Postgres 9.3: Sharelock issue with simple INSERT

Update: Potential solution below
I have a large corpus of configuration files consisting of key/value pairs that I'm trying to push into a database. A lot of the keys and values are repeated across configuration files so I'm storing the data using 3 tables. One for all unique key values, one for all unique pair values, and one listing all the key/value pairs for each file.
Problem:
I'm using multiple concurrent processes (and therefore connections) to add the raw data into the database. Unfortunately I get a lot of detected deadlocks when trying to add values to the key and value tables. I have a tried a few different methods of inserting the data (shown below), but always end up with a "deadlock detected" error
TransactionRollbackError: deadlock detected DETAIL: Process 26755
waits for ShareLock on transaction 689456; blocked by process 26754.
Process 26754 waits for ShareLock on transaction 689467; blocked by
process 26755.
I was wondering if someone could shed some light on exactly what could be causing these deadlocks, and possibly point me towards some way of fixing the issue. Looking at the SQL statements I'm using (listed below), I don't really see why there is any co-dependency at all. Thanks for reading!
Example config file:
example_key this_is_the_value
other_example other_value
third example yet_another_value
Table definitions:
CREATE TABLE keys (
id SERIAL PRIMARY KEY,
hash UUID UNIQUE NOT NULL,
key TEXT);
CREATE TABLE values (
id SERIAL PRIMARY KEY,
hash UUID UNIQUE NOT NULL,
key TEXT);
CREATE TABLE keyvalue_pairs (
id SERIAL PRIMARY KEY,
file_id INTEGER REFERENCES filenames,
key_id INTEGER REFERENCES keys,
value_id INTEGER REFERENCES values);
SQL Statements:
Initially I was trying to use this statement to avoid any exceptions:
WITH s AS (
SELECT id, hash, key FROM keys
WHERE hash = 'hash_value';
), i AS (
INSERT INTO keys (hash, key)
SELECT 'hash_value', 'key_value'
WHERE NOT EXISTS (SELECT 1 FROM s)
returning id, hash, key
)
SELECT id, hash, key FROM i
UNION ALL
SELECT id, hash, key FROM s;
But even something as simple as this causes the deadlocks:
INSERT INTO keys (hash, key)
VALUES ('hash_value', 'key_value')
RETURNING id;
In both cases, if I get an exception thrown because the inserted hash
value is not unique, I use savepoints to rollback the change and
another statement to just select the id I'm after.
I'm using hashes for the unique field, as some of the keys and values
are too long to be indexed
Full example of the python code (using psycopg2) with savepoints:
key_value = 'this_key'
hash_val = generate_uuid(value)
try:
cursor.execute(
'''
SAVEPOINT duplicate_hash_savepoint;
INSERT INTO keys (hash, key)
VALUES (%s, %s)
RETURNING id;
'''
(hash_val, key_value)
)
result = cursor.fetchone()[0]
cursor.execute('''RELEASE SAVEPOINT duplicate_hash_savepoint''')
return result
except psycopg2.IntegrityError as e:
cursor.execute(
'''
ROLLBACK TO SAVEPOINT duplicate_hash_savepoint;
'''
)
#TODO: Should ensure that values match and this isn't just
#a hash collision
cursor.execute(
'''
SELECT id FROM keys WHERE hash=%s LIMIT 1;
'''
(hash_val,)
)
return cursor.fetchone()[0]
Update:
So I believe I a hint on another stackexchange site:
Specifically:
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands
behave the same as SELECT in terms of searching for target rows: they
will only find target rows that were committed as of the command start
time1. However, such a target row might have already been updated (or
deleted or locked) by another concurrent transaction by the time it is
found. In this case, the would-be updater will wait for the first
updating transaction to commit or roll back (if it is still in
progress). If the first updater rolls back, then its effects are
negated and the second updater can proceed with updating the
originally found row. If the first updater commits, the second updater
will ignore the row if the first updater deleted it2, otherwise it
will attempt to apply its operation to the updated version of the row.
While I'm still not exactly sure where the co-dependency is, it seems that processing a large number of key/value pairs without commiting would likely result in something like this. Sure enough, if I commit after each individual configuration file is added, the deadlocks don't occur.

It looks like you're in this situation:
The table to INSERT into has a primary key (or unique index(es) of any sort).
Several INSERTs into that table are performed within one transaction (as opposed to committing immediately after each one)
The rows to insert come in random order (with regard to the primary key)
The rows are inserted in concurrent transactions.
This situation creates the following opportunity for deadlock:
Assuming there are two sessions, that each started a transaction.
Session #1: insert row with PK 'A'
Session #2: insert row with PK 'B'
Session #1: try to insert row with PK 'B'
=> Session #1 is put to wait until Session #2 commits or rollbacks
Session #2: try to insert row with PK 'A'
=> Session #2 is put to wait for Session #1.
Shortly thereafter, the deadlock detector gets aware that both sessions are now waiting for each other, and terminates one of them with a fatal deadlock detected error.
If you're in this scenario, the simplest solution is to COMMIT after a new entry is inserted, before attempting to insert any new row into the table.

Postgres is known for that type of deadlocks, to be honest. I often encounter such problems when different workers update information about interleaving entities. Recently I had a task of importing a big list of scientific papers metadata from multiple json files. I was using parallel processes via joblib to read from several files at the same time. Deadlocks were hanging all the time on authors(id bigint primary key, name text) table all the time 'cause many files contained papers of the same authors, therefore producing inserts with oftentimes the same authors. I was using insert into authors (id,name) values %s on conflict(id) do nothing, but that was not helping. I tried sorting tuples before sending them to Postgres server, with little success. What really helped me was keeping a list of known authors in a Redis set (accessible to all processes):
if not rexecute("sismember", "known_authors", author_id):
# your logic...
rexecute("sadd", "known_authors", author_id)
Which I recommend to everyone. Use Memurai if you are limited to Windows. Sad but true, not a lot of other options for Postgres.

Possible to let the stored procedure run one by one even if multiple sessions are calling them in postgresql

In postgresql: multiple sessions want to get one record from the the table, but we need to make sure they don't interfere with each other. I could do it using message queue: put the data in a queue, and them let each session get data from the queue. But is it doable in postgresql? since it will be easier for SQL guys to cal stored procedure. Any way to configure a stored procedure so that no concurrent calling will happen, or use some special lock?

I would recommend making sure the stored procedure uses SELECT FOR UPDATE, which should prevent the same row in the table from being accessed by multiple transactions.
Per the Postgres doc:
FOR UPDATE causes the rows retrieved by the SELECT statement to be
locked as though for update. This prevents them from being modified or
deleted by other transactions until the current transaction ends. That
is, other transactions that attempt UPDATE, DELETE, SELECT FOR UPDATE,
SELECT FOR NO KEY UPDATE, SELECT FOR SHARE or SELECT FOR KEY SHARE of
these rows will be blocked until the current transaction ends. The FOR
UPDATE lock mode is also acquired by any DELETE on a row, and also by
an UPDATE that modifies the values on certain columns. Currently, the
set of columns considered for the UPDATE case are those that have an
unique index on them that can be used in a foreign key (so partial
indexes and expressional indexes are not considered), but this may
change in the future.
More SELECT info.
So you don't end up locking all of the rows in the table at once (i.e. by SELECTing all of the records), I would recommend you use ORDER BY to sort the table in a consistent manner, and then do a LIMIT 1, so that it only gets the next one in the queue. Also add a WHERE clause that checks for a certain column value (i.e. processed), and then once processed set the column to a value that will prevent the WHERE clause from picking it up.

Sending only updated rows to a client

I'd like to create a web service that allows a client to fetch all rows in a table, and then later allows the client to only fetch new or updated rows.
The simplest implementation seems to be to send the current timestamp to the client, and then have the client ask for rows that are newer than the timestamp in the following request.
It seems that this is doable by keeping an "updated_at" column with a timestamp set to NOW() in update and insert triggers, and then querying newer rows, and also passing down the value of NOW().
The problem is that if there are uncommitted transactions, these transactions will set updated_at to the start time of the transaction, not the commit time.
As a result, this simple implementation doesn't work, because rows can be lost since they can appear with a timestamp in the past.
I have been unable to find any simple solution to this problem, despite the fact that it seems to be a very common need: any ideas?
Possible solutions:
Keep a monotonic timestamp in a table, update it at the start of every transaction to MAX(NOW(), last_timestamp + 1) and use it as a row timestamp. Problem: this effectively means that all write transactions are fully serialized and lock the whole database since they conflict on the update time table.
At the end of the transaction, add a mapping from NOW() to the time in an update table like the above solution. This seems to require to take an explicit lock and use a sequence to generate non-temporal "timestamps" because just using an UPDATE on a single row would cause rollbacks in SERIALIZABLE mode.
Somehow have PostgreSQL, at commit time, iterate over all updated rows and set updated_at to a monotonic timestamp
Somehow have PostgreSQL itself maintain a table of transaction commit times, which it doesn't seem to do at the moment
Using the built-in xmin column also seems impossible, because VACUUM can trash it.
It would be nice to be able to do this in the database without modifications to all updates in the application.
What is the usual way this is done?
The problem with the naive solution
In case it's not obvious, this is the problem with using NOW() or CLOCK_TIMESTAMP():
At time 1, we run NOW() or CLOCK_TIMESTAMP() in a transaction, which gives 1 and we update a row setting time 1 as the update time
At time 2, a client fetches all rows, and we tell him that we gave it all rows until time 2
At time 3, the transaction commits with "time 1" in the updated_at field
The client asks for updated rows since time 2 (the time he got from the previous full fetch request), we query for updated_at >= 2 and we return nothing, instead of returning the row that is just added
That row is lost and will never seen by the client

Your whole proposition goes against some of the underlying fundamentals of an ACID-compliant RDBMS like PostgreSQL. Time of transaction start (e.g. current_timestamp()) and other time-based metrics are meaningless as a measure of what a particular client has received or not. Abandon the whole idea.
Assuming that your clients connect through a persistent session to the database you can follow this procedure:
When the session starts, CREATE TEMP UNLOGGED TABLE for the session user. This table contains nothing but the PK and the last update time of the table you want to fetch the data from.
The client polls for new data and receives only those records that have a PK not yet in the temp table or an existing PK but a newer last update time. Currently uncommitted transactions are invisible but will be retrieved at the next poll for new or updated records. The update time is required because there is no way to delete records from the temp tables of all concurrent clients.
The PK and last update time of retrieved record are stored in the temp table.
When the user closes the session, the temp table is deleted.
If you want to persist the retrieved records over multiple sessions for each client or the client disconnects after every query, then you need a regular table but then I would suggest to also add the oid of the user such that all users can use a single table for keeping track of the retrieved records. In that latter case you can create an AFTER UPDATE trigger on the table with your data which deletes the PK from the table with fetched records, for all users in one sweep. On their next poll the clients will then get the updated record.

Add a column, which will be used to track what record has been sent to a client:
alter table table_under_view
add column access_order int null;
create sequence table_under_view_access_order_seq
owned by table_under_view.access_order;
create function table_under_view_reset_access_order()
returns trigger
language plpgsql
as $func$
new.access_order := null;
$func$;
create trigger table_under_view_reset_access_order_before_update
before update on table_under_view
for each row execute procedure table_under_view_reset_access_order();
create index table_under_view_access_order_idx
on table_under_view (access_order);
create index table_under_view_access_order_where_null_idx
on table_under_view (access_order)
where (access_order is null);
(You could use a before insert on table_under_view trigger too, to ensure only NULL values are inserted into access_order).
You need to update this column after transactions with INSERTs & UPDATEs on this table is finished, but before any client query your data. You cannot do anything just after a transaction finished, so let's do it before a query happens. You can do this with a function, f.ex:
create function table_under_access(from_access int)
returns setof table_under_view
language sql
as $func$
update table_under_view
set access_order = nextval('table_under_view_access_order_seq'::regclass)
where access_order is null;
select *
from table_under_view
where access_order > from_access;
$func$;
Now, your first "chunk" of data (which will fetch all rows in a table), looks like:
select *
from table_under_access(0);
The key element after this is that your client needs to process every "chunk" of data to determine which is the greatest access_order it last got (unless you include it in your result with f.ex. window functions, but if you're going to process the results - which seems highly likely - you don't need that). Always use that for the subsequent calls.
You can add an updated_at column too for ordering your results, if you want to.
You can also use a view + rule(s) for the last part (instead of the function), to make it more transparent.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse