PostgreSQL trigger race condition updating a balance table from transactions - postgresql

I have a financial system where users have tokens and can add transactions. The system has to calculate the balance and mean acquisition price of each token. Data integrity is of utmost importance in the system and it should be impossible to have incorrect balances or mean prices in the system.
To comply with these requirements I've come up with the following tables:
token (to hold each token)
transaction (to hold each transaction of a token)
balance (to hold the token balances without having to calculate each time using all transactions)
The token and transaction tables are straight forward. The balance table is a table that is automatically updated using a PostgreSQL trigger to hold each change of balance in a token. This table exists so every time we need to know something like "What was the balance/mean price of token A in 2023-01-05?" we don't need to sum all transactions and calculate from scratch.
Trigger
Enough of explanation, this is the trigger I've come up with. It fires AFTER every INSERT in the transaction table.
DECLARE
old_balance NUMERIC(17, 8);
old_mean_price NUMERIC(17, 8);
old_local_mean_price NUMERIC(17, 8);
new_balance NUMERIC(17, 8);
new_mean_price NUMERIC(17, 8);
new_local_mean_price NUMERIC(17, 8);
BEGIN
-- Prevent the creation of retroactive transaction since it would mess up the balance table
IF EXISTS (
SELECT * FROM transaction
WHERE
token_id = NEW.token_id
AND date > NEW.date
) THEN
RAISE EXCEPTION 'There is already a newer transaction for token %', NEW.token_id;
END IF;
-- Fetch the latest balance of this token
SELECT
amount,
mean_price,
local_mean_price
INTO
old_balance, old_mean_price, old_local_mean_price
FROM balance
WHERE
token_id = NEW.token_id
AND date <= NEW.date
ORDER BY date DESC
LIMIT 1;
-- If there's no balance in the table then set everything to zero
old_balance := COALESCE(old_balance, 0);
old_mean_price := COALESCE(old_mean_price, 0);
old_local_mean_price := COALESCE(old_local_mean_price, 0);
-- Calculate the new values
IF NEW.side = 'buy' THEN
new_balance := old_balance + NEW.quantity;
new_mean_price := (old_balance * old_mean_price + NEW.quantity * NEW.unit_price) / new_balance;
new_local_mean_price := (old_balance * old_local_mean_price + NEW.quantity * NEW.local_unit_price) / new_balance;
ELSIF NEW.side = 'sell' THEN
new_balance := old_balance - NEW.quantity;
new_mean_price := old_mean_price;
new_local_mean_price := old_local_mean_price;
ELSE
RAISE EXCEPTION 'Side is invalid %', NEW.side;
END IF;
-- Update the balance table
IF NOT EXISTS (
SELECT * FROM balance
WHERE
date = NEW.date
AND token_id = NEW.token_id
) THEN
-- Create a row in the balance table
INSERT INTO balance
(date, token_id, amount, mean_price, local_mean_price)
VALUES
(
NEW.date,
NEW.token_id,
new_balance,
new_mean_price,
new_local_mean_price
);
ELSE
-- There's already a row for this token and date in the balance table. We should update it.
UPDATE balance
SET
amount = new_balance,
mean_price = new_mean_price,
local_mean_price = new_local_mean_price
WHERE
date = NEW.date
AND token_id = NEW.token_id;
END IF;
RETURN NULL;
END;
This trigger does some things:
Prevents the insertion of retroactive transactions, since this means we would have to update all the following balances
Add a new row in the balance table with the updated balance and mean prices of the token
Or, update the row in the balance if one already exists with the same datetime
Race condition
This works fine, but it has a race condition when executing 2 concurrent transactions. Imagine the following scenario:
Start T1 using BEGIN
Start T2 using BEGIN
T1 inserts a row in the transaction table
The trigger is fired inside T1 and it inserts a row in balance
T2 inserts a row in the transaction table
The trigger is fired inside T2 but it cannot see the changes made from the T1 trigger since it has not commited yet
The balance created by T2 is incorrect because it used stale data
Imperfect solution 1
Maybe I could change the SELECT statement in the trigger (the one that selects the previous balance) to use a SELECT FOR UPDATE. This way the trigger is locked until a concurrent trigger is commited. This doesn't work because of three things:
If it's the first transaction then the table balance doesn't have a row for that particular token (this could be solved by locking the token table)
Even if we lock and wait the concurrent transaction to commit, because of the way transaction works in PostgreSQL we would still fetch stale data since inside a transaction we only have visibility of the data that was there when the transaction started.
Even if we managed to get the most up to date information, there's still the issue that T1 can rollback and this means that the balance generated in T2 would still be incorrect
Imperfect solution 2
Another solution would be to scrap the FOR UPDATE and just defer the trigger execution to the transaction commit. This solves the race condition since the trigger is executed after the end of the transaction and has visibility of the most recent changed. The only issue is that it leaves me unable to use the balance table inside the transaction (since it will only be updated after the transaction commits)
Question
I have two questions regarding this:
Does the Imperfect solution 2 really solves all the race condition problems or am I missing something?
Is there a way to solve this problem and also update the balance table ASAP?

Your solution 2 only narrows the race condition, but does not fix it. Both transactions could commit at the same time.
There are only two ways to prevent such a race condition:
use the SERIALIZABLE transaction isolation level (you can set that as default value with the parameter default_transaction_isolation)
lock whatever is necessary to prevent concurrent operations (for example, the corresponding row in balance)
Besides, your code can be improved: You should check for the existence of a balance only once, and you could use INSERT ... ON CONFLICT.
You could read my article for a more detailed analysis.

You could also consider either creating an extra table containing the running transactions and trowing an error if an insert into that is not possible or simply locking the relevant balance rows and forcing the transactions to run fully sequentially that way. Either way you force conflicting statements to run one at a time thus resolving the race condition.

Related

Prevent Deadlock Errors with Trigger on high concurrent write table

I have a table that is getting around 1000+ inserts per minute. There is a trigger on it to update a column on another table.
CREATE or replace FUNCTION clothing_price_update() RETURNS trigger AS $clothing_price_update$
BEGIN
INSERT INTO
clothes(clothing_id, last_price, sale_date)
VALUES(NEW.clothing_id, new.price, new."timestamp")
ON CONFLICT (clothing_id) DO UPDATE set last_price = NEW.price, sale_date = NEW."timestamp";
RETURN NEW;
END;
$clothing_price_update$ LANGUAGE plpgsql;
CREATE TRIGGER clothing_price_update_trigger BEFORE INSERT OR UPDATE ON sales
FOR EACH ROW EXECUTE PROCEDURE clothing_price_update();
However, I'm randomly getting a Deadlock error. This seems pretty straightforward and there are no other triggers in play. Am I missing something?
sales has data constantly being inserted into it, but it relies on no other tables and no updates occur once data has been added.
Going out on a limb, the typical root cause for deadlocks is that the order of written (locked) rows is inconsistent among concurrent transactions.
Imagine two exactly concurrent transactions:
T1:
INSERT INTO sales(clothing_id, price, timestamp) VALUES
(1, 11, '2000-1-1')
, (2, 22, '2000-2-1');
T2:
INSERT INTO sales(clothing_id, price, timestamp) VALUES
(2, 23, '2000-2-1')
, (1, 12, '2000-1-1');
T1 locks the row with `clothing_id = 1` in `sales` and `clothes`.
T2 locks the row with `clothing_id = 2` in `sales` and `clothes`.
T1 waits for T2 to release locks for `clothing_id = 2`.
T2 waits for T1 to release locks for `clothing_id = 1`.
đź’Ł Deadlock.
Typically, deadlocks are still extremely unlikely as the time window is so narrow, but with bigger sets / more concurrent transaction / longer transactions / more expensive writes / added cycles for triggers (!) etc. it gets more likely.
The trigger itself is not the cause in this scenario (unless it introduces writes out of order!), it only increases the probability of a deadlock actually happening.
The cure is to insert rows in consistent sort order within the same transaction. Most importantly within the same command. Then the next transaction will wait in line until the first one finishes (COMMIT or ROLLBACK) and releases its locks. The manual:
The best defense against deadlocks is generally to avoid them by being
certain that all applications using a database acquire locks on
multiple objects in a consistent order.
See:
How to simulate deadlock in PostgreSQL?
Long-running transactions typically add to the problem. See:
Table Locking in PostgreSQL
Aside, you use:
ON CONFLICT (clothing_id) DO UPDATE set last_price = NEW.price ...
You may want to use EXCLUDED instead of NEW here:
ON CONFLICT (clothing_id) DO UPDATE set last_price = EXCLUDED.price ...
Subtle difference: this way, effects of possible triggers ON INSERT are carried over, while pasting NEW again overwrites that. Related:
How to UPSERT multiple rows with individual values in one statement?

Lock row, release later

I'm trying to understand how to lock a row, and only release that lock later.
I have a table like this :
create table testTable (Name varchar(100));
Some test data
insert into testTable (name) select 'Bob';
insert into testTable (name) select 'John';
insert into testTable (name) select 'Steve';
Now, I want to select one of those rows, and prevent other other queries from seeing this row. I achieve that like this :
begin transaction;
select * from testTable where name = 'Bob' for update;
In another window, I do this :
select * from testTable for update skip locked;
Great, I don't see 'Bob' in that result set. Now, I want to do something with the primary retrieved row (Bob), and after I did my work, I want to release that row again. Simple answer would be to do :
commit transaction
However, I am running multiple transactions on the same connection, so I can't just begin and commit transactions all over the show. Ideally I would like to have a "named" transaction, something like :
begin transaction 'myTransaction';
select * from testTable where name = 'Bob' for update;
//do stuff with the data, outside sql then later call ...
commit transaction 'myTransaction';
But postgres doesn't support that. I have found "prepare transaction", but that seems to be a pear-shaped path I don't want to go down, especially as these transaction seem to persist through restarts even.
Is there anyway I can have a reference to commit/rollback for a specific transaction?
You can have only one transaction in a database session, so the question as such is moot.
But I assume that you do not really want to run a transaction, you want to block access to a certain row for a while.
It is usually not a good idea to use regular database locks for such a purpose (the exception are advisory locks, which serve exactly that purpose, but are not tied to table rows). The problem is that long database transactions keep autovacuum from doing its job.
I recommend that you add a status column to the table and change the status rather than locking the row. That would server the same purpose in a more natural fashion and make your problem go away.
If you are concerned that the status flag might not get cleared due to application logic problems, replace it with a visible_from column of type timestamp with time zone that initially contains -infinity. Instead of locking the row, set the value to current_timestamp + INTERVAL '5 minutes'. Only select rows that fulfill WHERE visible_from < current_timestamp. That way the “lock” will automatically expire after 5 minutes.

Error during execution of trigger. How to rework select statement?

I'm relatively new to triggers so forgive me if this doesn't look how it should. I am creating a trigger that checks a user account for last payment date and sets a value to 0 if they haven't paid in a while. I created what I thought was a correct trigger but I get the error, "error during execution of trigger" when its triggered. From what I understand the select statement is causing the error as it selecting values which are in the process of being changed. Here is my code.
CREATE OR REPLACE TRIGGER t
BEFORE
UPDATE OF LASTLOGINDATE
ON USERS
FOR EACH ROW
DECLARE
USER_CHECK NUMBER;
PAYMENTDATE_CHECK DATE;
ISACTIVE_CHECK CHAR(1);
BEGIN
SELECT U.USERID, U.ISACTIVE, UP.PAYMENTDATE
INTO USER_CHECK, PAYMENTDATE_CHECK, ISACTIVE_CHECK
FROM USERS U JOIN USERPAYMENTS UP ON U.USERID = UP.USERID
WHERE UP.PAYMENTDATE < TRUNC(SYSDATE-60);
IF ISACTIVE_CHECK = 1 THEN
UPDATE USERS U
SET ISACTIVE = 0
WHERE U.USERID = USER_CHECK;
INSERT INTO DEACTIVATEDUSERS
VALUES(USER_CHECK,SYSDATE);
END IF;
END;
From what I thought, since the select is in the begin statement, it would run before an update, nothing would be changing about the tables until after if runs through the trigger. I tried but using :old in front of the select variables but that doesn't seem to be the right use.
And here is the update statement i was trying.
UPDATE USERS
SET LASTLOGINDATE = SYSDATE
WHERE USERID = 5;
Some issues:
The select you do in the trigger sets the variable isactive_check to a payment date, and vice versa. There is an accidental switch there, which will have a negative effect on the next if;
The same select should return exactly one record, which by the looks of it is not guaranteed, since you join with table userpayments, which may have several payments for the selected user that meet the condition, or none at all. Change that select to do an aggregation.
If a user has more than one payment record, the condition might be true for one, but not for another. So if you are interested only in users who have not paid in a long while, such user should not be included, even though they have an old payment record. Instead you should check whether all records meet the condition. This you can do with a having clause.
As the table users is mutating (the update trigger is on that table), you cannot perform every action on that same table, as it would otherwise lead to a kind of deadlock. This means you need to rethink what the purpose is of the trigger. As this is about an update for a specific user, you actually don't need to check the whole table, but only the record that is being changed. For that you can use the special new variable.
I would suggest this SQL instead:
SELECT MAX(UP.PAYMENTDATE)
INTO PAYMENTDATE_CHECK
FROM USERPAYMENTS
WHERE USERID = :NEW.USERID
and then continue with the checks:
IF :NEW.ISACTIVE = 1 AND PAYMENTDATE_CHECK < TRUNC(SYSDATE-60) THEN
:NEW.ISACTIVE := 0;
INSERT INTO DEACTIVATEDUSERS (USER_ID, DEACTIVATION_DATE)
VALUES(USER_CHECK,SYSDATE);
END IF;
Now you have avoided to do anything in the table users and have made the checks and modification via the :new "record".
Also, it is good practice to mention the column names in an insert statement, which I have done in above code (adapt column names as needed):
Make sure the trigger is compiled and produces no compilation errors.

Purge data using a while loop.

I have a database that gets populated daily with incremental data and then at the end of each month a full download of the month's data is put into the system. Our business wants each day put into the system and then at the end of the month the daily stuff is removed and the full month data is left. I have written the query below and if you could help I'd appreciate it.
DECLARE #looper INT
DECLARE #totalindex int;
select name, (substring(name,17,8)) as Attempt, substring(name,17,4) as [year], substring(name,21,2) as [month], create_date
into #work_to_do_for
from sys.databases d
where name like 'Snapshot%' and
d.database_id >4 and
(substring(name,21,2) = DATEPART(m, DATEADD(m, -1, getdate()))) AND (substring(name,17,4) = DATEPART(yyyy, DATEADD(m, -1, getdate())))
order by d.create_date asc
SELECT #totalindex = COUNT(*) from #work_to_do_for
SET #looper = 1 -- reset and reuse counter
WHILE (#looper < #totalindex)
BEGIN;
set #looper=#looper+1
END;
DROP TABLE #work_to_do_for;
I'd need to perform the purge on several tables.
Thanks in advance.
When I delete large numbers of records, I always do it in batches and off-hours so as not to use up resources during production processes. To accomplish this, you incorporate a loop and some testing to find the optimal number to delete at a time.
begin transaction del -- I always use transactions as a safeguard
declare #count int = 1
while #count > 0
begin
delete top (100000) t
from dbo.MyTable t -- JOIN if necessary
-- WHERE if necessary
set #count = ##ROWCOUNT
end
Run this manually (without the WHILE loop) 1 time with 100000 records in parenthesis and see what your execution time is. Write it down. Run it again with 200000 records. Check the time; write it down. Run it with 500000 records. What you're looking for is a trend in the execution time. As long as the time required to delete 100000 records is decreasing as you increase the batch size, keep increasing it. You might end at 500k, but this method will help you find the optimal number to delete per batch. Then, run it as a loop.
That being said, if you are literally deleting MILLIONS of records, it might make more sense to drop and recreate the table as long as you aren't going to interfere with other processes. If you needed to save some of the data, you could insert what you needed into a new table (eg MyTable_New), drop the original table (MyTable), and rename MyTable_New to MyTable.
The script you've posted iterating through with a while loop to delete the rows should be changed to a set-based operation if at all possible. Relational database engines excel at set-based operations like
Delete dbo.table WHERE yourcolumn = 5
as opposed to iterating through one at a time. Especially if it will be for "several million" rows as you indicated in the comments above.
#rwking where are you putting the COMMIT to the Transaction.. I mean are you keeping the all eligible Delete count in single Transaction and doing one final Commit?
I have the similar type of Requirement where I have to Delete in batches, and also track the number of count affected in the end.
My Sample Code is as Follows:
Declare #count int
Declare #deletecount int
set #count=0
While(1=1)
BEGIN
BEGIN TRY
BEGIN TRAN
DELETE TOP 1000 FROM --CONDITION
SET #COUNT = #COUNT+##ROWCOUNT
IF (##ROWCOUNT)=0
Break;
COMMIT
END CATCH
BEGIN CATCH
ROLLBACK;
END CATCH
END
set #deletecount=#COUNT
Above Code Works fine, but how to keep track of #deletecount if Rollback happens in one of the batch.

How to ensure a unique number field with zero order

Here is a table that has fields id, id_user, order_id.
Required when creating a record to find the last number of user and insert the following in order.
I wrote a stored procedure that takes the next order number to the user, but even it does not provide a unique order number.
CREATE OR REPLACE FUNCTION get_next_order()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $function$
DECLARE
next_order_num bigint;
BEGIN
select order_id + 1 INTO next_order_num
from payment_out
where payment_out.id_usr = NEW.id_usr
and payment_out.order_id is not null
order by payment_out.order_id desc
limit 1;
-- if payments does't exist, return 1
NEW.order_id = coalesce(next_order_num, 1);
return NEW;
END;
$function$
CREATE TRIGGER get_next_order
BEFORE INSERT
ON payment_out
FOR EACH ROW EXECUTE
PROCEDURE get_next_order()
How can I avoid duplicate order numbers?
For this to work in the presence of multiple concurrent transactions inserting orders for the same user, you need a lock on a particular record to make them wait and execute serially.
e.g., before the first SELECT, you might:
PERFORM 1 FROM "users" where id_user = NEW.id_user FOR UPDATE;
where you lock the parent "users" record that owns the orders.
Otherwise, multiple concurrent transactions could execute your procedure at the same time, but they can't see each others' inserted values, so they'll pick the same numbers.
However, beware: A foreign key constraint will cause a SHARE lock to be taken on the users entry already, when you insert into a table that depends on it. Your trigger will try to upgrade that into an UPDATE lock, but multiple transactions might already hold the SHARE lock, so this will block. You'll land up with transactions all waiting for each other, until PostgreSQL kills all but one of them in a deadlock abort error. The only way to avoid this is for the application to SELECT 1 FROM users WHERE id_user = blahblah FOR UPDATE before it creates the orders for that user.
A variant is to keep a next_order_id field in users and do an UPDATE users SET next_order_id = next_order_id + 1 RETURNING next_order_id, and use the result of that to set the order ID. The same lock upgrade problem applies.