I am in the unfortunate situation of needing to add triggers to a table to track changes to a legacy system. I have insert, update, and delete triggers on TABLE_A each one of them writes the values of two columns to a TABLE_B, and a bit flag that is set to 1 if populated by the delete trigger.
Every entry in TABLE_B shows up twice. An insert crates two rows, and update creates two rows (we believe), and a delete creates an insert and then a delete.
Is the legacy application doing this, or is SQL doing it?
EDIT (adding more detail):
body of triggers:
.. after delete
INSERT INTO TableB(col1, isdelete) SELECT col1, 1 from DELETED
.. after insert
INSERT INTO TableB(col1, isdelete) SELECT col1, 0 from INSERTED
.. after update
INSERT INTO TableB(col1, isdelete) SELECT col1, 0 from DELETED
I have tried profiler, and do not see any duplicate statements being executed.
It may be that the application is changing the data again when it sees the operations on its data.
It's also possible that triggers exist elsewhere - is there any possiblity that there is a trigger on TableB that is creating extra rows?
More detail would be needed to address the question more fully.
Related
I'm using a PostgreSQL RDS instance in AWS. Basically, there is a query that inserts data into a first table, let's call it table. The data there can have duplicates in some fields (except for the primary key obviously).
Then there is the trigger that updates another table, infotable, allowing no duplicates.
The trigger:
CREATE TRIGGER insert_infotable AFTER INSERT ON table
FOR EACH ROW EXECUTE PROCEDURE insert_infotable();
The relevant part of the trigger function looks like this:
CREATE OR REPLACE FUNCTION insert_infotable() RETURNS trigger AS $insert_infotable$
BEGIN
--some irrelevant code
IF NOT EXISTS (SELECT * FROM infotable WHERE col1 = NEW.col1 AND col2 = NEW.col2) THEN
INSERT INTO infotable(col1, col2, col3, col4, col5, col6) values (--some values--);
END IF;
RETURN NEW;
END;
$insert_infotable$ LANGUAGE plpgsql;
The table infotable has a UNIQUE constraint on the columns col1 and col2.
In general all is working fine, but rarely, about once in 1k inserts, the trigger returns an error 'duplicate key value violates unique constraint "unique_col1_and_col2"' for table infotable. Which shouldn't happen since there is the IF NOT EXISTS part in the trigger function.
The first question is what might be the cause of this? The only thing I can think of is races where two users are getting the same info simultaneously, both trigger the trigger but then one updates the second table via trigger and the second user gets the duplicate error. And because of that his whole insert query fails, including the insert to the main table.
If that's the case, what can I do about it? Is using a lock on insert a good idea for a table that is supposed to have 100+ users inserting data simultaneously?
And if yes, what type of lock should I use and what table should I lock -- the main table, or the second one, which gets modified by the trigger? (or I guess should I have the lock with my main insert statement or inside the trigger function?)
Yes, this is a race condition. Two such triggers running concurrently won't see each other's modifications, because the transactions are not yet committed.
Since you have a unique constraint on infotable, you can simply use
INSERT INTO infotable ...
ON CONFLICT (col1, col2) DO NOTHING;
Please help with my understanding of how triggers and locks can interact
I bulk load records to a table with statements something like this…..
BEGIN;
INSERT INTO table_a VALUES (record1) , (record2), (record3)………;
INSERT INTO table_a VALUES (record91) , (record92), (record93)………;
…..
….
COMMIT;
There can be several hundred records in a single insert, and there can be several dozen INSERT statements between COMMITs
Table_a has a trigger on it defined as….
AFTER INSERT ON table_a FOR EACH ROW EXECUTE PROCEDURE foo();
The procedure foo() parses each new row as it’s added, and will (amongst other stuff) update a record in a summary table_b (uniquely identified by primary key). So, for every record inserted into table_a a corresponding record will be updated in table_b
I have a 2nd process that also attempts to (occasionally) update records in table_b. On very rare occasions it may attempt to update the same row in table_b that the bulk process is updating
Questions – should anything in the bulk insert statements affect my 2nd process being able to update records in table_b? I understand that the bulk insert process will obtain a row lock each time it updates a row in table_b, but when will that row lock be released? – when the individual record (record1, record2, record3 etc etc) has been inserted? Or when the entire INSERT statement has completed? Or when the COMMIT is reached?
Some more info - my overall purpose for this question is to try to understand why my 2nd process occasionally pauses for a minute or more when trying to update a row in table_b that is also being updated by the bulk-load process. What appears to be happening is that the lock on the target record in table_b isn't actually being released until the COMMIT has been reached - which is contrary to what I think ought to be happening. (I think a row-lock should be released as soon as the UPDATE on that row is done)
UPDATE after answer(s) - yes of course you're both right. In my mind I had somehow convinced myself that the individual updates performed within the trigger were somehow separate from the overall BEGIN and COMMIT of the whole transaction. Silly me.
The practice of adding multiple records with one INSERT, and multiple INSERTs between COMMITs was introduced to improve the bulk load speed (which it does) I had forgotten about the side-effect of increasing the time before locks would be released.
What should happen when the transaction is rolled back? It is rather obvious that all inserts on table_a, as well as all updates on table_b, should be rolled back. This is why all rows of table_b updated by the trigger will be locked until the transaction completes.
Committing after each insert (reducing the number of rows inserted in a single transaction) will reduce the chance of conflicts with concurrent processes.
Problem is following: remove all records from one table, and insert them to another.
I have a table that is partitioned by date criteria. To avoid partitioning each record one by one, I'm collecting the data in one table, and periodically move them to another table. Copied records have to be removed from first table. I'm using DELETE query with RETURNING, but the side effect is that autovacuum is having a lot of work to do to clean up the mess from original table.
I'm trying to achieve the same effect (copy and remove records), but without creating additional work for vacuum mechanism.
As I'm removing all rows (by delete without where conditions), I was thinking about TRUNCATE, but it does not support RETURNING clause. Another idea was to somehow configure the table, to automatically remove tuple from page on delete operation, without waiting for vacuum, but I did not found if it is possible.
Can you suggest something, that I could use to solve my problem?
You need to use something like:
--Open your transaction
BEGIN;
--Prevent concurrent writes, but allow concurrent data access
LOCK TABLE table_a IN SHARE MODE;
--Copy the data from table_a to table_b, you can also use CREATE TABLE AS to do this
INSERT INTO table_b AS SELECT * FROM table_a;
--Zeroying table_a
TRUNCATE TABLE table_a;
--Commits and release the lock
COMMIT;
I am using PostgreSQL 9.2 and I need to write an INSERT statement which copies data from table A to table B without firing the INSERT trigger defined on table B (maybe some sort of bulk insertion operation??).
On this specific table (table B) many INSERT, UPDATE and DELETE operations are executed. During each and every one of this executions, a trigger must fire.
I cannot temporary disable the triggers because of standard, day-to-day DML operations.
Can anyone help me with the syntax for this non-trigger-firing INSERT statement?
Run your "privileged" inserts as a different user. That way your trigger can check the current user and exit if it shouldn't do anything.
Consider a trigger after update on the table A.
For every update the trigger should update all the records in table B.
Then consider this query:
UPDATE A SET X = Y
Apparently there are many rows updated. After the update the trigger takes place.
Now if the trigger would be using inserted table, and you would like to update the table B with every single row of the temporary table inserted, and in MSDN is not recommended to use cursors, how would you do that?
Thank you
I don't know what exactly you want to do in your update trigger, but you could e.g.
UPDATE dbo.B
SET someColumn = i.Anothervalue
FROM Inserted i
WHERE b.Criteria = i.Criteria
or something else - you need to tell us a bit more about what it is you want to do with table B! But it's definitely possible to update, insert into or other things, without using a cursor and handling multiple rows from the Inserted table.
I will assume that table A is related to table B via a key (have to assume, as you posted no details).
If that is the case, you can use either sub-queries or joins with inserted to select the rows that need changing on table B.
UPDATE tableB B
SET B.colx = someValue
WHERE B.id IN
(
SELECT b_id
FROM INSERTED
)