use trigger to insert into table if data is not already present - triggers

I have two tables with the same structure. Table 1 has multiple rows which can have same values. Now i want to insert the same rows into table 2 excluding duplicate rows. I am able to do this normally using 'minus', but i want to write a trigger such that if a new row is inserted into table 1 and is not present in table 2 then insert in table 2 otherwise not. I am new to triggers. The trigger i have written gives me "trigger is mutating" error when i insert in table 1.
INSERT INTO t3(name1,name2,num1,num2) select name1,name2,num1,num2 from t1 group by name1,name2,num1,num2 minus select * from t3
when i write the above code it works fine but when i include this into my trigger it gives error. How do i perform the above with the help of a trigger?
Please help,
Thanks
Pranay

You don't need to requery the table from a row-level trigger. That's what the :NEW. syntax is for, e.g.:
INSERT INTO t3(name1,name2,num1,num2)
select :NEW.name1,:NEW.name2,:NEW.num1,:NEW.num2 from DUAL
minus select name1,name2,num1,num2 from t3;
Although I think the above code looks a bit silly. I'd prefer to put a unique constraint on t3 then add a handler in the trigger to take care of any DUP_VAL_ON_INDEX exceptions.

Related

My PSQL after insert trigger fails to insert into another table when ON DUPLICATE encounters a dupilcate

I am slowly working through a feature where I am importing large csv files. The contents of the csv file has a chance that when it is uploaded the contents will trigger a uniqueness conflict. I've combed stack overflow for some similar resources but I still can't seem to get my trigger to update another table when a duplicate entry is found. The following code is what I have currently implemented with my line of logic for this process. Also, this is implemented in a rails app but the underlying sql is the following.
When a user uploads a file, the following happens when its processed.
CREATE TEMP TABLE codes_temp ON COMMIT DROP AS SELECT * FROM codes WITH NO DATA;
create or replace function log_duplicate_code()
returns trigger
language plpgsql
as
$$
begin
insert into duplicate_codes(id, campaign_id, code_batch_id, code, code_id, created_at, updated_at)
values (gen_random_uuid(), excluded.campaign_id, excluded.code_batch_id, excluded.code, excluded.code_id, now(), now());
return null;
end;
$$
create trigger log_duplicate_code
after insert on codes
for each row execute procedure log_duplicate_code();
INSERT INTO codes SELECT * FROM codes_temp ct
ON CONFLICT (campaign_id, code)
DO update set updated_at = excluded.updated_at;
DROP TRIGGER log_duplicate_code ON codes;
When I try to run this process nothing happens at all. If I were to have a csv file with this value CODE01 and then upload again with CODE01 the duplicate_codes table doesn't get populated at all and I don't understand why. There is no error that gets triggered or anything so it seems like DO UPATE..... is doing something. What am I missing here?
I also have some questions that come to my mind even if this were to work as intended. For example, I am uploading millions of these codes, etc.
1). Should my trigger be a statement trigger instead of a row for scalability?
2). What if someone else tries to upload another file that has millions of codes? I have my code wrapped in a transaction. Would a new separate trigger be created? Will this conflict with a previously processing process?
####### EDIT #1 #######
Thanks to Adriens' comment I do see that After Insert does not have the OLD key phrase. I updated my code to use EXCLUDED and I receive the following error for the trigger.
ERROR: missing FROM-clause entry for table "excluded" (PG::UndefinedTable)
Finally, here are the S.O. posts I've used to try to tailor my code but I just can't seem to make it work.
####### EDIT #2 #######
I have a little more context on to how this is implemented.
When the CSV is loaded, a staging table called codes_temp is created and dropped at the end of the transaction. This table contains no unique constraints. From what I read only the actual table that I want to insert codes should have the unique constraint error.
In my INSERT statement, the DO update set updated_at = excluded.updated_at; doesn't trigger a unique constraint error. As of right now, I don't know if it should or not. I borrowed this logic taken from this s.o. question postgresql log into another table with on conflict it seemed to me like I had to update something if I specify the DO UPDATE SET clause.
Last, the correct criteria for codes in the database is the following:
For example, this is an example entry in my codes table
id, campaign_id, code
1, 1, CODE01
2, 1, CODE02
3, 1, CODE03
If any of these codes appear again somewhere, This should not be inserted into the codes table but it needs to be inserted into the duplicate_codes table because they were already uploaded before.
id, campaign_id, code
1, 1, CODE01.
2, 1, CODE02
3, 1, CODE03
As for the codes_temp table I don't have any unique constraints, so there is no criteria to select the right one.
postgresql log into another table with on conflict
Postgres insert on conflict update using other table
Postgres on conflict - insert to another table
How to do INSERT INTO SELECT and ON DUPLICATE UPDATE in PostgreSQL 9.5?
Seems to me something like:
INSERT INTO
codes
SELECT
distinct on(campaign_id, code) *
FROM
codes_temp ct
ORDER BY
campaign_id, code, id DESC;
Assuming id was assigned sequentially, the above would select the most recent row into codes.
Then:
INSERT INTO
duplicate_codes
SELECT
*
FROM
codes_temp AS ct
LEFT JOIN
codes
ON
ct.id = codes.id
WHERE
codes.id IS NULL;
The above would select the rows in codes_temp that where not selected into codes into the duplicates table.
Obviously not tested on your data set. I would create a small test data set that has uniqueness conflicts and test with.

PSQL: use Where clause inside a called Trigger

i want to execute a Trigger after every 'DELETE' operation on the table 'table1', because i need to delete the corresponding row in 'table2', too.
So i want to type:
DELETE * FROM table1 WHERE x;
And the Trigger should do the following in addition to the above delete operation:
DELETE * FROM table2 WHERE x;
So how can i access/reuse the 'WHERE' clause inside the trigger?
P.S.: I got the same problem with the UPDATE operation, but i think the solution will be the same...
P.P.S: More context for you: I have one table which stores some encrypted data and another table which stores the encryption-keys. So if someone deletes a row in table1 i want to delete all corresponding key-rows in table2 (there could be 1 for every user who has access to this row).
Thanks for you help :)

Updating an "Inserted" column inside my "Insert" Trigger - a little different

I have researched quite a bit but couldn't find what I wanted-
(I have shallow knowledge on TRIGGERS in SQL- pardon me!)
Qn: I have all the THREE Triggers on my table (Insert, Update & Delete)
In my AFTER INSERT Trigger: I need to "update" the "inserted" column
and I was using :
UPDATE Table_name
SET Column_name = #Input
(currently)
But I was requested to use something like:
UPDATE "Inserted.column_name"
SET Column_name = #Input
But this generally cannot happen as it throws me an error:
The logical tables INSERTED and DELETED cannot be updated
Can someone help me out please?
I have seen posts on using INSTEAD OF TRIGGER but that doesn't serve my purpose.. Thanks in advance! Appreciate your help!
You need to update the actual, underlying table - not the Inserted pseudo table....
You need to join the tables on the primary key, and then update your actual data table - something like
CREATE TRIGGER trg_Insert_Sample
ON dbo.YourTableName
AFTER INSERT
AS
UPDATE dbo.YourTableName
SET SomeColumn = i.SomeValue
FROM Inserted i
WHERE dbo.YourTableName.PrimaryKey = i.PrimaryKey
or something along those lines....
You also need to be aware that the trigger is called once per statement - not once per row - so if your INSERT statements inserts 10 rows at once (from e.g. a SELECT), your trigger is called once, and Inserted will contain 10 rows - so you need to make sure your trigger code is capable of handling this situation and is written in a proper, set-based manner (no SELECT #Value = SomeColumn FROM Inserted - that won't work!)

Explain the effect of a parent column in a nested select

I have a scenario where I need to delete rows from a table using the outcome of a nested select. Like this:
DECLARE #tbl_big TABLE (bigID int);
INSERT INTO #tbl_big (bigID)
VALUES (1),(2),(3),(4),(5);
DECLARE #tbl_small TABLE (smallID int);
INSERT INTO #tbl_small (smallID)
VALUES (1),(2),(3);
DELETE FROM #tbl_big
WHERE (bigID IN (SELECT smallID FROM #tbl_small));
SELECT *
FROM #tbl_big; -- shows 4,5 as expected
However, during development I accidentally made a typo:
DELETE FROM #tbl_big WHERE (bigID IN (SELECT bigID FROM #tbl_small)); --bigID used instead of smallID
SELECT *
FROM #tbl_big; -- no rows
The result was that all rows within the parent table were deleted.
While this may be completely acceptable T-SQL, I've never seen it applied like this, nor would I expect the statement to even compile given that #tbl_small does not contain a bigID column.
Can anybody please clarify why/how this works, and is it valid T-SQL? Also, can you provide a real-world example where this is more useful than risky(!)?
bigID in the DELETE statement you mentioned referes to #tbl_big because it is legal to mention columns from the main table in the sub queries you write in the WHERE clause. For example, you can write the below:
DELETE FROM #tbl_big WHERE (bigID IN (SELECT smallID FROM #tbl_small WHERE smallID = bigID));
So, in your case, you just used all bigID values in your table in the sub query as a constant value.

how to emulate "insert ignore" and "on duplicate key update" (sql merge) with postgresql?

Some SQL servers have a feature where INSERT is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE.
What's the best way to emulate INSERT IGNORE and ON DUPLICATE KEY UPDATE with PostgreSQL?
With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
9.5 brings support for "UPSERT" operations.
INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
...
Further example of new syntax:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT operations until the rule is dropped, so not quite ad hoc.
For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)
INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')
As #hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table.
I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1.
Example:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial.
I hope this information is helpful to everyone.
*I have no college degree in software development/coding. Everything I know in coding, I study on my own.
Looks like PostgreSQL supports a schema object called a rule.
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT for a given table, making it do NOTHING if a row exists with the given primary key value, or else making it do an UPDATE instead of the INSERT if a row exists with the given primary key value.
I haven't tried this myself, so I can't speak from experience or offer an example.
This solution avoids using rules:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive
to enter and exit than a block without one. Therefore, don't use
EXCEPTION without need.
On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.
For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;