Using a SQL 2008 R2 November release database and a .net 4.0 Beta 2 Azure worker role application. The worker role collects data and inserts it into a single SQL table with one identity column. Because there will likely be multiple instances of this worker role running, I created an Insert Instead Of trigger on the SQL table. The trigger performs Upsert functionality using the SQL Merge function. Using T-SQL I was able to verify the insert instead of trigger functions correctly, new rows were inserted while existing rows were updated.
This is the code for my trigger:
Create Trigger [dbo].[trgInsteadOfInsert] on [dbo].[Cars] Instead of Insert
as
begin
set nocount On
merge into Cars as Target
using inserted as Source
on Target.id=Source.id AND target.Manufactureid=source.Manufactureid
when matched then
update set Target.Model=Source.Model,
Target.NumDoors = Source.NumDoors,
Target.Description = Source.Description,
Target.LastUpdateTime = Source.LastUpdateTime,
Target.EngineSize = Source.EngineSize
when not matched then
INSERT ([Manufactureid]
,[Model]
,[NumDoors]
,[Description]
,[ID]
,[LastUpdateTime]
,[EngineSize])
VALUES
(Source.Manufactureid,
Source.Model,
Source.NumDoors,
Source.Description,
Source.ID,
Source.LastUpdateTime,
Source.EngineSize);
End
Within the worker role I am using Entity Framework for an object model. When I call the SaveChanges method I receieve the following exception:
OptimisticConcurrencyException
Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded. Refresh ObjectStateManager entries.
I understand this is likly due to SQL not reporting back an IdentityScope for each new inserted/updated row. Then EF thinks the rows were not inserted and the transaction is not ultimately not committed.
What is the best way to handle this exception? Maybe using OUTPUT from the SQL merge function?
Thanks!
-Paul
As you suspected, the problem is that any insertions into a table with an Identity column are immediately followed by a select of the scope_identity() to populate the associated value in the Entity Framework. The instead of trigger causes this second step to be missed, which leads to the 0 rows inserted error.
I found an answer in this StackOverflow thread that suggested adding the following line at the end of your trigger (in the case where the item is not matched and the Insert is performed).
select [Id] from [dbo].[TableXXX] where ##ROWCOUNT > 0 and [Id] = scope_identity()
I tested this with Entity Framework 4.1, and it solved the problem for me. I have copied my entire trigger creation here for completeness. With this trigger defenition I was able to add rows to the table by adding Address entities to the context and saving them using context.SaveChanges().
ALTER TRIGGER [dbo].[CalcGeoLoc]
ON [dbo].[Address]
INSTEAD OF INSERT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT OFF;
-- Insert statements for trigger here
INSERT INTO Address (Street, Street2, City, StateProvince, PostalCode, Latitude, Longitude, GeoLoc, Name)
SELECT Street, Street2, City, StateProvince, PostalCode, Latitude, Longitude, geography::Point(Latitude, Longitude, 4326), Name
FROM Inserted;
select AddressId from [dbo].Address where ##ROWCOUNT > 0 and AddressId = scope_identity();
END
I had almost exactly the same scenario: Entity Framework-driven inserts to a view with an INSTEAD OF INSERT trigger on it were resulting in the "...unexpected number of rows (0)..." exception. Thanks to Ryan Gross's answer I fixed it by adding
SELECT SCOPE_IDENTITY() AS CentrePersonID;
at the end of my trigger, where CentrePersonID is the name of the key field of the underlying table that has an auto-inrcementing identity. This way the EF can discover the ID of the newly-inserted record.
Related
I am slowly working through a feature where I am importing large csv files. The contents of the csv file has a chance that when it is uploaded the contents will trigger a uniqueness conflict. I've combed stack overflow for some similar resources but I still can't seem to get my trigger to update another table when a duplicate entry is found. The following code is what I have currently implemented with my line of logic for this process. Also, this is implemented in a rails app but the underlying sql is the following.
When a user uploads a file, the following happens when its processed.
CREATE TEMP TABLE codes_temp ON COMMIT DROP AS SELECT * FROM codes WITH NO DATA;
create or replace function log_duplicate_code()
returns trigger
language plpgsql
as
$$
begin
insert into duplicate_codes(id, campaign_id, code_batch_id, code, code_id, created_at, updated_at)
values (gen_random_uuid(), excluded.campaign_id, excluded.code_batch_id, excluded.code, excluded.code_id, now(), now());
return null;
end;
$$
create trigger log_duplicate_code
after insert on codes
for each row execute procedure log_duplicate_code();
INSERT INTO codes SELECT * FROM codes_temp ct
ON CONFLICT (campaign_id, code)
DO update set updated_at = excluded.updated_at;
DROP TRIGGER log_duplicate_code ON codes;
When I try to run this process nothing happens at all. If I were to have a csv file with this value CODE01 and then upload again with CODE01 the duplicate_codes table doesn't get populated at all and I don't understand why. There is no error that gets triggered or anything so it seems like DO UPATE..... is doing something. What am I missing here?
I also have some questions that come to my mind even if this were to work as intended. For example, I am uploading millions of these codes, etc.
1). Should my trigger be a statement trigger instead of a row for scalability?
2). What if someone else tries to upload another file that has millions of codes? I have my code wrapped in a transaction. Would a new separate trigger be created? Will this conflict with a previously processing process?
####### EDIT #1 #######
Thanks to Adriens' comment I do see that After Insert does not have the OLD key phrase. I updated my code to use EXCLUDED and I receive the following error for the trigger.
ERROR: missing FROM-clause entry for table "excluded" (PG::UndefinedTable)
Finally, here are the S.O. posts I've used to try to tailor my code but I just can't seem to make it work.
####### EDIT #2 #######
I have a little more context on to how this is implemented.
When the CSV is loaded, a staging table called codes_temp is created and dropped at the end of the transaction. This table contains no unique constraints. From what I read only the actual table that I want to insert codes should have the unique constraint error.
In my INSERT statement, the DO update set updated_at = excluded.updated_at; doesn't trigger a unique constraint error. As of right now, I don't know if it should or not. I borrowed this logic taken from this s.o. question postgresql log into another table with on conflict it seemed to me like I had to update something if I specify the DO UPDATE SET clause.
Last, the correct criteria for codes in the database is the following:
For example, this is an example entry in my codes table
id, campaign_id, code
1, 1, CODE01
2, 1, CODE02
3, 1, CODE03
If any of these codes appear again somewhere, This should not be inserted into the codes table but it needs to be inserted into the duplicate_codes table because they were already uploaded before.
id, campaign_id, code
1, 1, CODE01.
2, 1, CODE02
3, 1, CODE03
As for the codes_temp table I don't have any unique constraints, so there is no criteria to select the right one.
postgresql log into another table with on conflict
Postgres insert on conflict update using other table
Postgres on conflict - insert to another table
How to do INSERT INTO SELECT and ON DUPLICATE UPDATE in PostgreSQL 9.5?
Seems to me something like:
INSERT INTO
codes
SELECT
distinct on(campaign_id, code) *
FROM
codes_temp ct
ORDER BY
campaign_id, code, id DESC;
Assuming id was assigned sequentially, the above would select the most recent row into codes.
Then:
INSERT INTO
duplicate_codes
SELECT
*
FROM
codes_temp AS ct
LEFT JOIN
codes
ON
ct.id = codes.id
WHERE
codes.id IS NULL;
The above would select the rows in codes_temp that where not selected into codes into the duplicates table.
Obviously not tested on your data set. I would create a small test data set that has uniqueness conflicts and test with.
I am learning to use triggers in PostgreSQL but run into an issue with this code:
CREATE OR REPLACE FUNCTION checkAdressen() RETURNS TRIGGER AS $$
DECLARE
adrCnt int = 0;
BEGIN
SELECT INTO adrCnt count(*) FROM Adresse
WHERE gehoert_zu = NEW.kundenId;
IF adrCnt < 1 OR adrCnt > 3 THEN
RAISE EXCEPTION 'Customer must have 1 to 3 addresses.';
ELSE
RAISE EXCEPTION 'No exception';
END IF;
END;
$$ LANGUAGE plpgsql;
I create a trigger with this procedure after freshly creating all my tables so they are all empty. However the count(*) function in the above code returns 1.
When I run SELECT count(*) FROM adresse; outside of PL/pgSQL, I get 0.
I tried using the FOUND variable but it is always true.
Even more strangely, when I insert some values into my tables and then delete them again so that they are empty again, the code works as intended and count(*) returns 0.
Also if I leave out the WHERE gehoert_zu = NEW.kundenId, count(*) returns 0 which means I get more results with the WHERE clause than without.
--Edit:
Here is an example of how I use the procedure:
CREATE TABLE kunde (
kundenId int PRIMARY KEY
);
CREATE TABLE adresse (
id int PRIMARY KEY,
gehoert_zu int REFERENCES kunde
);
CREATE CONSTRAINT TRIGGER adressenKonsistenzTrigger AFTER INSERT ON Kunde
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE PROCEDURE checkAdressen();
INSERT INTO kunde VALUES (1);
INSERT INTO adresse VALUES (1,1);
It looks like I am getting the DEFERRABLE INITIALLY DEFERRED part wrong. I assumed the trigger would be executed after the first INSERT statement but it happens after the second one, although the inserts are not inside a BEGIN; - COMMIT; - Block.
According to the PostgreSQL Documentation inserts are commited automatically every time if not inside such a block and thus there shouldn't be an entry in adresse when the first INSERT statement is commited.
Can anyone point out my mistake?
--Edit:
The trigger and DEFERRABLE INITIALLY DEFERRED seem to be working all right.
My mistake was to assume that since I am not using a BEGIN-COMMIT-Block each insert would be executed in an own transaction with the trigger being executed afterwards every time.
However even without the BEGIN-COMMIT all inserts get bundled into one transaction and the trigger is executed afterwards.
Given this behaviour, what is the point in using BEGIN-COMMIT?
You need a transaction plus the "DEFERRABLE INITIALLY DEFERRED" because of the chicken and egg problem.
starting with two empty tables:
you cannot insert a single row into the person table, because the it needs at least one address.
you cannot insert a single row into the address table, because the FK constraint needs a corresponding row on the person table to exist
This is why you need to bundle the two inserts into one operation: the transaction. You need the BEGIN+ COMMIT, and the DEFERRABLE allows transient forbidden database states to exists: it causes the check to be evaluated at commit time.
This may seem a bit silly, but the answer is you need to stop deferring the trigger and run it BEFORE the insert. If you run it after the insert, of course there is data in the table.
As far as I can tell this is working as expected.
One further note, you probably dont mean:
RAISE EXCEPTION 'No Exception';
You probably want
RAISE INFO 'No Exception';
Then you can change your settings and run queries in transactions to test that the trigger does what you want it to do. As it is, every insert is going to fail and you have no way to move this into production without editing your procedure.
I've been searching for a while trying to figure this out. I'm trying to create a table with a composite primary key. The first part of the key is also a foreign key to the parent table. The second part is autogenerated on the SQL Server. So, I have a table that should look like this:
ParentId ChildId
-------- -------
1 1
1 2
1 3
2 1
2 2
2 3
2 4
The ChildId column is only unique within the context of the ParentId. The values are autogenerated on the server using an INSTEAD OF INSERT trigger so that each ChildId has its own sequence.
My issue is that while this works grand within SQL Server, and classic ADO.NET SqlCommand statements, Entity Framework does not want to work with this.
If I set the ChildId column's StoreGeneratedPattern to be an Identity then EF generates SQL that looks like this:
insert [dbo].[ChildTable]([ParentId], [Name])
values (#0, #1)
select [ChildId]
from [dbo].[ChildTable]
where ##ROWCOUNT > 0 and [ParentId] = #0 and [Id] = scope_identity()
This just generates an error:
System.Data.Entity.Infrastructure.DbUpdateConcurrencyException : Store
update, insert, or delete statement affected an unexpected number of
rows (0). Entities may have been modified or deleted since entities
were loaded. Refresh ObjectStateManager entries.
---->
System.Data.OptimisticConcurrencyException : Store update, insert, or
delete statement affected an unexpected number of rows (0). Entities
may have been modified or deleted since entities were loaded. Refresh
ObjectStateManager entries.
However, if I create a test table with a key based on a GUID and set the StoreGeneratedPattern to be an Identity then the SQL generated looks like this:
declare #generated_keys table([Id] uniqueidentifier)
insert [dbo].[GuidTable]([Name])
output inserted.[Id] into #generated_keys
values (#0)
select t.[Id]
from #generated_keys as g join [dbo].[GuidTable] as t on g.[Id] = t.[Id]
where ##ROWCOUNT > 0
And the entity in my application is updated with the value of the GUID that the SQL Server generated.
So, that suggests that the column does not have to be an IDENTITY column in order for the Entity Framework to get a value back, however, since it uses the logical table inserted the value of ChildId won't be the value that it was changed to by the trigger. Also, the inserted table cannot have an UPDATE operation applied to it to push the values back inside the trigger (Tried that, it said "The logical tables INSERTED and DELETED cannot be updated.")
I feel that I've kind of backed myself in to a corner here, but before I rethink the design is there any way to get the ChildId value(s) back into the application via Entity Framework?
I found this article which offered a suggestion: http://wiki.alphasoftware.com/Scope_Identity+in+SQL+Server+with+nested+and+INSTEAD+OF+triggers
The TL;DR version is that the INSTEAD OF INSERT performs a SELECT at the end to return the keys. The article was for the loss of the SCOPE_IDENTITY() value due to the trigger but it also works here too.
So, what I did was this:
The Trigger now now reads
ALTER TRIGGER dbo.IOINS_ChildTable
ON dbo.ChildTable
INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON;
-- Acquire the lock so that no one else can generate a key at the same time.
-- If the transaction fails then the lock will automatically be released.
-- If the acquisition takes longer than 15 seconds an error is raised.
DECLARE #res INT;
EXEC #res = sp_getapplock #Resource = 'IOINS_ChildTable',
#LockMode = 'Exclusive', #LockOwner = 'Transaction', #LockTimeout = '15000',
#DbPrincipal = 'public'
IF (#res < 0)
BEGIN
RAISERROR('Unable to acquire lock to update ChildTable.', 16, 1);
END
-- Work out what the current maximum Ids are for each parent that is being
-- inserted in this operation.
DECLARE #baseId TABLE(BaseId int, ParentId int);
INSERT INTO #baseId
SELECT MAX(ISNULL(c.Id, 0)) AS BaseId, i.ParentId
FROM inserted i
LEFT OUTER JOIN ChildTable c ON i.ParentId = c.ParentId
GROUP BY i.ParentId
-- The replacement insert operation
DECLARE #keys TABLE (Id INT);
INSERT INTO ChildTable
OUTPUT inserted.Id INTO #keys
SELECT
i.ParentId,
ROW_NUMBER() OVER(PARTITION BY i.ParentId ORDER BY i.ParentId) + b.BaseId
AS Id,
Name
FROM inserted i
INNER JOIN #baseId b ON b.ParentId = i.ParentId
-- Release the lock.
EXEC #res = sp_releaseapplock #Resource = 'IOINS_ChildTable',
#DbPrincipal = 'public', #LockOwner = 'Transaction'
SELECT Id FROM #keys
END
GO
The Entity Model has the Id column's StoreGeneratedPattern set to being Identity. This means that when Entity Framework attempts to read the SCOPE_IDENTITY() it will get the value the SELECT statement in the trigger supplied rather than the value its own SELECT ... SCOPE_IDENTITY() supplied, which is now in the next result set which EF wasn't expecting and will ignore.
This has some obvious issues.
Because the trigger now selects data to be returned from the trigger it means that other code, say a stored proc, that inserts some data and performs its own select is going to have the data from its own selected pushed out. So if you have code expecting only one result set from a database operation, it now has an additional result set.
If you are only ever going to use the entity framework then this may all be fine. However, I can't say what the future holds so I'm not entirely comfortable with this solution.
This is an odd one...
I have two tables tableA and tableB
tableB has a foreign key in tableA.
I have 2 sprocs, one inserts to tableA, the other to tableB.
using odp.net I run the first sproc, inserting a record into tableA. I can then open SQLPlus and select this record
I then run the second sproc, inserting into tableB.
It fails with "ora-02291-integrity-constraint-violated-parent-key-not-found"
I have double, triple, quadruple checked for typos etc... nothing.
To make things even more odd when I do this same operation manually in SQLPlus, with the same sprocs, it works without a problem.
This is killing me 12+ hours looking for something I know has to be simple.
Here are the sprocs.
SPROCA
CREATE OR REPLACE PROCEDURE genData_TestTrackerSegment
(
INTX_ID IN IntxSegment.IntxID%TYPE,
siteid IN INT
)
AS
BEGIN
INSERT INTO INTXSEGMENT(INTXID,INTXTYPEID,VERSION,ISPRIVATE,
SEGMENTTYPE,STARTDATETIME,INTXDIRECTION,SITEID)
VALUES(INTX_ID,1,1,0,1,SYSDATE,1,siteid);
COMMIT;
END;
SPROCB
CREATE OR REPLACE PROCEDURE genData_TestTrackerPart
(
INTX_ID IN IntxSegment.IntxID%TYPE,
INTX_PART_ID IN INTX_PARTICIPANT.INTX_PART_ID%TYPE,
INDIVID IN INDIVIDUAL.INDIVID%TYPE,
CALLID IN INTX_PARTICIPANT.CALLIDKEY%TYPE
)AS
BEGIN
INSERT INTO
INTX_PARTICIPANT(INTXID,INTX_PART_ID,INDIVID,ROLE,
CALLIDKEY,RECORDED,VERSION,STARTDATETIME)
VALUES(INTX_ID,INTX_PART_ID,INDIVID,1,CALLID,1,1,SYSDATE);
COMMIT;
END;
Yeah I am more than certain - it is without a shadow of a doubt that FKEY.
That being said I fixed this..... This is sooooooo stupid by the way.
I was under the (mistaken) assumption that 'named parameters' in ODP.NET meant I did not have to add these parameters in the same order they are referenced in the stored procedure. Long story short - after I re-wrote this about 4 times I modified the order of the parameters and it is now fixed. –
For me, ORA-02291 error means that one of the FK used is not a valid pk or not an existing pk from the other table.
Just Make sure that each FK inserted in your TABLE is a valid one.
Some SQL servers have a feature where INSERT is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE.
What's the best way to emulate INSERT IGNORE and ON DUPLICATE KEY UPDATE with PostgreSQL?
With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
9.5 brings support for "UPSERT" operations.
INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
...
Further example of new syntax:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT operations until the rule is dropped, so not quite ad hoc.
For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)
INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')
As #hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table.
I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1.
Example:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial.
I hope this information is helpful to everyone.
*I have no college degree in software development/coding. Everything I know in coding, I study on my own.
Looks like PostgreSQL supports a schema object called a rule.
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT for a given table, making it do NOTHING if a row exists with the given primary key value, or else making it do an UPDATE instead of the INSERT if a row exists with the given primary key value.
I haven't tried this myself, so I can't speak from experience or offer an example.
This solution avoids using rules:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive
to enter and exit than a block without one. Therefore, don't use
EXCEPTION without need.
On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.
For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;