Batch INSERT on multiple queries is throwing foreign key violation - postgresql

I am following this to do batch INSERT
with two queries. The first query inserts into <tableone> and the second query insert into <tabletwo>.
The second table has a foreign key constraints that references <tableone>.
The following code is how I am handling the batch inserts
batchQuery.push(
insertTableOne,
insertTableTwo
);
const query = pgp.helpers.concat(batchQuery);
db.none(query)
insertTableOne looks like
INSERT INTO tableone (id, att2, att3) VALUES
(1, 'a', 'b'), (2, 'c', 'd'), (3, 'e', 'f'), ...
insertTableTwo looks like
INSERT INTO tabletwo (id, tableone_id) VALUES
(10, 1), (20, 2), (30, 3), ...
with a constraint on <tabletwo>
CONSTRAINT fk_tabletwo_tableone_id
FOREIGN KEY (tableone_id)
REFERENCES Tableone (id)
upon db.none(query) I am getting a violates foreign key constraint "fk_tabletwo_tableone_id"
Does the above query not execute in sequence? First insert into table one, then insert into table two?
Is this an issue with how the query is being commited? I have also tried using a transaction shown by the example in the linked page above.
Any thoughts?

If you read through to the documentation for the spex.batch() method (which is used by the pgp.helpers.concat() method from your linked example) says of the values argument:
Array of mixed values (it can be empty), to be resolved
asynchronously, in no particular order.
See http://vitaly-t.github.io/spex/global.html#batch
You probably need to look at another method rather than using batch().
I'd suggest chaining the dependent query using a .then() after the first insert has completed, ie. something like db.none(insertTableOne).then(() => db.none(insertTableTwo))

Related

How to avoid unnecessary updates when using on conflict with Postgres?

My use case involves syncing a table with an upstream source on a recurring schedule.
Each row has a unique identifier and other columns, and I want to make sure I'm inserting any new upstream rows, and updating any changed upstream rows. And there could be thousands of rows to sync.
But I'd like to avoid unnecessary updates where the row in the database doesn't differ from what's upstream.
Currently I'm using ON CONFLICT UPDATE like so:
INSERT INTO symbols (id, name, status)
VALUES
(1, 'one', 'online'),
(2, 'two', 'offline'),
...
ON CONFLICT (id)
UPDATE SET (id, name, status) = (excluded.id, excluded.name, excluded.status)
RETURNING *
But this will write the updates even when nothing is changing. How should I tweak the UPDATE to performantly check and apply to rows that need it?
You can add a where clause to only update those rows that are different.
INSERT INTO symbols (id, name, status)
VALUES
(1, 'one', 'online'),
(2, 'two', 'offline'),
...
ON CONFLICT (id) DO
UPDATE SET (id, name, status) = (excluded.id, excluded.name, excluded.status)
WHERE (symbols.id, symbols.name, symbols.status) IS DISTINCT FROM (excluded.id, excluded.name, excluded.status)
RETURNING *
However, this will only return the rows that are actually updated, which may impact how you use the returning clause.

How can I remove rows that are 100% duplicates in a PostgreSQL table without a primary key? [duplicate]

This question already has answers here:
Delete duplicate rows from small table
(15 answers)
Closed 3 years ago.
I have a PostgreSQL table with a very large number of columns. The table does not have a primary key and now contains several rows that are 100% duplicates of another row.
How can I remove those duplicates without deleting the original along with them?
I found this answer on a related question, but I'd have to spell out each and every column name, which is error-prone. How can I avoid having to know anything about the table structure?
Example:
Given
create table duplicated (
id int,
name text,
description text
);
insert into duplicated
values (1, 'A', null),
(2, 'B', null),
(2, 'B', null),
(3, 'C', null),
(3, 'C', null),
(3, 'C', 'not a DUPE!');
after deletion, the following rows should remain:
(1, 'A', null)
(2, 'B', null)
(3, 'C', null)
(3, 'C', 'not a DUPE!')
As proposed in this answer, use the system column ctid to distinguish the physical copies of otherwise indentical rows.
To avoid having to spell out a non-existing 'key' for the rows, simply use the row constructor row(table), which returns a
row value containing the entire row as returned by select * from table:
DELETE FROM duplicated
USING (
SELECT MIN(ctid) as ctid, row(duplicated) as row
FROM duplicated
GROUP BY row(duplicated) HAVING COUNT(*) > 1
) uniqued
WHERE row(duplicated) = uniqued.row
AND duplicated.ctid <> uniqued.ctid;
You can try it in this DbFiddle.

Cascading delete on table that references itself

I have a table (say 'MyTable') that can reference itself, ie. it has a ParentId that can point to another record in the same table (so you can store a tree of related nodes).
The problem is that when I try to delete all records that are children of a specific parent, I get the following exception (using EF 6):
The DELETE statement conflicted with the SAME TABLE REFERENCE constraint "FK_dbo.MyTable_dbo.MyTable_ParentId". The conflict occurred in database "foo", table "dbo.MyTable", column 'ParentId'.
The statement has been terminated.
(I'm executing a Sql command like this context.Database.ExecuteSqlCommand("DELETE FROM [MyTable] WHERE ParentId = {0}", parentId);
I tried to fix it by adding a Children property and use fluent api to set cascading delete like this:
modelBuilder.Entity<MyTable>()
.HasMany(t => t.Children)
.WithOptional(t => t.Parent)
.WillCascadeOnDelete(true);
But that gives the following error:
Introducing FOREIGN KEY constraint 'FK_dbo.MyTable_dbo.MyTable_ParentId' on table 'MyTable' may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.
Also when I manually remove the FK and recreate it with ON CASCADE DELETE I get the same error.
I'm a bit lost now on how to fix this.. so any ideas are welcome :)
USE tempdb;
GO
IF OBJECT_ID('tempdb..#Employees') IS NOT NULL
DROP TABLE #Employees;
GO
CREATE TABLE #Employees
(
empid INT PRIMARY KEY,
mgrid INT NULL REFERENCES #Employees,
empname VARCHAR(25) NOT NULL
);
CREATE UNIQUE INDEX idx_unc_mgrid_empid ON #Employees(mgrid, empid);
INSERT INTO #Employees(empid, mgrid, empname) VALUES
(1, NULL, 'David'),
(2, 1, 'Eitan'),
(3, 1, 'Ina'),
(4, 2, 'Seraph'),
(5, 2, 'Jiru'),
(6, 2, 'Steve'),
(7, 3, 'Aaron'),
(8, 5, 'Lilach'),
(9, 7, 'Rita'),
(10, 5, 'Sean'),
(11, 7, 'Gabriel'),
(12, 9, 'Emilia'),
(13, 9, 'Michael'),
(14, 9, 'Didi');
GO
DELETE FROM #Employees WHERE mgrid = 9
SELECT * FROM #Employees;

Bulk insert and update in one query sqlite

is there any way to insert and update bulk data in same query. I have seen many likes but not getting solution. I get a code but its not working
INSERT INTO `demo1` (`id`,`uname`,`address`)
VALUES (1, 2, 3),
VALUES (6, 5, 4),
VALUES (7, 8, 9)
ON DUPLICATE KEY UPDATE `id` = VALUES(32), `uname` = VALUES (b),`address` = VALUES(c)
Can any one help me.
SQLite has the REPLACE statement (which is an alias for INSERT OR REPLACE), but this just deletes the old row if a duplicate key is found.
If you want to keep data from the old row(s), you must use two statements for each row:
db.execute("UPDATE demo1 SET ... WHERE id = 1")
if db.affected_rows == 0:
db.execute("INSERT ...")

generate insert script from model in entity framework

is there any option how to generate data insert script in EF from model? For example: I have a tables(objects) structure like: People->Customers->Orders...
I want to load one instance of the People recursively....People people = peopleRepository.GetByKey(1)
and from this people instance I want to generate insert script for all child objects like:
insert into people(id, name, ...) values (1, john...)
insert into customers(id, peopleid) values(1, 1)
insert into orders(id, customerid) values (1, 1)
insert into orders(id, customerid) values (1, 2)...
is this possible in EF?
thanks
No there is not any such tool. You must write it yourselves. Simply call
var person = context.People.Include("Customers.Orders").Where(p => p.Id == 1);
and use data to create insert scripts.