Why can't I use WHERE NOT EXISTS with INSERT? - postgresql

I want to insert a tag named "foo", unless it already exists. So I constructed the following query:
INSERT INTO "tag" ("name") VALUES ('foo')
WHERE NOT EXISTS (SELECT 1 FROM "tag" WHERE ("tag"."name" = 'foo'));
But this will fail with the following error:
ERROR: syntax error at or near "WHERE"
LINE 1: INSERT INTO "tag" ("name") VALUES ('foo') WHERE NOT EXISTS (...
^
I don't understand where the problem with that query is. Especially, since I can provide a subquery instead of VALUES and suddenly the query is perfectly fine:
INSERT INTO "tag" ("name") SELECT 'foo' AS name
WHERE NOT EXISTS (SELECT 1 FROM "tag" WHERE ("tag"."name" = 'foo'));
This results in:
Query returned successfully: 0 rows affected, 11 ms execution time.
It's 0 rows, because the tag already exists.

You can, you just need to use the INSERT INTO ... SELECT ... form.
INSERT INTO "tag" ("name")
SELECT 'foo'
WHERE NOT EXISTS (SELECT 1 FROM "tag" WHERE ("tag"."name" = 'foo'));
However, it doesn't do what you want. At least not under concurrent workloads. You can still get unique violations or duplicate inserts.
See:
How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
Insert, on duplicate update in PostgreSQL?

Related

Why postgresql encount duplicate key when key not exists?

When I am inserting data into Postgresql(9.6),throw this error:
ERROR: duplicate key value violates unique constraint "book_intial_name_isbn_isbn10_key"
DETAIL: Key (name, isbn, isbn10)=(三銃士, , ) already exists.
SQL state: 23505
I add uniq constraint on columns name, isbn, isbn10.But when I check the distination table,it does not contains the record:
select * from public.book where name like '%三銃%';
How to fix?This is my insert sql:
insert into public.book
select *
from public.book_backup20190405 legacy
where legacy."name" not in
(
select name
from public.book
)
limit 1000
An educated guess, there may be more than one row in the source table book_backup20190405 which has the unique key tuple ('三銃', '', '').
Since the bulk INSERT INTO ... SELECT ... will be be transactional, you'll be none the wiser to the error, since all data will have been rolled back when the constraint fails.
You can verify this by running a dupe check on the source table:
SELECT name, isbn, isbn10, COUNT(*)
FROM public.book_backup20190405
WHERE name = '三銃'
GROUP BY name, isbn, isbn10
HAVING COUNT(*) > 1;
To see if there are duplicates.
Here's an example of how the source table can be the sole source of duplicates:
http://sqlfiddle.com/#!17/29ba3

Teradata MERGE with DELETE and INSERT - syntax?

I have been trying to find the correct syntax for the following case (if it is possible?):
MERGE INTO TAB_A tgt
USING TAB_B src ON (src.F1 = tgt.F1 AND src.F2 = tgt.F2
WHEN MATCHED THEN DELETE
ELSE INSERT (tgt.*) VALUES (src.*)
Background: the temp table contains a fix for the target table, as in it contains two types of rows:
the incorrect rows that are to be removed (they match with rows in the target table), and the 'corrected' row that should be inserted (it replaces all the 'delete' rows).
So essentially: remove anything that matches;
insert anything that does not match.
the current error I am getting is:
"Syntax error: expected something between the 'DELETE' keyword and the 'ELSE' keyword"
Any help appreciated, thanks!
You can make use of MultiStatement DELETE and INSERT statement to correct data from temp table into target table
DELETE FROM TAB_A WHERE EXISTS (SELECT 1 FROM TAB_B WHERE TAB_A.F1 = TAB_B.F1 AND TAB_A.F2 = TAB_B.F2)
;INSERT INTO TAB_A SELECT * FROM TAB_B;

Moving "Craig's" gapless sequence to other schema fails?

While trying to move Craig Ringer's PostgreSQL gapless sequences example in to another schema 'test' I get an error. The query
INSERT INTO test.dummy(id, blah)
VALUES ( test.get_next_id('test.thetable_id_counter','last_id'), 42 );
errors with
relation "test.thetable_id_counter" does not exist
UPDATE "test.thetable_id_counter" SET "test.last_id" = "test...
^
yet the query:
SELECT to_regclass('test.thetable_id_counter');
does not result in NULL.

Upsert error (On Conflict Do Update) pointing to duplicate constrained values

I have a problem with ON CONFLICT DO UPDATE in Postgres 9.5 when I try to use more than one source in the FROM statement.
Example of working code:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
Faulty code:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference", old."ReferenceAuthor"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
AND old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID"
--Year, Title and Author must be present in the data, otherwise the entry is deemed useless, hence won't be included
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
I added an additional source in the FROM statement and one more WHERE statement to make sure only entries that have a title, year and author are inserted into the new database. (If old."Reference"."ID" exists in old."ReferenceAuthor" as "ReferenceID", then an author exists.) Even without the additional WHERE statement the query is faulty. The columns I specified in SELECT are only present in old."Reference", not in old."ReferenceAuthor".
Currently old."ReferenceAuthor" and old."Reference" don't have a UNIQUE CONSTRAINT,the uniqe constraints for bookmonographs are:
CONSTRAINT bookmonographs_pk PRIMARY KEY (bookmonographsid),
CONSTRAINT bookmonographs_bookseries FOREIGN KEY (bookseriesid)
REFERENCES new.bookseries (bookseriesid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT bookmonographs_citaviid_unique UNIQUE (citavi_id)
The error PSQL throws:
ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
********** Error **********
ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time
SQL state: 21000
Hint: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
I don't know what's wrong, or why the hint points to a duplicated constrained value.
The problem is caused by the fact that apparently some entries have multiple authors. So the inner join in the select query that you wrote will return multiple rows for the same entry and INSERT ... ON CONFLICT doesn't like that. Since you only use the ReferenceAuthor table for filtering, you can simply rewrite the query so that it uses that table to only filter entries that don't have any author by doing an exists on a correlated subquery. Here's how:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
AND exists(SELECT FROM old."ReferenceAuthor" WHERE old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID")
--Year, Title and Author must be present in the data, otherwise the entry is deemed useless, hence won't be included
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
Use an explicit INNER JOIN to join the two source tables together:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
INNER JOIN old."ReferenceAuthor" -- explicit join
ON old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID" -- ON condition
WHERE old."Reference"."ReferenceType" = 'Book' AND
old."Reference"."Year" IS NOT NULL AND
old."Reference"."Title" IS NOT NULL
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) =
(excluded.abstract, excluded.createdon, excluded.edition, excluded.title,
excluded.year)
There's a great explanation of the issue in postgres' docs (ctrl + f: "Cardinality violation" errors in detail, as there's no direct link).
To quote from the docs:
The idea of raising "cardinality violation" errors is to ensure that any one row is affected no more than once per statement executed. In the lexicon of the SQL standard's discussion of SQL MERGE, the SQL statement is "deterministic". The user ought to be confident that a row will not be affected more than once - if that isn't the case, then it isn't predictable what the final value of a row affected multiple times will be.
To replay their simpler example, on table upsert the below query could not work, as we couldn't reliably know if select val from upsert where key = 1 was equal to 'Foo' or 'Bar':
INSERT INTO upsert(key, val)
VALUES(1, 'Foo'), (1, 'Bar')
ON CONFLICT (key) UPDATE SET val = EXCLUDED.val;
ERROR: 21000: ON CONFLICT UPDATE command could not lock/update self-inserted tuple
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.

On Insert: column reference "score" is ambiguous

I have the following command in postgresql:
INSERT INTO word_relations(word1_id, word2_id, score) VALUES($1, $2, $3)
ON CONFLICT (word1_id, word2_id) DO UPDATE SET score = score + $3`)
I get the following error:
column reference "score" is ambiguous
I thought it was odd as I am only using one table. Any ideas?
On the right side of the = in the set clause, there are two possibilities for score: EXCLUDED.score and word_relations.score. The former is a way of accessing the value being inserted; the latter a way of accessing the value stored in the row.
I would write this as:
ON CONFLICT (word1_id, word2_id) DO
UPDATE SET score = word_relations.score + EXCLUDED.score