Partial Unique Index not working with on conflict clause PostgreSQL - postgresql

Table Structure:
create table example (a_id integer, b_id integer, c_id integer, flag integer);
Partial Index:
create unique index u_idx on example (a_id, b_id, c_id) where flag = 0;
My Code for upsert:
with a_ins_upd as (
Insert into example (a_id, b_id, c_id, flag)
select x.a_id, x.b_id, x.c_id, x.flag
from <input_tableType> x
on conflict (a_id, b_id, c_id) where flag = 0
do update
set
a_id = excluded.a_id,
b_id = excluded.b_id,
c_id = excluded.c_id,
flag = excluded.flag
returning *
)
insert into <some_other_table>
Result Entries, as of now :
Entry_1 is the data already in the table. Entry_2 , is the new input. The new input should update the Entry_1, rather than Inserting a new entry.
Where I am going wrong. I want the second entry to get updated, not Inserted.
I tried to include the flag as a part of the index. Then also, records are getting inserted. Does partial index not work with on conflict() clause?
How do I achieve this - I want to ignore the records with flag = 1, existing in the table (like blind to flag = 1), and do the merge operations on other records (the records which are having flag = 0, unique constraint remains on (a_id, b_id, c_id)). I thought of doing partial constraint/index and do on conflict. Seems tricky now.

Related

postgresql DO UPDATE ON CONFLICT with multiple constraints [duplicate]

I have two columns in table col1, col2, they both are unique indexed (col1 is unique and so is col2).
I need at insert into this table, use ON CONFLICT syntax and update other columns, but I can't use both column in conflict_targetclause.
It works:
INSERT INTO table
...
ON CONFLICT ( col1 )
DO UPDATE
SET
-- update needed columns here
But how to do this for several columns, something like this:
...
ON CONFLICT ( col1, col2 )
DO UPDATE
SET
....
ON CONFLICT requires a unique index* to do the conflict detection. So you just need to create a unique index on both columns:
t=# create table t (id integer, a text, b text);
CREATE TABLE
t=# create unique index idx_t_id_a on t (id, a);
CREATE INDEX
t=# insert into t values (1, 'a', 'foo');
INSERT 0 1
t=# insert into t values (1, 'a', 'bar') on conflict (id, a) do update set b = 'bar';
INSERT 0 1
t=# select * from t;
id | a | b
----+---+-----
1 | a | bar
* In addition to unique indexes, you can also use exclusion constraints. These are a bit more general than unique constraints. Suppose your table had columns for id and valid_time (and valid_time is a tsrange), and you wanted to allow duplicate ids, but not for overlapping time periods. A unique constraint won't help you, but with an exclusion constraint you can say "exclude new records if their id equals an old id and also their valid_time overlaps its valid_time."
A sample table and data
CREATE TABLE dupes(col1 int primary key, col2 int, col3 text,
CONSTRAINT col2_unique UNIQUE (col2)
);
INSERT INTO dupes values(1,1,'a'),(2,2,'b');
Reproducing the problem
INSERT INTO dupes values(3,2,'c')
ON CONFLICT (col1) DO UPDATE SET col3 = 'c', col2 = 2
Let's call this Q1. The result is
ERROR: duplicate key value violates unique constraint "col2_unique"
DETAIL: Key (col2)=(2) already exists.
What the documentation says
conflict_target can perform unique index inference. When performing
inference, it consists of one or more index_column_name columns and/or
index_expression expressions, and an optional index_predicate. All
table_name unique indexes that, without regard to order, contain
exactly the conflict_target-specified columns/expressions are inferred
(chosen) as arbiter indexes. If an index_predicate is specified, it
must, as a further requirement for inference, satisfy arbiter indexes.
This gives the impression that the following query should work, but it does not because it would actually require a together unique index on col1 and col2. However such an index would not guarantee that col1 and col2 would be unique individually which is one of the OP's requirements.
INSERT INTO dupes values(3,2,'c')
ON CONFLICT (col1,col2) DO UPDATE SET col3 = 'c', col2 = 2
Let's call this query Q2 (this fails with a syntax error)
Why?
Postgresql behaves this way is because what should happen when a conflict occurs on the second column is not well defined. There are number of possibilities. For example in the above Q1 query, should postgresql update col1 when there is a conflict on col2? But what if that leads to another conflict on col1? how is postgresql expected to handle that?
A solution
A solution is to combine ON CONFLICT with old fashioned UPSERT.
CREATE OR REPLACE FUNCTION merge_db(key1 INT, key2 INT, data TEXT) RETURNS VOID AS
$$
BEGIN
LOOP
-- first try to update the key
UPDATE dupes SET col3 = data WHERE col1 = key1 and col2 = key2;
IF found THEN
RETURN;
END IF;
-- not there, so try to insert the key
-- if someone else inserts the same key concurrently, or key2
-- already exists in col2,
-- we could get a unique-key failure
BEGIN
INSERT INTO dupes VALUES (key1, key2, data) ON CONFLICT (col1) DO UPDATE SET col3 = data;
RETURN;
EXCEPTION WHEN unique_violation THEN
BEGIN
INSERT INTO dupes VALUES (key1, key2, data) ON CONFLICT (col2) DO UPDATE SET col3 = data;
RETURN;
EXCEPTION WHEN unique_violation THEN
-- Do nothing, and loop to try the UPDATE again.
END;
END;
END LOOP;
END;
$$
LANGUAGE plpgsql;
You would need to modify the logic of this stored function so that it updates the columns exactly the way you want it to. Invoke it like
SELECT merge_db(3,2,'c');
SELECT merge_db(1,2,'d');
In nowadays is (seems) impossible. Neither the last version of the ON CONFLICT syntax permits to repeat the clause, nor with CTE is possible: not is possible to breack the INSERT from ON CONFLICT to add more conflict-targets.
If you are using postgres 9.5, you can use the EXCLUDED space.
Example taken from What's new in PostgreSQL 9.5:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
Vlad got the right idea.
First you have to create a table unique constraint on the columns col1, col2 Then once you do that you can do the following:
INSERT INTO dupes values(3,2,'c')
ON CONFLICT ON CONSTRAINT dupes_pkey
DO UPDATE SET col3 = 'c', col2 = 2
ON CONFLICT ( col1, col2 )
DO UPDATE
SET
works fine. but you should not update col1, col2 in the SET section.
Create a constraint (foreign index, for example).
OR/AND
Look at existing constraints (\d in psq).
Use ON CONSTRAINT(constraint_name) in the INSERT clause.
You can typically (I would think) generate a statement with only one on conflict that specifies the one and only constraint that is of relevance, for the thing you are inserting.
Because typically, only one constraint is the "relevant" one, at a time. (If many, then I'm wondering if something is weird / oddly-designed, hmm.)
Example:
(License: Not CC0, only CC-By)
// there're these unique constraints:
// unique (site_id, people_id, page_id)
// unique (site_id, people_id, pages_in_whole_site)
// unique (site_id, people_id, pages_in_category_id)
// and only *one* of page-id, category-id, whole-site-true/false
// can be specified. So only one constraint is "active", at a time.
val thingColumnName = thingColumnName(notfificationPreference)
val insertStatement = s"""
insert into page_notf_prefs (
site_id,
people_id,
notf_level,
page_id,
pages_in_whole_site,
pages_in_category_id)
values (?, ?, ?, ?, ?, ?)
-- There can be only one on-conflict clause.
on conflict (site_id, people_id, $thingColumnName) <—— look
do update set
notf_level = excluded.notf_level
"""
val values = List(
siteId.asAnyRef,
notfPref.peopleId.asAnyRef,
notfPref.notfLevel.toInt.asAnyRef,
// Only one of these is non-null:
notfPref.pageId.orNullVarchar,
if (notfPref.wholeSite) true.asAnyRef else NullBoolean,
notfPref.pagesInCategoryId.orNullInt)
runUpdateSingleRow(insertStatement, values)
And:
private def thingColumnName(notfPref: PageNotfPref): String =
if (notfPref.pageId.isDefined)
"page_id"
else if (notfPref.pagesInCategoryId.isDefined)
"pages_in_category_id"
else if (notfPref.wholeSite)
"pages_in_whole_site"
else
die("TyE2ABK057")
The on conflict clause is dynamically generated, depending on what I'm trying to do. If I'm inserting a notification preference, for a page — then there can be a unique conflict, on the site_id, people_id, page_id constraint. And if I'm configuring notification prefs, for a category — then instead I know that the constraint that can get violated, is site_id, people_id, category_id.
So I can, and fairly likely you too, in your case?, generate the correct on conflict (... columns ), because I know what I want to do, and then I know which single one of the many unique constraints, is the one that can get violated.
Kind of hacky but I solved this by concatenating the two values from col1 and col2 into a new column, col3 (kind of like an index of the two) and compared against that. This only works if you need it to match BOTH col1 and col2.
INSERT INTO table
...
ON CONFLICT ( col3 )
DO UPDATE
SET
-- update needed columns here
Where col3 = the concatenation of the values from col1 and col2.
I get I am late to the party but for the people looking for answers I found this:
here
INSERT INTO tbl_Employee
VALUES (6,'Noor')
ON CONFLICT (EmpID,EmpName)
DO NOTHING;
ON CONFLICT is very clumsy solution, run
UPDATE dupes SET key1=$1, key2=$2 where key3=$3
if rowcount > 0
INSERT dupes (key1, key2, key3) values ($1,$2,$3);
works on Oracle, Postgres and all other database

Sqlite trigger oddly fails on unique constraint

This is SQLite version 3.16.0 2016-11-04
Here's my test schema:
PRAGMA foreign_keys = ON;
CREATE TABLE A (a_id integer primary key, x int);
CREATE TABLE B (b_id integer primary key, a_id integer not null references A(a_id) ON DELETE RESTRICT ON UPDATE CASCADE, descr text);
CREATE TABLE C (id integer primary key, dt DATETIME DEFAULT CURRENT_TIMESTAMP);
CREATE TRIGGER Tb after update on B begin
replace into C (id) select NEW.b_id;
end;
Here's some test data (as per .dump):
INSERT INTO "A" VALUES(1, 5000);
INSERT INTO "B" VALUES(1, 1, 'none');
Now, if I manually update a row in table B, like so:
UPDATE B SET descr = "test1" WHERE b_id = 1;
I can see a new row (which gets updated every time I touch B) in table C:
1|2017-04-17 21:59:42
I can also manually "touch" C the same way it's done in the trigger:
replace into C (id) select 1;
As expected, the dt column gets updated. All good so far.
But suddenly, if I change a_id, referenced from B:
UPDATE A SET a_id = 2 WHERE a_id = 1;
I get Error: UNIQUE constraint failed: C.id
In my understanding, the foreign key from B.a_id to A.a_id updates B's row. That leads to the trigger trying to touch C. Which then for some reason fails, though it just worked perfectly fine in previous scenarios.
Why does that happen and how can I fix it?
You are using the REPLACE command, which
is an alias for the "INSERT OR REPLACE" variant of the INSERT command.
The trigger documentation says:
An ON CONFLICT clause may be specified as part of an UPDATE or INSERT action within the body of the trigger. However if an ON CONFLICT clause is specified as part of the statement causing the trigger to fire, then conflict handling policy of the outer statement is used instead.
So when the outer command is an UPDATE, your REPLACE also becomes an UPDATE.
As a workaround, replace the REPLACE with the appropriate DELETE and INSERT commands.

Create unique constraint initially disabled

This is my table :
CREATE TABLE [dbo].[TestTable]
(
[Name1] varchar(50) COLLATE French_CI_AS NOT NULL,
[Name2] varchar(255) COLLATE French_CI_AS NULL,
CONSTRAINT [TestTable_uniqueName1] UNIQUE ([Name1]),
CONSTRAINT [TestTable_uniqueName1Name2] UNIQUE ([Name1], [Name2])
)
ALTER TABLE [dbo].[TestTable]
ADD CONSTRAINT [TestTable_uniqueName1]
UNIQUE NONCLUSTERED ([Name1])
ALTER TABLE [dbo].[TestTable]
ADD CONSTRAINT [TestTable_uniqueName1Name2]
UNIQUE NONCLUSTERED ([Name1], [Name2])
GO
ALTER INDEX [TestTable_uniqueName1]
ON [dbo].[TestTable]
DISABLE
GO
My idea is to enable/disable one or other unique contraint depending on the customer application. With this way, I can catch the thrown exception in my c# code, and display a specific error message to the GUI.
Now, my problem is to alter the collation of columns Name1 & Name2, I need to make them case sensitive (French_CS_AS). To alter these fields, I have to drop the two constraints and recreate it. According to the explained schema, I cannot create an enabled constraint and then disable it, because by some customers, I have duplicate keys for one or other constraint.
For my update script, my idea number 1 was
Save the name of enabled constraints in a temp table
Drop the constraints
Alter columns
Create DISABLED unique constraints
Enable specific constraints according to the saved values in points 1.
My problem is in point 4., I don't find how to create a disabled unique constraint with an ALTER TABLE statement. Is it possible to create it directly in the sys.indexes table ?
My idea number 2 was
Rename TestTable to TestTableCopy
Recreate TestTable with the new fields collation, and otherwise the same schema (indexes, FK, triggers, ...)
Disable specifical unique contraints in TestTable
Migrate data from TestTableCopy to TestTable
Drop TestTableCopy
In this way, my fear is to loose some links with other tables/dependencies, beceause it is a central table in my database.
Is there any other way to achieve my goal?
If necessary, I can use unique indexes instead of unique constraints.
It looks like it is impossible to create a unique index on a column that already has duplicate values.
So, rather than having a disabled unique index either:
not have an index at all (which is the same as having a disabled index from the query processor point of view),
or create a non-unique index.
For those instanses where your client has unique data create unique index. For those instanses where your client has non-unique data create non-unique index.
CREATE PROCEDURE [dbo].[spUsers_AddUsers]
#Name1 varchar(50) ,
#Name2 varchar(50) ,
#Unique bit
AS
declare #err int
begin tran
if #Unique = 1 begin
if not exists (SELECT * FROM Users WHERE Name1 = #Name1 and Name2 = #Name2)
begin
INSERT INTO Users (Name1,Name2)
VALUES (#Name1,#Name2)
set #err = ##ERROR
end else
begin
UPDATE Users
set Name1 = #Name1,
Name2 = #Name2
where Name1 = #Name1 and Name2 = #Name2
set #err = ##ERROR
end
end else begin
if not exists ( SELECT * FROM Users WHERE Name1 = #Name1 )
begin
INSERT INTO Users (Name1,Name2)
VALUES (#Name1,#Name2)
set #err = ##ERROR
end else
begin
UPDATE Users
set Name1 = #Name1,
Name2 = #Name2
where Name1 = #Name1
set #err = ##ERROR
end
if #err = 0 commit tran
else rollback tran
So first you check if you need an unique Name1 and Name2 or just Name1. Then if you do you an insert/update based on what constrain you have.

Foreign key constraints involving multiple tables

I have the following scenario in a Postgres 9.3 database:
Tables B and C reference Table A.
Table C has an optional field that references table B.
I would like to ensure that for each row of table C that references table B, c.b.a = c.a. That is, if C has a reference to B, both rows should point at the same row in table A.
I could refactor table C so that if c.b is specified, c.a is null but that would make queries joining tables A and C awkward.
I might also be able to make table B's primary key include its reference to table A and then make table C's foreign key to table B include table C's reference to table A but I think this adjustment would be too awkward to justify the benefit.
I think this can be done with a trigger that runs before insert/update on table C and rejects operations that violate the specified constraint.
Is there a better way to enforce data integrity in this situation?
There is a very simple, bullet-proof solution. Works for Postgres 9.3 - when the original question was asked. Works for the current Postgres 13 - when the question in the bounty was added:
Would like information on if this is possible to achieve without database triggers
FOREIGN KEY constraints can span multiple columns. Just include the ID of table A in the FK constraint from table C to table B. This enforces that linked rows in B and C always point to the same row in A. Like:
CREATE TABLE a (
a_id int PRIMARY KEY
);
CREATE TABLE b (
b_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, UNIQUE (a_id, b_id) -- redundant, but required for FK
);
CREATE TABLE c (
c_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, b_id int
, CONSTRAINT fk_simple_and_safe_solution
FOREIGN KEY (a_id, b_id) REFERENCES b(a_id, b_id) -- THIS !
);
Minimal sample data:
INSERT INTO a(a_id) VALUES
(1)
, (2);
INSERT INTO b(b_id, a_id) VALUES
(1, 1)
, (2, 2);
INSERT INTO c(c_id, a_id, b_id) VALUES
(1, 1, NULL) -- allowed
, (2, 2, 2); -- allowed
Disallowed as requested:
INSERT INTO c(c_id, a_id, b_id) VALUES (3,2,1);
ERROR: insert or update on table "c" violates foreign key constraint "fk_simple_and_safe_solution"
DETAIL: Key (a_id, b_id)=(2, 1) is not present in table "b".
db<>fiddle here
The default MATCH SIMPLE behavior of FK constraints works like this (quoting the manual):
MATCH SIMPLE allows any of the foreign key columns to be null; if any of them are null, the row is not required to have a match in the referenced table.
So NULL values in c(b_id) are still allowed (as requested: "optional field"). The FK constraint is "disabled" for this special case.
We need the logically redundant UNIQUE constraint on b(a_id, b_id) to allow the FK reference to it. But by making it out to be on (a_id, b_id) instead of (b_id, a_id), it is also useful in its own right, providing a useful index on b(a_id) to support the other FK constraint, among other things. See:
Is a composite index also good for queries on the first field?
(An additional index on c(a_id) is typically useful accordingly.)
Further reading:
Differences between MATCH FULL, MATCH SIMPLE, and MATCH PARTIAL?
Enforcing constraints “two tables away”
I ended up creating a trigger as follows:
create function "check C.A = C.B.A"()
returns trigger
as $$
begin
if NEW.b is not null then
if NEW.a != (select a from B where id = NEW.b) then
raise exception 'a != b.a';
end if;
end if;
return NEW;
end;
$$
language plpgsql;
create trigger "ensure C.A = C.B.A"
before insert or update on C
for each row
execute procedure "check C.A = C.B.A"();
Would like information on if this is possible to achieve without database triggers
Yes, it is possible. The mechanism is called ASSERTION and it is defined in SQL-92 Standard(though it is not implemented by any major RDBMS).
In short it allows to create multiple-row constraints or multi-table check constraints.
As for PostgreSQL it could be emulated by using view with WITH CHECK OPTION and performing operation on view instead of base table.
WITH CHECK OPTION
This option controls the behavior of automatically updatable views. When this option is specified, INSERT and UPDATE commands on the view will be checked to ensure that new rows satisfy the view-defining condition (that is, the new rows are checked to ensure that they are visible through the view). If they are not, the update will be rejected.
Example:
CREATE TABLE a(id INT PRIMARY KEY, cola VARCHAR(10));
CREATE TABLE b(id INT PRIMARY KEY, colb VARCHAR(10), a_id INT REFERENCES a(id) NOT NULL);
CREATE TABLE c(id INT PRIMARY KEY, colc VARCHAR(10),
a_id INT REFERENCES a(id) NOT NULL,
b_id INT REFERENCES b(id));
Sample inserts:
INSERT INTO a(id, cola) VALUES (1, 'A');
INSERT INTO a(id, cola) VALUES (2, 'A2');
INSERT INTO b(id, colb, a_id) VALUES (12, 'B', 1);
INSERT INTO c(id, colc, a_id) VALUES (15, 'C', 2);
Violating the condition(connecting C with B different a_id on both tables)
UPDATE c SET b_id = 12 WHERE id = 15;;
-- no issues whatsover
Creating view:
CREATE VIEW view_c
AS
SELECT *
FROM c
WHERE NOT EXISTS(SELECT 1
FROM b
WHERE c.b_id = b.id
AND c.a_id != b.a_id) -- here is the clue, we want a_id to be the same
WITH CHECK OPTION ;
Trying update second time(error):
UPDATE view_c SET b_id = 12 WHERE id = 15;
--ERROR: new row violates check option for view "view_c"
--DETAIL: Failing row contains (15, C, 2, 12).
Trying brand new inserts with incorrect data(also errors)
INSERT INTO b(id, colb, a_id) VALUES (20, 'B2', 2);
INSERT INTO view_c(id, colc, a_id, b_id) VALUES (30, 'C2', 1, 20);
--ERROR: new row violates check option for view "view_c"
--DETAIL: Failing row contains (30, C2, 1, 20)
db<>fiddle demo

SELECT or INSERT a row in one command

I'm using PostgreSQL 9.0 and I have a table with just an artificial key (auto-incrementing sequence) and another unique key. (Yes, there is a reason for this table. :)) I want to look up an ID by the other key or, if it doesn't exist, insert it:
SELECT id
FROM mytable
WHERE other_key = 'SOMETHING'
Then, if no match:
INSERT INTO mytable (other_key)
VALUES ('SOMETHING')
RETURNING id
The question: is it possible to save a round-trip to the DB by doing both of these in one statement? I can insert the row if it doesn't exist like this:
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING id
... but that doesn't give the ID of an existing row. Any ideas? There is a unique constraint on other_key, if that helps.
Have you tried to union it?
Edit - this requires Postgres 9.1:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH new_row AS (
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING *
)
SELECT * FROM new_row
UNION
SELECT * FROM mytable WHERE other_key = 'SOMETHING';
results in:
id | other_key
----+-----------
1 | SOMETHING
(1 row)
No, there is no special SQL syntax that allows you to do select or insert. You can do what Ilia mentions and create a sproc, which means it will not do a round trip fromt he client to server, but it will still result in two queries (three actually, if you count the sproc itself).
using 9.5 i successfully tried this
based on Denis de Bernardy's answer
only 1 parameter
no union
no stored procedure
atomic, thus no concurrency problems (i think...)
The Query:
WITH neworexisting AS (
INSERT INTO mytable(other_key) VALUES('hello 2')
ON CONFLICT(other_key) DO UPDATE SET existed=true -- need some update to return sth
RETURNING *
)
SELECT * FROM neworexisting
first call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|false |
second call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|true |
First create your table ;-)
CREATE TABLE mytable (
id serial NOT NULL,
other_key text NOT NULL,
created timestamptz NOT NULL DEFAULT now(),
existed bool NOT NULL DEFAULT false,
CONSTRAINT mytable_pk PRIMARY KEY (id),
CONSTRAINT mytable_uniq UNIQUE (other_key) --needed for on conflict
);
you can use a stored procedure
IF (SELECT id FROM mytable WHERE other_key = 'SOMETHING' LIMIT 1) < 0 THEN
INSERT INTO mytable (other_key) VALUES ('SOMETHING')
END IF
I have an alternative to Denis answer, that I think is less database-intensive, although a bit more complex:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH table_sel AS (
SELECT id
FROM mytable
WHERE other_key = 'test'
UNION
SELECT NULL AS id
ORDER BY id NULLS LAST
LIMIT 1
), table_ins AS (
INSERT INTO mytable (id, other_key)
SELECT
COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)),
'test'
FROM table_sel
ON CONFLICT (id) DO NOTHING
RETURNING id
)
SELECT * FROM table_ins
UNION ALL
SELECT * FROM table_sel
WHERE id IS NOT NULL;
In table_sel CTE I'm looking for the right row. If I don't find it, I assure that table_sel returns at least one row, with a union with a SELECT NULL.
In table_ins CTE I try to insert the same row I was looking for earlier. COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)) is saying: id could be defined, if so, use it; whereas if id is null, increment the sequence on id and use this new value to insert a row. The ON CONFLICT clause assure
that if id is already in mytable I don't insert anything.
At the end I put everything together with a UNION between table_ins and table_sel, so that I'm sure to take my sweet id value and execute both CTE.
This query needs to search for the value other_key only once, and is a "search this value" not a "check if this value not exists in the table", that is very heavy; in Denis alternative you use other_key in both types of searches. In my query you "check if a value not exists" only on id that is a integer primary key, that, for construction, is fast.
Minor tweak a decade late to Denis's excellent answer:
-- Create the table with a unique constraint
CREATE TABLE mytable (
id serial PRIMARY KEY
, other_key varchar NOT NULL UNIQUE
);
WITH new_row AS (
-- Only insert when we don't find anything, avoiding a table lock if
-- possible.
INSERT INTO mytable ( other_key )
SELECT 'SOMETHING'
WHERE NOT EXISTS (
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
)
RETURNING *
)
(
-- This comes first in the UNION ALL since it'll almost certainly be
-- in the query cache. Marginally slower for the insert case, but also
-- marginally faster for the much more common read-only case.
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
-- Don't check for duplicates to be removed
UNION ALL
-- If we reach this point in iteration, we needed to do the INSERT and
-- lock after all.
SELECT *
FROM new_row
) LIMIT 1 -- Just return whatever comes first in the results and allow
-- the query engine to cut processing short for the INSERT
-- calculation.
;
The UNION ALL tells the planner it doesn't have to collect results for de-duplication. The LIMIT 1 at the end allows the planner to short-circuit further processing/iteration once it knows there's an answer available.
NOTE: There is a race condition present here and in the original answer. If the entry does not already exist, the INSERT will fail with a unique constraint violation. The error can be suppressed with ON CONFLICT DO NOTHING, but the query will return an empty set instead of the new row. This is a difficult problem because getting that info from another transaction would violate the I in ACID.