Coalesce sentence containing an insert into clause fails in PostgreSQL - postgresql

This is my trivial test table,
create table test (
id int not null generated always as identity,
first_name. varchar,
primary key (id),
unique(first_name)
);
As an alternative to insert-into-on-conflict sentences, I was trying to use the coalesce laziness to execute a select whenever possible or an insert, only when select fails to find a row.
coalesce laziness is described in documentation. See https://www.postgresql.org/docs/current/functions-conditional.html
Like a CASE expression, COALESCE only evaluates the arguments that are needed to determine the result; that is, arguments to the right of the first non-null argument are not evaluated. This SQL-standard function provides capabilities similar to NVL and IFNULL, which are used in some other database systems.
I also want to get back the id value of the row, having being inserted or not.
I started with:
select coalesce (
(select id from test where first_name='carlos'),
(insert into test(first_name) values('carlos') returning id)
);
but an error syntax error at or near "into" was found.
See it on this other DBFiddle
https://www.db-fiddle.com/f/t7TVkoLTtWU17iaTAbEhDe/0
Then I tried:
select coalesce (
(select id from test where first_name='carlos'),
(with r as (
insert into test(first_name) values('carlos') returning id
) select id from r
)
);
Here I am getting a WITH clause containing a data-modifying statement must be at the top level error that I don't understand, as insert is the first and only sentence within the with.
I am testing this with DBFiddle and PostgreSQL 13. The source code can be found at
https://www.db-fiddle.com/f/hp8T1iQ8eS4wozDCBhBXDw/5

Different method: chained CTEs:
CREATE TABLE test
( id INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, first_name VARCHAR UNIQUE
);
WITH sel AS (
SELECT id FROM test WHERE first_name = 'carlos'
)
, ins AS (
INSERT INTO test(first_name)
SELECT 'carlos'
WHERE NOT EXISTS (SELECT 1 FROM test WHERE first_name = 'carlos')
RETURNING id
)
, omg AS (
SELECT id FROM sel
UNION ALL
SELECT id FROM ins
)
SELECT id
FROM omg
;

It seems that the returning value from the insert into clause is not equivalent in nature to the scalar query of a select clause. So I try encapsulating the insert into into an SQL function and it worked.
create or replace function insert_first_name(
_first_name varchar
) returns int
language sql as $$
insert into test (first_name)
values (_first_name)
returning id;
$$;
select coalesce (
(select id from test where first_name='carlos'),
(select insert_first_name('carlos'))
);
See it on https://www.db-fiddle.com/f/73rVXgqGfrG4VmjrAk6Z3i/2

This is a refinement on #wildplasser accepted answer. it avoids comparing first_name twice and uses coalesce instead of union all. Kind of an selsert in just one sentence.
with sel as (
select id from test where first_name = 'carlos'
)
, ins as (
insert into test(first_name)
select 'carlos'
where (select id from sel) is null
returning id
)
select coalesce (
(select id from sel),
(select id from ins)
);
See it at https://www.db-fiddle.com/f/goRh4TyAebTkEZFHk6WbtK/6

Related

Postgres Crosstab query with CTE (with clause)

Recently started working on Postgres and need to pivot data.
I wrote the following query:
select *
from crosstab (
$$
with tmp_kv as (
select distinct pat_id
,col.name as key, replace(replace(replace(value, '[',''), ']', ''),'"','') as value
from (
select p.Id as pat_id, nullif(kv.key,'undefined')::int as key, trim(kv.value::text,'"') as value
from pat_table p
left join e_table e on e.pat_id = p.id and e.id is null
,jsonb_each_text(p.data) as kv
) t
left join lateral (
select name::text as name from public.config_fields fld
where id = t.key
) col on true
)
select pat_id, key, value
from tmp_kv
where nullif(trim(key),'') is not null
order by pat_id, key
$$,$$
select distinct key from tmp_kv -- (Get error "relation "tmp_kv" does not exist" )
where nullif(trim(key),'') is not null
order by 1
$$
) as (
pat_id bigint
...
...
);
Query works if I take the WITH clause out into temporary table. But will be deploying it to production with read replicas, so need it to be working with a CTE. Is there a way?
The two queries passed as strings to the crosstab() function are separate queries.
A CTE can only be attached to a single query.
What you ask for is strictly impossible.
Since you have to spell out the (static) return type for crosstab() anyway, and the result of the query in the 2nd parameter has to match that, it's pointless to use a query with a dynamic result as 2nd parameter to begin with.

PostgreSQL -- JOIN UNNEST output with CTE INSERT ID -- INSERT many to many

In a PostgreSQL function, is it possible to join the result of UNNEST, which is an integer array from function input, with an ID returned from a CTE INSERT?
I have PostgreSQL tables like:
CREATE TABLE public.message (
id SERIAL PRIMARY KEY,
content TEXT
);
CREATE TABLE public.message_tag (
id SERIAL PRIMARY KEY,
message_id INTEGER NOT NULL CONSTRAINT message_tag_message_id_fkey REFERENCES public.message(id) ON DELETE CASCADE,
tag_id INTEGER NOT NULL CONSTRAINT message_tag_tag_id_fkey REFERENCES public.tag(id) ON DELETE CASCADE
);
I want to create a PostgreSQL function which takes input of content and an array of tag_id. This is for graphile. I want to do it all in one function, so I get a mutation.
Here's what I got so far. I don't know how to join an UNNEST across an id returned from a CTE.
CREATE FUNCTION public.create_message(content text, tags Int[])
RETURNS public.message
AS $$
-- insert to get primary key of message, for many to many message_id
WITH moved_rows AS (
INSERT INTO public.message (content)
RETURNING *;
)
-- many to many relation
INSERT INTO public.message_tag
SELECT moved_rows.id as message_id, tagInput.tag_id FROM moved_rows, UNNEST(tags) as tagInput;
RETURNING *
$$ LANGUAGE sql VOLATILE STRICT;
You're not that far from your goal:
the semicolon placement in the CTE is wrong
the first INSERT statement lacks a SELECT or VALUES clause to specify what should be inserted
the INSERT into tag_message should specify the columns in which to insert (especially if you have that unnecessary serial id)
you specified a relation alias for the UNNEST call already, but none for the column tag_id
your function was RETURNING a set of message_tag rows but was specified to return a single message row
To fix these:
CREATE FUNCTION public.create_message(content text, tags Int[])
RETURNS public.message
AS $$
-- insert to get primary key of message, for many to many message_id
WITH moved_rows AS (
INSERT INTO public.message (content)
VALUES ($1)
RETURNING *
),
-- many to many relation
_ AS (
INSERT INTO public.message_tag (message_id, tag_id)
SELECT moved_rows.id, tagInput.tag_id
FROM moved_rows, UNNEST($2) as tagInput(tag_id)
)
TABLE moved_rows;
$$ LANGUAGE sql VOLATILE STRICT;
(Online demo)

Avoid putting PostgreSQL function result into one field

The end result of what I am after is a query that calls a function and that function returns a set of records that are in their own separate fields. I can do this but the results of the function are all in one field.
ie: http://i.stack.imgur.com/ETLCL.png and the results I am after are: http://i.stack.imgur.com/wqRQ9.png
Here's the code to create the table
CREATE TABLE tbl_1_hm
(
tbl_1_hm_id bigserial NOT NULL,
tbl_1_hm_f1 VARCHAR (250),
tbl_1_hm_f2 INTEGER,
CONSTRAINT tbl_1_hm PRIMARY KEY (tbl_1_hm_id)
)
-- do that for a few times to get some data
INSERT INTO tbl_1_hm (tbl_1_hm_f1, tbl_1_hm_f2)
VALUES ('hello', 1);
CREATE OR REPLACE FUNCTION proc_1_hm(id BIGINT)
RETURNS TABLE(tbl_1_hm_f1 VARCHAR (250), tbl_1_hm_f2 int AS $$
SELECT tbl_1_hm_f1, tbl_1_hm_f2
FROM tbl_1_hm
WHERE tbl_1_hm_id = id
$$ LANGUAGE SQL;
--And here is the current query I am running for my results:
SELECT t1.tbl_1_hm_id, proc_1_hm(t1.tbl_1_hm_id) AS t3
FROM tbl_1_hm AS t1
Thanks for having a read. Please if you want to haggle about the semantics of what I am doing by hitting the same table twice or my naming convention --> this is a simplified test.
When a function returns a set of records, you should treat it as a table source:
SELECT t1.tbl_1_hm_id, t3.*
FROM tbl_1_hm AS t1, proc_1_hm(t1.tbl_1_hm_id) AS t3;
Note that functions are implicitly using a LATERAL join (scroll down to sub-sections 4 and 5) so you can use fields from tables listed previously without having to specify an explicit JOIN condition.

Get row to swap tables on a certain condition

I currently have a parent table:
CREATE TABLE members (
member_id SERIAL NOT NULL, UNIQUE, PRIMARY KEY
first_name varchar(20)
last_name varchar(20)
address address (composite type)
contact_numbers varchar(11)[3]
date_joined date
type varchar(5)
);
and two related tables:
CREATE TABLE basic_member (
activities varchar[3])
INHERITS (members)
);
CREATE TABLE full_member (
activities varchar[])
INHERITS (members)
);
If the type is full the details are entered to the full_member table or if type is basic into the basic_member table. What I want is that if I run an update and change the type to basic or full the tuple goes into the corresponding table.
I was wondering if I could do this with a rule like:
CREATE RULE tuple_swap_full
AS ON UPDATE TO full_member
WHERE new.type = 'basic'
INSERT INTO basic_member VALUES (old.member_id, old.first_name, old.last_name,
old.address, old.contact_numbers, old.date_joined, new.type, old.activities);
... then delete the record from the full_member
Just wondering if my rule is anywhere near or if there is a better way.
You don't need
member_id SERIAL NOT NULL, UNIQUE, PRIMARY KEY
A PRIMARY KEY implies UNIQUE NOT NULL automatically:
member_id SERIAL PRIMARY KEY
I wouldn't use hard coded max length of varchar(20). Just use text and add a check constraint if you really must enforce a maximum length. Easier to change around.
Syntax for INHERITS is mangled. The key word goes outside the parens around columns.
CREATE TABLE full_member (
activities text[]
) INHERITS (members);
Table names are inconsistent (members <-> member). I use the singular form everywhere in my test case.
Finally, I would not use a RULE for the task. A trigger AFTER UPDATE seems preferable.
Consider the following
Test case:
Tables:
CREATE SCHEMA x; -- I put everything in a test schema named "x".
-- DROP TABLE x.members CASCADE;
CREATE TABLE x.member (
member_id SERIAL PRIMARY KEY
,first_name text
-- more columns ...
,type text);
CREATE TABLE x.basic_member (
activities text[3]
) INHERITS (x.member);
CREATE TABLE x.full_member (
activities text[]
) INHERITS (x.member);
Trigger function:
Data-modifying CTEs (WITH x AS ( DELETE ..) are the best tool for the purpose. Requires PostgreSQL 9.1 or later.
For older versions, first INSERT then DELETE.
CREATE OR REPLACE FUNCTION x.trg_move_member()
RETURNS trigger AS
$BODY$
BEGIN
CASE NEW.type
WHEN 'basic' THEN
WITH x AS (
DELETE FROM x.member
WHERE member_id = NEW.member_id
RETURNING *
)
INSERT INTO x.basic_member (member_id, first_name, type) -- more columns
SELECT member_id, first_name, type -- more columns
FROM x;
WHEN 'full' THEN
WITH x AS (
DELETE FROM x.member
WHERE member_id = NEW.member_id
RETURNING *
)
INSERT INTO x.full_member (member_id, first_name, type) -- more columns
SELECT member_id, first_name, type -- more columns
FROM x;
END CASE;
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Trigger:
Note that it is an AFTER trigger and has a WHEN condition.
WHEN condition requires PostgreSQL 9.0 or later. For earlier versions, you can just leave it away, the CASE statement in the trigger itself takes care of it.
CREATE TRIGGER up_aft
AFTER UPDATE
ON x.member
FOR EACH ROW
WHEN (NEW.type IN ('basic ','full')) -- OLD.type cannot be IN ('basic ','full')
EXECUTE PROCEDURE x.trg_move_member();
Test:
INSERT INTO x.member (first_name, type) VALUES ('peter', NULL);
UPDATE x.member SET type = 'full' WHERE first_name = 'peter';
SELECT * FROM ONLY x.member;
SELECT * FROM x.basic_member;
SELECT * FROM x.full_member;

SELECT or INSERT a row in one command

I'm using PostgreSQL 9.0 and I have a table with just an artificial key (auto-incrementing sequence) and another unique key. (Yes, there is a reason for this table. :)) I want to look up an ID by the other key or, if it doesn't exist, insert it:
SELECT id
FROM mytable
WHERE other_key = 'SOMETHING'
Then, if no match:
INSERT INTO mytable (other_key)
VALUES ('SOMETHING')
RETURNING id
The question: is it possible to save a round-trip to the DB by doing both of these in one statement? I can insert the row if it doesn't exist like this:
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING id
... but that doesn't give the ID of an existing row. Any ideas? There is a unique constraint on other_key, if that helps.
Have you tried to union it?
Edit - this requires Postgres 9.1:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH new_row AS (
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING *
)
SELECT * FROM new_row
UNION
SELECT * FROM mytable WHERE other_key = 'SOMETHING';
results in:
id | other_key
----+-----------
1 | SOMETHING
(1 row)
No, there is no special SQL syntax that allows you to do select or insert. You can do what Ilia mentions and create a sproc, which means it will not do a round trip fromt he client to server, but it will still result in two queries (three actually, if you count the sproc itself).
using 9.5 i successfully tried this
based on Denis de Bernardy's answer
only 1 parameter
no union
no stored procedure
atomic, thus no concurrency problems (i think...)
The Query:
WITH neworexisting AS (
INSERT INTO mytable(other_key) VALUES('hello 2')
ON CONFLICT(other_key) DO UPDATE SET existed=true -- need some update to return sth
RETURNING *
)
SELECT * FROM neworexisting
first call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|false |
second call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|true |
First create your table ;-)
CREATE TABLE mytable (
id serial NOT NULL,
other_key text NOT NULL,
created timestamptz NOT NULL DEFAULT now(),
existed bool NOT NULL DEFAULT false,
CONSTRAINT mytable_pk PRIMARY KEY (id),
CONSTRAINT mytable_uniq UNIQUE (other_key) --needed for on conflict
);
you can use a stored procedure
IF (SELECT id FROM mytable WHERE other_key = 'SOMETHING' LIMIT 1) < 0 THEN
INSERT INTO mytable (other_key) VALUES ('SOMETHING')
END IF
I have an alternative to Denis answer, that I think is less database-intensive, although a bit more complex:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH table_sel AS (
SELECT id
FROM mytable
WHERE other_key = 'test'
UNION
SELECT NULL AS id
ORDER BY id NULLS LAST
LIMIT 1
), table_ins AS (
INSERT INTO mytable (id, other_key)
SELECT
COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)),
'test'
FROM table_sel
ON CONFLICT (id) DO NOTHING
RETURNING id
)
SELECT * FROM table_ins
UNION ALL
SELECT * FROM table_sel
WHERE id IS NOT NULL;
In table_sel CTE I'm looking for the right row. If I don't find it, I assure that table_sel returns at least one row, with a union with a SELECT NULL.
In table_ins CTE I try to insert the same row I was looking for earlier. COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)) is saying: id could be defined, if so, use it; whereas if id is null, increment the sequence on id and use this new value to insert a row. The ON CONFLICT clause assure
that if id is already in mytable I don't insert anything.
At the end I put everything together with a UNION between table_ins and table_sel, so that I'm sure to take my sweet id value and execute both CTE.
This query needs to search for the value other_key only once, and is a "search this value" not a "check if this value not exists in the table", that is very heavy; in Denis alternative you use other_key in both types of searches. In my query you "check if a value not exists" only on id that is a integer primary key, that, for construction, is fast.
Minor tweak a decade late to Denis's excellent answer:
-- Create the table with a unique constraint
CREATE TABLE mytable (
id serial PRIMARY KEY
, other_key varchar NOT NULL UNIQUE
);
WITH new_row AS (
-- Only insert when we don't find anything, avoiding a table lock if
-- possible.
INSERT INTO mytable ( other_key )
SELECT 'SOMETHING'
WHERE NOT EXISTS (
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
)
RETURNING *
)
(
-- This comes first in the UNION ALL since it'll almost certainly be
-- in the query cache. Marginally slower for the insert case, but also
-- marginally faster for the much more common read-only case.
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
-- Don't check for duplicates to be removed
UNION ALL
-- If we reach this point in iteration, we needed to do the INSERT and
-- lock after all.
SELECT *
FROM new_row
) LIMIT 1 -- Just return whatever comes first in the results and allow
-- the query engine to cut processing short for the INSERT
-- calculation.
;
The UNION ALL tells the planner it doesn't have to collect results for de-duplication. The LIMIT 1 at the end allows the planner to short-circuit further processing/iteration once it knows there's an answer available.
NOTE: There is a race condition present here and in the original answer. If the entry does not already exist, the INSERT will fail with a unique constraint violation. The error can be suppressed with ON CONFLICT DO NOTHING, but the query will return an empty set instead of the new row. This is a difficult problem because getting that info from another transaction would violate the I in ACID.