Need row returned from upsert - postgresql

I have a table that I need to upsert. If the row already exists then I want to update and return the row. If the row doesn't already exist then I need to insert and return the row. With the query I have below I get the row returned on insert, but not on update.
Table "main.message_account_seen"
Column | Type | Modifiers
----------------+--------------------------+-------------------------------------------------------------------
id | integer | not null default nextval('message_account_seen_id_seq'::regclass)
field_config_id | integer | not null
edit_stamp | timestamp with time zone | not null default now()
audit_stamp | timestamp with time zone |
message_id | integer | not null
account_id | integer |
Here's the sql.
with upsert as (
update message_account_seen set (message_id, account_id, field_config_id ) = (1, 60, 980)
where message_id = 1 and account_id = 60 and field_config_id = 980 returning *
)
insert into message_account_seen (message_id, account_id, field_config_id)
select 1, 60, 980
where not exists (select message_id, account_id, field_config_id from upsert) returning *;
I can't do a postgres function, it needs to be handled in a regular sql query. Also, there is no constraint on the table for uniqueness of row otherwise I would use on conflict. But I'm willing to scrap this query and go with something else if need be.
These are the results when I run the query, and then run it again. You can see that on the insert or first run I get the row returned. However on subsequent runs of the query I get 0 rows returned. I know that it's working because the edit_stamp increases in time. That's a good thing.
# with upsert as (
update message_account_seen set (message_id, account_id, field_config_id ) = (1, 60, 980)
where message_id = 1 and account_id = 60 and field_config_id = 980 returning *
)
insert into message_account_seen (message_id, account_id, field_config_id)
select 1, 60, 980
where not exists (select message_id, account_id, field_config_id from upsert) returning *;
id | field_config_id | edit_stamp | audit_stamp | message_id | account_id
--+-----------------+--------------------------------+-------------+------------+------------
38 | 980 | 09/27/2016 11:43:22.153908 MDT | | 1 | 60
(1 row)
INSERT 0 1
# with upsert as (
update message_account_seen set (message_id, account_id, field_config_id ) = (1, 60, 980)
where message_id = 1 and account_id = 60 and field_config_id = 980 returning *
)
insert into message_account_seen (message_id, account_id, field_config_id)
select 1, 60, 980
where not exists (select message_id, account_id, field_config_id from upsert) returning *;
id | field_config_id | edit_stamp | audit_stamp | message_id | account_id
----+-----------------+------------+-------------+------------+------------
(0 rows)
INSERT 0 0

When the update succeeds its result is not returned in your query. This does it:
with upsert as (
update message_account_seen
set (message_id, account_id, field_config_id ) = (1, 60, 980)
where (message_id, account_id, field_config_id) = (1, 60, 980)
returning *
), ins as (
insert into message_account_seen (message_id, account_id, field_config_id)
select 1, 60, 980
where not exists (select 1 from upsert)
returning *
)
select * from upsert
union all
select * from ins
;

The best option here is to use the new upsert that postgres 9.5 offers, but this requires a unique index on (message_id, account_id, field_config_id). It can be used like this:
INSERT INTO message_account_seen(message_id, account_id, field_config_id)
VALUES (1, 60, 980)
ON CONFLICT (message_id, account_id, field_config_id)
DO UPDATE
SET edit_stamp=now() -- adjust here
RETURNING *;
This is probably the fastest way to do this and guarantees that nothing unexpected will happen if two processes try to upsert into the same table at the same time (your approach doesn't guarantee that).

Related

Upsert always updates postgresql even not matching where clause

I have sql query:
INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE SET version = books.version + 1, updated = CURRENT_TIMESTAMP;
however even if the where clause is not true, the row is updated. Here's example: https://dbfiddle.uk/CPHvZDm3
I can't understand what is wrong here.
The location of the WHERE clause is the issue. Corrected statement below.
CREATE TABLE books (
id int4 NOT NULL,
version int8 NOT NULL,
updated timestamp NULL,
CONSTRAINT books_pkey PRIMARY KEY (id)
);
INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < CURRENT_TIMESTAMP;
select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < CURRENT_TIMESTAMP from books where id = 12;
id | version | updated | current_timestamp | ?column?
----+---------+----------------------------+--------------------------------+----------
12 | 0 | 11/13/2022 10:34:06.028222 | 11/13/2022 10:34:06.055526 PST | f
INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < CURRENT_TIMESTAMP;
select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < CURRENT_TIMESTAMP from books where id = 12
id | version | updated | current_timestamp | ?column?
----+---------+----------------------------+--------------------------------+----------
12 | 0 | 11/13/2022 10:34:06.028222 | 11/13/2022 10:34:08.668121 PST | f
From docs INSERT:
and conflict_action is one of:
DO NOTHING
DO UPDATE SET { column_name = { expression | DEFAULT } |
( column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT } [, ...] ) |
( column_name [, ...] ) = ( sub-SELECT )
} [, ...]
[ WHERE condition ]

How do I simplify this complex query?

The referred to table definitions are:
CREATE TABLE message
(
id BIGINT PRIMARY KEY,
user_id BIGINT NOT NULL,
guild_id BIGINT NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP NOT NULL,
);
CREATE TABLE d_user
(
id BIGINT PRIMARY KEY
);
CREATE TABLE vcsession
(
id BIGINT PRIMARY KEY,
user_id BIGINT NOT NULL,
guild_id BIGINT NOT NULL,
duration INTEGER NOT NULL,
began_at TIMESTAMP NOT NULL,
last_active TIMESTAMP NOT NULL
);
The expected result set of this query should consist of a row for each user in the guild provided with columns for:
user_id: The user id
message_count: The number of messages sent by each user within an interval defined by two datetimes (this should be 0 if no messages were sent)
voice_time: Sum of each voice session's duration last active within an interval defined by two datetimes (this should be 0 if no voice sessions were active)
active_days: Days in which the user either sent a message or had an active voice session (this should be 0 if user wasn't active in the time interval provided)
This is the query I wrote:
select
activity.user_id,
message_count,
voice_time,
coalesce(active_days, 0) as active_days
from (
select
d_user.id as user_id,
coalesce(messages.count, 0) as message_count,
coalesce(vcsessions.duration, 0) as voice_time
from d_user left join (
select
user_id,
count(*) as "count"
from message where (
(guild_id = $1) and
(created_at >= $2) and
(created_at < $3)
) group by user_id
) as messages on messages.user_id = d_user.id left join (
select
user_id,
sum(duration) as "duration"
from vcsession where (
(guild_id = $1) and
(last_active >= $2) and
(last_active < $3)
) group by user_id
) as vcsessions on vcsessions.user_id = d_user.id
) as activity left join (
select user_id, count(*) as active_days from (
select * from (
select
user_id,
(cast(extract(EPOCH from message.created_at) as int) - cast(extract(EPOCH from $2) as int)) / 86400 as day_offset
from message where (
(created_at >= $2) and
(created_at < $3)
) group by user_id, day_offset
) as message_days union (
select
user_id,
(cast(extract(EPOCH from vcsession.last_active) as int) - cast(extract(EPOCH from $2) as int)) / 86400 as day_offset
from vcsession where (
(last_active >= $2) and
(last_active < $3)
) group by user_id, day_offset
)
) as active_days group by user_id
) as active_days on active_days.user_id = activity.user_id
And this is what the result set looks like:
|user_id |message_count |voice_time |active_days |
|--------------------|--------------------|--------------------|--------------------|
|1 |752 |694 |1 |
|2 |12 |543 |2 |
|3 |323 |7163 |4 |
|4 |56 |870 |3 |
It looks reasonably readable to me.
Maybe you could pull the two subselects in the forst FROM clause into the main query:
SELECT ...
FROM ((SELECT ...) AS messages
LEFT JOIN
(SELECT ...) AS vcsessions
) AS ...
LEFT JOIN ...
could become
SELECT ...
FROM (SELECT ...) AS messages
LEFT JOIN
(SELECT ...) AS vcsessions
LEFT JOIN ...

PostgreSQL recursive parent/child query

I'm having some trouble working out the PostgreSQL documentation for recursive queries, and wonder if anyone might be able to offer a suggestion for the following.
Here's the data:
Table "public.subjects"
Column | Type | Collation | Nullable | Default
-------------------+-----------------------------+-----------+----------+--------------------------------------
id | bigint | | not null | nextval('subjects_id_seq'::regclass)
name | character varying | | |
Table "public.subject_associations"
Column | Type | Collation | Nullable | Default
------------+-----------------------------+-----------+----------+--------------------------------------------------
id | bigint | | not null | nextval('subject_associations_id_seq'::regclass)
parent_id | integer | | |
child_id | integer | | |
Here, a "subject" may have many parents and many children. Of course, at the top level a subject has no parents and at the bottom no children. For example:
parent_id | child_id
------------+------------
2 | 3
1 | 4
1 | 3
4 | 8
4 | 5
5 | 6
6 | 7
What I'm looking for is starting with a child_id to get all the ancestors, and with a parent_id, all the descendants. Therefore:
parent_id 1 -> children 3, 4, 5, 6, 7, 8
parent_id 2 -> children 3
child_id 3 -> parents 1, 2
child_id 4 -> parents 1
child_id 7 -> parents 6, 5, 4, 1
Though there seem to be a lot of examples of similar things about I'm having trouble making sense of them, so any suggestions I can try out would be welcome.
To get all children for subject 1, you can use
WITH RECURSIVE c AS (
SELECT 1 AS id
UNION ALL
SELECT sa.child_id
FROM subject_associations AS sa
JOIN c ON c.id = sa. parent_id
)
SELECT id FROM c;
CREATE OR REPLACE FUNCTION func_finddescendants(start_id integer)
RETURNS SETOF subject_associations
AS $$
DECLARE
BEGIN
RETURN QUERY
WITH RECURSIVE t
AS
(
SELECT *
FROM subject_associations sa
WHERE sa.id = start_id
UNION ALL
SELECT next.*
FROM t prev
JOIN subject_associations next ON (next.parentid = prev.id)
)
SELECT * FROM t;
END;
$$ LANGUAGE PLPGSQL;
Try this
--- Table
-- DROP SEQUENCE public.data_id_seq;
CREATE SEQUENCE "data_id_seq"
INCREMENT 1
MINVALUE 1
MAXVALUE 9223372036854775807
START 1
CACHE 1;
ALTER TABLE public.data_id_seq
OWNER TO postgres;
CREATE TABLE public.data
(
id integer NOT NULL DEFAULT nextval('data_id_seq'::regclass),
name character varying(50) NOT NULL,
label character varying(50) NOT NULL,
parent_id integer NOT NULL,
CONSTRAINT data_pkey PRIMARY KEY (id),
CONSTRAINT data_name_parent_id_unique UNIQUE (name, parent_id)
)
WITH (
OIDS=FALSE
);
INSERT INTO public.data(id, name, label, parent_id) VALUES (1,'animal','Animal',0);
INSERT INTO public.data(id, name, label, parent_id) VALUES (5,'birds','Birds',1);
INSERT INTO public.data(id, name, label, parent_id) VALUES (6,'fish','Fish',1);
INSERT INTO public.data(id, name, label, parent_id) VALUES (7,'parrot','Parrot',5);
INSERT INTO public.data(id, name, label, parent_id) VALUES (8,'barb','Barb',6);
--- Function
CREATE OR REPLACE FUNCTION public.get_all_children_of_parent(use_parent integer) RETURNS integer[] AS
$BODY$
DECLARE
process_parents INT4[] := ARRAY[ use_parent ];
children INT4[] := '{}';
new_children INT4[];
BEGIN
WHILE ( array_upper( process_parents, 1 ) IS NOT NULL ) LOOP
new_children := ARRAY( SELECT id FROM data WHERE parent_id = ANY( process_parents ) AND id <> ALL( children ) );
children := children || new_children;
process_parents := new_children;
END LOOP;
RETURN children;
END;
$BODY$
LANGUAGE plpgsql VOLATILE COST 100;
ALTER FUNCTION public.get_all_children_of_parent(integer) OWNER TO postgres
--- Test
SELECT * FROM data WHERE id = any(get_all_children_of_parent(1))
SELECT * FROM data WHERE id = any(get_all_children_of_parent(5))
SELECT * FROM data WHERE id = any(get_all_children_of_parent(6))

Select row position in filtered and ordered row list PostgreSQL

I got this query,
SELECT s.pos
FROM (SELECT t.guild_id, t.user_id
ROW_NUMBER() OVER(ORDER BY t.reputation DESC) AS pos
FROM users t) s
WHERE (s.guild_id, s.user_id) = ($2, $3)
that gets a user's "rank" in a guild, but I want to filter the results by entries that are in an array of t.user_id values (like {'1', '64', '83'}) and have this affect the resulting pos value. I found FILTER and WITHIN GROUP, but I'm not sure how to fit one of those into this query. How would I do that?
Here's the full table if that helps at all:
Table "public.users"
Column | Type | Collation | Nullable | Default
------------+-----------------------+-----------+----------+---------
guild_id | character varying(20) | | not null |
user_id | character varying(20) | | not null |
reputation | real | | not null | 0
Indexes:
"users_pkey" PRIMARY KEY, btree (guild_id, user_id)
Why not select on those first?
WITH UsersWeCareAbout AS (
SELECT * FROM users u WHERE u.user_id = ANY(subgroup_array)
), RepUsers AS (
SELECT t.guild_id, t.user_id, ROW_NUMBER() OVER(ORDER BY t.reputation DESC) AS pos
FROM UsersWeCareAbout t
) SELECT s.pos FROM RepUsers s WHERE (s.guild_id, s.user_id) = ($2, $3)
(untested if only because I didn't really have enough context to test with)

Postgres insert trigger fills id

I have a BEFORE trigger which should fill record's root ID which, of course, would point to rootmost entry. I.e:
id | parent_id | root_id
-------------------------
a | null | a
a.1 | a | a
a.1.1 | a.1 | a
b | null | b
If entry's parent_id is null, it would point to record itself.
Question is - inside BEFORE INSERT trigger, if parent_id is null, can I or should I fetch next sequence value, fill id and root_id in order to avoid filling root_id in AFTER trigger?
According to your own definition:
if entry's parent_id is null, it would point to record itself
then you have to do:
if new.parent_id is null then
new.root_id = new.id ;
else
WITH RECURSIVE p (parent_id, level) AS
(
-- Base case
SELECT
parent_id, 0 as level
FROM
t
WHERE
t.id = new.id
UNION ALL
SELECT
t.parent_id, level + 1
FROM
t JOIN p ON t.id = p.parent_id
WHERE
t.parent_id IS NOT NULL
)
SELECT
parent_id
INTO
new.root_id
FROM
p
ORDER BY
level DESC
LIMIT
1 ;
end if ;
RETURN new ;