POSTGRES INSERT/UPDATE ON CONFLICT using WITH CTE - postgresql

I have a table like below. I am trying to merge into this table based on the value in a CTE. But when I try to update the table when there is a conflict, it cannot get the value in CTE
CREATE TABLE IF NOT EXISTS master_config_details
(
master_config_id INT NOT NULL,
account_id INT NOT NULL,
date_value TIMESTAMP(3) NULL,
number_value BIGINT NULL,
string_value VARCHAR(50) NULL,
row_status SMALLINT NOT NULL,
created_date TIMESTAMP(3) NOT NULL,
modified_date TIMESTAMP(3) NULL,
CONSTRAINT pk_master_config_details PRIMARY KEY (master_config_id, account_id, row_status)
);
INSERT INTO master_config_details VALUES (
1, 11, NULL,100,NULL, 0, '2020-11-18 12:01:18', '2020-11-18 12:02:31');
select * from master_config_details;`
Now using a cte I want to insert/update records in this table. Below is the code I am using to do the same. When the record already exist in the table I want to update the table based on the data_type_id value in the cte (cte_input_data.data_type_id ) but it fails with the error.
SQL Error [42703]: ERROR: column excluded.data_type_id does not exist
what it should achieve is
if cte_input_data.data_type_id = 1 update master_config_details set date_value = cte.value
if cte_input_data.data_type_id = 2 update master_config_details set number_value = cte.value
if cte_input_data.data_type_id = 3 update master_config_details set string_value = cte.value
The below code should do an update to the table master_config_details.number_value = 22 as there is already a record in that combination (master_config_id, account_id, row_status) which is (1,11,1) ( run this to see the record select * from master_config_details;) but its throwing an error instead
SQL Error [42703]: ERROR: column excluded.data_type_id does not exist
WITH cte_input_data AS (
select
1 AS master_config_id
,11 AS account_id
,2 AS data_type_id
,'22' AS value
,1 AS row_status)
INSERT INTO master_config_details
SELECT
cte.master_config_id
,cte.account_id
,CASE WHEN cte.data_type_id = 1 THEN cte.value::timestamp(3) ELSE NULL END AS date_time_value
,CASE WHEN cte.data_type_id = 2 THEN cte.value::integer ELSE NULL END AS number_value
,CASE WHEN cte.data_type_id = 3 THEN cte.value ELSE NULL END AS string_value
,1
,NOW() AT TIME ZONE 'utc'
,NOW() AT TIME ZONE 'utc'
FROM cte_input_data cte
ON CONFLICT (master_config_id,account_id,row_status)
DO UPDATE SET
date_value = CASE WHEN excluded.data_type_id = 1 THEN excluded.date_time_value::timestamp(3) ELSE NULL END
,number_value = CASE WHEN excluded.data_type_id = 2 THEN excluded.number_value::integer ELSE NULL END
,string_value = CASE WHEN excluded.data_type_id = 3 THEN excluded.string_value ELSE NULL END
,modified_date = NOW() AT TIME ZONE 'utc';

Special excluded table is used to reference values originally proposed for insertion.
So you’re getting this error because this column doesn’t exist in your target table, and so in special excluded table. It exists only in your cte.
As a workaround you can select it from cte using nested select in on conflict statement.

Related

Is it possible to find duplicating records in two columns simultaneously in PostgreSQL?

I have the following database schema (oversimplified):
create sequence partners_partner_id_seq;
create table partners
(
partner_id integer default nextval('partners_partner_id_seq'::regclass) not null primary key,
name varchar(255) default NULL::character varying,
company_id varchar(20) default NULL::character varying,
vat_id varchar(50) default NULL::character varying,
is_deleted boolean default false not null
);
INSERT INTO partners(name, company_id, vat_id) VALUES('test1','1010109191191', 'BG1010109191192');
INSERT INTO partners(name, company_id, vat_id) VALUES('test2','1010109191191', 'BG1010109191192');
INSERT INTO partners(name, company_id, vat_id) VALUES('test3','3214567890102', 'BG1010109191192');
INSERT INTO partners(name, company_id, vat_id) VALUES('test4','9999999999999', 'GE9999999999999');
I am trying to figure out how to return test1, test2 (because the company_id column value duplicates vertically) and test3 (because the vat_id column value duplicates vertically as well).
To put it in other words - I need to find duplicating company_id and vat_id records and group them together, so that test1, test2 and test3 would be together, because they duplicate by company_id and vat_id.
So far I have the following query:
SELECT *
FROM (
SELECT *, LEAD(row, 1) OVER () AS nextrow
FROM (
SELECT *, ROW_NUMBER() OVER (w) AS row
FROM partners
WHERE is_deleted = false
AND ((company_id != '' AND company_id IS NOT null) OR (vat_id != '' AND vat_id IS NOT NULL))
WINDOW w AS (PARTITION BY company_id, vat_id ORDER BY partner_id DESC)
) x
) y
WHERE (row > 1 OR nextrow > 1)
AND is_deleted = false
This successfully shows all company_id duplicates, but does not appear to show vat_id ones - test3 row is missing. Is this possible to be done within one query?
Here is a db-fiddle with the schema, data and predefined query reproducing my result.
You can do this with recursion, but depending on the size of your data you may want to iterate, instead.
The trick is to make the name just another match key instead of treating it differently than the company_id and vat_id:
create table partners (
partner_id integer generated always as identity primary key,
name text,
company_id text,
vat_id text,
is_deleted boolean not null default false
);
insert into partners (name, company_id, vat_id) values
('test1','1010109191191', 'BG1010109191192'),
('test2','1010109191191', 'BG1010109191192'),
('test3','3214567890102', 'BG1010109191192'),
('test4','9999999999999', 'GE9999999999999'),
('test5','3214567890102', 'BG8888888888888'),
('test6','2983489023408', 'BG8888888888888')
;
I added a couple of test cases and left in the lone partner.
with recursive keys as (
select partner_id,
array['n_'||name, 'c_'||company_id, 'v_'||vat_id] as matcher,
array[partner_id] as matchlist,
1 as size
from partners
), matchers as (
select *
from keys
union all
select p.partner_id, c.matcher,
p.matchlist||c.partner_id as matchlist,
p.size + 1
from matchers p
join keys c
on c.matcher && p.matcher
and not p.matchlist #> array[c.partner_id]
), largest as (
select distinct sort(matchlist) as matchlist
from matchers m
where not exists (select 1
from matchers
where matchlist #> m.matchlist
and size > m.size)
-- and size > 1
)
select *
from largest
;
matchlist
{1,2,3,5,6}
{4}
fiddle
EDIT UPDATE
Since recursion did not perform, here is an iterative example in plpgsql that uses a temporary table:
create temporary table match1 (
partner_id int not null,
group_id int not null,
matchkey uuid not null
);
create index on match1 (matchkey);
create index on match1 (group_id);
insert into match1
select partner_id, partner_id, md5('n_'||name)::uuid from partners
union all
select partner_id, partner_id, md5('c_'||company_id)::uuid from partners
union all
select partner_id, partner_id, md5('v_'||vat_id)::uuid from partners;
do $$
declare _cnt bigint;
begin
loop
with consolidate as (
select group_id,
min(group_id) over (partition by matchkey) as new_group_id
from match1
), minimize as (
select group_id, min(new_group_id) as new_group_id
from consolidate
group by group_id
), doupdate as (
update match1
set group_id = m.new_group_id
from minimize m
where m.group_id = match1.group_id
and m.new_group_id != match1.group_id
returning *
)
select count(*) into _cnt from doupdate;
if _cnt = 0 then
exit;
end if;
end loop;
end;
$$;
updated fiddle

trigger to set date automatic after update

Some background info: i have a table named defects which has column named status_id and another column named date_closed ,i want to set date_closed after status_id has been updated
i already try to do this using after update trigger with the following code:
after update on eba_bt_sw_defects
for each row
declare
l_status number(20) := null;
begin
select status_id into l_status from eba_bt_sw_defects D,eba_bt_status S where D.status_id = S.id;
if l_status in ( select id from eba_bt_status where is_open = 'N' and NVL(is_enhancement,'N')='N') then
:NEW.DATE_CLOSED := LOCALTIMESTAMP ;
end if;
end;
but an error occured ( subquery not allowed in this contextCompilation failed)
i want a help
A couple of things that need fixing in your code:
In a trigger do not select from the table the trigger you're on. This will probably raise a ORA-04091: table name is mutating, trigger/function may not see it error.
IF l_variable IN (SELECT ...) is not a valid oracle syntax. It raises PLS-00405: subquery not allowed in this context
I don't have your data so here is a similar example:
drop table todos;
drop table statuses;
-- create tables
create table statuses (
id number generated by default on null as identity
constraint statuses_id_pk primary key,
status varchar2(60 char),
is_open varchar2(1 char) constraint statuses_is_open_ck
check (is_open in ('Y','N'))
)
;
create table todos (
id number generated by default on null as identity
constraint todos_id_pk primary key,
name varchar2(255 char) not null,
close_date timestamp with local time zone,
status_id number
constraint todos_status_id_fk
references statuses on delete cascade
)
;
-- load data
insert into statuses (id, status, is_open ) values (1, 'OPEN', 'Y' );
insert into statuses (id, status, is_open ) values (2, 'COMPLETE', 'N' );
insert into statuses (id, status, is_open ) values (3, 'ON HOLD', 'Y' );
insert into statuses (id, status, is_open ) values (4, 'CANCELLED', 'N' );
commit;
insert into todos (name, close_date, status_id ) values ( 'Y2 Security Review', NULL, 1 );
-- triggers
CREATE OR REPLACE TRIGGER todos_biu BEFORE
INSERT OR UPDATE ON todos
FOR EACH ROW
DECLARE
l_dummy NUMBER;
BEGIN
SELECT
1
INTO l_dummy
FROM
statuses
WHERE
is_open = 'N' AND
id = :new.status_id;
:new.close_date := localtimestamp;
EXCEPTION
WHEN no_data_found THEN
-- I'm assuming you want close_date to NULL if todo is re-opened.
:new.close_date := NULL;
END todos_biu;
/
update todos set status_id = 2;
select * from todos;
id name close_date status_id
1 Y2 Security Review 11-MAY-22 05.27.04.987117000 PM 2

Postgres exclude using gist across different tables

I have 2 tables like this
drop table if exists public.table_1;
drop table if exists public.table_2;
CREATE TABLE public.table_1 (
id serial NOT NULL,
user_id bigint not null,
status varchar(255) not null,
date_start date NOT NULL,
date_end date NULL
);
CREATE TABLE public.table_2 (
id serial NOT NULL,
user_id bigint not null,
status varchar(255) not null,
date_start date NOT NULL,
date_end date NULL
);
alter table public.table_1
add constraint my_constraint_1
EXCLUDE USING gist (user_id with =, daterange(date_start, date_end, '[]') WITH &&)
where (status != 'deleted');
alter table public.table_2
add constraint my_constraint_2
EXCLUDE USING gist (user_id with =, daterange(date_start, date_end, '[]') WITH &&)
where (status != 'deleted');
Every table contains rows which are related to a user, and all the rows of the same user cannot overlap in range. In addition, some rows may be logically deleted, so I added a where condition.
So far it's working w/o problems, but the 2 constraints work separately for each table.
I need to create a constraint which cover the 2 set of tables, so that a single daterange (of the same user and not deleted), may appaer only once across the 2 different tables.
Does the EXCLUDE notation be extended to work with different tables or do I need to check it with a trigger? If the trigger is the answer, which is the simplier way to do this? Create a temporary table with the union of the 2, add the constraint on it and check if fails?
Starting from #Laurenz Albe suggestion, this is what I made
-- #################### SETUP SAMPLE TABLES ####################
drop table if exists public.table_1;
drop table if exists public.table_2;
CREATE TABLE public.table_1 (
id serial NOT NULL,
user_id bigint not null,
status varchar(255) not null,
date_start date NOT NULL,
date_end date NULL
);
CREATE TABLE public.table_2 (
id serial NOT NULL,
user_id bigint not null,
status varchar(255) not null,
date_start date NOT NULL,
date_end date NULL
);
alter table public.table_1
add constraint my_constraint_1
EXCLUDE USING gist (user_id with =, daterange(date_start, date_end, '[]') WITH &&)
where (status != 'deleted');
alter table public.table_2
add constraint my_constraint_2
EXCLUDE USING gist (user_id with =, daterange(date_start, date_end, '[]') WITH &&)
where (status != 'deleted');
-- #################### SETUP TRIGGER ####################
create or REPLACE FUNCTION check_date_overlap_trigger_hook()
RETURNS trigger as
$body$
DECLARE
l_table text;
l_sql text;
l_row record;
begin
l_table := TG_ARGV[0];
l_sql := format('
select *
from public.%s as t
where
t.user_id = %s -- Include only records of the same user
and t.status != ''deleted'' -- Include only records that are active
', l_table, new.user_id);
for l_row in execute l_sql
loop
IF daterange(l_row.date_start, COALESCE(l_row.date_end, 'infinity'::date)) && daterange(new.date_start, COALESCE(new.date_end, 'infinity'::date))
THEN
RAISE EXCEPTION 'Date interval is overlapping with another one in table %', l_table
USING HINT = 'You can''t have the same interval across table1 AND table2';
END IF;
end loop;
RETURN NEW;
end
$body$
LANGUAGE plpgsql;
-- #################### INSTALL TRIGGER ####################
create trigger check_date_overlap
BEFORE insert or update
ON public.table_1
FOR EACH row
EXECUTE PROCEDURE check_date_overlap_trigger_hook('table_2');
create trigger check_date_overlap
BEFORE insert or update
ON public.table_2
FOR EACH row
EXECUTE PROCEDURE check_date_overlap_trigger_hook('table_1');
-- #################### INSERT DEMO ROWS ####################
insert into public.table_1 (user_id, status, date_start, date_end) values (1, 'active', '2020-12-10', '2020-12-20');
insert into public.table_1 (user_id, status, date_start, date_end) values (1, 'deleted', '2020-12-15', '2020-12-25');
insert into public.table_1 (user_id, status, date_start, date_end) values (2, 'active', '2020-12-10', '2020-12-20');
insert into public.table_1 (user_id, status, date_start, date_end) values (2, 'deleted', '2020-12-15', '2020-12-25');
-- This will fail for overlap on the same table
-- insert into public.table_1 (user_id, status, date_start, date_end) values (1, 'active', '2020-12-15', '2020-12-25');
-- This will fail as the user 1 already has an overlapping period on table 1
-- insert into public.table_2 (user_id, status, date_start, date_end) values (1, 'active', '2020-12-15', '2020-12-25');
-- This will fail as the user 1 already has an overlapping period on table 1
insert into public.table_2 (user_id, status, date_start, date_end) values (1, 'deleted', '2020-12-15', '2020-12-25');
update public.table_2 set status = 'active' where id = 1;
select 'table_1' as src_table, * from public.table_1
union
select 'table_2', * from public.table_2
You can probably use a trigger, but triggers are always vulnerable to race conditions (unless you are using SERIALIZABLE isolation).
If your tables really have the same columns, why don't you use a single table (and perhaps add a type column to disambiguate)?

Postgresql not choosing rows grouping

I have query. There is a construction like this example: (online demo)
You will see the in result created_at field. I have to use query the created_at field. So I have to use it in select created_at. I don't want to use it created_at field in select. Because, there are millions of records in the deposits table. How can i escape this problem?
(Note: I have many table to query, like "deposits" table. this is just a short example.)
create table payment_methods
(
payment_method_id bigserial not null
constraint payment_methods_pkey
primary key
);
create table currencies_of_payment_methods
(
copm_id bigserial not null
constraint currencies_of_payment_methods_pkey
primary key,
payment_method_id integer not null
);
create table deposits
(
deposit_id bigserial not null
constraint deposits_pkey
primary key,
amount numeric(18,2) not null,
copm_id integer not null,
created_at timestamp(0)
);
INSERT INTO payment_methods (payment_method_id) VALUES (1);
INSERT INTO payment_methods (payment_method_id) VALUES (2);
INSERT INTO currencies_of_payment_methods (copm_id, payment_method_id) VALUES (1, 1);
INSERT INTO deposits (amount, copm_id, created_at) VALUES (100, 1, '2020-09-10 08:49:37');
INSERT INTO deposits (amount, copm_id, created_at) VALUES (200, 1, '2020-09-10 08:49:37');
INSERT INTO deposits (amount, copm_id, created_at) VALUES (40, 1, '2020-09-10 08:49:37');
Query:
SELECT payment_methods.payment_method_id,
deposit_copm_id.deposit_copm_id,
manuel_deposit_amount.manuel_deposit_amount,
manuel_deposit_amount.created_at
FROM payment_methods
CROSS JOIN lateral
(
SELECT currencies_of_payment_methods.copm_id AS deposit_copm_id
FROM currencies_of_payment_methods
WHERE currencies_of_payment_methods.payment_method_id = payment_methods.payment_method_id) deposit_copm_id
CROSS JOIN lateral
(
SELECT sum(deposits.amount) AS manuel_deposit_amount,
array_agg(deposits.created_at) AS created_at
FROM deposits
WHERE deposits.copm_id = deposit_copm_id.deposit_copm_id) manuel_deposit_amount
WHERE payment_methods.payment_method_id = 1

How to do multiple columns update on different where condition using PostgreSQL Upsert Using INSERT ON CONFLICT statement

Suppose I have a table like this
create schema test;
CREATE TABLE test.customers (
customer_id serial PRIMARY KEY,
name VARCHAR UNIQUE,
email VARCHAR NOT NULL,
active bool NOT NULL DEFAULT TRUE,
is_active_datetime TIMESTAMP(3) NOT NULL DEFAULT'1900-01-01T00:00:00.000Z'::timestamp(3)
updated_datetime TIMESTAMP(3) NOT NULL DEFAULT '1900-01-01T00:00:00.000Z'::timestamp(3),
);
Now If i want to update email on conflict name
WHERE $tableName.updated_datetime < excluded.updated_datetime
and i want to update is_active_datetime on conflict name but that condition for this update is where active flag has changed.
WHERE customer.active != excluded.active
basically want to track when active status is changed. so can I do that in single statement like this
Initial insert :
insert INTO test.customers (NAME, email)
VALUES
('IBM', 'contact#ibm.com'),
(
'Microsoft',
'contact#microsoft.com'
),
(
'Intel',
'contact#intel.com'
);
To achieve my purpose I am trying something like this :
select * from test.customers;
INSERT INTO customers (name, email)
VALUES
(
'Microsoft',
'hotline#microsoft.com'
)
ON CONFLICT (name)
DO
UPDATE
SET customers.email = EXCLUDED.email
WHERE $tableName.updated_datetime < excluded.updated_datetime
on CONFLICT (name)
do
update
set is_active_datetime = current_timestamp()
WHERE customer.active != excluded.active ;
Is it possible to do this ? How to do this using this method.
You could update multiple columns with CASE conditions in a single DO UPDATE clause.
INSERT INTO customers (
name
,email
,updated_datetime
)
VALUES (
'Microsoft'
,'hotline#microsoft.com'
,now()
) ON CONFLICT(name) DO
UPDATE
SET email = CASE
WHEN customers.updated_datetime < excluded.updated_datetime
THEN excluded.email
ELSE customers.email --default when condition not satisfied
END
,is_active_datetime = CASE
WHEN customers.active != excluded.active
THEN current_timestamp
ELSE customers.is_active_datetime
END;
Demo