Trigger taking time to insert data in postgres (column count 300) - postgresql

I have created a trigger, it is taking more time while inserting multiple records.
Insetting 1 or 2 records is working. But if the records are more than 1000 then not fast, still running query from 2 hours.
I have created only 15 columns in below table. My actual table has 300 columns.
Is any other way to insert multiple records on the trigger table.?
Table
create table patients (
id serial,
name character varying (50),
daily varchar (8),
month varchar (6),
quarter varchar (6),
registration_date timestamp,
age integer,
address text,
country text,
city text,
phone_number integer,
Education text,
Occupation text,
Marital_Status text,"E-mail" text
);
trigger function
CREATE OR REPLACE FUNCTION update_data_after_insert_data_into_patients()
RETURNS trigger AS
$$BEGIN
update patients t1
set quarter=t2.quarter
from (SELECT (extract(year from registration_date)::text || 'Q' || extract(quarter from registration_date)::text) as quarter,registration_date
from patients) t2 where t1.registration_date =t2.registration_date;
update patients t1
set month=t2.month
from (select (extract(year from registration_date)::text || '' || to_char(registration_date,'MM')) as month,registration_date
from patients) t2 where t1.registration_date =t2.registration_date;
update patients t1
set daily=t2.daily
from (select extract(year from registration_date) || '' ||to_char(registration_date,'MM') || '' || to_char(registration_date,'DD') as daily,registration_date
from patients) t2 where t1.registration_date =t2.registration_date;
RETURN new;
END;
$$ LANGUAGE plpgsql;
Trigger definition
create TRIGGER trigger_update_data_after_insert_patients
AFTER insert ON patients
FOR EACH ROW
EXECUTE PROCEDURE update_data_after_insert_data_into_patients();
insert multiple records into patients table
INSERT INTO public.patients
("name", daily, "month", quarter, registration_date, age, address, country, city, phone_number, education, occupation, marital_status, "E-mail")
VALUES('Adam', '20221215', '202212', '2022Q4', '2022-08-17 19:01:10-08', 24, '', '', '', 1245578, '', '', '', '');
select statement
select * from patients;

You are updating all rows in the table with the same registration date as the one provided in the insert three times - just to calculate those generated columns.
You can do this more efficiently by assigning the generated values to the NEW record in a BEFORE trigger.
CREATE OR REPLACE FUNCTION update_data_after_insert_data_into_patients()
RETURNS trigger AS
$$
BEGIN
new.quarter := to_char(new.registration_date, 'yyyy"Q"q');
new.month := to_char(new.registration_date, 'yyyy mm');
new.daily := to_char(new.registration_date, 'yyyymmdd');
RETURN new;
END;
$$
LANGUAGE plpgsql;
create TRIGGER trigger_update_data_after_insert_patients
BEFORE insert ON patients
FOR EACH ROW
EXECUTE PROCEDURE update_data_after_insert_data_into_patients();
However I don't see the need to store these calculated values when you can easily format the registration_date when retrieving the data. I would get rid of those columns and the trigger and create a VIEW that does the formatting.

Related

insert total row in a table using trigger PostgreSQL

this is my first post in stackoverflow,
I've a problem with my insert trigger,
the case is I have two tables
table raw_data
create table raw_data(
id varchar(20),
name varchar(20),
duration integer
)
and table total_duration
create table total_duration(
id_raw varchar(20),
name varchar(20),
duration integer
)
if I insert data to raw_data using trigger
begin
insert into total_duration(id_raw, name, duration)
select new.id, new.name, new.duration
from raw_data;
return new;
end;
the question is, how to insert row which stored with results of sum of duration, where the trigger above only insert the data from raw_data,
I ve trying to make another trigger to insert sum row, but the result Is not as expected
the expected result is like this
id_raw
name
total
1001
a
450
1001
b
450
1002
c
450
1001
a,b
900
Thank you for your help

Weighted Random Selection

Please. I have two tables with the most common first and last names. Each table has basically two fields:
Tables
CREATE TABLE "common_first_name" (
"first_name" text PRIMARY KEY, --The text representing the name
"ratio" numeric NOT NULL, -- the % of how many times it occurs compared to the other names.
"inserted_at" timestamp WITH time zone DEFAULT timezone('utc'::text, now()) NOT NULL,
"updated_at" timestamp WITH time zone DEFAULT timezone('utc'::text, now()) NOT NULL
);
CREATE TABLE "common_last_name" (
"last_name" text PRIMARY KEY, --The text representing the name
"ratio" numeric NOT NULL, -- the % of how many times it occurs compared to the other names.
"inserted_at" timestamp WITH time zone DEFAULT timezone('utc'::text, now()) NOT NULL,
"updated_at" timestamp WITH time zone DEFAULT timezone('utc'::text, now()) NOT NULL
);
P.S: The TOP 1 name occurs only ~ 1.8% of the time. The tables have 1000 rows each.
Function (Pseudo, not READY)
CREATE OR REPLACE FUNCTION create_sample_data(p_number_of_records INT)
RETURNS VOID
AS $$
DECLARE
SUM_OF_WEIGHTS CONSTANT INT := 100;
BEGIN
FOR i IN 1..coalesce(p_number_of_records, 0) LOOP
--Get the random first and last name but taking in consideration their probability (RATIO)round(random()*SUM_OF_WEIGHTS);
--create_person (random_first_name || ' ' || random_last_name);
END LOOP;
END
$$
LANGUAGE plpgsql VOLATILE;
P.S.: The sum of all ratios for each name (per table) sums up to 100%.
I want to run a function N times and get a name and a surname to create sample data... both tables have 1000 rows each.
The sample size can be anywhere from 1000 full names to 1000000 names, so if there is a "fast" way of doing this random weighted function, even better.
Any suggestion of how to do it in PL/PGSQL?
I am using PG 13.3 on SUPABASE.IO.
Thanks
Given the small input dataset, it's straightforward to do this in pure SQL. Use CTEs to build lower & upper bound columns for each row in each of the common_FOO_name tables, then use generate_series() to generate sets of random numbers. Join everything together, and use the random value between the bounds as the WHERE clause.
with first_names_weighted as (
select first_name,
sum(ratio) over (order by first_name) - ratio as lower_bound,
sum(ratio) over (order by first_name) as upper_bound
from common_first_name
),
last_names_weighted as (
select last_name,
sum(ratio) over (order by last_name) - ratio as lower_bound,
sum(ratio) over (order by last_name) as upper_bound
from common_last_name
),
randoms as (
select random() * (select sum(ratio) from common_first_name) as f_random,
random() * (select sum(ratio) from common_last_name) as l_random
from generate_series(1, 32)
)
select r, first_name, last_name
from randoms r
cross join first_names_weighted f
cross join last_names_weighted l
where f.lower_bound <= r.f_random and r.f_random <= f.upper_bound
and l.lower_bound <= r.l_random and r.l_random <= l.upper_bound;
Change the value passed to generate_series() to control how many names to generate. If it's important that it be a function, you can just use a LANGAUGE SQL function definition to parameterize that number:
https://www.db-fiddle.com/f/mmGQRhCP2W1yfhZTm1yXu5/3

Postgres exclude using gist across different tables

I have 2 tables like this
drop table if exists public.table_1;
drop table if exists public.table_2;
CREATE TABLE public.table_1 (
id serial NOT NULL,
user_id bigint not null,
status varchar(255) not null,
date_start date NOT NULL,
date_end date NULL
);
CREATE TABLE public.table_2 (
id serial NOT NULL,
user_id bigint not null,
status varchar(255) not null,
date_start date NOT NULL,
date_end date NULL
);
alter table public.table_1
add constraint my_constraint_1
EXCLUDE USING gist (user_id with =, daterange(date_start, date_end, '[]') WITH &&)
where (status != 'deleted');
alter table public.table_2
add constraint my_constraint_2
EXCLUDE USING gist (user_id with =, daterange(date_start, date_end, '[]') WITH &&)
where (status != 'deleted');
Every table contains rows which are related to a user, and all the rows of the same user cannot overlap in range. In addition, some rows may be logically deleted, so I added a where condition.
So far it's working w/o problems, but the 2 constraints work separately for each table.
I need to create a constraint which cover the 2 set of tables, so that a single daterange (of the same user and not deleted), may appaer only once across the 2 different tables.
Does the EXCLUDE notation be extended to work with different tables or do I need to check it with a trigger? If the trigger is the answer, which is the simplier way to do this? Create a temporary table with the union of the 2, add the constraint on it and check if fails?
Starting from #Laurenz Albe suggestion, this is what I made
-- #################### SETUP SAMPLE TABLES ####################
drop table if exists public.table_1;
drop table if exists public.table_2;
CREATE TABLE public.table_1 (
id serial NOT NULL,
user_id bigint not null,
status varchar(255) not null,
date_start date NOT NULL,
date_end date NULL
);
CREATE TABLE public.table_2 (
id serial NOT NULL,
user_id bigint not null,
status varchar(255) not null,
date_start date NOT NULL,
date_end date NULL
);
alter table public.table_1
add constraint my_constraint_1
EXCLUDE USING gist (user_id with =, daterange(date_start, date_end, '[]') WITH &&)
where (status != 'deleted');
alter table public.table_2
add constraint my_constraint_2
EXCLUDE USING gist (user_id with =, daterange(date_start, date_end, '[]') WITH &&)
where (status != 'deleted');
-- #################### SETUP TRIGGER ####################
create or REPLACE FUNCTION check_date_overlap_trigger_hook()
RETURNS trigger as
$body$
DECLARE
l_table text;
l_sql text;
l_row record;
begin
l_table := TG_ARGV[0];
l_sql := format('
select *
from public.%s as t
where
t.user_id = %s -- Include only records of the same user
and t.status != ''deleted'' -- Include only records that are active
', l_table, new.user_id);
for l_row in execute l_sql
loop
IF daterange(l_row.date_start, COALESCE(l_row.date_end, 'infinity'::date)) && daterange(new.date_start, COALESCE(new.date_end, 'infinity'::date))
THEN
RAISE EXCEPTION 'Date interval is overlapping with another one in table %', l_table
USING HINT = 'You can''t have the same interval across table1 AND table2';
END IF;
end loop;
RETURN NEW;
end
$body$
LANGUAGE plpgsql;
-- #################### INSTALL TRIGGER ####################
create trigger check_date_overlap
BEFORE insert or update
ON public.table_1
FOR EACH row
EXECUTE PROCEDURE check_date_overlap_trigger_hook('table_2');
create trigger check_date_overlap
BEFORE insert or update
ON public.table_2
FOR EACH row
EXECUTE PROCEDURE check_date_overlap_trigger_hook('table_1');
-- #################### INSERT DEMO ROWS ####################
insert into public.table_1 (user_id, status, date_start, date_end) values (1, 'active', '2020-12-10', '2020-12-20');
insert into public.table_1 (user_id, status, date_start, date_end) values (1, 'deleted', '2020-12-15', '2020-12-25');
insert into public.table_1 (user_id, status, date_start, date_end) values (2, 'active', '2020-12-10', '2020-12-20');
insert into public.table_1 (user_id, status, date_start, date_end) values (2, 'deleted', '2020-12-15', '2020-12-25');
-- This will fail for overlap on the same table
-- insert into public.table_1 (user_id, status, date_start, date_end) values (1, 'active', '2020-12-15', '2020-12-25');
-- This will fail as the user 1 already has an overlapping period on table 1
-- insert into public.table_2 (user_id, status, date_start, date_end) values (1, 'active', '2020-12-15', '2020-12-25');
-- This will fail as the user 1 already has an overlapping period on table 1
insert into public.table_2 (user_id, status, date_start, date_end) values (1, 'deleted', '2020-12-15', '2020-12-25');
update public.table_2 set status = 'active' where id = 1;
select 'table_1' as src_table, * from public.table_1
union
select 'table_2', * from public.table_2
You can probably use a trigger, but triggers are always vulnerable to race conditions (unless you are using SERIALIZABLE isolation).
If your tables really have the same columns, why don't you use a single table (and perhaps add a type column to disambiguate)?

Copying records in a table with self referencing ids

I have a table with records which can reference another row in the same table so there is a parent-child relationship between rows in the same table.
What I am trying to achieve is to create the same data for another user so that they can see and manage their own version of this structure through the web ui where these rows are displayed as a tree.
Problem is when I bulk insert this data by only changing user_id, I lose the relation between rows because the parent_id values will be invalid for these new records and they should be updated as well with the newly generated ids.
Here is what I tried: (did not work)
Iterate over main_table
copy-paste the static values after each
do another insert on a temp table for holding old and new ids
update old parent_ids with new ids after loop ends
My attempt at doing such thing(last step is not included here)
create or replace function test_x()
returns void as
$BODY$
declare
r RECORD;
userId int8;
rowPK int8;
begin
userId := (select 1)
create table if not exists id_map (old_id int8, new_id int8);
create table if not exists temp_table as select * from main_table;
for r in select * from temp_table
loop
rowPK := insert into main_table(id, user_id, code, description, parent_id)
values(nextval('hibernate_sequence'), userId, r.code, r.description, r.parent_id) returning id;
insert into id_map (old_id, new_id) values (r.id, rowPK);
end loop;
end
$BODY$
language plpgsql;
My PostgreSQL version is 9.6.14.
DDL below for testing.
create table main_table(
id bigserial not null,
user_id int8 not null,
code varchar(3) not null,
description varchar(100) not null,
parent_id int8 null,
constraint mycompkey unique (user_id, code, parent_id),
constraint mypk primary key (id),
constraint myfk foreign key (parent_id) references main_table(id)
);
insert into main_table (id, user_id, code, description, parent_id)
values(0, 0, '01', 'Root row', null);
insert into main_table (id, user_id, code, description, parent_id)
values(1, 0, '001', 'Child row 1', 0);
insert into main_table (id, user_id, code, description, parent_id)
values(2, 0, '002', 'Child row 2', 0);
insert into main_table (id, user_id, code, description, parent_id)
values(3, 0, '002', 'Grand child row 1', 2);
How to write a procedure to accomplish this?
Thanks in advance.
It appears your task is coping all data for a given user to another while maintaining the hierarchical relationship within the new rows. The following accomplishes that.
It begins creating a new copy of the existing rows with the new user_id, including the old row parent_id. That will be user in the next (update) step.
The CTE logically begins with the new rows which have parent_id and joins to the old parent row. From here it joins to the old parent row to the new parent row using the code and description. At that point we have the new id along with the new parent is. At that point just update with those values. Actually for the update the CTE need only select those two columns, but I've left the intermediate columns so you trace through if you wish.
create or replace function copy_user_data_to_user(
source_user_id bigint
, target_user_id bigint
)
returns void
language plpgsql
as $$
begin
insert into main_table ( user_id,code, description, parent_id )
select target_user_id, code, description, parent_id
from main_table
where user_id = source_user_id ;
with n_list as
(select mt.id, mt.code, mt.description, mt.parent_id
, mtp.id p_id,mtp.code p_code,mtp.description p_des
, mtc.id c_id, mtc.code c_code, mtc.description c_description
from main_table mt
join main_table mtp on mtp.id = mt.parent_id
join main_table mtc on ( mtc.user_id = target_user_id
and mtc.code = mtp.code
and mtc.description = mtp.description
)
where mt.parent_id is not null
and mt.user_id = target_user_id
)
update main_table mt
set parent_id = n_list.c_id
from n_list
where mt.id = n_list.id;
return;
end ;
$$;
-- test
select * from copy_user_data_to_user(0,1);
select * from main_table;
CREATE TABLE 'table name you want to create' SELECT * FROM myset
but new table and myset column name should be equal and you can also
use inplace of * to column name but column name exist in new table
othwerwise getting errors

I'm having an issue with this code when I try to input values into the transactions table

So I'm setting up a schema in which I can input transactions of a journal entry independent of each other but also that rely on each other (mainly to ensure that debits = credits). I set up the tables, function, and trigger. Then, when I try to input values into the transactions table, I get the error below. I'm doing all of this in pgAdmin4.
CREATE TABLE transactions (
transactions_id UUID PRIMARY KEY DEFAULT uuid_generate_v1(),
entry_id INTEGER NOT NULL,
post_date DATE NOT NULL,
account_id INTEGER NOT NULL,
contact_id INTEGER NULL,
description TEXT NOT NULL,
reference_id UUID NULL,
document_id UUID NULL,
amount NUMERIC(12,2) NOT NULL
);
CREATE TABLE entries (
id UUID PRIMARY KEY,
test_date DATE NOT NULL,
balance NUMERIC(12,2)
CHECK (balance = 0.00)
);
CREATE OR REPLACE FUNCTION transactions_biut()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
EXECUTE 'INSERT INTO entries (id,test_date,balance)
SELECT
entry_id,
post_date,
SUM(amount) AS ''balance''
FROM
transactions
GROUP BY
entry_id;';
END;
$$;
CREATE TRIGGER transactions_biut
BEFORE INSERT OR UPDATE ON transactions
FOR EACH ROW EXECUTE PROCEDURE transactions_biut();
INSERT INTO transactions (
entry_id,
post_date,
account_id,
description,
amount
)
VALUES
(
'1',
'2019-10-01',
'101',
'MISC DEBIT: PAID FOR FACEBOOK ADS',
-200.00
),
(
'1',
'2019-10-01',
'505',
'MISC DEBIT: PAID FOR FACEBOOK ADS',
200.00
);
After I execute this input, I get the following error:
ERROR: column "id" of relation "entries" does not exist
LINE 1: INSERT INTO entries (id,test_date,balance)
^
QUERY: INSERT INTO entries (id,test_date,balance)
SELECT
entry_id,
post_date,
SUM(amount) AS "balance"
FROM
transactions
GROUP BY
entry_id;
CONTEXT: PL/pgSQL function transactions_biut() line 2 at EXECUTE
SQL state: 42703
There are a few problems here:
You're not returning anything from the trigger function => should probably be return NEW or return OLD since you're not modifying anything
Since you're executing the trigger before each row, it's bound to fail for any transaction that isn't 0 => maybe you want a deferred constraint trigger?
You're not grouping by post_date, so your select should fail
You've defined entry_id as INTEGER, but entries.id is of type UUID
Also note that this isn't really going to scale (you're summing up all transactions of all days, so this will get slower and slower...)
#chirs I was able to figure out how to create a functioning solution using statement-level triggers:
CREATE TABLE transactions (
transactions_id UUID PRIMARY KEY DEFAULT uuid_generate_v1(),
entry_id INTEGER NOT NULL,
post_date DATE NOT NULL,
account_id INTEGER NOT NULL,
contact_id INTEGER NULL,
description TEXT NOT NULL,
reference_id UUID NULL,
document_id UUID NULL,
amount NUMERIC(12,2) NOT NULL
);
CREATE TABLE entries (
entry_id INTEGER PRIMARY KEY,
post_date DATE NOT NULL,
balance NUMERIC(12,2),
CHECK (balance = 0.00)
);
CREATE OR REPLACE FUNCTION transactions_entries() RETURNS TRIGGER AS $$
BEGIN
IF (TG_OP = 'DELETE') THEN
INSERT INTO entries
SELECT o.entry_id, o.post_date, SUM(o.amount) FROM old_table o GROUP BY o.entry_id, o.post_date;
ELSIF (TG_OP = 'UPDATE') THEN
INSERT INTO entries
SELECT o.entry_id, n.post_date, SUM(n.amount) FROM new_table n, old_table o GROUP BY o.entry_id, n.post_date;
ELSIF (TG_OP = 'INSERT') THEN
INSERT INTO entries
SELECT n.entry_id,n.post_date, SUM(n.amount) FROM new_table n GROUP BY n.entry_id, n.post_date;
END IF;
RETURN NULL; -- result is ignored since this is an AFTER trigger
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER transactions_ins
AFTER INSERT ON transactions
REFERENCING NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE transactions_entries();
CREATE TRIGGER transactions_upd
AFTER UPDATE ON transactions
REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE transactions_entries();
CREATE TRIGGER transactions_del
AFTER DELETE ON transactions
REFERENCING OLD TABLE AS old_table
FOR EACH STATEMENT EXECUTE PROCEDURE transactions_entries();
Any thoughts on optimization?