Auto-partitioning trigger doesn't work as expected - postgresql

I'm trying to implement auto-partitioning of a table
CREATE TABLE incoming_ais_messages (
id uuid NOT NULL,
"source" int4 NOT NULL,
ais_channel varchar(8) NOT NULL,
is_read bool NOT NULL,
"time_stamp" timestamptz NOT null,
address_type varchar(32) NOT NULL,
"text" varchar NOT NULL,
CONSTRAINT incoming_ais_messages_pkey PRIMARY KEY (id,time_stamp)
) partition by range ("time_stamp");
For that I use a function:
create or replace function create_partition() returns trigger as $auto_partition$
begin
raise notice 'create_partion called';
execute 'create table if not exists incoming_ais_messages_partition_' || to_char(now()::date, 'yyyy_mm_dd') || ' partition of incoming_ais_messages
for values from (''' || to_char(now()::date, 'yyyy-mm-dd') || ''') to (''' || to_char((now() + interval '1 day')::date, 'yyyy-mm-dd') || ''');';
return new;
end;
$auto_partition$ language plpgsql;
And a trigger that should call it before any inserts:
create trigger auto_partition
before insert on incoming_ais_messages
for each row
execute procedure create_partition();
However when I insert something like:
INSERT INTO incoming_ais_messages (id, "source", ais_channel, is_read, "time_stamp", address_type, "text")
VALUES('123e4567-e89b-12d3-a456-426614174000'::uuid, 0, 'A', false, now(), 'DIRECT', 'text');
I get ther error:
SQL Error [23514]: ERROR: no partition of relation "incoming_ais_messages" found for row
Detail: Partition key of the failing row contains (time_stamp) = (2022-07-21 18:01:41.787604+03).
After that I created the partition manually:
create table if not exists incoming_ais_messages_partition_1970_01_01 partition of incoming_ais_messages
for values from (now()::date) to ((now() + interval '1 day')::date);
executed the same insert statement and got the error:
SQL Error [55006]: ERROR: cannot CREATE TABLE .. PARTITION OF "incoming_ais_messages" because it is being used by active queries in this session
Where: SQL statement "create table if not exists incoming_ais_messages_partition_2022_07_21 partition of incoming_ais_messages
for values from ('2022-07-21') to ('2022-07-22');"
PL/pgSQL function create_partition() line 4 at EXECUTE
Would be great to know what is wrong here. My solution is based on the approach described here https://evilmartians.com/chronicles/a-slice-of-life-table-partitioning-in-postgresql-databases
(Section: Bonus: how to create partitions)

PostgreSQL wants to know which partition the new rows will go into before it calls BEFORE ROW triggers, so the error is thrown before the CREATE gets a chance to run. (Note that the blog example is using a trigger on one table to create partition for a different table).
Doing what you want is possible (timescaledb extension does it, and you could research how if you want), but do yourself a favor and just pre-create a lot of partitions, and add a note to your calendar to add more in the future (as well as dropping old ones). Or write a cron job to do it.

Related

Get row number of row to be inserted in Postgres trigger that gives no collisions when inserting multiple rows

Given the following (simplified) schema:
CREATE TABLE period (
id UUID NOT NULL DEFAULT uuid_generate_v4(),
name TEXT
);
CREATE TABLE course (
id UUID NOT NULL DEFAULT uuid_generate_v4(),
name TEXT
);
CREATE TABLE registration (
id UUID NOT NULL DEFAULT uuid_generate_v4(),
period_id UUID NOT NULL REFERENCES period(id),
course_id UUID NOT NULL REFERENCES course(id),
inserted_at timestamptz NOT NULL DEFAULT now()
);
I now want to add a new column client_ref, which identifies a registration unique within a period, but consists of only a 4-character string. I want to use pg_hashids - which requires a unique integer input - to base the column value on.
I was thinking of setting up a trigger on the registration table that runs on inserting a new row. I came up with the following:
CREATE OR REPLACE FUNCTION set_client_ref()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
DECLARE
next_row_number integer;
BEGIN
WITH rank AS (
SELECT
period.id AS period_id,
row_number() OVER (PARTITION BY period.id ORDER BY registration.inserted_at)
FROM
registration
JOIN period ON registration.period_id = period.id ORDER BY
period.id,
row_number
)
SELECT
COALESCE(rank.row_number, 0) + 1 INTO next_row_number
FROM
period
LEFT JOIN rank ON (rank.period_id = period.id)
WHERE
period.id = NEW.period_id
ORDER BY
rank.row_number DESC
LIMIT 1;
NEW.client_ref = id_encode (next_row_number);
RETURN NEW;
END
$function$
;
The trigger is set-up like: CREATE TRIGGER set_client_ref BEFORE INSERT ON registration FOR EACH ROW EXECUTE FUNCTION set_client_ref();
This works as expected when inserting a single row to registration, but if I insert multiple within one statement, they end up having the same client_ref. I can reason about why this happens (the rows don't know about each other's existence, so they assume they're all just next in line when retrieving their row_order), but I am not sure what a way is to prevent this. I tried setting up the trigger as an AFTER trigger, but it resulted in the same (duplicated) behaviour.
What would be a better way to get the lowest possible, unique integer for the rows to be inserted (to base the hash function on) that also works when inserting multiple rows?

Trying to automatically insert into a table using triggers in POSTGRESQL

I am trying to make a trigger and function that inserts into the table purchases the values which have been inserted into the table customers.
Columns of table customers
1-customer_id serial PK references customer_id in purchases
2-c_name VARCHAR
3-amount DOUBLE PRECISION
Columns of table purchases
1- customer_id serial PK 2- amount DOUBLE PRECISION
The code for the trigger and the function:
CREATE OR REPLACE FUNCTION auto_insert_purchases()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS
$body$
BEGIN
insert into purchases(customer_id,purchase) values
(NEW.customer_id,NEW.purchase);
END
$body$
CREATE TRIGGER tr_auto_insert_purchases
AFTER INSERT ON customers
EXECUTE PROCEDURE auto_insert_purchases()
As you can see its supposed to take the new row data and insert it into the table but after doing and insertion to customers like this:
insert into customers values(2,'Stewie Griffin',4.99);
I get this error message:
ERROR: null value in column "customer_id" of relation "purchases" violates not-null
constraint
DETAIL: Failing row contains (null, null).
CONTEXT: SQL statement "insert into purchases(customer_id,purchase) values
(NEW.customer_id,NEW.purchase)"
auto_insert_purchases() PL/pgSQL fonksiyonu, 3. satır, SQL ifadesi içinde
SQL state: 23502
Why does the failing row contain null? Am I using the NEW keyword incorrectly?
CREATE TABLE customers (
customer_id int4 NULL,
c_name varchar NULL,
amount float8 NULL
);
CREATE TABLE purchases (
customer_id int4 NULL,
amount float8 NULL
);
CREATE OR REPLACE FUNCTION auto_insert_purchases()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
insert into purchases(customer_id, amount) values
(NEW.customer_id, NEW.amount);
return new;
END;
$function$
;
create trigger tr_auto_insert_purchases
after insert ON customers
for each row
execute procedure auto_insert_purchases();
insert into customers(customer_id, c_name, amount) values (2,'Stewie Griffin', 4.99);
select * from purchases;
-- Result:
customer_id|amount|
-----------+------+
2| 4.99|
May be you just forgot to write for each row statement after CREATE TRIGGER tr_auto_insert_purchases AFTER INSERT ON customers

Postgres SQL Table Partitioning by Range Timestamp not Unique key Collision

I have an issue when trying to modify and existing PostgreSQL (version 13.3) table to support partitioning it gets stuck when inserting the new data from the old table because the inserted timestamp in some cases may not be unique, so it fails on execution.
The partition forces me to create the primary to be the range (timestamp) value. You can see the new table definition below:
CREATE TABLE "UserFavorites_master" (
"Id" int4 NOT NULL GENERATED BY DEFAULT AS IDENTITY,
"UserId" int4 NOT NULL,
"CardId" int4 NOT NULL,
"CreationDate" timestamp NOT NULL,
CONSTRAINT "PK_UserFavorites_CreationDate" PRIMARY KEY ("CreationDate")
) partition by range ("CreationDate");
The original table didn't have a constraint on timestamp to either be unique or a primary key nor would we particularly want that but that seems to be a requirement of partitioning. Looking for alternatives or good ideas to solve the issue.
You can see the full code below:
alter table "UserFavorites" rename to "UserFavorites_old";
CREATE TABLE "UserFavorites_master" (
"Id" int4 NOT NULL GENERATED BY DEFAULT AS IDENTITY,
"UserId" int4 NOT NULL,
"CardId" int4 NOT NULL,
"CreationDate" timestamp NOT NULL,
CONSTRAINT "PK_UserFavorites_CreationDate" PRIMARY KEY ("CreationDate")
) partition by range ("CreationDate");
-- Frome Reference: https://stackoverflow.com/a/53600145/1190540
create or replace function createPartitionIfNotExists(forDate timestamp) returns void
as $body$
declare yearStart date := date_trunc('year', forDate);
declare yearEndExclusive date := yearStart + interval '1 year';
declare tableName text := 'UserFavorites_Partition_' || to_char(forDate, 'YYYY');
begin
if to_regclass(tableName) is null then
execute format('create table %I partition of "UserFavorites_master" for values from (%L) to (%L)', tableName, yearStart, yearEndExclusive);
-- Unfortunatelly Postgres forces us to define index for each table individually:
--execute format('create unique index on %I (%I)', tableName, 'UserId'::text);
end if;
end;
$body$ language plpgsql;
do
$$
declare rec record;
begin
loop
for rec in 2015..2030 loop
-- ... and create a partition for them
perform createPartitionIfNotExists(to_date(rec::varchar,'yyyy'));
end loop;
end
$$;
create or replace view "UserFavorites" as select * from "UserFavorites_master";
insert into "UserFavorites" ("Id", "UserId", "CardId", "CreationDate") select * from "UserFavorites_old";
It fails on the Last line with the following error:
SQL Error [23505]: ERROR: duplicate key value violates unique constraint "UserFavorites_Partition_2020_pkey"
Detail: Key ("CreationDate")=(2020-11-02 09:38:54.997) already exists.
ERROR: duplicate key value violates unique constraint "UserFavorites_Partition_2020_pkey"
Detail: Key ("CreationDate")=(2020-11-02 09:38:54.997) already exists.
ERROR: duplicate key value violates unique constraint "UserFavorites_Partition_2020_pkey"
Detail: Key ("CreationDate")=(2020-11-02 09:38:54.997) already exists.
No, partitioning doesn't force you to create a primary key. Just omit that line, and your example should work.
However, you definitely always should have a primary key on your tables. Otherwise, you can end up with identical rows, which is a major headache in a relational database. You might have to clean up your data.
#Laurenz Albe is correct, it seems I also have the ability to specify multiple keys though it may affect performance as referenced here Multiple Keys Performance, even indexing the creation date of the partition seemed to make the performance worse.
You can see a reference to multiple keys below, you mileage may vary.
CREATE TABLE "UserFavorites_master" (
"Id" int4 NOT NULL GENERATED BY DEFAULT AS IDENTITY,
"UserId" int4 NOT NULL,
"CardId" int4 NOT NULL,
"CreationDate" timestamp NOT NULL,
CONSTRAINT "PK_UserFavorites" PRIMARY KEY ("Id", "CreationDate")
) partition by range ("CreationDate");

I'm having an issue with this code when I try to input values into the transactions table

So I'm setting up a schema in which I can input transactions of a journal entry independent of each other but also that rely on each other (mainly to ensure that debits = credits). I set up the tables, function, and trigger. Then, when I try to input values into the transactions table, I get the error below. I'm doing all of this in pgAdmin4.
CREATE TABLE transactions (
transactions_id UUID PRIMARY KEY DEFAULT uuid_generate_v1(),
entry_id INTEGER NOT NULL,
post_date DATE NOT NULL,
account_id INTEGER NOT NULL,
contact_id INTEGER NULL,
description TEXT NOT NULL,
reference_id UUID NULL,
document_id UUID NULL,
amount NUMERIC(12,2) NOT NULL
);
CREATE TABLE entries (
id UUID PRIMARY KEY,
test_date DATE NOT NULL,
balance NUMERIC(12,2)
CHECK (balance = 0.00)
);
CREATE OR REPLACE FUNCTION transactions_biut()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
EXECUTE 'INSERT INTO entries (id,test_date,balance)
SELECT
entry_id,
post_date,
SUM(amount) AS ''balance''
FROM
transactions
GROUP BY
entry_id;';
END;
$$;
CREATE TRIGGER transactions_biut
BEFORE INSERT OR UPDATE ON transactions
FOR EACH ROW EXECUTE PROCEDURE transactions_biut();
INSERT INTO transactions (
entry_id,
post_date,
account_id,
description,
amount
)
VALUES
(
'1',
'2019-10-01',
'101',
'MISC DEBIT: PAID FOR FACEBOOK ADS',
-200.00
),
(
'1',
'2019-10-01',
'505',
'MISC DEBIT: PAID FOR FACEBOOK ADS',
200.00
);
After I execute this input, I get the following error:
ERROR: column "id" of relation "entries" does not exist
LINE 1: INSERT INTO entries (id,test_date,balance)
^
QUERY: INSERT INTO entries (id,test_date,balance)
SELECT
entry_id,
post_date,
SUM(amount) AS "balance"
FROM
transactions
GROUP BY
entry_id;
CONTEXT: PL/pgSQL function transactions_biut() line 2 at EXECUTE
SQL state: 42703
There are a few problems here:
You're not returning anything from the trigger function => should probably be return NEW or return OLD since you're not modifying anything
Since you're executing the trigger before each row, it's bound to fail for any transaction that isn't 0 => maybe you want a deferred constraint trigger?
You're not grouping by post_date, so your select should fail
You've defined entry_id as INTEGER, but entries.id is of type UUID
Also note that this isn't really going to scale (you're summing up all transactions of all days, so this will get slower and slower...)
#chirs I was able to figure out how to create a functioning solution using statement-level triggers:
CREATE TABLE transactions (
transactions_id UUID PRIMARY KEY DEFAULT uuid_generate_v1(),
entry_id INTEGER NOT NULL,
post_date DATE NOT NULL,
account_id INTEGER NOT NULL,
contact_id INTEGER NULL,
description TEXT NOT NULL,
reference_id UUID NULL,
document_id UUID NULL,
amount NUMERIC(12,2) NOT NULL
);
CREATE TABLE entries (
entry_id INTEGER PRIMARY KEY,
post_date DATE NOT NULL,
balance NUMERIC(12,2),
CHECK (balance = 0.00)
);
CREATE OR REPLACE FUNCTION transactions_entries() RETURNS TRIGGER AS $$
BEGIN
IF (TG_OP = 'DELETE') THEN
INSERT INTO entries
SELECT o.entry_id, o.post_date, SUM(o.amount) FROM old_table o GROUP BY o.entry_id, o.post_date;
ELSIF (TG_OP = 'UPDATE') THEN
INSERT INTO entries
SELECT o.entry_id, n.post_date, SUM(n.amount) FROM new_table n, old_table o GROUP BY o.entry_id, n.post_date;
ELSIF (TG_OP = 'INSERT') THEN
INSERT INTO entries
SELECT n.entry_id,n.post_date, SUM(n.amount) FROM new_table n GROUP BY n.entry_id, n.post_date;
END IF;
RETURN NULL; -- result is ignored since this is an AFTER trigger
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER transactions_ins
AFTER INSERT ON transactions
REFERENCING NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE transactions_entries();
CREATE TRIGGER transactions_upd
AFTER UPDATE ON transactions
REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE transactions_entries();
CREATE TRIGGER transactions_del
AFTER DELETE ON transactions
REFERENCING OLD TABLE AS old_table
FOR EACH STATEMENT EXECUTE PROCEDURE transactions_entries();
Any thoughts on optimization?

After using a trigger - ERROR: null value in column "group_id" violates not-null constraint

I'm using PostgreSQL 8.1.23 on x86_64-redhat-linux-gnu
I have to write a database for reserving seats on language courses and there's a requirement there should be a trigger, which will check whether lector, we're trying to write into new group, has any other group at the same time. I have such table:
CREATE TABLE groups (
group_id serial PRIMARY KEY,
lang varchar(3) NOT NULL,
level varchar(3),
seats int4,
lector int4,
start time,
day varchar(3),
FOREIGN KEY (language) REFERENCES languages(lang) ON UPDATE CASCADE ON DELETE CASCADE,
FOREIGN KEY (lector) REFERENCES lectors(lector_id) ON UPDATE CASCADE ON DELETE SET NULL);
and such trigger:
CREATE FUNCTION if_available () RETURNS trigger AS '
DECLARE
r groups%rowtype;
c groups%rowtype;
BEGIN
FOR r IN SELECT * FROM groups WHERE r.lector=NEW.lector ORDER BY group_id LOOP
IF (r.start = NEW.start AND r.day = NEW.day) THEN
RAISE NOTICE ''Lector already has a group at this time!'';
c = NULL;
EXIT;
ELSE
c = NEW;
END IF;
END LOOP;
RETURN c;
END;
' LANGUAGE 'plpgsql';
CREATE TRIGGER if_available_t
BEFORE INSERT OR UPDATE ON grupy
FOR EACH ROW EXECUTE PROCEDURE if_available();
After inserting the new row to a table groups, eg.:
INSERT groups (lang, level, seats, lector, start, day) values ('ger','A-2',12,2,'11:45','wed');
I get an error like this:
ERROR: null value in column "group_id" violates not-null constraint
Without this trigger everything is OK. Could anybody help me how to make it work?
Finally, I have solved it! After BEGIN there should be c = NEW;, because when table groups is empty at the beginning, FOR loop doesn't run and NULL is returned. Also I have changed the condition in FOR loop for: ...WHERE lector = NEW.lector.... And finally, I have changed the condition in IF for IF (r.group_id <> NEW.group_id AND r.start = NEW.start AND r.day = NEW.day) THEN..., because I haven't wanted to run this trigger before one particular update. Maybe this will be helpful for someone :)