Race condition in partitioning with dynamic table creation

Race condition in partitioning with dynamic table creation - postgresql

I'm trying to implement table partitioning with dynamic table creation using BEFORE INSERT trigger to create new tables and indexes when necesarry using following solution:
create table mylog (
mylog_id serial not null primary key,
ts timestamp(0) not null default now(),
data text not null
);
CREATE OR REPLACE FUNCTION mylog_insert() RETURNS trigger AS
$BODY$
DECLARE
_name text;
_from timestamp(0);
_to timestamp(0);
BEGIN
SELECT into _name 'mylog_'||replace(substring(date_trunc('day', new.ts)::text from 0 for 11), '-', '');
IF NOT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name=_name) then
SELECT into _from date_trunc('day', new.ts)::timestamp(0);
SELECT into _to _from + INTERVAL '1 day';
EXECUTE 'CREATE TABLE '||_name||' () INHERITS (mylog)';
EXECUTE 'ALTER TABLE '||_name||' ADD CONSTRAINT ts_check CHECK (ts >= '||quote_literal(_from)||' AND ts < '||quote_literal(_to)||')';
EXECUTE 'CREATE INDEX '||_name||'_ts_idx on '||_name||'(ts)';
END IF;
EXECUTE 'INSERT INTO '||_name||' (ts, data) VALUES ($1, $2)' USING
new.ts, new.data;
RETURN null;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER mylog_insert
BEFORE INSERT
ON mylog
FOR EACH ROW
EXECUTE PROCEDURE mylog_insert();
Everything works as expected but each day when concurrent INSERT statements are being fired for the first time that day, one of them fails trying to "create table that already exists". I suspect that this is caused by the triggers being fired concurrently and both trying to create new table and only one can succeed.
I could be using CREATE TABLE IF NOT EXIST but I cannot detect the outcome so I cannot reliably create constraints and indexes.
What can I do to avoid such problem? Is there any way to signal the fact that the table has been already created to other concurrent triggers? Or maybe there is a way of knowing if CREATE TABLE IF NOT EXISTS created new table or not?

What I do is create a pgAgent job to run every day and create 3 months of tables ahead of time.
CREATE OR REPLACE FUNCTION avl_db.create_alltables()
RETURNS numeric AS
$BODY$
DECLARE
rec record;
BEGIN
FOR rec IN
SELECT date_trunc('day', i::timestamp without time zone) as table_day
FROM generate_series(now()::date,
now()::date + '3 MONTH'::interval,
'1 DAY'::interval) as i
LOOP
PERFORM avl_db.create_table (rec.table_day);
END LOOP;
PERFORM avl_db.avl_partition(now()::date,
now()::date + '3 MONTH'::interval);
RETURN 0;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION avl_db.create_alltables()
OWNER TO postgres;
create_table is very similar to your CREATE TABLE code
avl_partition update the BEFORE INSERT TRIGGER but I saw you do that part with dynamic query. Will have to check again that.
Also I see you are doing inherit, but you are missing a very important CONSTRAINT
CONSTRAINT route_sources_20170601_event_time_check CHECK (
event_time >= '2017-06-01 00:00:00'::timestamp without time zone
AND event_time < '2017-06-02 00:00:00'::timestamp without time zone
)
This improve the query a lot when doing a search for event_time because doesn't have to check every table.
See how doesn't check all tables for the month:

Eventually I wrapped CREATE TABLE in BEGIN...EXCEPTION block that catches duplicate_table exception - this did the trick, but creating the tables upfront in a cronjob is much better approach performance-wise.
CREATE OR REPLACE FUNCTION mylog_insert() RETURNS trigger AS
$BODY$
DECLARE
_name text;
_from timestamp(0);
_to timestamp(0);
BEGIN
SELECT into _name 'mylog_'||replace(substring(date_trunc('day', new.ts)::text from 0 for 11), '-', '');
IF NOT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name=_name) then
SELECT into _from date_trunc('day', new.ts)::timestamp(0);
SELECT into _to _from + INTERVAL '1 day';
BEGIN
EXECUTE 'CREATE TABLE '||_name||' () INHERITS (mylog)';
EXECUTE 'ALTER TABLE '||_name||' ADD CONSTRAINT ts_check CHECK (ts >= '||quote_literal(_from)||' AND ts < '||quote_literal(_to)||')';
EXECUTE 'CREATE INDEX '||_name||'_ts_idx on '||_name||'(ts)';
EXCEPTION WHEN duplicate_table THEN
RAISE NOTICE 'table exists -- ignoring';
END;
END IF;
EXECUTE 'INSERT INTO '||_name||' (ts, data) VALUES ($1, $2)' USING
new.ts, new.data;
RETURN null;
END;
$BODY$
LANGUAGE plpgsql;

Related

Create partition table using execute

I would like to create N partition tables for the last N days. I have created a table like the following
create table metrics.my_table (
id bigserial NOT NULL primary key,
...
logdate date NOT NULL
) PARTITION BY LIST (logdate);
Then I have the following function to create those tables:
CREATE OR REPLACE function metrics.create_my_partitions(init_date numeric default 30, current_date_parameter timestamp default current_date)
returns void as $$
DECLARE
partition_date TEXT;
partition_name TEXT;
begin
for cnt in 0..init_date loop
partition_date := to_char((current_date_parameter - (cnt * interval '1 day')),'YYYY-MM-DD');
raise notice 'cnt: %', cnt;
raise notice 'partition_date: %', partition_date;
partition_name := 'my_table_' || partition_date;
raise notice 'partition_name: %', partition_name;
EXECUTE format('CREATE table if not exists metrics.%I PARTITION OF metrics.my_table for VALUES IN ($1)', partition_name) using partition_date;
end loop;
END
$$
LANGUAGE plpgsql;
select metrics.create_my_partitions(30, current_date);
But it throws the following error in the EXECUTE format line:
SQL Error [42P02]: ERROR: there is no parameter $1
Any idea on how to create those tables?

The EXECUTE ... USING ... option only works for data values in DML commands (SELECT,INSERT, etc.). Since CREATE TABLE is a DDL command, use a parameter in format():
execute format(
'create table if not exists metrics.%I partition of metrics.my_table for values in (%L)',
partition_name, partition_date);

New columns can't be updated if a trigger BEFORE UPDATE is triggered

I came across a strange behavior (at least for me) with PostgreSQL and a trigger BEFORE UPDATE.
I have a table witch has an updated_at column witch is set by a BEFORE UPDATE trigger.
I need to add new columns to this table and set their values with an UPDATE query (not with DEFAULT).
It works just fine excepts when i do an UPDATE juste before adding those columns.
Here's an example :
ALTER TABLE my_schema.my_table ADD COLUMN new_column varchar(50);
UPDATE my_schema.my_table SET new_column = 'new_column_update' WHERE id = xxxxxx;
This script works fine.
But if i do an UPDATE before :
UPDATE my_schema.my_table SET other_column = 'other_column_update' WHERE id = xxxxxx; -- the TRIGGER is triggered
ALTER TABLE my_schema.my_table ADD COLUMN new_column varchar(50);
UPDATE my_schema.my_table SET new_column = 'new_column_update' WHERE id = xxxxxx; -- this UPDATE does't do anything
It doesn't works anymore.
After a few (a lot) hours, i found that the trigger BEFORE UPDATE is reponsible. But i can't find why.
I found a workaround by temporary disabling the trigger
ALTER TABLE my_table DISABLE TRIGGER update_date;
Here is a dbfiddle, just run it to see this behaviour :
dbfiddle
Here is the code in dbfiddle
CREATE TABLE my_table (
other_column varchar(50),
updated_at timestamp
);
CREATE OR REPLACE FUNCTION update_date()
RETURNS trigger
LANGUAGE plpgsql
COST 1
AS '
BEGIN
IF row(NEW.*) IS DISTINCT FROM row(OLD.*) THEN
NEW.updated_at = now();
RETURN NEW;
ELSE
RETURN OLD;
END IF;
END;
'
;
CREATE TRIGGER update_date BEFORE
UPDATE
ON
my_table FOR EACH ROW EXECUTE PROCEDURE update_date();
INSERT INTO my_table VALUES ('other_column_insert');
UPDATE my_table SET other_column = 'other_column_update';
ALTER TABLE my_table ADD COLUMN new_colum varchar(50);
UPDATE my_table SET new_colum = 'new_colum_update'; -- this UPDATE doesn't work because of the trigger BEFORE UPDATE
-- It is possible to make it works by disabling the trigger BEFORE the first UPDATE
-- ALTER TABLE my_table DISABLE TRIGGER update_date;
Have you ever encountered this behavior ?

It's something to do with the (unnecessary) wrapping of NEW/OLD with a ROW(...) constructor:
BEGIN
IF row(NEW.*) IS DISTINCT FROM row(OLD.*) THEN
-- IF NEW IS DISTINCT FROM OLD THEN
NEW.updated_at = now();
ELSE
RAISE EXCEPTION $$NOT DISTINCT: % / %$$, NEW, OLD;
END IF;
RETURN NEW;
END;
I've also moved the RETURN NEW to the end. If you try your version you should see the exceptions. If you replace it out with the commented-out one below then it works.
Now, as to why this is failing when you compare rows I'm not sure and it's too hot and late on a Friday afternoon where I am to figure it out I'm afraid.

I am going to say this is a caching problem. I modified the function to see what is going on:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table (
other_column varchar(50),
updated_at timestamp
);
CREATE OR REPLACE FUNCTION public.update_date()
RETURNS trigger
LANGUAGE plpgsql
COST 1
AS $function$
BEGIN
RAISE NOTICE 'New row %', ROW(NEW.*);
RAISE NOTICE 'Old row%', ROW(OLD.*);
RAISE NOTICE 'New.* %', (NEW.*)::text;
RAISE NOTICE 'Old.* %', (OLD.*)::text;
IF NEW.* IS DISTINCT FROM OLD.* THEN
NEW.updated_at = now();
RETURN NEW;
ELSE
RETURN OLD;
END IF;
END;
$function$;
CREATE TRIGGER update_date BEFORE
UPDATE
ON
my_table FOR EACH ROW EXECUTE PROCEDURE update_date();
INSERT INTO my_table VALUES ('other_column_insert');
UPDATE my_table SET other_column = 'other_column_update';
NOTICE: New row (other_column_update,)
NOTICE: Old row(other_column_insert,)
NOTICE: New.* (other_column_update,)
NOTICE: Old.* (other_column_insert,)
ALTER TABLE my_table ADD COLUMN new_colum varchar(50);
UPDATE my_table SET new_colum = 'new_colum_update';
NOTICE: New row (other_column_update,"2022-08-12 10:38:54.815831")
NOTICE: Old row(other_column_update,"2022-08-12 10:38:54.815831")
NOTICE: New.* (other_column_update,"2022-08-12 10:38:54.815831",new_colum_update)
NOTICE: Old.* (other_column_update,"2022-08-12 10:38:54.815831",)
It has to do with the ROW(). Even doing ROW(NEW.*)::my_table or using EXECUTE to make the query dynamic and not use caching does not work.

Postgresql trigger IF condition to see if destination table has this record

I have a straight forward trigger function that is set to run on UPDATE or INSERT in a table.
When this trigger runs, I want to insert the record into another table, only if it doesn't already exist there.
I could simply ignore checking and let the insert fail, but I feel like that's not the best approach.
-- Trigger
CREATE TRIGGER archivelogic_trigger AFTER INSERT OR UPDATE ON entsf.et4ae5__individualemailresult__c
FOR EACH ROW EXECUTE PROCEDURE entsf.archivelogicfunc();
-- Function
CREATE OR REPLACE FUNCTION entsf.archivelogicfunc() RETURNS TRIGGER AS $result_table$
BEGIN
BEGIN
IF (DATE(NEW.et4ae5__datesent__c) < NOW() - INTERVAL '180 days'
AND DATE(NEW.et4ae5__datesent__c) > NOW() - INTERVAL '540 days'
AND NEW.id NOT IN (SELECT id FROM archive.individualemailresult__c)) -- this seems expensive
THEN
INSERT INTO archive.individualemailresult__c
(dateopened__c,
numberoftotalclicks__c,
datebounced__c,
fromname__c,
hardbounce__c,
fromaddress__c,
softbounce__c,
name,
lastmodifieddate,
opened__c,
ownerid,
subjectline__c,
isdeleted,
contact__c,
systemmodstamp,
lastmodifiedbyid,
datesent__c,
dateunsubscribed__c,
createddate,
createdbyid,
lead__c,
tracking_as_of__c,
numberofuniqueclicks__c,
senddefinition__c,
mergeid__c,
triggeredsenddefinition__c,
sfid,
id,
_hc_lastop,
_hc_err)
VALUES
(NEW.et4ae5__dateopened__c,
NEW.et4ae5__numberoftotalclicks__c,
NEW.et4ae5__datebounced__c,
NEW.et4ae5__fromname__c,
NEW.et4ae5__hardbounce__c,
NEW.et4ae5__fromaddress__c,
NEW.et4ae5__softbounce__c,
NEW.name,
NEW.lastmodifieddate,
NEW.et4ae5__opened__c,
NEW.ownerid,
NEW.et4ae5__subjectline__c,
NEW.isdeleted,
NEW.et4ae5__contact__c,
NEW.systemmodstamp,
NEW.lastmodifiedbyid,
NEW.et4ae5__datesent__c,
NEW.et4ae5__dateunsubscribed__c,
NEW.createddate,
NEW.createdbyid,
NEW.et4ae5__lead__c,
NEW.et4ae5__tracking_as_of__c,
NEW.et4ae5__numberofuniqueclicks__c,
NEW.et4ae5__senddefinition__c,
NEW.et4ae5__mergeid__c,
NEW.et4ae5__triggeredsenddefinition__c,
NEW.sfid,
NEW.id,
NEW._hc_lastop,
NEW._hc_err);
END IF;
RETURN NULL;
END;
I added the line in my logic that checks to see if that ID exists in the other table, but I'm not sure if this is the best way to handle it?
AND NEW.id NOT IN (SELECT id FROM archive.individualemailresult__c)) -- this seems expensive

Transpose generate series date postgresql

Questions about transpose are asked many times before, but I cannot find any good answer when using generate_series and dates, because the columns may vary.
WITH range AS
(SELECT to_char(generate_series('2015-01-01','2015-01-05', interval '1 day'),'YYYY-MM-DD'))
SELECT * FROM range;
The normal output from generate series is:
2015-12-01
2015-12-02
2015-12-03
... and so on
http://sqlfiddle.com/#!15/9eecb7db59d16c80417c72d1e1f4fbf1/5478
But I want it to be columns instead
2015-12-01 2015-12-02 2015-12-03 ...and so on
It seems that crosstab maybe should do the trick, but I only get errors:
select * from crosstab('(SELECT to_char(generate_series('2015-01-01','2015-01-05', interval '1 day'),'YYYY-MM-DD'))')
as ct (dynamic columns?)
How do I get crosstab to work with generate_series(date-date) and different intervals dynamically?
TIA

Taking Reference from link PostgreSQL query with generated columns.
you can generate columns dynamically:
create or replace function sp_test()
returns void as
$$
declare cases character varying;
declare sql_statement text;
begin
drop table if exists temp_series;
create temporary table temp_series as
SELECT to_char(generate_series('2015-01-01','2015-01-02', interval '1 day'),'YYYY-MM-DD') as series;
select string_agg(concat('max(case when t1.series=','''',series,'''',' then t1.series else ''0000-00-00'' end) as ','"', series,'"'),',') into cases from temp_series;
drop table if exists temp_data;
sql_statement=concat('create temporary table temp_data as select ',cases ,'
from temp_series t1');
raise notice '%',sql_statement;
execute sql_statement;
end;
$$
language 'plpgsql';
Call function in following way to get output:
select sp_test(); select * from temp_data;
Updated Function which takes two date paramaeters:
create or replace function sp_test(start_date timestamp without time zone,end_date timestamp without time zone)
returns void as
$$
declare cases character varying;
declare sql_statement text;
begin
drop table if exists temp_series;
create temporary table temp_series as
SELECT to_char(generate_series(start_date,end_date, interval '1 day'),'YYYY-MM-DD') as series;
select string_agg(concat('max(case when t1.series=','''',series,'''',' then t1.series else ''0000-00-00'' end) as ','"', series,'"'),',') into cases from temp_series;
drop table if exists temp_data;
sql_statement=concat('create temporary table temp_data as select ',cases ,'
from temp_series t1');
raise notice '%',sql_statement;
execute sql_statement;
end;
$$
language 'plpgsql';
Function call:
select sp_test('2015-01-01','2015-01-10'); select * from temp_data;

Infinite recursion detected when updating from function (psycopg2, python 2.7, postgres 9.3)

I have a simple table:
CREATE TABLE IF NOT EXISTS someTable (
row_id smallserial PRIMARY KEY,
name text NOT NULL,
creation_date timestamp with time zone DEFAULT current_timestamp,
last_updated_date timestamp with time zone DEFAULT current_timestamp,
created_by text DEFAULT "current_user"(),
last_updated_by text DEFAULT "current_user"()
);
with the following rule:
CREATE OR REPLACE RULE log_update_some_table AS
ON UPDATE TO someTable
DO ALSO
UPDATE someTable
SET last_updated_date = current_timestamp,
last_updated_by = current_user;
and a very simple function in plpgsql:
CREATE OR REPLACE FUNCTION test_update ()
RETURNS void AS $$
BEGIN
UPDATE someTable
SET name = 'test'
WHERE row_id = 1;
END;
$$ LANGUAGE plpgsql;
One would think the function would run just fine, but I get the following error:
psycopg2.ProgrammingError: infinite recursion detected in rules for relation "sometable"
CONTEXT: SQL statement "UPDATE someTable
SET name = 'test'
WHERE row_id = 1"
PL/pgSQL function test_update() line 3 at SQL statement
Why isn't this working and how do I fix it? Thanks!

So your update rule on someTable triggers an update on someTable which executes the rule which updates someTable which executes the rule...
I'd use a simple trigger instead, something like this:
create or replace function log_update_some_table() returns trigger as $$
begin
NEW.last_updated_date = current_timestamp;
NEW.last_updated_by = current_user;
return NEW;
end;
$$ language plpgsql;
create trigger log_update_some_table_trigger
before update on someTable
for each row execute procedure log_update_some_table();
should do the trick. That will modify the row before the update happens rather than adding another update (which triggers the recursion problem) to the queue.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Race condition in partitioning with dynamic table creation - postgresql

Related

Create partition table using execute

New columns can't be updated if a trigger BEFORE UPDATE is triggered

Postgresql trigger IF condition to see if destination table has this record

Transpose generate series date postgresql

Infinite recursion detected when updating from function (psycopg2, python 2.7, postgres 9.3)

Categories

Resources