Commit changes after each iteration inside a function postgres - postgresql

I am trying to commit the changes after each iteration inside this function to be able to save my progress in case of stopping the function in the middle of work. what should I do?
CREATE OR REPLACE FUNCTION delete_in_loop()
RETURNS INTEGER AS $$
DECLARE
counter INTEGER = 0 ;
i INTEGER = 0 ;
BEGIN
i = (select COUNT("ID") from "AwsSesNotification"
where "UTADateCreatedOn" < (now() - interval '3 month'))/1000 + 1 ;
LOOP
EXIT WHEN counter > i ;
counter = counter+1;
delete from "AwsSesNotification"
where "ID" in(
select "ID" from "AwsSesNotification"
where "UTADateCreatedOn" < (now() - interval '3 month')
limit 1000
);
RAISE NOTICE 'Counter: %', counter;
RAISE NOTICE 'From: %', i;
PERFORM pg_sleep(2);
END LOOP;
return counter;
END;
$$ LANGUAGE plpgsql;

This is known as AUTONOMOUS TRANSACTIONS and is unfortunately not supported in PostgreSQL.
The only way I know to achieve similar effects is by using dblink(), essentially writing to another database which has a separate transaction context. Much slower than normal writing.
Best regards,
Bjarni

Related

PostgreSQL PgAgent syntax error at or near DECLARE

Could someone tells me what is wrong please.
I try to create a job with using PgAgent with declaring some variables. When I run this code manually, it works successfully. But when I try to put this code in job step and save it, it throws me an error.
DO $$
DECLARE
start_date date;
dates date;
d SMALLINT;
counter integer := 0;
res date[];
treshold bigint;
BEGIN
TRUNCATE ditdemo.daily;
start_date:= now();
dates := start_date;
while counter <= 14 loop
dates := dates - INTERVAL '1 DAY';
select cal.is_holiday into d from ditdemo.calendar as cal where cal.calendardate = dates;
if d=0 then
res := array_append(res,dates);
counter := counter + 1;
end if;
/*
raise notice 'dates %', dates;
raise notice 'is holiday %', d;
raise notice 'result %', res;
*/
end loop;
insert into ditdemo.daily
select
time_bucket('1 day', j."timestamp") as day,
j.account,
count(*) as cnt
from ditdemo.jrnl as j
where
cast(j."timestamp" as date) in (select unnest(res)) AND
j.account not in (select account from ditdemo.user where is_service = 1)
group by day, j.account;
SELECT
round(PERCENTILE_CONT(0.95) WITHIN GROUP(ORDER BY d.cnt))
into treshold
FROM ditdemo.daily as d;
UPDATE ditdemo.calendar
SET daily_treshold = treshold
WHERE calendardate > start_date and calendardate <=(start_date::date + interval '7 day');
END $$;
It seems like PgAgent translates your code to another format, perhaps to string or something else and then can't parse it. To understand this try to:
Delete some special symbols from your code like brackets, quotes etc
Try to understand is it error from pgAgent or PostgreSQL
Good lucK!

FOR loop over a date range in Postgres

In one of my functions in Postgres, I am trying to loop over a range of dates using the following code:
FOR timesheet_date IN select generate_series('2012-11-24'::date,'2012-12-03','1 day'::interval)::date LOOP
//My code goes here
END LOOP;
But I am getting an error
Now as am getting dates, I think it is not a record variable and hence the error.
But, how can I loop through a date range ? I am very new to Postgres actually.
DO $$
declare
dt record;
begin
FOR dt IN SELECT generate_series('2023-02-10'::date, '2023-02-15'::date, '1 day'::interval) LOOP
RAISE NOTICE 'Processing date: %', dt.generate_series;
END LOOP;
end; $$

Measure the time it takes to execute a PostgreSQL query

Based on Measure the time it takes to execute a t-sql query, how would one time several trials of a query in PostgreSQL?
A general outline would be
-- set up number of trials (say 1000)
SELECT CURRENT_DATE ; -- save start time
BEGIN
LOOP
-- execute query to be tested
END LOOP;
END;
SELECT CURRENT_DATE ; -- save end time
I.E. I want a PostgreSQL equivalent of the following TSQL code, taken from an answer by HumbleWebDev from the linked TSQL question: see [reference for code]
declare #tTOTAL int = 0
declare #i integer = 0
declare #itrs integer = 100
while #i < #itrs
begin
declare #t0 datetime = GETDATE()
--your query here
declare #t1 datetime = GETDATE()
set #tTotal = #tTotal + DATEDIFF(MICROSECOND,#t0,#t1)
set #i = #i + 1
end
select #tTotal/#itrs
-- your query here: Standard SQL queries such as Select * from table1 inner -- join table2, or executing stored procedure, etc.
Coming from an MSSQL background myself and now more often working in Postgres I feel your pain =)
The "trouble" with Postgres is that it supports only 'basic' SQL commands (SELECT, INSERT, UPDATE, CREATE, ALTER, etc...) but the moment you want to add logic (IF THEN, WHILE, variables, etc.) you need to switch to pl/pgsql which you can only use inside functions (AFAIK). From a TSQL POV there are quite some limitations and in fact, some things suddenly don't work anymore (or need to be done differently.. e.g. SELECT * INTO TEMPORARY TABLE tempTable FROM someTable will not work but CREATE TABLE tempTable AS SELECT * FROM someTable will)
Something I learned the hard way too is that CURRENT_TIMESTAMP (or Now()) will return the same value within a transaction. And since everything inside a function runs inside a transaction this means you have to use clock_timstamp()
Anyway, to answer your question, I think this should get you going:
CREATE OR REPLACE FUNCTION fn_test ( nbrOfIterations int)
RETURNS TABLE (iterations int, totalTime interval, secondsPerIteration int)
AS $$
DECLARE
i int;
startTime TIMESTAMP;
endTime TIMESTAMP;
dummy text;
BEGIN
i := 1;
startTime := clock_timestamp();
WHILE ( i <= nbrOfIterations) LOOP
-- your query here
-- (note: make sure to not return anything or you'll get an error)
-- example:
SELECT pg_sleep INTO dummy FROM pg_sleep(1);
i := i + 1;
END LOOP;
endTime := clock_timestamp();
iterations := nbrOfIterations;
totalTime := (endTime - startTime);
secondsPerIteration := (EXTRACT(EPOCH FROM endTime) - EXTRACT(EPOCH FROM startTime)) / iterations;
RETURN NEXT;
END;
$$ language plpgsql;
SELECT * FROM fn_test(5);
While the accepted answer is correct, this tweaking of it worked better for me. Again, I want to emphasize this extra answer below is based on the above answer, and it would not be possible without it. It just works better in my own situation to use the tweak I made below.
The answer below is indeed almost entirely based on the accepted answer. However, I changed how the return is used and also seconds to milliseconds:
----------------------------------------------------------------------------------------------------
-- fn__myFunction_Q.sql
----------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------
-- DROP FUNCTION mySchema.fn__myFunction
--------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION mySchema.fn__myFunction ( nbrOfIterations int)
RETURNS TABLE (iterations int, totalTime interval, millisecondsPerIteration int) -- interval --
AS $$
declare
i int;
startTime TIMESTAMP;
endTime TIMESTAMP;
-- dummy text;
iterations int;
millisecondsPerIteration int;
totalTime interval;
BEGIN
i := 1;
startTime := clock_timestamp();
WHILE ( i <= nbrOfIterations) LOOP
PERFORM /* Put your query here, replacing SELECT with PERFORM */
--------------------------------------------------------------------------------------------
--SELECT
-- YOUR QUERY HERE
-- ...
--------------------------------------------------------------------------------------------
i := i + 1; -- very important to increment loop counter, else one gets an infinite loop!!!
END LOOP;
endTime := clock_timestamp();
iterations := nbrOfIterations;
totalTime := (endTime - startTime);
millisecondsPerIteration := 1000 * (EXTRACT(EPOCH FROM endTime) - EXTRACT(EPOCH FROM startTime)) / iterations;
RETURN QUERY select iterations, totalTime, millisecondsPerIteration;
-- RETURNS TABLE (iterations int, totalTime interval, secondsPerIteration int) -- interval --
-- RETURN NEXT;
END;
$$ language plpgsql;
--------------------------------------------------------------------------------------------
To call this function, just use:
SELECT * from mySchema.fn__myFunction(1000) as ourTableResult;

PostgreSQL AFTER INSERT trigger prevents insert

I have a PostgreSQL 9.3 database. I use a log4net configuration to insert the errors in a table: log_messages.
I have a webpage that shows the errors in a nice way with charts and such.
Because I use a rather complex view this webpage is very slow, so I moved to a materialized view. My page is fast again.
Now I need to keep my materialized view in sync with my table/view. So I created a AFTER INSERT trigger on the table:
CREATE TRIGGER refresh_mv_insert
AFTER INSERT
ON log_messages
FOR EACH ROW
EXECUTE PROCEDURE refresh_mv();
My refresh_mv() is more complicated but even this simplified version doesn't work:
CREATE OR REPLACE FUNCTION refresh_mv()
RETURNS trigger AS
$BODY$
DECLARE
l_view_name character varying := 'mv_log_messages';
begin
EXECUTE 'REFRESH MATERIALIZED VIEW ' || l_view_name;
RETURN NEW;
end;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
When I change it to an anonymous procedure the full and simplified version do work. So it seems I have an error in the Trigger part.
I've been reading documentation and similar Q&A for two days now but I can't get it to work.
Any help is much appreciated.
Edit: Clarification
In my full trigger procedure I use a config table to store the refresh timestamp and I don't refresh within an hour.
When I enable the trigger the record is not saved into the log_messages table.
I don't know how to read any trigger errors. Where can I find them?
Here's my full code:
CREATE OR REPLACE FUNCTION refresh_mv()
RETURNS trigger AS
$BODY$
DECLARE
l_last_refresh timestamp;
l_view_name character varying := 'mv_log_messages';
l_num_new smallint;
l_refresh boolean := false;
begin
l_refresh := false;
-- check the last time:
select last_refresh, num_new into l_last_refresh, l_num_new from config where view_name = l_view_name;
-- refresh every hour
if (l_last_refresh + interval '1 hour' < current_timestamp) then
l_refresh := true;
end if;
-- refresh every 10 inserts, but not more often than every 10 minutes:
if (l_num_new > 9 and l_last_refresh + interval '10 minutes' < current_timestamp) then
l_refresh := true;
end if;
if l_refresh then
-- Reset config and do refresh:
update config set last_refresh = current_timestamp, num_new = 0 where view_name = l_view_name;
-- this line prevents the insertion of the record EXECUTE 'REFRESH MATERIALIZED VIEW ' || l_view_name;
else
-- Update counter:
update config set num_new = l_num_new + 1 where view_name = l_view_name;
end if;
RETURN NULL;
end;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
This trigger works because I commented the line EXECUTE 'REFRESH MATERIALIZED VIEW ' || l_view_name;

count number of rows to be affected before update in trigger

I want to know number of rows that will be affected by UPDATE query in BEFORE per statement trigger . Is that possible?
The problem is that i want to allow only queries that will update up to 4 rows. If affected rows count is 5 or more i want to raise error.
I don't want to do this in code because i need this check on db level.
Is this at all possible?
Thanks in advance for any clues on that
Write a function that updates the rows for you or performs a rollback. Sorry for poor style formatting.
create function update_max(varchar, int)
RETURNS void AS
$BODY$
DECLARE
sql ALIAS FOR $1;
max ALIAS FOR $2;
rcount INT;
BEGIN
EXECUTE sql;
GET DIAGNOSTICS rcount = ROW_COUNT;
IF rcount > max THEN
--ROLLBACK;
RAISE EXCEPTION 'Too much rows affected (%).', rcount;
END IF;
--COMMIT;
END;
$BODY$ LANGUAGE plpgsql
Then call it like
select update_max('update t1 set id=id+10 where id < 4', 3);
where the first param ist your sql-Statement and the 2nd your max rows.
Simon had a good idea but his implementation is unnecessarily complicated. This is my proposition:
create or replace function trg_check_max_4()
returns trigger as $$
begin
perform true from pg_class
where relname='check_max_4' and relnamespace=pg_my_temp_schema();
if not FOUND then
create temporary table check_max_4
(value int check (value<=4))
on commit drop;
insert into check_max_4 values (0);
end if;
update check_max_4 set value=value+1;
return new;
end; $$ language plpgsql;
I've created something like this:
begin;
create table test (
id integer
);
insert into test(id) select generate_series(1,100);
create or replace function trg_check_max_4_updated_records()
returns trigger as $$
declare
counter_ integer := 0;
tablename_ text := 'temptable';
begin
raise notice 'trigger fired';
select count(42) into counter_
from pg_catalog.pg_tables where tablename = tablename_;
if counter_ = 0 then
raise notice 'Creating table %', tablename_;
execute 'create temporary table ' || tablename_ || ' (counter integer) on commit drop';
execute 'insert into ' || tablename_ || ' (counter) values(1)';
execute 'select counter from ' || tablename_ into counter_;
raise notice 'Actual value for counter= [%]', counter_;
else
execute 'select counter from ' || tablename_ into counter_;
execute 'update ' || tablename_ || ' set counter = counter + 1';
raise notice 'updating';
execute 'select counter from ' || tablename_ into counter_;
raise notice 'Actual value for counter= [%]', counter_;
if counter_ > 4 then
raise exception 'Cannot change more than 4 rows in one trancation';
end if;
end if;
return new;
end; $$ language plpgsql;
create trigger trg_bu_test before
update on test
for each row
execute procedure trg_check_max_4_updated_records();
update test set id = 10 where id <= 1;
update test set id = 10 where id <= 2;
update test set id = 10 where id <= 3;
update test set id = 10 where id <= 4;
update test set id = 10 where id <= 5;
rollback;
The main idea is to have a trigger on 'before update for each row' that creates (if necessary) a temporary table (that is dropped at the end of transaction). In this table there is just one row with one value, that is the number of updated rows in current transaction. For each update the value is incremented. If the value is bigger than 4, the transaction is stopped.
But I think that this is a wrong solution for your problem. What's a problem to run such wrong query that you've written about, twice, so you'll have 8 rows changed. What about deletion rows or truncating them?
PostgreSQL has two types of triggers: row and statement triggers. Row triggers only work within the context of a row so you can't use those. Unfortunately, "before" statement triggers don't see what kind of change is about to take place so I don't believe you can use those, either.
Based on that, I would say it's unlikely you'll be able to build that kind of protection into the database using triggers, not unless you don't mind using an "after" trigger and rolling back the transaction if the condition isn't satisfied. Wouldn't mind being proved wrong. :)
Have a look at using Serializable Isolation Level. I believe this will give you a consistent view of the database data within your transaction. Then you can use option #1 that MusiGenesis mentioned, without the timing vulnerability. Test it of course to validate.
I've never worked with postgresql, so my answer may not apply. In SQL Server, your trigger can call a stored procedure which would do one of two things:
Perform a SELECT COUNT(*) to determine the number of records that will be affected by the UPDATE, and then only execute the UPDATE if the count is 4 or less
Perform the UPDATE within a transaction, and only commit the transaction if the returned number of rows affected is 4 or less
No. 1 is timing vulnerable (the number of records affected by the UPDATE may change between the COUNT(*) check and the actual UPDATE. No. 2 is pretty inefficient, if there are many cases where the number of rows updated is greater than 4.