PL/pgSQL "for loop" + select basic example ("hello world") - postgresql

I've been using Postgres for a while, but I'm totally new to PL/pgSQL.
I'm struggling to get a basic for loop to work.
This works fine:
-- Without SELECT
DO $$
BEGIN
FOR counter IN 1..6 BY 2 LOOP
RAISE NOTICE 'Counter: %', counter;
END LOOP;
END; $$;
But what I really want is to iterate through the result of a SELECT query.
I keep running into this error:
Error in query: ERROR: loop variable of loop over rows must be a record or row variable or list of scalar variables
Sounds pretty obscure to me and googling did not help.
There's a table from my own data I want to use (I was hoping to use a SELECT * FROM mytable WHERE ‹whatever›), but I realize I can't even get the for loop to work with simpler data.
Take this:
-- with a SELECT
DO $$
BEGIN
RAISE NOTICE 'Get ready to be amazed…';
FOR target IN SELECT * FROM generate_series(1,2) LOOP
RAISE NOTICE 'hello'
END LOOP;
END; $$
This generates the error above too. I'd like to get a simple thing printed to get the hang of the loop syntax, something like:
hello 1
hello 2
What am I doing wrong?

The iterator must be declared
DO $$
DECLARE
target record;
BEGIN
RAISE NOTICE 'Get ready to be amazed…';
FOR target IN SELECT * FROM generate_series(1,2) LOOP
RAISE NOTICE 'hello';
END LOOP;
END; $$;
NOTICE: Get ready to be amazed…
NOTICE: hello
NOTICE: hello

Related

calling a function in sql within nested loop

My query doesnt even work, but hopefully the logic comes through. Basically Im using datavault.dimdates_csv to produce a row for each date/day. Then for each day Im trying to get all account ids and for each and call a function using the date and for the account.
is there a better approach to getting my data? I know nested loops arnt that great for sql.
do
$$
declare
date_record record;
account record;
begin
for date_record in select d."date" from datavault.dimdates_csv d
for account in select sad.id from datavault.sat_account_details sad
select datavault.account_active_for_date(date_record , account)
loop
loop
end loop;
end;
$$
It's hard to follow your business logic but syntax-wise your block needs correction. Please note that d."date" and sad.id are scalars (I assume a date and an integer) and not records.
do
$$
declare
running_date date;
running_id integer;
begin
for running_date in select d."date" from datavault.dimdates_csv d loop
for running_id in select sad.id from datavault.sat_account_details sad loop
perform datavault.account_active_for_date(running_date, running_id);
end loop;
end loop;
end;
$$;
As far as I can see you are calling function datavault.account_active_for_date for every pair of d."date" and sad.id. If this is true then you can simply
select datavault.account_active_for_date(d."date", sad.id)
from datavault.dimdates_csv d, datavault.sat_account_details sad;
and ignore the resultset.

Amazon redshift stored procedure, CONTINUE cannot be used outside a loop;

Im building a stored procedure in amazon redshift.
This is an example of what im trying to do.
CREATE OR REPLACE PROCEDURE test_sp1()
LANGUAGE plpgsql
AS $$
DECLARE v_test RECORD;
BEGIN
FOR v_test IN select * from pg_user
LOOP
RAISE INFO 'before loop';
CONTINUE;
RAISE INFO 'after loop';
END LOOP;
END;
$$;
CALL test_sp1();
This piece of code gives me an exception
"[42601][500310] Amazon Invalid operation: CONTINUE cannot be used outside a loop".
Why can i not use continue in this loop?
The error appears to be happening when CONTINUE passes control to the beginning of the loop but there's no more iterating to be done.

Why is catching errors inside a LOOP causing performance issues?

I had a function with a performance issue:
totalCharge := 0;
FOR myRecord IN ... LOOP
......
IF severalConditionsAreMet THEN
BEGIN
SELECT t1.charge INTO STRICT recordCharge
FROM t1
WHERE t1.id = myRecord.id AND otherComplexConditionsHere;
totalCharge := totalCharge + recordCharge;
...........
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
END IF;
END LOOP;
The function was being called 232 times (not counting the number of times the code from the FOR was accessed).
The IF from the FOR LOOP ended up being accessed 4466 times and was taking 561 seconds to complete all 4466 iterations.
For the particular data set that I had, the IF was always accessed, the SELECT from above never return data and the code was reaching the EXCEPTION branch each and every time.
I have changed the code to:
totalCharge := 0;
FOR myRecord IN ... LOOP
......
IF severalConditionsAreMet THEN
SELECT t1.charge INTO recordCharge
FROM t1
WHERE t1.id = myRecord.id AND otherComplexConditionsHere;
IF (recordCharge IS NULL) THEN
CONTINUE;
END IF;
totalCharge := totalCharge + recordCharge;
...........
END IF;
END LOOP;
Please note that for the table t1, the t1.charge column has a NOT NULL condition defined on it.
This time, the code from the IF takes 1-2 seconds to complete all 4466 iterations.
Basically, all I did was replace the
BEGIN
…
EXCEPTION
….
END;
With
IF conditionIsNotMet THEN
CONTINUE;
END IF;
Can someone please explain to me why this worked?
What happened behind the scenes?
I suspect that when you catch exceptions inside of a LOOP and the code ends up generating an exception, Postgres can’t use cached plans to optimize that code so it ends up planning the code at each iteration and this causes performance issues.
Is my assumption correct?
Later Edit:
I altered the example provided by Vao Tsun to reflect the case that I want to illustrate.
CREATE OR REPLACE FUNCTION initialVersion()
RETURNS VOID AS $$
declare
testDate DATE;
begin
for i in 1..999999 loop
begin
select now() into strict testDate where 1=0;
exception when others
then null;
end;
end loop;
end;
$$ Language plpgsql;
CREATE OR REPLACE FUNCTION secondVersion()
RETURNS VOID AS $$
declare
testDate DATE;
begin
for i in 1..999999 loop
select now() into testDate where 1=0;
if testDate is null then
continue;
end if;
end loop;
end;
$$ Language plpgsql;
select initialVersion(); -- 19.7 seconds
select secondVersion(); -- 5.2
As you can see there is a difference of almost 15 seconds.
In the example that I have provided initially, the difference is bigger because the SELECT FROM t1 runs against complex data and takes more time to execute that the simple SELECT provided in this second example.
I asked the same question here, in the PostgreSQL - general mailing group and got some responses that elucidated this "mystery" for me:
David G. Johnston:
"​Tip: A block containing an EXCEPTION clause is significantly
more expensive to enter and exit than a block without one. Therefore,
don't use EXCEPTION without need."
I'm somewhat doubting "plan caching" has anything to do with this; I
suspect its basically that there is high memory and runtime overhead
to deal with the possibilities of needing to convert a exception into
a branch instead of allowing it to be fatal.
Tom Lane:
Yeah, it's about the overhead of setting up and ending a
subtransaction. That's a fairly expensive mechanism, but we don't have
anything cheaper that is able to recover from arbitrary errors.
and an addition from David G. Johnston:
[...] setting up the pl/pgsql execution layer to trap "arbitrary SQL-layer
exceptions"​ is fairly expensive. Even if the user specifies specific
errors the error handling mechanism in pl/pgsql is code for generic
(arbitrary) errors being given to it.
These answers helped me understand a bit how things work.
I am posting this answer here 'cause I hope that this answer will help someone else.
with given details - cant reproduce:
t=# do
$$
declare
begin
for i in 1..999999 loop
perform now();
/* exception when others then null; */
if null then null; end if;
end loop;
end;
$$
;
DO
Time: 1920.568 ms
t=# do
$$
declare
begin
for i in 1..999999 loop
begin
perform now();
exception when others then null;
end;
end loop;
end;
$$
;
DO
Time: 2417.425 ms
as you can see with 10 millions iterations the difference is clear, but insignificant. please try same test on your machine - if you get same results, you need to provide more details...

Cursor retrieves a deleted row in table created within plpgsql function

Within a plpgsql function I create a table and use a cursor to access its rows. While at first row, I delete a following one and surprisingly (to me at least) the cursor fetches it. When repeating within the same function, it works as I expected it so.
However, if the table pre-exists and is not created within the function, the deleted row is never fetched.
What am I missing?
DECLARE
curs1 refcursor;
rec record;
BEGIN
CREATE TABLE test as select generate_series(1,5,1) test_id;
OPEN curs1 FOR SELECT * FROM test ORDER BY test_id;
LOOP
FETCH curs1 INTO rec;
EXIT WHEN NOT FOUND;
RAISE NOTICE 'ROW:%',rec.test_id;
IF rec.test_id=1 THEN
DELETE FROM TEST WHERE test_id=3;
END IF;
END LOOP;
CLOSE curs1;
RAISE NOTICE 'AGAIN';
--just repeating without deleting
OPEN curs1 FOR SELECT * FROM test ORDER BY test_id;
LOOP
FETCH curs1 INTO rec;
EXIT WHEN NOT FOUND;
RAISE NOTICE 'ROW:%',rec.test_id;
END LOOP;
CLOSE curs1;
Output is:
NOTICE: ROW:1
NOTICE: ROW:2
NOTICE: ROW:3
NOTICE: ROW:4
NOTICE: ROW:5
NOTICE: AGAIN
NOTICE: ROW:1
NOTICE: ROW:2
NOTICE: ROW:4
NOTICE: ROW:5
The reason is that Postgres cursors are "insensitive" by default. The documentation:
The SQL standard says that it is implementation-dependent whether
cursors are sensitive to concurrent updates of the underlying data by
default. In PostgreSQL, cursors are insensitive by default, and can be
made sensitive by specifying FOR UPDATE. Other products may work differently.
Bold emphasis mine.
So try the same with using the FOR UPDATE clause:
DO
$$
DECLARE
curs1 refcursor;
rec record;
BEGIN
CREATE TABLE test AS SELECT generate_series(1,5) test_id;
OPEN curs1 FOR SELECT * FROM test ORDER BY test_id FOR UPDATE;
DELETE FROM test WHERE test_id = 3;
LOOP
FETCH curs1 INTO rec;
EXIT WHEN NOT FOUND;
RAISE NOTICE 'ROW:%',rec.test_id;
END LOOP;
CLOSE curs1;
END
$$
And you get:
NOTICE: ROW:1
NOTICE: ROW:2
NOTICE: ROW:4
NOTICE: ROW:5
Row 3 is not visible any more.

How to get a statement calling the function from inside the function itself?

Let's say I have a function show_files(IN file text, IN suffix text, OUT statement text). In next step the function is called:
SELECT * FROM show_files(file := 'example', suffix := '.png');
My question is: Is there any solution that I could get statement that has called this function from inside that function?
I mean, after running the SELECT the output of function (OUT statement text) should be: 'SELECT * FROM show_files(file := 'example', suffix := '.png');', or is it possible to assign this statement to the variable inside the function?
I need the functionality like those with TG_NAME, TG_OP, etc. in trigger procedures.
Maybe is it possible to retrieve this statement from SELECT current_query FROM pg_stat_activity ?
When I'm trying to use it inside a function I've got an empty record:
CREATE OR REPLACE FUNCTION f_snitch(text)
RETURNS text AS
$BODY$
declare
rr text;
BEGIN
RAISE NOTICE '.. from f_snitch.';
-- do stuff
SELECT current_query into rr FROM pg_stat_activity
WHERE current_query ilike 'f_snitch';
RETURN rr;
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
Any help and suggestions would be happily welcome!
TG_NAME and friends are special variables that only exist for trigger functions. Regular plpgsql functions don't have anything like that. I am fresh out of ideas how you could possibly get this inside the called function in plpgsql.
You could add RAISE NOTICE to your function so you get the desired information
CREATE OR REPLACE FUNCTION f_snitch(text)
RETURNS text LANGUAGE plpgsql AS
$func$
BEGIN
RAISE NOTICE '.. from f_snitch.';
-- do stuff
RETURN 'Snitch says hi!';
END
$func$;
Call:
SELECT f_snitch('foo')
In addition to the result, this returns a notice:
NOTICE: .. from f_snitch.
Fails to please in two respects:
Calling statement is not in the notice.
No CONTEXT in the notice.
For 1. you can use RAISE LOG instead (or set your cluster up to log NOTICES, too - which I usually don't, too verbose for me). With standard settings, you get an additional line with the STATEMENT in the database log:
LOG: .. from f_snitch.
STATEMENT: SELECT f_snitch('foo')
For 2., have a look at this related question at dba.SE. CONTEXT would look like:
CONTEXT: SQL statement "SELECT f_raise('LOG', 'My message')"
PL/pgSQL function "f_snitch" line 5 at PERFORM
Ok, I've got it!
CREATE OR REPLACE FUNCTION f_snitch(text)
RETURNS setof record AS
$BODY$
BEGIN
RETURN QUERY
SELECT current_query
FROM pg_stat_activity
<strike>ORDER BY length(current_query) DESC LIMIT 1;</strike>
where current_query ilike 'select * from f_snitch%';
-- much more reliable solution
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
select * from f_snitch('koper') AS (tt text);
And here is the result:
It's probably not 100% reliable solution but for small systems (for few users) it's quite ok.