PostgreSQL: Iterating over array of text and executing SQL - postgresql

I am copying tables from one schema to another. I am trying to pass argument of name of tables that I want to copy. But no table is created in Schema when I execute the CALL.
Command: CALL copy_table('firstname', 'tableName1,tableName2,tableName3');
CREATE OR REPLACE PROCEDURE copy_table(user VARCHAR(50), strs TEXT)
LANGUAGE PLPGSQL
AS $$
DECLARE
my_array TEXT;
BEGIN
FOR my_array IN
SELECT string_to_array(strs, ',')
LOOP
EXECUTE 'CREATE TABLE ' || user || '.' || my_array || ' (LIKE public.' || my_array || ' INCLUDING ALL)';
END LOOP;
$$
Could you please help? Thank you.

The function string_to_array returns an array value. Looping through arrays is performed by FOREACH command, not FOR.
See documentation:
https://www.postgresql.org/docs/14/plpgsql-control-structures.html#PLPGSQL-FOREACH-ARRAY
CREATE FUNCTION sum(int[]) RETURNS int8 AS $$
DECLARE
s int8 := 0;
x int;
BEGIN
FOREACH x IN ARRAY $1
LOOP
s := s + x;
END LOOP;
RETURN s;
END;
$$ LANGUAGE plpgsql;

Loop over an array with FOREACH like Simon suggested. Or with FOR in old (or any) versions. See:
Iterating over integer[] in PL/pgSQL
Typically, a set-based solution is shorter and faster, though:
CREATE OR REPLACE PROCEDURE copy_tables(_schema text, VARIADIC _tables text[])
LANGUAGE plpgsql AS
$proc$
BEGIN
EXECUTE
(SELECT string_agg(format('CREATE TABLE %1$I.%2$I (LIKE public.%2$I INCLUDING ALL)', _schema, t), E';\n')
FROM unnest(_tables) t);
END
$proc$;
About VARIADIC:
Return rows matching elements of input array in plpgsql function
Call, passing list of table names:
CALL copy_tables('firstname', 'tableName1', 'tableName2', 'tableName3');
Or, passing genuine array:
CALL copy_tables('foo', VARIADIC '{tableName1,tableName2,tableName3}');
Or, passing (and converting) comma-separated string (your original input):
CALL copy_tables('foo', VARIADIC string_to_array('tableName1,tableName2,tableName3', ','));
I use format() to concatenate the SQL string safely. Note that identifiers must be passed as case-sensitive strings! See:
Define table and column names as arguments in a plpgsql function?
SQL injection in Postgres functions vs prepared queries

Related

postgres plpgsql how to properly convert function to use FORMAT in DECLARE

I am writing a function in POSTGRES v13.3 that when passed an array of column names returns an array of JSONB objects each with the distinct values of one of the columns. I have an existing script that I wish to refactor using FORMAT in the declaration portion of the function.
The existing and working function looks like below. It is passed an array of columns and a dbase name. The a loop presents each column name to an EXECUTE statement that uses JSONB_AGG on the distinct values in the column, creates a JSONB object, and appends that to an array. The array is returned on completion. This is the function:
CREATE OR REPLACE FUNCTION foo1(text[], text)
RETURNS text[] as $$
declare
col text;
interim jsonb;
temp jsonb;
y jsonb[];
begin
foreach col in array $1
loop
execute
'select jsonb_agg(distinct '|| col ||') from ' || $2 into interim;
temp := jsonb_build_object(col, interim);
y := array_append(y,temp);
end loop;
return y;
end;
$$ LANGUAGE plpgsql;
I have refactored the function to the following. The script is now in the DECLARE portion of the function.
CREATE OR REPLACE FUNCTION foo2(_cols text[], _db text)
RETURNS jsonb[]
LANGUAGE plpgsql as
$func$
DECLARE
_script text := format(
'select jsonb_agg( distinct $1) from %1$I', _db
);
col text;
interim jsonb;
temp jsonb;
y jsonb[];
BEGIN
foreach col in array _cols
loop
EXECUTE _script USING col INTO interim;
temp := jsonb_build_object(col, interim);
y := array_append(y,temp);
end loop;
return y;
END
$func$;
Unfortunately the two functions give different results on a toy data set (see bottom):
Original: {"{\"id\": [1, 2, 3]}","{\"val\": [1, 2]}"}
Refactored: {{"id": ["id"]},{"val": ["val"]}}
Here is a db<>fiddle of the preceding.
The challenge is in the EXECUTE. In the first instance the col argument is treated as a column identifier. In the refactored function it seems to be treated as just a text string. I think my approach is consistent with the docs and tutorials (example), and the answer from this forum here and the links included therein. I have tried playing around with combinations of ", ', and || but those were unsuccessful and don't make sense in a format statement.
Where should I be looking for the error in my use of FORMAT?
NOTE 1: From the docs I have so possibly the jsonagg() and distinct are what's preventing the behaviour I want:
Another restriction on parameter symbols is that they only work in SELECT, INSERT, UPDATE, and DELETE commands. In other statement types (generically called utility statements), you must insert values textually even if they are just data values.
TOY DATA SET:
drop table if exists example;
create temporary table example(id int, str text, val integer);
insert into example values
(1, 'a', 1),
(2, 'a', 2),
(3, 'b', 2);
https://www.postgresql.org/docs/14/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROW
The command string can use parameter values, which are referenced in
the command as $1, $2, etc. These symbols refer to values supplied in
the USING clause.
What you want is paramterize sql identifier(column name).
You cannot do that. Access column using variable instead of explicit column name
Which means that select jsonb_agg( distinct $1) from %1$I In here "$1" must be %I type. USING Expression in the manual (EXECUTE command-string [ INTO [STRICT] target ] [ USING expression [, ... ] ];) will pass the literal value to it. But it's valid because select distinct 'hello world' from validtable is valid.
select jsonb_agg( distinct $1) from %1$I In here $1 must be same type as %1$I namely-> sql identifier.
--
Based on the following debug code, then you can solve your problem:
CREATE OR REPLACE FUNCTION foo2(_cols text[], _db text)
RETURNS void
LANGUAGE plpgsql as
$func$
DECLARE
col text;
interim jsonb;temp jsonb; y jsonb[];
BEGIN
foreach col in array _cols
loop
EXECUTE format( 'select jsonb_agg( distinct ( %1I ) ) from %2I', col,_db) INTO interim;
raise info 'interim: %', interim;
temp := jsonb_build_object(col, interim);
raise info 'temp: %', temp;
y := array_append(y,temp);
raise info 'y: %',y;
end loop;
END
$func$;

In clause in postgres

Need Output from table with in clause in PostgreSQL
I tried to make loop or ids passed from my code. I did same to update the rows dynamically, but for select I m not getting values from DB
CREATE OR REPLACE FUNCTION dashboard.rspgetpendingdispatchbyaccountgroupidandbranchid(
IN accountgroupIdCol numeric(8,0),
IN branchidcol character varying
)
RETURNS void
AS
$$
DECLARE
ArrayText text[];
i int;
BEGIN
select string_to_array(branchidcol, ',') into ArrayText;
i := 1;
loop
if i > array_upper(ArrayText, 1) then
exit;
else
SELECT
pd.branchid,pd.totallr,pd.totalarticle,pd.totalweight,
pd.totalamount
FROM dashboard.pendingdispatch AS pd
WHERE
pd.accountgroupid = accountgroupIdCol AND pd.branchid IN(ArrayText[i]::numeric);
i := i + 1;
end if;
END LOOP;
END;
$$ LANGUAGE 'plpgsql' VOLATILE;
There is no need for a loop (or PL/pgSQL actually)
You can use the array directly in the query, e.g.:
where pd.branchid = any (string_to_array(branchidcol, ','));
But your function does not return anything, so obviously you won't get a result.
If you want to return the result of that SELECT query, you need to define the function as returns table (...) and then use return query - or even better make it a SQL function:
CREATE OR REPLACE FUNCTION dashboard.rspgetpendingdispatchbyaccountgroupidandbranchid(
IN accountgroupIdCol numeric(8,0),
IN branchidcol character varying )
RETURNS table(branchid integer, totallr integer, totalarticle integer, totalweight numeric, totalamount integer)
AS
$$
SELECT pd.branchid,pd.totallr,pd.totalarticle,pd.totalweight, pd.totalamount
FROM dashboard.pendingdispatch AS pd
WHERE pd.accountgroupid = accountgroupIdCol
AND pd.branchid = any (string_to_array(branchidcol, ',')::numeric[]);
$$
LANGUAGE sql
VOLATILE;
Note that I guessed the data types for the columns of the query based on their names. You have to adjust the line with returns table (...) to match the data types of the select columns.

Passing table names in an array

I need to do the same deletion or purge operation (based on several conditions) on a set of tables. For that I am trying to pass the table names in an array to a function. I am not sure if I am doing it right. Or is there a better way?
I am pasting just a sample example this is not the real function I have written but the basic is same as below:
CREATE OR REPLACE FUNCTION test (tablename text[]) RETURNS int AS
$func$
BEGIN
execute 'delete * from '||tablename;
RETURN 1;
END
$func$ LANGUAGE plpgsql;
But when I call the function I get an error:
select test( {'rajeev1'} );
ERROR: syntax error at or near "{"
LINE 10: select test( {'rajeev1'} );
^
********** Error **********
ERROR: syntax error at or near "{"
SQL state: 42601
Character: 179
Array syntax
'{rajeev1, rajeev2}' or ARRAY['rajeev1', 'rajeev2']. Read the manual.
TRUNCATE
Since you are deleting all rows from the tables, consider TRUNCATE instead. Per documentation:
Tip: TRUNCATE is a PostgreSQL extension that provides a faster
mechanism to remove all rows from a table.
Be sure to study the details. If TRUNCATE works for you, the whole operation becomes very simple, since the command accepts multiple tables:
TRUNCATE rajeev1, rajeev2, rajeev3, ..
Dynamic DELETE
Else you need dynamic SQL like you already tried. The scary missing detail: you are completely open to SQL injection and catastrophic syntax errors. Use format() with %I (not %s to sanitize identifiers like table names. Or, better yet in this particular case, use an array of regclass as parameter instead:
CREATE OR REPLACE FUNCTION f_del_all(_tbls regclass)
RETURNS void AS
$func$
DECLARE
_tbl regclass;
BEGIN
FOREACH _tbl IN ARRAY _tbls LOOP
EXECUTE format('DELETE * FROM %s', _tbl);
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT f_del_all('{rajeev1,rajeev2,rajeev3}');
Explanation here:
Table name as a PostgreSQL function parameter
You used wrong syntax for text array constant in the function call. But even if it was right, your function is not correct.
If your function has text array as argument you should loop over the array to execute query for each element.
CREATE OR REPLACE FUNCTION test (tablenames text[]) RETURNS int AS
$func$
DECLARE
tablename text;
BEGIN
FOREACH tablename IN ARRAY tablenames LOOP
EXECUTE FORMAT('delete * from %s', tablename);
END LOOP;
RETURN 1;
END
$func$ LANGUAGE plpgsql;
You can then call the function for several tables at once, not only for one.
SELECT test( '{rajeev1, rajeev2}' );
If you do not need this feature, simply change the argument type to text.
CREATE OR REPLACE FUNCTION test (tablename text) RETURNS int AS
$func$
BEGIN
EXECUTE format('delete * from %s', tablename);
RETURN 1;
END
$func$ LANGUAGE plpgsql;
SELECT test('rajeev1');
I recommend using the format function.
If you want to execute a function (say purge_this_one_table(tablename)) on a group of tables identified by similar names you can use this construction:
create or replace function purge_all_these_tables(mask text)
returns void language plpgsql
as $$
declare
tabname text;
begin
for tabname in
select relname
from pg_class
where relkind = 'r' and relname like mask
loop
execute format(
'purge_this_one_table(%s)',
tabname);
end loop;
end $$;
select purge_all_these_tables('agg_weekly_%');
It should be:
select test('{rajeev1}');

Create a function to get column from multiple tables in PostgreSQL

I'm trying to create a function to get a field value from multiple tables in my database. I made script like this:
CREATE OR REPLACE FUNCTION get_all_changes() RETURNS SETOF RECORD AS
$$
DECLARE
tblname VARCHAR;
tblrow RECORD;
row RECORD;
BEGIN
FOR tblrow IN SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname='public' LOOP /*FOREACH tblname IN ARRAY $1 LOOP*/
RAISE NOTICE 'r: %', tblrow.tablename;
FOR row IN SELECT MAX("lastUpdate") FROM tblrow.tablename LOOP
RETURN NEXT row;
END LOOP;
END LOOP;
END
$$
LANGUAGE 'plpgsql' ;
SELECT get_all_changes();
But it is not working, everytime it shows this error
tblrow.tablename" not defined in line "FOR row IN SELECT MAX("lastUpdate") FROM tblrow.tablename LOOP"
Your inner FOR loop must use the FOR...EXECUTE syntax as shown in the manual:
FOR target IN EXECUTE text_expression [ USING expression [, ... ] ] LOOP
statements
END LOOP [ label ];
In your case something along this line:
FOR row IN EXECUTE 'SELECT MAX("lastUpdate") FROM ' || quote_ident(tblrow.tablename) LOOP
RETURN NEXT row;
END LOOP
The reason for this is explained in the manual somewhere else:
Oftentimes you will want to generate dynamic commands inside your PL/pgSQL functions, that is, commands that will involve different tables or different data types each time they are executed. PL/pgSQL's normal attempts to cache plans for commands (as discussed in Section 39.10.2) will not work in such scenarios. To handle this sort of problem, the EXECUTE statement is provided[...]
Answer to your new question (mislabeled as answer):
This can be much simpler. You do not need to create a table just do define a record type.
If at all, you would better create a type with CREATE TYPE, but that's only efficient if you need the type in multiple places. For just a single function, you can use RETURNS TABLE instead :
CREATE OR REPLACE FUNCTION get_all_changes(text[])
RETURNS TABLE (tablename text
,"lastUpdate" timestamp with time zone
,nums integer) AS
$func$
DECLARE
tblname text;
BEGIN
FOREACH tblname IN ARRAY $1 LOOP
RETURN QUERY EXECUTE format(
$f$SELECT '%I', MAX("lastUpdate"), COUNT(*)::int FROM %1$I
$f$, tblname)
END LOOP;
END
$func$ LANGUAGE plpgsql;
A couple more points:
Use RETURN QUERY EXECUTE instead of the nested loop. Much simpler and faster.
Column aliases would only serve as documentation, those names are discarded in favor of the names declared in the RETURNS clause (directly or indirectly).
Use format() with %I to replace the concatenation with quote_ident() and %1$I to refer to the same parameter another time.
count() usually returns type bigint. Cast the integer, since you defined the column in the return type as such: count(*)::int.
Thanks,
I finally made my script like:
CREATE TABLE IF NOT EXISTS __rsdb_changes (tablename text,"lastUpdate" timestamp with time zone, nums bigint);
CREATE OR REPLACE FUNCTION get_all_changes(varchar[]) RETURNS SETOF __rsdb_changes AS /*TABLE (tablename varchar(40),"lastUpdate" timestamp with time zone, nums integer)*/
$$
DECLARE
tblname VARCHAR;
tblrow RECORD;
row RECORD;
BEGIN
FOREACH tblname IN ARRAY $1 LOOP
/*RAISE NOTICE 'r: %', tblrow.tablename;*/
FOR row IN EXECUTE 'SELECT CONCAT('''|| quote_ident(tblname) ||''') AS tablename, MAX("lastUpdate") AS "lastUpdate",COUNT(*) AS nums FROM ' || quote_ident(tblname) LOOP
/*RAISE NOTICE 'row.tablename: %',row.tablename;*/
/*RAISE NOTICE 'row.lastUpdate: %',row."lastUpdate";*/
/*RAISE NOTICE 'row.nums: %',row.nums;*/
RETURN NEXT row;
END LOOP;
END LOOP;
RETURN;
END
$$
LANGUAGE 'plpgsql' ;
Well, it works. But it seems I can only create a table to define the return structure instead of just RETURNS SETOF RECORD. Am I right?
Thanks again.

Postgres how to evaluate expression from query to variables in the function

I would like to be able to get values of function variables whose names are queried from a table
Edited to show querying a table instead of query from static values:
create table __test__
(
_col text
);
insert into __test__
(_col)
values('_a');
create or replace function __test()
returns void
language 'plpgsql' as
$$
declare
_r record;
_a int;
_b int;
_sql text;
begin
_a = 1;
_b = 0;
for _r in select _col as _nam from __test__ a loop
-- query returns one row valued "_a"
_sql = 'select ' || _r._nam ;
execute _sql into _b;
end loop;
raise info 'value of _b %', _b;
end;
$$;
select __test()
when function executes so that _b = 1. Is it possible?
same error ...
ERROR: column "_a" does not exist
LINE 1: select _a
^
QUERY: select _a
CONTEXT: PL/pgSQL function "__test" line 15 at EXECUTE statement
You could create a temporary table, insert your variable names and values in it, and then execute a select against that. Just clean up after. I have used approaches like that before. It works ok. It does have extra overhead though.
Edit: adding an example
CREATE FUNCTION switch (in_var text) RETURNS text
LANGUAGE PLPGSQL VOLATILE AS $$
declare t_test text;
switch_vals text[];
BEGIN
CREATE TEMPORARY TABLE switch_values (var text, value text);
EXECUTE $e$ INSERT INTO switch_values VALUES
('a', '1'), ('b', '2'), ('c', '3') $e$;
EXECUTE $e$ SELECT value FROM switch_values WHERE var = $e$ || quote_literal(in_var)
INTO t_test;
DROP TABLE switch_values;
RETURN t_test;
END; $$;
postgres=# select switch('a');
switch
--------
1
(1 row)
Let's try to reframe the question: what you're after would be the equivalent of Perl eval()
function, with its ability to execute a dynamically generated piece of code for which "any outer lexical variables are visible to it". In your example, the variable would be _a, but as you can see from the error message, it can't be interpolated by a dynamic SQL statement. The reason is that the SQL interpreter has no visibility on the current pl/pgsql variables, or even the knowledge that such variables exist. They are confined to pl/pgsql.
What would be needed here is a context-aware dynamically-generated pl/pgsql statement, but this language does not have this feature. It's doubtful that a trick could be found to achieve the result without this feature. For all its ability to interface nicely with SQL, other than that it's a fairly static language.
On the other hand, this would be no problem for pl/perl.