Invalid input syntax for type integer: "(2,2)" with composite data type while executing function - postgresql

begin;
create type public.ltree as (a int, b int);
create table public.parent_tree(parent_id int,l_tree ltree);
insert into public.parent_tree values(1,(2,2)),(2,(1,2)),(3, (1,28));
commit;
Trying to replicate the solution in this answer:
Format specifier for integer variables in format() for EXECUTE?
For a function with composite type:
CREATE OR REPLACE FUNCTION public.get_parent_ltree
(_parent_id int, tbl_name regclass , OUT _l_tree ltree)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('SELECT l_tree FROM %s WHERE parent_id = $1', tbl_name)
INTO _l_tree
USING _parent_id;
END
$func$;
The effective query executed:
select l_tree from parent_tree where parent_id = 1;
Executing the function:
select get_parent_ltree(1,'parent_tree');
select get_parent_ltree(1,'public.parent_tree');
I get this error:
ERROR: invalid input syntax for type integer: "(2,2)"
CONTEXT: PL/pgSQL function get_parent_ltree(integer,regclass) line 3 at EXECUTE
Context of line 3:

The output parameter _l_tree is a "row variable". (A composite type is treated as row variable.) SELECT INTO assigns fields of a row variable one-by-one. The manual:
The optional target is a record variable, a row variable, or a comma-separated list of simple variables and record/row fields, [...]
So, currently (pg 14), a row or record variables must stand alone as target. Or as the according Postgres error message would put it:
ERROR: record variable cannot be part of multiple-item INTO list
This works:
CREATE OR REPLACE FUNCTION public.get_parent_ltree (IN _parent_id int
, IN _tbl_name regclass
, OUT _l_tree ltree)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('SELECT (l_tree).* FROM %s WHERE parent_id = $1', _tbl_name)
INTO _l_tree
USING _parent_id;
END
$func$;
Or this:
CREATE OR REPLACE FUNCTION public.get_parent_ltree2 (IN _parent_id int
, IN _tbl_name regclass
, OUT _l_tree ltree)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('SELECT (l_tree).a, (l_tree).b FROM %s WHERE parent_id = $1', _tbl_name)
INTO _l_tree.a, _l_tree.b
USING _parent_id;
END
$func$;
db<>fiddle here
I agree that this is rather tricky. One might expect that a composite field is treated as a single field (like a simple type). But that's currently not so in PL/pgSQL assignments.
A related quote from the manual about composite types:
A composite type represents the structure of a row or record; it is
essentially just a list of field names and their data types.
PostgreSQL allows composite types to be used in many of the same ways that simple types can be used.
Bold emphasis mine.
Many. Not all.
Related:
Use of custom return types in a FOR loop in plpgsql
PostgreSQL: ERROR: 42601: a column definition list is required for functions returning "record"
Aside: Consider the additional module ltree instead of "growing your own". And if you continue working with your own composite type, consider a different name to avoid confusion / conflict with that module.

It looks like a Postgres bug but Erwin clarifies the issue in the adjacent answer. One of the natural workarounds is to use an auxiliary text variable in the way like this:
create or replace function get_parent_ltree(_parent_id int, tbl_name regclass)
returns ltree language plpgsql as
$func$
declare
rslt text;
begin
execute format('select l_tree from %s where parent_id = $1', tbl_name)
into rslt
using _parent_id;
return rslt::ltree;
end
$func$;

Related

how do I pass a user defined type variable to a function as a parameter?

I want to pass a user defined type parameter to a PLPGSQL function, but I am getting this error at runtime:
dev=# select process_shapes();
ERROR: invalid input syntax for integer: "(,,7)"
CONTEXT: PL/pgSQL function process_shapes() line 9 at SQL statement
dev=#
For some reason, the parameters are not passed correctly and I have no idea why it doesn't work.
My functions are:
CREATE OR REPLACE FUNCTION join_shapes(first_shape shape_t,second_shape shape_t,OUT new_shape shape_t)
AS $$
DECLARE
BEGIN -- simplified join_shape()s function
new_shape.num_lines:=first_shape.num_lines+second_shape.num_lines;
END;
$$ LANGUAGE PLPGSQL;
CREATE OR REPLACE FUNCTION process_shapes()
RETURNS void AS $$
DECLARE
rectangle shape_t;
triangle shape_t;
produced_shape shape_t;
BEGIN
rectangle.num_lines:=4;
triangle.num_lines:=3;
SELECT join_shapes(rectangle,triangle) INTO produced_shape;
RAISE NOTICE 'produced shape = %s',produced_shape;
END;
$$ LANGUAGE PLPGSQL;
Type definition:
CREATE TYPE shape_t AS (
shape_id integer,
shape_name varchar,
num_lines integer
);
Postgres version: 9.6.1
When the target of a SELECT ... INTO statement is of a composite type, it will assign each of the columns returned by the SELECT to a different field in the target.
However, SELECT join_shapes(rectangle,triangle) returns a single column of type shape_t, and it's trying to cram the whole thing into the first column of the target, i.e. produced_shape.shape_id (hence the error message about a failed integer conversion).
Instead, you need a SELECT statement which returns three columns. Just replace
SELECT join_shapes(rectangle,triangle)
with
SELECT * FROM join_shapes(rectangle,triangle)
Alternatively, you could use
produced_shape := (SELECT join_shapes(rectangle,triangle));
which performs a single assignment, rather than trying to assign the target fields individually.
For other people whom want to pass composite types to functions:
create type pref_public.create_test_row_input as (
name text
);
create or replace function pref_public.create_test_row(test_row pref_public.create_test_row_input) returns pref_public.test_rows as $$
insert into pref_public.test_rows (name)
values
(test_row.name)
returning *;
$$ language sql strict security definer;
grant execute on function pref_public.create_test_row to pref_user;
You'll need to use row()
select * from pref_public.create_test_row(row('new row'));
More info here

Execute a dynamic crosstab query

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.

PL/pgSQL column name the same as variable

I'm new to plpgsql and I'm trying to create function that will check if a certain value exists in table and if not will add a row.
CREATE OR REPLACE FUNCTION hire(
id_pracownika integer,
imie character varying,
nazwisko character varying,
miasto character varying,
pensja real)
RETURNS TEXT AS
$BODY$
DECLARE
wynik TEXT;
sprawdzenie INT;
BEGIN
sprawdzenie = id_pracownika;
IF EXISTS (SELECT id_pracownika FROM pracownicy WHERE id_pracownika=sprawdzenie) THEN
wynik = "JUZ ISTNIEJE";
RETURN wynik;
ELSE
INSERT INTO pracownicy(id_pracownika,imie,nazwisko,miasto,pensja)
VALUES (id_pracownika,imie,nazwisko,miasto,pensja);
wynik = "OK";
RETURN wynik;
END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
The issue is that I'm getting errors saying that id_pracownika is a column name and a variable.
How to specify that "id_pracownika" in such context refers to column name?
Assuming id_pracownika is The PRIMARY KEY of the table. Or at least defined UNIQUE. (If it's not NOT NULL, NULL is a corner case.)
SELECT or INSERT
Your function is another implementation of "SELECT or INSERT" - a variant of the UPSERT problem, which is more complex in the face of concurrent write load than it might seem. See:
Is SELECT or INSERT in a function prone to race conditions?
With UPSERT in Postgres 9.5 or later
In Postgres 9.5 or later use UPSERT (INSERT ... ON CONFLICT ...) Details in the Postgres Wiki. This new syntax does a clean job:
CREATE OR REPLACE FUNCTION hire(
_id_pracownika integer
, _imie varchar
, _nazwisko varchar
, _miasto varchar
, _pensja real)
RETURNS text
LANGUAGE plpgsql AS
$func$
BEGIN
INSERT INTO pracownicy
( id_pracownika, imie, nazwisko, miasto, pensja)
VALUES (_id_pracownika,_imie,_nazwisko,_miasto,_pensja)
ON CONFLICT DO NOTHING;
IF FOUND THEN
RETURN 'OK';
ELSE
RETURN 'JUZ ISTNIEJE'; -- already exists
END IF;
END
$func$;
About the special variable FOUND:
Why is IS NOT NULL false when checking a row type?
Table-qualify column names to disambiguate where necessary. (You can also prefix function parameters with the function name, but that gets awkward quickly.)
But column names in the target list of an INSERT may not be table-qualified. Those are never ambiguous anyway.
Best avoid ambiguities a priori. Some (including me) like to prefix all function parameters and variables with an underscore.
If you positively need a column name as function parameter name, one way to avoid naming collisions is to use an ALIAS inside the function. One of the rare cases where ALIAS is actually useful.
Or reference function parameters by ordinal position: $1 for id_pracownika in this case.
If all else fails, you can decide what takes precedence by setting #variable_conflict. See:
Naming conflict between function parameter and result of JOIN with USING clause
There is more:
There are intricacies to the RETURNING clause in an UPSERT. See:
How to use RETURNING with ON CONFLICT in PostgreSQL?
String literals (text constants) must be enclosed in single quotes: 'OK', not "OK". See:
Insert text with single quotes in PostgreSQL
Assigning variables is comparatively more expensive than in other programming languages. Keep assignments to a minimum for best performance in plpgsql. Do as much as possible in SQL statements directly.
VOLATILE COST 100 are default decorators for functions. No need to spell those out.
Without UPSERT in Postgres 9.4 or older
...
IF EXISTS (SELECT FROM pracownicy p
WHERE p.id_pracownika = hire.id_pracownika) THEN
RETURN 'JUZ ISTNIEJE';
ELSE
INSERT INTO pracownicy(id_pracownika,imie,nazwisko,miasto,pensja)
VALUES (hire.id_pracownika,hire.imie,hire.nazwisko,hire.miasto,hire.pensja);
RETURN 'OK';
END IF;
...
But there is a tiny race condition between the SELECT and the INSERT, so not bullet-proof under heavy concurrent write-load.
In an EXISTS expression, the SELECT list does not matter. SELECT id_pracownika, SELECT 1, or even SELECT 1/0 - all the same. Just use an empty SELECT list. Only the existence of any qualifying row matters. See:
What is easier to read in EXISTS subqueries?
It is a example tested by me where I use EXECUTE to run a select and put its result in a cursor, using dynamic column names.
1. Create the table:
create table people (
nickname varchar(9),
name varchar(12),
second_name varchar(12),
country varchar(30)
);
2. Create the function:
CREATE OR REPLACE FUNCTION fun_find_people (col_name text, col_value varchar)
RETURNS void AS
$BODY$
DECLARE
local_cursor_p refcursor;
row_from_people RECORD;
BEGIN
open local_cursor_p FOR
EXECUTE 'select * from people where '|| col_name || ' LIKE ''' || col_value || '%'' ';
raise notice 'col_name: %',col_name;
raise notice 'col_value: %',col_value;
LOOP
FETCH local_cursor_p INTO row_from_people; EXIT WHEN NOT FOUND;
raise notice 'row_from_people.nickname: %', row_from_people.nickname ;
raise notice 'row_from_people.name: %', row_from_people.name ;
raise notice 'row_from_people.country: %', row_from_people.country;
END LOOP;
END;
$BODY$ LANGUAGE 'plpgsql'
3. Run the function
select fun_find_people('name', 'Cristian');
select fun_find_people('country', 'Chile');
inspire with Erwin Brandstetter's answers.
CREATE OR REPLACE FUNCTION test_upsert(
_parent_id int,
_some_text text)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE a text;
BEGIN
INSERT INTO parent_tree (parent_id, some_text)
VALUES (_parent_id,_some_text)
ON CONFLICT DO NOTHING
RETURNING 'ok' into a;
return a;
IF NOT FOUND THEN
return 'JUZ ISTNIEJE';
END IF;
END
$func$;
Follow Erwin's answer. I make a variable hold the return type text.
If conflict do nothing then the function will return nothing. For example, already have parent_id = 10, Then the result would be as following:
test_upsert
------------
(1 row)
NOT Sure the usage of:
IF NOT FOUND THEN
return 'JUZ ISTNIEJE';
END IF;

Passing table names in an array

I need to do the same deletion or purge operation (based on several conditions) on a set of tables. For that I am trying to pass the table names in an array to a function. I am not sure if I am doing it right. Or is there a better way?
I am pasting just a sample example this is not the real function I have written but the basic is same as below:
CREATE OR REPLACE FUNCTION test (tablename text[]) RETURNS int AS
$func$
BEGIN
execute 'delete * from '||tablename;
RETURN 1;
END
$func$ LANGUAGE plpgsql;
But when I call the function I get an error:
select test( {'rajeev1'} );
ERROR: syntax error at or near "{"
LINE 10: select test( {'rajeev1'} );
^
********** Error **********
ERROR: syntax error at or near "{"
SQL state: 42601
Character: 179
Array syntax
'{rajeev1, rajeev2}' or ARRAY['rajeev1', 'rajeev2']. Read the manual.
TRUNCATE
Since you are deleting all rows from the tables, consider TRUNCATE instead. Per documentation:
Tip: TRUNCATE is a PostgreSQL extension that provides a faster
mechanism to remove all rows from a table.
Be sure to study the details. If TRUNCATE works for you, the whole operation becomes very simple, since the command accepts multiple tables:
TRUNCATE rajeev1, rajeev2, rajeev3, ..
Dynamic DELETE
Else you need dynamic SQL like you already tried. The scary missing detail: you are completely open to SQL injection and catastrophic syntax errors. Use format() with %I (not %s to sanitize identifiers like table names. Or, better yet in this particular case, use an array of regclass as parameter instead:
CREATE OR REPLACE FUNCTION f_del_all(_tbls regclass)
RETURNS void AS
$func$
DECLARE
_tbl regclass;
BEGIN
FOREACH _tbl IN ARRAY _tbls LOOP
EXECUTE format('DELETE * FROM %s', _tbl);
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT f_del_all('{rajeev1,rajeev2,rajeev3}');
Explanation here:
Table name as a PostgreSQL function parameter
You used wrong syntax for text array constant in the function call. But even if it was right, your function is not correct.
If your function has text array as argument you should loop over the array to execute query for each element.
CREATE OR REPLACE FUNCTION test (tablenames text[]) RETURNS int AS
$func$
DECLARE
tablename text;
BEGIN
FOREACH tablename IN ARRAY tablenames LOOP
EXECUTE FORMAT('delete * from %s', tablename);
END LOOP;
RETURN 1;
END
$func$ LANGUAGE plpgsql;
You can then call the function for several tables at once, not only for one.
SELECT test( '{rajeev1, rajeev2}' );
If you do not need this feature, simply change the argument type to text.
CREATE OR REPLACE FUNCTION test (tablename text) RETURNS int AS
$func$
BEGIN
EXECUTE format('delete * from %s', tablename);
RETURN 1;
END
$func$ LANGUAGE plpgsql;
SELECT test('rajeev1');
I recommend using the format function.
If you want to execute a function (say purge_this_one_table(tablename)) on a group of tables identified by similar names you can use this construction:
create or replace function purge_all_these_tables(mask text)
returns void language plpgsql
as $$
declare
tabname text;
begin
for tabname in
select relname
from pg_class
where relkind = 'r' and relname like mask
loop
execute format(
'purge_this_one_table(%s)',
tabname);
end loop;
end $$;
select purge_all_these_tables('agg_weekly_%');
It should be:
select test('{rajeev1}');

Unexpected behaviour for custom type returned from a function

I have created a custom type
CREATE TYPE rc_test_type AS (a1 bigint);
and a function
CREATE OR REPLACE FUNCTION public.rc_test_type_function(test_table character varying, dummy integer)
RETURNS rc_test_type AS
$BODY$
DECLARE
ret rc_test_type;
query text;
BEGIN
query := 'SELECT count(*) from ' || test_table ;
EXECUTE query into ret.a1;
RETURN ret;
END $BODY$
LANGUAGE plpgsql VOLATILE
If I run
SELECT * FROM rc_test_type_function('some_table', 1);
I get
"a1"
1389
So far so good.
If I run
SELECT p FROM (SELECT rc_test_type_function('some_table', s.step) AS p
FROM some_other_table s) foo;
I get
"p"
"(1389)"
"(1389)"
since 'some_other_table' has just two records. Fine.
But then if I try
SELECT p.a1 FROM (select rc_test_type_function('some_table', s.step) AS p
FROM some_other_table s) foo;
I get the error
missing FROM-clause entry in subquery for table »p«
which I find strange since the subquery has not changed.
Two questions:
Can anyone explain what's going on?
How do I extract the field value a1 from the returned array?
Use parentheses around the composite type:
SELECT (p).a1
FROM (SELECT rc_test_type_function('some_table', s.step) AS p
FROM some_other_table s
) foo;
Even though your type has just a single column is still a composite type - with its own column name. Doesn't make a lot of sense, but that's how you built it.
(You might want to just use a simple type or maybe a DOMAIN instead.)
Quoting the manual here:
(compositecol).somefield
(mytable.compositecol).somefield
The parentheses are required here to show that compositecol is a column name not a
a table name, or that mytable is a table name not a schema name in the second case.
Proper function
Omitting the part with the composite type, your function would be safer, simpler and faster this way:
CREATE OR REPLACE FUNCTION foo(test_table varchar, dummy int, OUT p bigint)
AS
$func$
BEGIN
EXECUTE format('SELECT count(*) from %I', test_table) -- !avoid SQLi!
INTO p;
END
$func$ LANGUAGE plpgsql;
Avoid SQL injection with dynamic SQL!
An OUT parameter simplifies the syntax in this case. You don't need a DECLARE clause at all, and no RETURN either
Even better
CREATE OR REPLACE FUNCTION foo(test_table regclass, dummy int, OUT p bigint)
AS
$func$
BEGIN
EXECUTE 'SELECT count(*) from ' || test_table
INTO p;
END
$func$ LANGUAGE plpgsql;
By using the object identifier regclass this would also work with schema-qualified table names. And SQLi is not possible to begin with. The function would fail immediately if the table name is illegal and it is quoted automatically when converted to text automatically.