PostgreSQL: store function in column as value - postgresql

Can functions be stored as anonymous functions directly in column as its value?
Let's say I want this function be stored in column.
Example (pseudocode):
Table my_table: pk (int), my_function (func)
func ( x ) { return x * 100 }
And later use it as:
select
t.my_function(some_input) AS output
from
my_table as t
where t.pk = 1999
Function may vary for each pk.

Your title asks something else than your example.
A function has to be created before you can call it. (title)
An expression has to be evaluated. You would need a meta-function for that. (example)
Here are solutions for both:
1. Evaluate expressions dynamically
You have to take into account that the resulting type can vary. I use polymorphic types for that.
CREATE OR REPLACE FUNCTION f1(int)
RETURNS int
LANGUAGE sql IMMUTABLE AS
'SELECT $1 * 100;';
CREATE OR REPLACE FUNCTION f2(text)
RETURNS text
LANGUAGE sql IMMUTABLE AS
$$SELECT $1 || '_foo';$$;
CREATE TABLE my_expr (
expr text PRIMARY KEY
, def text
, rettype regtype
);
INSERT INTO my_expr VALUES
('x', 'f1(3)' , 'int')
, ('y', $$f2('bar')$$, 'text')
, ('z', 'now()' , 'timestamptz')
;
CREATE OR REPLACE FUNCTION f_eval(text, _type anyelement = 'NULL'::text, OUT _result anyelement)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE
'SELECT ' || (SELECT def FROM my_expr WHERE expr = $1)
INTO _result;
END
$func$;
Related:
Refactor a PL/pgSQL function to return the output of various SELECT queries
Call:
SQL is strictly typed, the same result column can only have one data type. For multiple rows with possibly heterogeneous data types, you might settle for type text, as every data type can be cast to and from text:
SELECT *, f_eval(expr) AS result -- default to type text
FROM my_expr;
Or return multplce columns like:
SELECT *
, CASE WHEN rettype = 'text'::regtype THEN f_eval(expr) END AS text_result -- default to type text
, CASE WHEN rettype = 'int'::regtype THEN f_eval(expr, NULL::int) END AS int_result
, CASE WHEN rettype = 'timestamptz'::regtype THEN f_eval(expr, NULL::timestamptz) END AS tstz_result
-- , more?
FROM my_expr;
db<>fiddle here
2. Create and use functions dynamically
It is possible to create functions dynamically and then use them. You cannot do that with plain SQL, however. You will have to use another function to do that or at least an anonymous code block (DO statement), introduced in PostgreSQL 9.0.
It can work like this:
CREATE TABLE my_func (func text PRIMARY KEY, def text);
INSERT INTO my_func VALUES
('f'
, $$CREATE OR REPLACE FUNCTION f(int)
RETURNS int
LANGUAGE sql IMMUTABLE AS
'SELECT $1 * 100;'$$);
CREATE OR REPLACE FUNCTION f_create_func(text)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE (SELECT def FROM my_func WHERE func = $1);
END
$func$;
Call:
SELECT f_create_func('f');
SELECT f(3);
db<>fiddle here
You may want to drop the function afterwards.
In most cases you should just create the functions instead and be done with it. Use separate schemas if you have problems with multiple versions or privileges.
For more information on the features I used here, see my related answer on dba.stackexchange.com.

Related

Cannot run Postgres Stored Procedure [duplicate]

I have a Postgres function which is returning a table:
CREATE OR REPLACE FUNCTION testFunction() RETURNS TABLE(a int, b int) AS
$BODY$
DECLARE a int DEFAULT 0;
DECLARE b int DEFAULT 0;
BEGIN
CREATE TABLE tempTable AS SELECT a, b;
RETURN QUERY SELECT * FROM tempTable;
DROP TABLE tempTable;
END;
$BODY$
LANGUAGE plpgsql;
This function is not returning data in row and column form. Instead it returns data as:
(0,0)
That is causing a problem in Coldfusion cfquery block in extracting data. How do I get data in rows and columns when a table is returned from this function? In other words: Why does the PL/pgSQL function not return data as columns?
To get individual columns instead of the row type, call the function with:
SELECT * FROM testfunction();
Just like you would select all columns from a table.
Also consider this reviewed form of your test function:
CREATE OR REPLACE FUNCTION testfunction()
RETURNS TABLE(a int, b int)
LANGUAGE plpgsql AS
$func$
DECLARE
_a int := 0;
_b int := 0;
BEGIN
CREATE TABLE tempTable AS SELECT _a, _b;
RETURN QUERY SELECT * FROM tempTable;
DROP TABLE tempTable;
END
$func$;
In particular:
The DECLARE key word is only needed once.
Avoid declaring parameters that are already (implicitly) declared as OUT parameters in the RETURNS TABLE (...) clause.
Don't use unquoted CaMeL-case identifiers in Postgres. It works, unquoted identifiers are cast to lower case, but it can lead to confusing errors. See:
Are PostgreSQL column names case-sensitive?
The temporary table in the example is completely useless (probably over-simplified). The example as given boils down to:
CREATE OR REPLACE FUNCTION testfunction(OUT a int, OUT b int)
LANGUAGE plpgsql AS
$func$
BEGIN
a := 0;
b := 0;
END
$func$;
Of course you can do this by putting the function call in the FROM clause, like Eric Brandstetter correctly answered.
However, this is sometimes complicating in a query that already has other things in the FROM clause.
To get the individual columns that the function returns, you can use this syntax:
SELECT (testfunction()).*
Or to get only the column called "a":
SELECT (testfunction()).a
Place the whole function, including the input value(s) in parenteses, followed by a dot and the desired column name, or an asterisk.
To get the column names that the function returns, you'll have to either:
check the source code
inspect the result of the function first, like so : SELECT * FROM testfunction() .
The input values can still come out of a FROM clause.
Just to illustrate this, consider this function and test data:
CREATE FUNCTION funky(a integer, b integer)
RETURNS TABLE(x double precision, y double precision) AS $$
SELECT a*random(), b*random();
$$ LANGUAGE SQL;
CREATE TABLE mytable(a integer, b integer);
INSERT INTO mytable
SELECT generate_series(1,100), generate_series(101,200);
You could call the function "funky(a,b)", without the need to put it in the FROM clause:
SELECT (funky(mytable.a, mytable.b)).*
FROM mytable;
Which would result in 2 columns:
x | y
-------------------+-------------------
0.202419687062502 | 55.417385618668
1.97231830470264 | 63.3628275180236
1.89781916560605 | 1.98870931006968
(...)

Execute a dynamic crosstab query

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.

How to return uncertain number columns of a table from a postgresql function?

As we know, plpgsql functions can return a table like this:
RETURNS table(int, char(1), ...)
But how to write this function, when the list of columns is uncertain at the time of creating the function.
When a function returns anonymous records
RETURNS SETOF record
you have to provide a column definition list when calling it with SELECT * FROM. SQL demands to know column names and types to interpret *. For registered tables and types this is provided by the system catalog. For functions you need to declare it yourself one way or the other. Either in the function definition or in the call. The call could look like #Craig already provided. You probably didn't read his answer carefully enough.
Depending on what you need exactly, there are a number of ways around this, though:
1) Return a single anonymous record
Example:
CREATE OR REPLACE FUNCTION myfunc_single() -- return a single anon rec
RETURNS record AS
$func$
DECLARE
rec record;
BEGIN
SELECT into rec 1, 'foo'; -- note missing type for 'foo'
RETURN rec;
END
$func$ LANGUAGE plpgsql;
This is a very limited niche. Only works for a single anonymous record from a function defined with:
RETURNS record
Call without * FROM:
SELECT myfunc_single();
Won't work for a SRF (set-returning function) and only returns a string representation of the whole record (type record). Rarely useful.
To get individual cols from a single anonymous record, you need to provide a column definition list again:
SELECT * FROM myfunc_single() AS (id int, txt unknown); -- note "unknown" type
2) Return well known row type with a super-set of columns
Example:
CREATE TABLE t (id int, txt text, the_date date);
INSERT INTO t VALUES (3, 'foz', '2014-01-13'), (4, 'baz', '2014-01-14');
CREATE OR REPLACE FUNCTION myfunc_tbl() -- return well known table
RETURNS SETOF t AS
$func$
BEGIN
RETURN QUERY
TABLE t;
-- SELECT * FROM t; -- equivalent
END
$func$ LANGUAGE plpgsql;
The function returns all columns of the table. This is short and simple and performance won't suffer as long as your table doesn't hold a huge number of columns or huge columns.
Select individual columns on call:
SELECT id, txt FROM myfunc_tbl();
SELECT id, the_date FROM myfunc_tbl();
-> SQLfiddle demonstrating all.
3) Advanced solutions
This answer is long enough already. And this closely related answer has it all:
Refactor a PL/pgSQL function to return the output of various SELECT queries
Look to the last chapter in particular: Various complete table types
If the result is of uncertain/undefined format you must use RETURNS record or (for a multi-row result) RETURNS SETOF record.
The calling function must then specify the table format, eg:
SELECT my_func() AS result(a integer, b char(1));
BTW, char is an awful data type with insane space-padding rules that date back to the days of fixed-width file formats. Don't use it. Always just use text or varchar.
Given comments, let's make this really explicit:
regress=> CREATE OR REPLACE FUNCTION f_something() RETURNS SETOF record AS $$
SELECT 1, 2, TEXT 'a';
$$ LANGUAGE SQL;
CREATE FUNCTION
regress=> SELECT * FROM f_something();
ERROR: a column definition list is required for functions returning "record"
LINE 1: SELECT * FROM f_something();
regress=> SELECT * FROM f_something() AS x(a integer, b integer, c text);
a | b | c
---+---+---
1 | 2 | a
(1 row)

Return SETOF rows from PostgreSQL function

I have a situation where I want to return the join between two views. and that's a lot of columns. It was pretty easy in sql server. But in PostgreSQL when I do the join. I get the error "a column definition list is required".
Is there any way I can bypass this, I don't want to provide the definitions of returning columns.
CREATE OR REPLACE FUNCTION functionA(username character varying DEFAULT ''::character varying, databaseobject character varying DEFAULT ''::character varying)
RETURNS SETOF ???? AS
$BODY$
Declare
SqlString varchar(4000) = '';
BEGIN
IF(UserName = '*') THEN
Begin
SqlString := 'select * from view1 left join ' + databaseobject + ' as view2 on view1.id = view2.id';
End;
ELSE
Begin
SqlString := 'select * from view3 left join ' + databaseobject + ' as view2 on view3.id = view2.id';
End;
END IF;
execute (SqlString );
END;
$BODY$
Sanitize function
What you currently have can be simplified / sanitized to:
CREATE OR REPLACE FUNCTION func_a (username text = '', databaseobject text = '')
RETURNS ????
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE
format ('SELECT * FROM %s v1 LEFT JOIN %I v2 USING (id)'
, CASE WHEN username = '*' THEN 'view1' ELSE 'view3' END
, databaseobject);
END
$func$;
You only need additional instances of BEGIN ... END in the function body to start separate code blocks with their own scope, which is rarely needed.
The standard SQL concatenation operator is ||. + is a "creative" addition of your former vendor.
Don't use CaMeL-case identifiers unless you double-quote them. Best don't use them at all See:
Are PostgreSQL column names case-sensitive?
varchar(4000) is also tailored to a specific limitation of SQL Server. It has no specific significance in Postgres. Only use varchar(4000) if you actually need a limit of 4000 characters. I would just use text - except that we don't need any variables at all here, after simplifying the function.
If you have not used format(), yet, consult the manual here.
Return type
Now, for your actual question: The return type for a dynamic query can be tricky since SQL requires that to be declared at call time at the latest. If you have a table or view or composite type in your database already matching the column definition list, you can just use that:
CREATE FUNCTION foo()
RETURNS SETOF my_view AS
...
Else, spell the column definition list with out with (simplest) RETURNS TABLE:
CREATE FUNCTION foo()
RETURNS TABLE (col1 int, col2 text, ...) AS
...
If you are making the row type up as you go, you can return anonymous records:
CREATE FUNCTION foo()
RETURNS SETOF record AS
...
But then you have to provide a column definition list with every call, so I hardly ever use that.
I wouldn't use SELECT * to begin with. Use a definitive list of columns to return and declare your return type accordingly:
CREATE OR REPLACE FUNCTION func_a(username text = '', databaseobject text = '')
RETURNS TABLE(col1 int, col2 text, col3 date)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE
format ($f$SELECT v1.col1, v1.col2, v2.col3
FROM %s v1 LEFT JOIN %I v2 USING (id)$f$
, CASE WHEN username = '*' THEN 'view1' ELSE 'view3' END
, databaseobject);
END
$func$;
For completely dynamic queries, consider building the query in your client to begin with, instead of using a function.
You need to understand basics first:
Refactor a PL/pgSQL function to return the output of various SELECT queries
PL/pgSQL in the Postgres manual
Then there are more advanced options with polymorphic types, which allow you to pass the return type at call time. More in the last chapter of:
Refactor a PL/pgSQL function to return the output of various SELECT queries

Optional argument in PL/pgSQL function

I am trying to write a PL/pgSQL function with optional arguments. It performs a query based on a filtered set of records (if specified), otherwise performs a query on the entire data set in a table.
For example (PSEUDO CODE):
CREATE OR REPLACE FUNCTION foofunc(param1 integer, param2 date, param2 date, optional_list_of_ids=[]) RETURNS SETOF RECORD AS $$
IF len(optional_list_of_ids) > 0 THEN
RETURN QUERY (SELECT * from foobar where f1=param1 AND f2=param2 AND id in optional_list_of_ids);
ELSE
RETURN QUERY (SELECT * from foobar where f1=param1 AND f2=param2);
ENDIF
$$ LANGUAGE SQL;
What would be the correct way to implement this function?
As an aside, I would like to know how I could call such a function in another outer function. This is how I would do it - is it correct, or is there a better way?
CREATE FUNCTION foofuncwrapper(param1 integer, param2 date, param2 date) RETURNS SETOF RECORD AS $$
BEGIN
CREATE TABLE ids AS SELECT id from foobar where id < 100;
RETURN QUERY (SELECT * FROM foofunc(param1, param2, ids));
END
$$ LANGUAGE SQL
Since PostgreSQL 8.4 (which you seem to be running), there are default values for function parameters. If you put your parameter last and provide a default, you can simply omit it from the call:
CREATE OR REPLACE FUNCTION foofunc(_param1 integer
, _param2 date
, _ids int[] DEFAULT '{}')
RETURNS SETOF foobar -- declare return type!
LANGUAGE plpgsql AS
$func$
BEGIN -- required for plpgsql
IF _ids <> '{}'::int[] THEN -- exclude empty array and NULL
RETURN QUERY
SELECT *
FROM foobar
WHERE f1 = _param1
AND f2 = _param2
AND id = ANY(_ids); -- "IN" is not proper syntax for arrays
ELSE
RETURN QUERY
SELECT *
FROM foobar
WHERE f1 = _param1
AND f2 = _param2;
END IF;
END -- required for plpgsql
$func$;
Major points:
The keyword DEFAULT is used to declare parameter defaults. Short alternative: =.
I removed the redundant param1 from the messy example.
Since you return SELECT * FROM foobar, declare the return type as RETURNS SETOF foobar instead of RETURNS SETOF record. The latter form with anonymous records is very unwieldy, you'd have to provide a column definition list with every call.
I use an array of integer (int[]) as function parameter. Adapted the IF expression and the WHERE clause accordingly.
IF statements are not available in plain SQL. Has to be LANGUAGE plpgsql for that.
Call with or without _ids:
SELECT * FROM foofunc(1, '2012-1-1'::date);
Effectively the same:
SELECT * FROM foofunc(1, '2012-1-1'::date, '{}'::int[]);
You have to make sure the call is unambiguous. If you have another function of the same name and two parameters, Postgres might not know which to pick. Explicit casting (like I demonstrate) narrows it down. Else, untyped string literals work, too, but being explicit never hurts.
Call from within another function:
CREATE FUNCTION foofuncwrapper(_param1 integer, _param2 date)
RETURNS SETOF foobar
LANGUAGE plgpsql AS
$func$
DECLARE
_ids int[] := '{1,2,3}';
BEGIN
-- whatever
RETURN QUERY
SELECT * FROM foofunc(_param1, _param2, _ids);
END
$func$;
Elaborating on Frank's answer on this thread:
The VARIADIC agument doesn't have to be the only argument, only the last one.
You can use VARIADIC for functions that may take zero variadic arguments, it's just a little fiddlier in that it requires a different calling style for zero args. You can provide a wrapper function to hide the ugliness. Given an initial varardic function definition like:
CREATE OR REPLACE FUNCTION foofunc(param1 integer, param2 date, param2 date, optional_list_of_ids VARIADIC integer[]) RETURNS SETOF RECORD AS $$
....
$$ language sql;
For zero args use a wrapper like:
CREATE OR REPLACE FUNCTION foofunc(integer, date, date) RETURNS SETOF RECORD AS $body$
SELECT foofunc($1,$2,$3,VARIADIC ARRAY[]::integer[]);
$body$ LANGUAGE 'sql';
or just call the main func with an empty array like VARIADIC '{}'::integer[] directly. The wrapper is ugly, but it's contained ugliness, so I'd recommend using a wrapper.
Direct calls can be made in variadic form:
SELECT foofunc(1,'2011-01-01','2011-01-01', 1, 2, 3, 4);
... or array call form with array ctor:
SELECT foofunc(1,'2011-01-01','2011-01-01', VARIADIC ARRAY[1,2,3,4]);
... or array text literal form:
SELECT foofunc(1,'2011-01-01','2011-01-01', VARIADIC '{1,2,3,4}'::int[]);
The latter two forms work with empty arrays.
You mean SQL Functions with Variable Numbers of Arguments? If so, use VARIADIC.