can i declare a postgres function that takes an array of any type? - postgresql

Ie I need a function that can be called like this
select myfunc({1,'foo', true})
or
select myfunc({42.0,7, false, x'ff'})
to be 100% clear I actually want
select myfunc(array[col1,col2,col3])
where col1, col2, col3 are of different types. Maybe that makes a difference to the answers

https://www.postgresql.org/docs/current/static/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC
Each position (either argument or return value) declared as anyelement is allowed to have any specific actual data type, but in any given call they must all be the same actual type. Each position declared as anyarray can have any array data type, but similarly they must all be the same type.
function can accept anyarray, which effectively is array of values of any same type, not an array of any type mixed in one array...
what you probably look for instead would be something like:
so=# create function ae(i anyelement) returns anyelement as $$
begin
raise info '%',i::text; return i;
end;
$$ language plpgsql
;
CREATE FUNCTION
so=# create table pm100(f float,b bool, t bytea);
CREATE TABLE
so=# select ae((42.0, false, '\xff')::pm100);
INFO: (42,f,"\\xff")
ae
----------------
(42,f,"\\xff")
(1 row)

No, You cannot do it - #Vao Tsun reply is absolutely correct. PostgreSQL SQL language is pretty static - like C or Pascal. There are few dynamic features, but these features are limited.
Any query has two stages - planning and execution. And data types of any value must be known in planning time (dynamic queries and record type is a exception - but only locally in PLpgSQL). Because all types must be known before execution, PostgreSQL doesn't allow features that can hold type dynamic values - like polymorphic collections.
For constant values there can be workaround for your case. You can write variadic function with parameters of "any" type. It has sense only for constant values - types are known in planning time, and this functions can be implemented only in C language. For example, the format function is of this kind.
The necessity do some dynamic work is signal of "broken" design. "broken" from PostgreSQL perspective. Some patterns cannot be implemented in Postgres, and is better implement it outside or with different kind of software.

Related

How to pass a Tables Column into a plpgsql Functiion while performing a SELECT... statement

I googled but everyone was asking for how to pass tables or how to use the return result into a Function; I want to do neither. I simply want to take the value of a Column (lets assume col2 below is of the text datatype) of a table, and pass that data into a Function, so I can manipulate the data, but in the SELECT... statement itself, i.e.
SELECT t.col1, "myCustomFunction"(t.col2)
FROM tbl t
WHERE t.col1 = 'someCondition';
CREATE OR REPLACE FUNCTION myCustomFunction(myArg text)
RETURNS text AS $$
DECLARE
BEGIN
RETURN UPPER(myArg);
END
$$ LANGUAGE plpgsql;
... So if myCustomerFunction()'s job was capitalize letters (its not, just an example), the output would be the table with col2 data all capitalized.
Is this possible? I supposed it would be no different than embedding a CASE expression there, which I know works, and a Function returns a result, so I assumed it would be the same, but I am getting SQL Error
You cannot to pass named column to some function and you cannot to return this named column like table with this column. The table is set of rows, and almost all processing in Postgres is based on rows processing. Usually you need to hold only data of one row in memory, so you can process much bigger dataset than is your memory.
Inside PL/pgSQL function you have not informations about outer. You can get just data of scalar types, arrays of scalars, or composite or arrays of composites (or ranges and multiranges - this special kind of composite and array of composite). Nothing else.
Theoretically you can aggregate data in one column to array, and later you can expand this array to table. But these operations are memory expensive and can be slow. You need it only in few cases (like computing of median function), but it is slow, and there is risk of out of memory exception.
When object names are not doubled quoted Postgres processes then internally as lower case. When doubled quoted the are processed exactly as quoted. The thing is these may not be the same. You defined the function as FUNCTION myCustomFunction(myArg text) Not doubled quoted, but attempt to call it via "myCustomFunction"(t.col2). Unfortunately myCustomFunction processed as mycustomfunction but "myCustomFunction" is processed exactly as it appears. Those are NOT the same. Either change your select to:
SELECT t.col1,myCustomFunction(t.col2)
FROM tbl t
WHERE t.col1 = 'someCondition';
or change the function definition to:
CREATE OR REPLACE FUNCTION "myCustomFunction"(myArg text)
RETURNS text AS $$
DECLARE
BEGIN
RETURN UPPER(myArg);
END
$$ LANGUAGE plpgsql;

PostgreSQL - Auto Cast for types?

I'm working on porting database from Firebird to PostgreSQL and have many errors related to type cast. For example let's take one simple function:
CREATE OR REPLACE FUNCTION f_Concat3 (
s1 varchar, s2 varchar, s3 varchar
)
RETURNS varchar AS
$body$
BEGIN
return s1||s2||s3;
END;
$body$ LANGUAGE 'plpgsql' IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER LEAKPROOF COST 100;
As Firebird is quite flexible to types this functions was called differently: some of the arguments might be another type: integer/double precision/timestamp. And of course in Postgres function call f_Concat3 ('1', 2, 345.345) causes an error like:
function f_Concat3(unknown, integer, numeric) not found.
The documentation is recomended to use an explicit cast like:
f_Concat3 ('1'::varchar, 2::varchar, 345.345::varchar)
Also I can create a function clones for all possible combinations of types what might occur and it will work. An example to resolve error:
CREATE OR REPLACE FUNCTION f_Concat3 (
s1 varchar, s2 integer, s3 numeric
)
RETURNS varchar AS
$body$
BEGIN
return s1::varchar||s2::varchar||s3::varchar;
END;
However this is very bad and ugly and it wont work with big functions.
Important: We have one general code base for all DB and use our own language to create application objects (forms, reports, etc) which contains select queries. It is not possible to use explicit cast on function calls cause we will lose compatibility with other DB.
I am confused that the integer argument can not be casted to the numeric or double precision, or date / number to a string. I even face problems with integer to smallint, and vice versa. Most database act not like this.
Is there any best practice for such situation?
Is there any alternatives for explicit cast?
SQL is a typed language, and PostgreSQL takes that more seriously than other relational databases. Unfortunately that means extra effort when porting an application with sloppy coding.
It is tempting to add implicit casts, but the documentation warns you from creating casts between built-in data types:
Additional casts can be added by the user with the CREATE CAST command. (This is usually done in conjunction with defining new data types. The set of casts between built-in types has been carefully crafted and is best not altered.)
This is not an idle warning, because function resolution and other things may suddenly fail or misbehave if you create new casts between existing types.
I think that if you really don't want to clean up the code (which would make it more portable for the future), you have no choice but to add more versions of your functions.
Fortunately PostgreSQL has function overloading which makes that possible.
You can make the job easier by using one argument with a polymorphic type in your function definition, like this:
CREATE OR REPLACE FUNCTION f_concat3 (
s1 text, s2 integer, s3 anyelement
) RETURNS text
LANGUAGE sql IMMUTABLE LEAKPROOF AS
'SELECT f_concat3(s1, s2::text, s3::text)';
You cannot use more than one anyelement argument though, because that will only work if all such parameters are of the same type.
If you use function overloading, be careful that you don't create ambiguities that would make function resolution fail.

Syntax error in create aggregate

Trying to create an aggregate function:
create aggregate min (my_type) (
sfunc = least,
stype = my_type
);
ERROR: syntax error at or near "least"
LINE 2: sfunc = least,
^
What am I missing?
Although the manual calls least a function:
The GREATEST and LEAST functions select the largest or smallest value from a list of any number of expressions.
I can not find it:
\dfS least
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+------+------------------+---------------------+------
(0 rows)
Like CASE, COALESCE and NULLIF, GREATEST and LEAST are listed in the chapter Conditional Expressions. These SQL constructs are not implemented as functions .. like #Laurenz provided in the meantime.
The manual advises:
Tip: If your needs go beyond the capabilities of these conditional
expressions, you might want to consider writing a stored procedure in
a more expressive programming language.
The terminology is a bit off here as well, since Postgres does not support true "stored procedures", just functions. (Which is why there is an open TODO item "Implement stored procedures".)
This manual page might be sharpened to avoid confusion ...
#Laurenz also provided an example. I would just use LEAST in the function to get identical functionality:
CREATE FUNCTION f_least(anyelement, anyelement)
RETURNS anyelement LANGUAGE sql IMMUTABLE AS
'SELECT LEAST($1, $2)';
Do not make it STRICT, that would be incorrect. LEAST(1, NULL) returns 1 and not NULL.
Even if STRICT was correct, I would not use it, because it can prevent function inlining.
Note that this function is limited to exactly two parameters while LEAST takes any number of parameters. You might overload the function to cover 3, 4 etc. input parameters. Or you could write a VARIADIC function for up to 100 parameters.
LEAST and GREATEST are not real functions; internally they are parsed as MinMaxExpr (see src/include/nodes/primnodes.h).
You could achieve what you want with a generic function like this:
CREATE FUNCTION my_least(anyelement, anyelement) RETURNS anyelement
LANGUAGE sql IMMUTABLE CALLED ON NULL INPUT
AS 'SELECT LEAST($1, $2)';
(thanks to Erwin Brandstetter for the CALLED ON NULL INPUT and the idea to use LEAST.)
Then you can create your aggregate as
CREATE AGGREGATE min(my_type) (sfunc = my_least, stype = my_type);
This will only work if there are comparison functions for my_type, otherwise you have to come up with a different my_least function.

Executing queries dynamically in PL/pgSQL

I have found solutions (I think) to the problem I'm about to ask for on Oracle and SQL Server, but can't seem to translate this into a Postgres solution. I am using Postgres 9.3.6.
The idea is to be able to generate "metadata" about the table content for profiling purposes. This can only be done (AFAIK) by having queries run for each column so as to find out, say... min/max/count values and such. In order to automate the procedure, it is preferable to have the queries generated by the DB, then executed.
With an example salesdata table, I'm able to generate a select query for each column, returning the min() value, using the following snippet:
SELECT 'SELECT min('||column_name||') as minval_'||column_name||' from salesdata '
FROM information_schema.columns
WHERE table_name = 'salesdata'
The advantage being that the db will generate the code regardless of the number of columns.
Now there's a myriad places I had in mind for storing these queries, either a variable of some sort, or a table column, the idea being to then have these queries execute.
I thought of storing the generated queries in a variable then executing them using the EXECUTE (or EXECUTE IMMEDIATE) statement which is the approach employed here (see right pane), but Postgres won't let me declare a variable outside a function and I've been scratching my head with how this would fit together, whether that's even the direction to follow, perhaps there's something simpler.
Would you have any pointers, I'm currently trying something like this, inspired by this other question but have no idea whether I'm headed in the right direction:
CREATE OR REPLACE FUNCTION foo()
RETURNS void AS
$$
DECLARE
dyn_sql text;
BEGIN
dyn_sql := SELECT 'SELECT min('||column_name||') from salesdata'
FROM information_schema.columns
WHERE table_name = 'salesdata';
execute dyn_sql
END
$$ LANGUAGE PLPGSQL;
System statistics
Before you roll your own, have a look at the system table pg_statistic or the view pg_stats:
This view allows access only to rows of pg_statistic that correspond
to tables the user has permission to read, and therefore it is safe to
allow public read access to this view.
It might already have some of the statistics you are about to compute. It's populated by ANALYZE, so you might run that for new (or any) tables before checking.
-- ANALYZE tbl; -- optionally, to init / refresh
SELECT * FROM pg_stats
WHERE tablename = 'tbl'
AND schemaname = 'public';
Generic dynamic plpgsql function
You want to return the minimum value for every column in a given table. This is not a trivial task, because a function (like SQL in general) demands to know the return type at creation time - or at least at call time with the help of polymorphic data types.
This function does everything automatically and safely. Works for any table, as long as the aggregate function min() is allowed for every column. But you need to know your way around PL/pgSQL.
CREATE OR REPLACE FUNCTION f_min_of(_tbl anyelement)
RETURNS SETOF anyelement
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT format('SELECT (t::%2$s).* FROM (SELECT min(%1$s) FROM %2$s) t'
, string_agg(quote_ident(attname), '), min(' ORDER BY attnum)
, pg_typeof(_tbl)::text)
FROM pg_attribute
WHERE attrelid = pg_typeof(_tbl)::text::regclass
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
);
END
$func$;
Call (important!):
SELECT * FROM f_min_of(NULL::tbl); -- tbl being the table name
db<>fiddle here
Old sqlfiddle
You need to understand these concepts:
Dynamic SQL in plpgsql with EXECUTE
Polymorphic types
Row types and table types in Postgres
How to defend against SQL injection
Aggregate functions
System catalogs
Related answer with detailed explanation:
Table name as a PostgreSQL function parameter
Refactor a PL/pgSQL function to return the output of various SELECT queries
Postgres data type cast
How to set value of composite variable field using dynamic SQL
How to check if a table exists in a given schema
Select columns with particular column names in PostgreSQL
Generate series of dates - using date type as input
Special difficulty with type mismatch
I am taking advantage of Postgres defining a row type for every existing table. Using the concept of polymorphic types I am able to create one function that works for any table.
However, some aggregate functions return related but different data types as compared to the underlying column. For instance, min(varchar_column) returns text, which is bit-compatible, but not exactly the same data type. PL/pgSQL functions have a weak spot here and insist on data types exactly as declared in the RETURNS clause. No attempt to cast, not even implicit casts, not to speak of assignment casts.
That should be improved. Tested with Postgres 9.3. Did not retest with 9.4, but I am pretty sure, nothing has changed in this area.
That's where this construct comes in as workaround:
SELECT (t::tbl).* FROM (SELECT ... FROM tbl) t;
By casting the whole row to the row type of the underlying table explicitly we force assignment casts to get original data types for every column.
This might fail for some aggregate function. sum() returns numeric for a sum(bigint_column) to accommodate for a sum overflowing the base data type. Casting back to bigint might fail ...
#Erwin Brandstetter, Many thanks for the extensive answer. pg_stats does indeed provide a few things, but what I really need to draw a complete profile is a variety of things, min, max values, counts, count of nulls, mean etc... so a bunch of queries have to be ran for each columns, some with GROUP BY and such.
Also, thanks for highlighting the importance of data types, i was sort of expecting this to throw a spanner in the works at some point, my main concern was with how to automate the query generation, and its execution, this last bit being my main concern.
I have tried the function you provide (I probably will need to start learning some plpgsql) but get a error at the SELECT (t::tbl) :
ERROR: type "tbl" does not exist
btw, what is the (t::abc) notation referred as, in python this would be a list slice, but it’s probably not the case in PLPGSQL

Simple PostgreSQL function to return rows

How do I convert a simple select query like select * from customers into a stored procedure / function in pg?
I'm new to Postgres and create function customers() as returns table/setof just didn't feel right and thus the question here.
I understand procs are called "functions" in pg land. Thus create procedure does not exist and my only options are to either create a view or a function. The issue is create function x() returns setof y returns a paren'd comma separated row of values which can't be used without further processing (at least that's what I'm seeing in pgAdmin and Ruby/Sequel).
create function x() returns table(...) requires I embed the row definition which I don't want to.
I'm sure there's a reason behind all this but I'm surprised that the most common use case is this tricky.
Untested but should be about right:
CREATE OR REPLACE FUNCTION getcustomers() RETURNS SETOF customers AS $$
SELECT * FROM customers;
$$ LANGUAGE sql;
The issue is "create function x() returns setof y" returns a paren'd
comma separated row values which can't be used without further processing
The function returns a row. To decompose into individual columns, call it with:
SELECT * FROM getcustomers();
That's assuming the function defines a proper return type. See:
How to return multiple rows from PL/pgSQL function?
The manual on CREATE FUNCTION should be a good starting point. The example section covers this topic.