Test for null in function with varying parameters - postgresql

I have a Postgres function:
create function myfunction(integer, text, text, text, text, text, text) RETURNS
table(id int, match text, score int, nr int, nr_extra character varying, info character varying, postcode character varying,
street character varying, place character varying, country character varying, the_geom geometry)
AS $$
BEGIN
return query (select a.id, 'address' as match, 1 as score, a.ad_nr, a.ad_nr_extra,a.ad_info,a.ad_postcode, s.name as street, p.name place , c.name country, a.wkb_geometry as wkb_geometry from "Addresses" a
left join "Streets" s on a.street_id = s.id
left join "Places" p on s.place_id = p.id
left join "Countries" c on p.country_id = c.id
where c.name = $7
and p.name = $6
and s.name = $5
and a.ad_nr = $1
and a.ad_nr_extra = $2
and a.ad_info = $3
and ad_postcode = $4);
END;
$$
LANGUAGE plpgsql;
This function fails to give the right result when one or more of the variables entered are NULL because ad_postcode = NULL will fail.
What can I do to test for NULL inside the query?

I disagree with some of the advice in other answers. This can be done with PL/pgSQL and I think it is mostly far superior to assembling queries in a client application. It is faster and cleaner and the app only sends the bare minimum across the wire in requests. SQL statements are saved inside the database, which makes it easier to maintain - unless you want to collect all business logic in the client application, this depends on the general architecture.
PL/pgSQL function with dynamic SQL
CREATE OR REPLACE FUNCTION func(
_ad_nr int = NULL
, _ad_nr_extra text = NULL
, _ad_info text = NULL
, _ad_postcode text = NULL
, _sname text = NULL
, _pname text = NULL
, _cname text = NULL)
RETURNS TABLE(id int, match text, score int, nr int, nr_extra text
, info text, postcode text, street text, place text
, country text, the_geom geometry)
LANGUAGE plpgsql AS
$func$
BEGIN
-- RAISE NOTICE '%', -- for debugging
RETURN QUERY EXECUTE concat(
$$SELECT a.id, 'address'::text, 1 AS score, a.ad_nr, a.ad_nr_extra
, a.ad_info, a.ad_postcode$$
, CASE WHEN (_sname, _pname, _cname) IS NULL THEN ', NULL::text' ELSE ', s.name' END -- street
, CASE WHEN (_pname, _cname) IS NULL THEN ', NULL::text' ELSE ', p.name' END -- place
, CASE WHEN _cname IS NULL THEN ', NULL::text' ELSE ', c.name' END -- country
, ', a.wkb_geometry'
, concat_ws('
JOIN '
, '
FROM "Addresses" a'
, CASE WHEN NOT (_sname, _pname, _cname) IS NULL THEN '"Streets" s ON s.id = a.street_id' END
, CASE WHEN NOT (_pname, _cname) IS NULL THEN '"Places" p ON p.id = s.place_id' END
, CASE WHEN _cname IS NOT NULL THEN '"Countries" c ON c.id = p.country_id' END
)
, concat_ws('
AND '
, '
WHERE TRUE'
, CASE WHEN $1 IS NOT NULL THEN 'a.ad_nr = $1' END
, CASE WHEN $2 IS NOT NULL THEN 'a.ad_nr_extra = $2' END
, CASE WHEN $3 IS NOT NULL THEN 'a.ad_info = $3' END
, CASE WHEN $4 IS NOT NULL THEN 'a.ad_postcode = $4' END
, CASE WHEN $5 IS NOT NULL THEN 's.name = $5' END
, CASE WHEN $6 IS NOT NULL THEN 'p.name = $6' END
, CASE WHEN $7 IS NOT NULL THEN 'c.name = $7' END
)
)
USING $1, $2, $3, $4, $5, $6, $7;
END
$func$;
Call:
SELECT * FROM func(1, '_ad_nr_extra', '_ad_info', '_ad_postcode', '_sname');
SELECT * FROM func(1, _pname := 'foo');
Since all function parameters have default values, you can use positional notation, named notation or mixed notation at your choosing in the function call. See:
Functions with variable number of input parameters
More explanation for basics of dynamic SQL:
Refactor a PL/pgSQL function to return the output of various SELECT queries
The concat() function is instrumental for building the string. It was introduced with Postgres 9.1.
The ELSE branch of a CASE statement defaults to NULL when not present. Simplifies the code.
The USING clause for EXECUTE makes SQL injection impossible as values are passed as values and allows to use parameter values directly, exactly like in prepared statements.
NULL values are used to ignore parameters here. They are not actually used to search.
You don't need parentheses around the SELECT with RETURN QUERY.
Simple SQL function
You could do it with a plain SQL function and avoid dynamic SQL. For some cases this may be faster. But I wouldn't expect it in this case. Planning the query without unnecessary joins and predicates typically produces best results. Planning cost for a simple query like this is almost negligible.
CREATE OR REPLACE FUNCTION func_sql(
_ad_nr int = NULL
, _ad_nr_extra text = NULL
, _ad_info text = NULL
, _ad_postcode text = NULL
, _sname text = NULL
, _pname text = NULL
, _cname text = NULL)
RETURNS TABLE(id int, match text, score int, nr int, nr_extra text
, info text, postcode text, street text, place text
, country text, the_geom geometry)
LANGUAGE sql AS
$func$
SELECT a.id, 'address' AS match, 1 AS score, a.ad_nr, a.ad_nr_extra
, a.ad_info, a.ad_postcode
, s.name AS street, p.name AS place
, c.name AS country, a.wkb_geometry
FROM "Addresses" a
LEFT JOIN "Streets" s ON s.id = a.street_id
LEFT JOIN "Places" p ON p.id = s.place_id
LEFT JOIN "Countries" c ON c.id = p.country_id
WHERE ($1 IS NULL OR a.ad_nr = $1)
AND ($2 IS NULL OR a.ad_nr_extra = $2)
AND ($3 IS NULL OR a.ad_info = $3)
AND ($4 IS NULL OR a.ad_postcode = $4)
AND ($5 IS NULL OR s.name = $5)
AND ($6 IS NULL OR p.name = $6)
AND ($7 IS NULL OR c.name = $7)
$func$;
Identical call.
To effectively ignore parameters with NULL values:
($1 IS NULL OR a.ad_nr = $1)
To actually use NULL values as parameters, use this construct instead:
($1 IS NULL AND a.ad_nr IS NULL OR a.ad_nr = $1) -- AND binds before OR
This also allows for indexes to be used.
For the case at hand, replace all instances of LEFT JOIN with JOIN.
db<>fiddle here - with simple demo for all variants.
Old sqlfiddle
Asides
Don't use name and id as column names. They are not descriptive and when you join a bunch of tables (like you do to a lot in a relational database), you end up with several columns all named name or id, and have to attach aliases to sort the mess.
Please format your SQL properly, at least when asking public questions. But do it privately as well, for your own good.

If you can modify the query, you could do something like
and (ad_postcode = $4 OR $4 IS NULL)

You can use
c.name IS NOT DISTINCT FROM $7
It will return true if c.name and $7 are equal or both are null.
Or you can use
(c.name = $7 or $7 is null )
It will return true if c.name and $7 are equal or $7 is null.

Several things...
First, as side note: the semantics of your query might need a revisit. Some of the stuff in your where clauses might actually belong in your join clauses, like:
from ...
left join ... on ... and ...
left join ... on ... and ...
When they don't, you most should probably be using an inner join, rather than a left join.
Second, there is a is not distinct from operator, which can occasionally be handy in place of =. a is not distinct from b is basically equivalent to a = b or a is null and b is null.
Note, however, that is not distinct from does NOT use an index, whereas = and is null actually do. You could use (field = $i or $i is null) instead in your particular case, and it will yield the optimal plan if you're using the latest version of Postgres:
https://gist.github.com/ddebernardy/5884267

Related

Prepared Statement Does Not Exists, PostgreSQL

I created the stored procedure with this code
CREATE PROCEDURE get_conferences_for_attendee
(
IN start_time TIMESTAMP,
IN end_time TIMESTAMP,
IN email VARCHAR(255),
IN deleted BOOLEAN
)
AS
$$
SELECT c.localuuid, c.title, i.id, i.start_time, i.end_time, i.status, a.email, a.deleted
FROM Conference c
INNER JOIN Instance i ON i.conference_localuuid = c.localuuid
INNER JOIN Conference_Attendees ca ON ca.conference_localuuid = c.localuuid
INNER JOIN Attendee a ON ca.attendees_localuuid = a.localuuid
WHERE i.start_time BETWEEN start_time AND end_time
AND a.email = email
AND a.deleted = deleted
$$ LANGUAGE SQL;
and this returned
CREATE PROCEDURE
I can see my procedure
SELECT proname, prorettype
FROM pg_proc
WHERE pronamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public');
proname | prorettype
------------------------------+------------
get_conferences_for_attendee | 2278
When I try to execute, I get the error on the title.
EXECUTE get_conferences_for_attendee ('2022-12-26T00:00:00', '2023-01-01T23:59:59', 'yacs.demo2#abc.com', false);
ERROR: prepared statement "get_conferences_for_attendee" does not exist
Update
I found a solution but I'm not sure if it's the proper way to create this. It looks too complicated for me.
CREATE TYPE conference_record AS (
localuuid VARCHAR(255),
title VARCHAR(255),
id VARCHAR(255),
start_time TIMESTAMP,
end_time TIMESTAMP,
status VARCHAR(255),
email VARCHAR(255),
deleted BOOLEAN
);
CREATE FUNCTION get_conferences_for_attendee
(
IN start_time TIMESTAMP,
IN end_time TIMESTAMP,
IN email VARCHAR(255),
IN deleted BOOLEAN
)
RETURNS SETOF conference_record AS $$
BEGIN
RETURN QUERY
SELECT c.localuuid, c.title, i.id, i.start_time, i.end_time, i.status, a.email, a.deleted
FROM Conference c
INNER JOIN Instance i ON i.conference_localuuid = c.localuuid
INNER JOIN Conference_Attendees ca ON ca.conference_localuuid = c.localuuid
INNER JOIN Attendee a ON ca.attendees_localuuid = a.localuuid
WHERE i.start_time BETWEEN $1 AND $2
AND a.email = $3
AND a.deleted = $4;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM get_conferences_for_attendee ('2022-12-26T00:00:00', '2023-01-01T23:59:59', 'yacs.demo1#abc.com', false);
As pointed out in the comments, to use a procedure, you need to CALL your_procedure();.
The code you presented looks like you're trying to get something from it, so a function is more suitable - procedures can return data through out and inout parameters or side-effects, like dumping them to an outside table.
The function and type definitions you later added in an edit look fine. If you're planning to feed it directly into a table, you don't need to define the custom type and instead specify RETURNS SETOF your_target_table_name or RETURNS TABLE (LIKE your_target_table_name).
You can also make it LANGUAGE sql - since you're not using anything plpgsql-specific, you don't need the additional overhead that comes with it. You'll just have to remove BEGIN RETURN QUERY and END, leaving just the bare-bones query.
You can also use a regular prepared statement for this:
PREPARE get_conferences_for_attendee(
TIMESTAMP,
TIMESTAMP,
VARCHAR(255),
BOOLEAN ) AS
SELECT
c.localuuid,
c.title,
i.id,
i.start_time,
i.end_time,
i.status,
a.email,
a.deleted
FROM Conference c
INNER JOIN Instance i ON i.conference_localuuid = c.localuuid
INNER JOIN Conference_Attendees ca ON ca.conference_localuuid = c.localuuid
INNER JOIN Attendee a ON ca.attendees_localuuid = a.localuuid
WHERE i.start_time BETWEEN $1 AND $2
AND a.email = $3
AND a.deleted = $4;
And use it exactly like you intially planned to, with an EXECUTE:
EXECUTE get_conferences_for_attendee(
'2022-12-26T00:00:00',
'2023-01-01T23:59:59',
'yacs.demo1#abc.com',
false);
Online demo
I found a solution but I'm not sure if it's the proper way to create this.
A function is the correct way to do this.
It looks too complicated for me.
You are indeed over-complicating the implementation. You don't need to create a type, this can be simplified by using returns table() instead.
You also don't need PL/pgSQL for this. A SQL function will be enough
CREATE FUNCTION get_conferences_for_attendee
(
p_start_time TIMESTAMP,
p_end_time TIMESTAMP,
p_email text,
p_deleted BOOLEAN
)
RETURNS table(localuuid text, title, text, id text, start_time timestamp, end_time timestamp, status text, email text, deleted boolean)
AS
$$
SELECT c.localuuid, c.title, i.id, i.start_time, i.end_time, i.status, a.email, a.deleted
FROM Conference c
INNER JOIN Instance i ON i.conference_localuuid = c.localuuid
INNER JOIN Conference_Attendees ca ON ca.conference_localuuid = c.localuuid
INNER JOIN Attendee a ON ca.attendees_localuuid = a.localuuid
WHERE i.start_time BETWEEN p_start_time AND p_end_time
AND a.email = p_email
AND a.deleted = p_deleted
$$
LANGUAGE sql
stable;
I renamed the parameters with a prefix to avoid a name clash with columns of the same name.
Note that using BETWEEN with timestamp values is usually a bad idea. It's better to use a range query using >= for the lower bound and < for the "next day" of the upper bound
e.g. start_time >= 2022-12-26 00:00:00' and end_time < '2023-01-02 00:00:00'
Your condition would not return rows where the end_time is e.g. 2023-01-01 23:59:59.999

Concatenate string instead of just replacing it

I have a table with standard columns where I want to perform regular INSERTs.
But one of the columns is of type varchar with special semantics. It's a string that's supposed to behave as a set of strings, where the elements of the set are separated by commas.
Eg. if one row has in that varchar column the value fish,sheep,dove, and I insert the string ,fish,eagle, I want the result to be fish,sheep,dove,eagle (ie. eagle gets added to the set, but fish doesn't because it's already in the set).
I have here this Postgres code that does the "set concatenation" that I want:
SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array('fish,sheep,dove' || ',fish,eagle', ','))) AS x;
But I can't figure out how to apply this logic to insertions.
What I want is something like:
CREATE TABLE IF NOT EXISTS t00(
userid int8 PRIMARY KEY,
a int8,
b varchar);
INSERT INTO t00 (userid,a,b) VALUES (0,1,'fish,sheep,dove');
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array(t00.b || EXCLUDED.b, ','))) AS x;
How can I achieve something like that?
Storing comma separated values is a huge mistake to begin with. But if you really want to make your life harder than it needs to be, you might want to create a function that merges two comma separated lists:
create function merge_lists(p_one text, p_two text)
returns text
as
$$
select string_agg(item, ',')
from (
select e.item
from unnest(string_to_array(p_one, ',')) as e(item)
where e.item <> '' --< necessary because of the leading , in your data
union
select t.item
from unnest(string_to_array(p_two, ',')) t(item)
where t.item <> ''
) t;
$$
language sql;
If you are using Postgres 14 or later, unnest(string_to_array(..., ',')) can be replace with string_to_table(..., ',')
Then your INSERT statement gets a bit simpler:
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = merge_lists(excluded.b, t00.b);
I think I was only missing parentheses around the SELECT statement:
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = (SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array(t00.b || EXCLUDED.b, ','))) AS x);

Postgres function passing array of strings

I have the Postgres function below to return some info from my DB. I need the p_ic parameter to be able to take an array of
strings.
CREATE OR REPLACE FUNCTION eddie.getinv(
IN p_ic character varying[],
IN p_id character varying)
RETURNS TABLE(cnt bigint, actualid text, actualcompany text, part text, daysinstock double precision, condition text,
ic text, price numeric, stock text, quantity bigint, location text, comments text) AS
$
BEGIN
RETURN QUERY
WITH cte AS (
SELECT
CASE WHEN partnerslist IS NULL OR partnerslist = '' THEN
'XX99'
ELSE
partnerslist
END AS a
FROM support.members WHERE id = p_id
), ctegroup AS
(
SELECT
u.id AS actualid,
(SELECT m.company || ' (' || m.id ||')' FROM support.members m WHERE m.id = u.id) AS actualcompany,
u.itemname AS part,
DATE_PART('day', CURRENT_TIMESTAMP - u.datein::timestamp) AS daysinstock,
TRIM(u.grade)::character varying AS condition,
u.vstockno::text AS stock,
u.holl::text AS ic,
CASE WHEN u.rprice > 0 THEN
u.rprice
ELSE
NULL
END AS price,
u.quantity,
u.location,
u.comments::text
FROM public.net u
WHERE u.holl in (p_ic)
AND visibledate <= now()
AND u.id = ANY(REGEXP_SPLIT_TO_ARRAY(p_id ||','|| (SELECT a FROM cte), ','))
ORDER BY u.itemname, u.id
)
SELECT
COUNT(ctegroup.ic) OVER(PARTITION BY ctegroup.ic ORDER BY ctegroup.ic) AS cnt,
actualid,
MAX(actualcompany) AS actualcompany,
MAX(part) AS part,
MAX(daysinstock) AS daysinstock,
STRING_AGG(condition,',') AS condition,
MAX(ic) AS ic,
MAX(price) AS price,
STRING_AGG(stock,',') AS stock,
SUM(quantity) AS qty,
STRING_AGG(location,',') AS location,
STRING_AGG(comments,';') AS comments
FROM ctegroup
GROUP BY part, actualid, ic
ORDER BY actualid;
END; $
LANGUAGE 'plpgsql';
I am calling it from the pgAdminIII Query window like this:
SELECT * FROM eddie.getinv(array['536-01036','536-01033L','536-01037'], 'N40')
But it is returning this error:
ERROR: operator does not exist: text = character varying[]`
LINE 28: WHERE u.holl in (p_ic)`
How do I fix this, or am I calling it incorrectly? I will be calling it from a PHP API function similar to this:
$id = 'N40';
$ic = array('536-01036','536-01033L','536-01037');
$sql = "SELECT * FROM eddie.getinv(array['". implode("','",$ic)."'], '".$id."');";
try
{
$results = pg_query($sql);
if(pg_num_rows($results) == 0) {
$rows = [];
}
else
{
$data = pg_fetch_all($results);
foreach($data as $item)
{
$rows[$item["ic"]][] = $item;
}
}
pg_free_result($results);
}
catch (Exception $e)
{
$err = array("message"=>$e->getMessage(), "code"=> $e->getCode(), "error"=>$e->__toString().",\n".print_r($_REQUEST, true));
echo json_encode($err);
}
echo json_encode($rows);
It looks like your array is being passed to the function just fine. The problem is in your query.
IN () clauses expect a comma-separated list of values. When you put an array in there, it's interpreted as a one-element list, where the value is the whole array. In other words, u.holl in (p_ic) will check if u.holl is equal to p_ic, and the comparison fails due to the type mismatch.
If you want to test the value against the contents of the array, use u.holl = ANY(p_ic).

Dynamic RETURN EXECUTE QUERY with ILIKE in Postgresql

The search filter in the function below doesn't seem to work. If I don't provide a search parameter, it works, otherwise I get no recordset back. I'm assuming I am making a mess of the single-quoting and ILIKE, but not sure how to re-write this properly. Suggestions?
CREATE OR REPLACE FUNCTION get_operator_basic_by_operator(
_operator_id UUID DEFAULT NULL,
_search TEXT DEFAULT NULL,
_page_number INTEGER DEFAULT 1,
_page_size INTEGER DEFAULT 10,
_sort_col TEXT DEFAULT 'username',
_sort_dir TEXT DEFAULT 'asc',
_include_deleted BOOLEAN DEFAULT FALSE
)
RETURNS TABLE (
id UUID,
party_id UUID,
party_name TEXT,
username VARCHAR(32),
profile_picture_uri VARCHAR(512),
first_name VARCHAR(64),
last_name VARCHAR(64),
street VARCHAR(128),
specifier VARCHAR(128),
city VARCHAR(64),
state VARCHAR(2),
zipcode VARCHAR(9),
primary_email CITEXT,
primary_phone VARCHAR(10),
secondary_email CITEXT,
secondary_phone VARCHAR(10),
last_login TIMESTAMP WITH TIME ZONE,
created TIMESTAMP WITH TIME ZONE,
deleted TIMESTAMP WITH TIME ZONE
)
AS $$
DECLARE
_offset BIGINT;
BEGIN
IF (_page_number < 1 OR _page_number IS NULL) THEN
RAISE EXCEPTION '_page_number cannot be null or less than 1.';
END IF;
IF (_page_size < 1 OR _page_size IS NULL) THEN
RAISE EXCEPTION '_page_size cannot be null or less than 1.';
END IF;
IF (_sort_dir <> 'asc' AND _sort_dir <> 'desc') THEN
RAISE EXCEPTION '_sort_dir must be "asc" or "desc".';
END IF;
_offset := (_page_size * (_page_number-1));
RETURN QUERY EXECUTE '
SELECT
o.id,
p.id,
p.party_name,
o.username,
o.profile_picture_uri,
o.first_name,
o.last_name,
o.street,
o.specifier,
o.city,
o.state,
o.zipcode,
o.primary_email,
o.primary_phone,
o.secondary_email,
o.secondary_phone,
o.last_login,
o.created,
o.deleted
FROM
operator o
LEFT JOIN
party p ON o.party_id = p.id
WHERE ( -- include all or only those active, based on _include_deleted
$1 OR o.deleted > statement_timestamp()
)
AND o.party_id IN ( -- limit to operators in same party
SELECT oi.party_id FROM operator oi WHERE oi.id = $2
)
AND ( -- use optional search filter
$3 IS NULL
OR
o.username ILIKE ''%$3%''
OR
o.first_name ILIKE ''%$3%''
OR
o.last_name ILIKE ''%$3%''
OR
o.primary_email ILIKE ''%$3%''
)
ORDER BY ' || quote_ident(_sort_col) || ' ' || _sort_dir || '
LIMIT
$4
OFFSET
$5'
USING _include_deleted, _operator_id, _search, _page_size, _offset;
END;
The parameter is inserted as a text with quotes, so you should use it this way:
...
o.username ILIKE concat(''%'', $3, ''%'')
...
Personally I would use format() and dollar-quotes.
Strings handles as is, you should to pass ready to use values:
EXECUTE '... o.username ILIKE $3 ...' using ..., '%' || _search || '%', ...

Get complex output type of function that returns record in postgresql

I want to write nice and detailed report on functions in my postgresql database.
I built the following query:
SELECT routine_name, data_type, proargnames
FROM information_schema.routines
join pg_catalog.pg_proc on pg_catalog.pg_proc.proname = information_schema.routines.routine_name
WHERE specific_schema = 'public'
ORDER BY routine_name;
It works as it should (basically returns me what I want it to: function name, output data type and input data type) except one thing:
I have relatively complicated functions and many of them return record.
The thing is, data_type returns me record as well for such functions, while I want detailed list of function output types.
For instance, I have something like this in one of my functions:
RETURNS TABLE("Res" integer, "Output" character varying) AS
How can I make query above (or, perhaps, a new query, if it will solve the problem) return something like
integer, character varying instead of record for such functions?
I am using postgresql 9.2
Thanks in advance!
The RECORD returned value is evaluated at runtime, there is no way that the information can be retrieved this way.
BUT, if RETURNS TABLE("Res" integer, "Output" character varying) AS is used, there is a solution.
The test functions I used:
-- first function, uses RETURNS TABLE
CREATE FUNCTION test_ret(a TEXT, b TEXT)
RETURNS TABLE("Res" integer, "Output" character varying) AS $$
DECLARE
ret RECORD;
BEGIN
-- test
END;$$ LANGUAGE plpgsql;
-- second function, test some edge cases
-- same name as above, returns simple integer
CREATE FUNCTION test_ret(a TEXT)
RETURNS INTEGER AS $$
DECLARE
ret RECORD;
BEGIN
-- test
END;$$ LANGUAGE plpgsql;
How to retrieve this function return datatype is easy as it's stored into pg_catalog.pg_proc.proallargtypes, the problem is that this is an array of OID. We must unnest this thing and join it to pg_catalog.pg_types.oid.
-- edit: add support for function not returning tables, thx Tommaso Di Bucchianico
WITH pg_proc_with_unnested_proallargtypes AS (
SELECT
pg_catalog.pg_proc.oid,
pg_catalog.pg_proc.proname,
CASE WHEN proallargtypes IS NOT NULL THEN unnest(proallargtypes) ELSE null END AS proallargtype
FROM pg_catalog.pg_proc
JOIN pg_catalog.pg_namespace ON pg_catalog.pg_proc.pronamespace = pg_catalog.pg_namespace.oid
WHERE pg_catalog.pg_namespace.nspname = 'public'
),
pg_proc_with_proallargtypes_names AS (
SELECT
pg_proc_with_unnested_proallargtypes.oid,
pg_proc_with_unnested_proallargtypes.proname,
array_agg(pg_catalog.pg_type.typname) AS proallargtypes
FROM pg_proc_with_unnested_proallargtypes
LEFT JOIN pg_catalog.pg_type ON pg_catalog.pg_type.oid = proallargtype
GROUP BY
pg_proc_with_unnested_proallargtypes.oid,
pg_proc_with_unnested_proallargtypes.proname
)
SELECT
information_schema.routines.specific_name,
information_schema.routines.routine_name,
information_schema.routines.routine_schema,
information_schema.routines.data_type,
pg_proc_with_proallargtypes_names.proallargtypes
FROM information_schema.routines
-- we can declare many function with the same name and schema as long as arg types are different
-- This is the only right way to join pg_catalog.pg_proc and information_schema.routines, sadly
JOIN pg_proc_with_proallargtypes_names
ON pg_proc_with_proallargtypes_names.proname || '_' || pg_proc_with_proallargtypes_names.oid = information_schema.routines.specific_name
;
Any refactoring is welcome :)
Here is the result:
specific_name | routine_name | routine_schema | data_type | proallargtypes
----------------+--------------+----------------+-----------+--------------------------
test_ret_16633 | test_ret | public | record | {text,text,int4,varchar}
test_ret_16635 | test_ret | public | integer | {NULL}
(2 rows)
EDIT
Identification of input and output arguments is not trivial, here is my solution for pg 9.2
-- https://gist.github.com/subssn21/e9e121f6fd5ff50f688d
-- Allow us to use array_remove in pg < 9.3
CREATE OR REPLACE FUNCTION array_remove(a ANYARRAY, e ANYELEMENT)
RETURNS ANYARRAY AS $$
BEGIN
RETURN array(SELECT x FROM unnest(a) x WHERE x <> e);
END;
$$ LANGUAGE plpgsql;
-- edit: add support for function not returning tables, thx Tommaso Di Bucchianico
WITH pg_proc_with_unnested_proallargtypes AS (
SELECT
pg_catalog.pg_proc.oid,
pg_catalog.pg_proc.proname,
pg_catalog.pg_proc.proargmodes,
CASE WHEN proallargtypes IS NOT NULL THEN unnest(proallargtypes) ELSE null END AS proallargtype
FROM pg_catalog.pg_proc
JOIN pg_catalog.pg_namespace ON pg_catalog.pg_proc.pronamespace = pg_catalog.pg_namespace.oid
WHERE pg_catalog.pg_namespace.nspname = 'public'
),
pg_proc_with_unnested_proallargtypes_names_and_mode AS (
SELECT
pg_proc_with_unnested_proallargtypes.oid,
pg_proc_with_unnested_proallargtypes.proname,
pg_catalog.pg_type.typname,
-- we can't unnest multiple array of same length the way we expect in pg 9.2
-- just retrieve each mode manually using type row_number
pg_proc_with_unnested_proallargtypes.proargmodes[row_number() OVER w] AS proargmode
FROM pg_proc_with_unnested_proallargtypes
LEFT JOIN pg_catalog.pg_type ON pg_catalog.pg_type.oid = proallargtype
WINDOW w AS (PARTITION BY pg_proc_with_unnested_proallargtypes.proname)
),
pg_proc_with_input_and_output_type_names AS (
SELECT
pg_proc_with_unnested_proallargtypes_names_and_mode.oid,
pg_proc_with_unnested_proallargtypes_names_and_mode.proname,
array_agg(pg_proc_with_unnested_proallargtypes_names_and_mode.typname) AS proallargtypes,
-- we should use FILTER, but that's not available in pg 9.2 :(
array_remove(array_agg(
-- see documentation for proargmodes here: http://www.postgresql.org/docs/9.2/static/catalog-pg-proc.html
CASE WHEN pg_proc_with_unnested_proallargtypes_names_and_mode.proargmode = ANY(ARRAY['i', 'b', 'v'])
THEN pg_proc_with_unnested_proallargtypes_names_and_mode.typname
ELSE NULL END
), NULL) AS proinputargtypes,
array_remove(array_agg(
-- see documentation for proargmodes here: http://www.postgresql.org/docs/9.2/static/catalog-pg-proc.html
CASE WHEN pg_proc_with_unnested_proallargtypes_names_and_mode.proargmode = ANY(ARRAY['o', 'b', 't'])
THEN pg_proc_with_unnested_proallargtypes_names_and_mode.typname
ELSE NULL END
), NULL) AS prooutputargtypes
FROM pg_proc_with_unnested_proallargtypes_names_and_mode
GROUP BY
pg_proc_with_unnested_proallargtypes_names_and_mode.oid,
pg_proc_with_unnested_proallargtypes_names_and_mode.proname
)
SELECT
*
FROM pg_proc_with_input_and_output_type_names
;
And here is my sample output:
oid | proname | proallargtypes | proinputargtypes | prooutputargtypes
-------+--------------+--------------------------+------------------+-------------------
16633 | test_ret | {text,text,int4,varchar} | {text,text} | {int4,varchar}
16634 | array_remove | {NULL} | {} | {}
16635 | test_ret | {NULL} | {} | {}
(3 rows)
Hope that helps :)
Answer, provided by Clément Prévost, is detailed and educative and therefore I marked it as best, while, however, after I executed suggested script I ended up with empty (filled only by {}) proinputargtypes and prooutputargtypes columnes on my machine. So I conducted a little research on my own, using hints that I learned from the answer above, and wrote following query:
WITH pg_proc_with_unhandled_proallargtypes AS (
SELECT
pg_catalog.pg_proc.oid,
pg_catalog.pg_proc.proname,
pg_catalog.pg_proc.proargmodes,
CASE WHEN proallargtypes IS NOT NULL THEN cast(proallargtypes AS text) ELSE NULL END AS proallargtype,
CASE WHEN array_agg(proargtypes) IS NOT NULL THEN replace(string_agg(proargtypes::text, ','), ' ', ',') ELSE NULL END AS proargtype
FROM pg_catalog.pg_proc
JOIN pg_catalog.pg_namespace ON pg_catalog.pg_proc.pronamespace = pg_catalog.pg_namespace.oid
WHERE pg_catalog.pg_namespace.nspname = 'public'
GROUP BY pg_catalog.pg_proc.oid, pg_catalog.pg_proc.proname, pg_catalog.pg_proc.proargmodes, proallargtype
),
pg_proc_with_unnested_proallargtypes AS(
SELECT proname, CASE WHEN char_length(proargtype) =0 THEN NULL
ELSE ('{' || proargtype || '}')::oid[] end AS inp,
CASE WHEN proallargtype is NULL THEN NULL ELSE replace(proallargtype, proargtype || ',', '')::oid[] end AS output
FROM pg_proc_with_unhandled_proallargtypes
),
smth_input AS(
SELECT proname, unnest(inp) AS inp FROM pg_proc_with_unnested_proallargtypes
),
smth_output AS(
SELECT proname, unnest(output) AS output FROM pg_proc_with_unnested_proallargtypes
),
input_unnested AS(
SELECT proname, string_agg(pg_catalog.pg_type.typname::text, ',') AS fin_input FROM smth_input
JOIN pg_catalog.pg_type ON pg_catalog.pg_type.oid = inp
GROUP BY proname
),
output_unnested AS (
SELECT proname, string_agg(pg_catalog.pg_type.typname::text, ',') AS fin_output FROM smth_output
JOIN pg_catalog.pg_type ON pg_catalog.pg_type.oid = output
GROUP BY proname
)
SELECT input_unnested.proname, fin_input, CASE WHEN fin_output IS NOT NULl THEN fin_output ELSE information_schema.routines.data_type end AS fin_output
FROM input_unnested
LEFT JOIN output_unnested ON input_unnested.proname = output_unnested.proname
JOIN information_schema.routines ON information_schema.routines.routine_name = input_unnested.proname
It might be inefficient a bit and I probably used too much explicict type casting, but it worked.