Return situation-dependent messages (exceptions) - postgresql

I've got stored procedures in my PostgreSQL environment. One of these procedures stores user defined data.
In plpgsql-language, I would like to realize the following processing:
Because there are unique constraints in the destination-table, I first check, if the affected attributes are already in this table. If so, I want the stored procedure to return these affected attributes, so the user knows: "Uh, some of my data already exists, let's try something different!"
For example a part of the affected table:
CREATE TABLE XYZ (
Name varchar UNIQUE
...
)
Now I want to add a new row to XYZ. If the function notices that the name already exists it should return a message with the offending name. There can be multiple duplicates and the function should return all of them. How can this be implemented?
I first thought to check every attribute individually, but that's very slow:
-- Check, if abbreviation or name already exist.
SELECT EXISTS(
SELECT
1
FROM
"sample-scheme"."sample-table" AS tableName
WHERE
tableName.abbreviation = argAbbreviation
OR
tableName.name = argName
)
INTO
varEntryExists;
-- If abbreviation or name exists, return error message.
IF varEntryExists THEN
RETURN "Already exists.";
END IF;
This sample also doesn't return the offending attribute.

All you need is this simple query:
SELECT name
FROM XYZ
JOIN unnest ('{name1, name2, name3}'::varchar[]) AS n(name) USING (name);
Or use the "is contained by" operator <# (faster for small arrays):
SELECT name
FROM XYZ
WHERE name <# '{name1, name2, name3}'::varchar[]
Wrapped into a function, to return multiple finds, you can either return an array or RETURNS SETOF ... or RETURNS TABLE(...) to return a set. Demonstrating a plpgsql function (could also just be SQL), with a convenient VARIADIC parameter:
CREATE OR REPLACE FUNCTION f_name_dupes(VARIADIC _names varchar[])
RETURNS SETOF varchar AS
$func$
BEGIN
SELECT name
FROM XYZ x
WHERE name <# $1;
END
$func$ LANGUAGE sql;
Call:
SELECT * FROM f_name_dupes('name1', 'name2', 'name3');
Returns all names that already exist in table XYZ. More details for VARIADIC:
How to do WHERE x IN (val1, val2,…) in plpgsql

Related

Setting return table data type be the same type as another table

I have a postgres table function currently declared as such:
CREATE OR REPLACE FUNCTION apolloqa.my_func(arguments...)
RETURNS TABLE(first_name text, last_name text, age int)
LANGUAGE plpgsql AS $function$
BEGIN
RETURN QUERY
select first_name, last_name, age
from person_table ;
END
$function$ ;
When I run this code, postgres complains that the first_name, and last_name in the return table does not match the query's return type. That is true. But how do I declare first_name and last_name so that it either matches the query's return type or the underlying person_table's column type without repeating the same type? Is there a way to say something like:
RETURNS TABLE(first_name TYPE is person_table.first_name, ... ) ?
Postgres has a 'like' feature, but it selects all columns from a given table. I want to select just a few from one table, and a few from another. My solutions in the past would be to hard code the datatype from the underlying table, so varchar(150), or something. But, I'd like to have the type reference another type, if that's possible.
Yes, you can do almost exactly you indicated, the syntax is just a little different. Use
returns table(first_name person_table.first_name%type
,last_name person_table.last_name%type
,age int
);
Since your function has just SQL you can also define it as an SQL function:
create or replace function my_func(arguments text)
returns table(first_name person_table.first_name%type
,last_name person_table.last_name%type
,age int
)
language sql as $function$
select first_name, last_name, age
from person_table ;
$function$ ;
You can use Copying type
By using %TYPE you don't need to know the data type of the structure
you are referencing, and most importantly, if the data type of the
referenced item changes in the future (for instance: you change the
type of user_id from integer to real), you might not need to change
your function definition.
https://www.postgresql.org/docs/current/plpgsql-declarations.html#PLPGSQL-DECLARATION-TYPE
your function INPUT and OUTPUT parameter name the same with query body column name. That may lead to some mistake in the future. See DB fiddle: the last code block
CREATE OR REPLACE FUNCTION apolloqa.my_func(p_age person_table.age%type)
RETURNS TABLE(_first_name person_table.first_name%type,
_last_name person_table.last_name%type,
_age person_table.age%type)
LANGUAGE plpgsql AS $function$
BEGIN
RETURN QUERY
select first_name, last_name, age from person_table where age = p_age;
END
$function$;

Access column using variable instead of explicit column name

I would like to access a column by using variable instead of a static column name.
Example:
variable := 'customer';
SELECT table.variable (this is what I would prefer) instead of table.customer
I need this functionality as records in my table vary in terms of data length (eg. some have data in 10 columns, some in 14 or 16 columns etc.) so I need to address columns dynamically. As I understand, I can't address columns by their index (eg. select 8-th column of the table) right?
I can loop and put the desired column name in a variable for the given iteration. However, I get errors when I try to access a column using that variable (e.g. table_name.variable is not working).
For the sake of simplicity, I paste just some dummy code to illustrate the issue:
CREATE OR REPLACE FUNCTION dynamic_column_name() returns text
LANGUAGE PLPGSQL
AS $$
DECLARE
col_name text;
return_value text;
BEGIN
create table customer (
id bigint,
name varchar
);
INSERT INTO customer VALUES(1, 'Adam');
col_name := 'name';
-- SELECT customer.name INTO return_value FROM customer WHERE id = 1; -- WORKING, returns 'Adam' but it is not DYNAMIC.
-- SELECT customer.col_name INTO return_value FROM customer WHERE id = 1; -- ERROR: column customer.col_name does not exist
-- SELECT 'customer.'||col_name INTO return_value FROM customer WHERE id = 1; -- NOT working, returns 'customer.name'
-- SELECT customer||'.'||col_name INTO return_value FROM customer WHERE id = 1; -- NOT working, returns whole record + .name, i.e.: (1,Adam).name
DROP TABLE customer;
RETURN return_value;
END;
$$;
SELECT dynamic_column_name();
So how to obtain 'Adam' string with SQL query using col_name variable when addressing column of customer table?
SQL does not allow to parameterize identifiers (incl. column names) or syntax elements. Only values can be parameters.
You need dynamic SQL for that. (Basically, build the SQL string and execute.) Use EXECUTE in a plpgsql function. There are multiple syntax variants. For your simple example:
CREATE OR REPLACE FUNCTION dynamic_column_name(_col_name text, OUT return_value text)
RETURNS text
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('SELECT %I FROM customer WHERE id = 1', _col_name)
INTO return_value;
END
$func$;
Call:
SELECT dynamic_column_name('name');
db<>fiddle here
Data types have to be compatible, of course.
More examples:
How to use text input as column name(s) in a Postgres function?
https://stackoverflow.com/search?q=%5Bpostgres%5D+%5Bdynamic-sql%5D+parameter+column+code%3AEXECUTE

postgresSQL insert multiple rows, of id returned from select queries

I have a complex query that join multiple tables and return many member ids (line 5)
For each memberId I want to insert a memberSegment record, consisting of the memberId (new for each insert) and a segmentId (always the same/not list)
INSERT INTO db."memberSegment"(
"memberId",
"segmentId")
VALUES (
(select table."memberId" complex query returns many ids ),
(SELECT id FROM db.segment where "idName" = 'due-for-360')
);
From reading on SO this is how I interpret it should look, but I am getting following error message, making me think that my query is not expecting a list in either values.
ERROR: more than one row returned by a subquery used as an expression
SQL state: 21000
Each query on its' own returns following:
You might be able to phrase this as an INSERT INTO ... SELECT:
INSERT INTO db."memberSegment" (memberId, segmentId)
SELECT
memberId,
(SELECT id FROM db.segment WHERE idName = 'due-for-360')
FROM table -- (complex query returns many ids );
This would at the very least get around your current error, which is stemming from the query returning more than one id. The only possible issue would be if the subquery on db.segment also returns more than a single value. If it does not, then the above should work. If it does return more than one value, then your logic needs to be reconsidered.
For example:
CREATE OR REPLACE FUNCTION f_get(ikey text)
returns integer
AS
$func$
DECLARE
l_id integer;
BEGIN
LOCK TABLE foo IN SHARE ROW EXCLUSIVE MODE;
INSERT INTO foo (type)
SELECT ikey
WHERE NOT EXISTS (
SELECT * FROM foo WHERE type=ikey
)
returning id into l_id; --< store the returned ID in local variable
return l_id; --< return this variable
END
$func$ LANGUAGE plpgsql;

Execute a dynamic crosstab query

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.

How to return uncertain number columns of a table from a postgresql function?

As we know, plpgsql functions can return a table like this:
RETURNS table(int, char(1), ...)
But how to write this function, when the list of columns is uncertain at the time of creating the function.
When a function returns anonymous records
RETURNS SETOF record
you have to provide a column definition list when calling it with SELECT * FROM. SQL demands to know column names and types to interpret *. For registered tables and types this is provided by the system catalog. For functions you need to declare it yourself one way or the other. Either in the function definition or in the call. The call could look like #Craig already provided. You probably didn't read his answer carefully enough.
Depending on what you need exactly, there are a number of ways around this, though:
1) Return a single anonymous record
Example:
CREATE OR REPLACE FUNCTION myfunc_single() -- return a single anon rec
RETURNS record AS
$func$
DECLARE
rec record;
BEGIN
SELECT into rec 1, 'foo'; -- note missing type for 'foo'
RETURN rec;
END
$func$ LANGUAGE plpgsql;
This is a very limited niche. Only works for a single anonymous record from a function defined with:
RETURNS record
Call without * FROM:
SELECT myfunc_single();
Won't work for a SRF (set-returning function) and only returns a string representation of the whole record (type record). Rarely useful.
To get individual cols from a single anonymous record, you need to provide a column definition list again:
SELECT * FROM myfunc_single() AS (id int, txt unknown); -- note "unknown" type
2) Return well known row type with a super-set of columns
Example:
CREATE TABLE t (id int, txt text, the_date date);
INSERT INTO t VALUES (3, 'foz', '2014-01-13'), (4, 'baz', '2014-01-14');
CREATE OR REPLACE FUNCTION myfunc_tbl() -- return well known table
RETURNS SETOF t AS
$func$
BEGIN
RETURN QUERY
TABLE t;
-- SELECT * FROM t; -- equivalent
END
$func$ LANGUAGE plpgsql;
The function returns all columns of the table. This is short and simple and performance won't suffer as long as your table doesn't hold a huge number of columns or huge columns.
Select individual columns on call:
SELECT id, txt FROM myfunc_tbl();
SELECT id, the_date FROM myfunc_tbl();
-> SQLfiddle demonstrating all.
3) Advanced solutions
This answer is long enough already. And this closely related answer has it all:
Refactor a PL/pgSQL function to return the output of various SELECT queries
Look to the last chapter in particular: Various complete table types
If the result is of uncertain/undefined format you must use RETURNS record or (for a multi-row result) RETURNS SETOF record.
The calling function must then specify the table format, eg:
SELECT my_func() AS result(a integer, b char(1));
BTW, char is an awful data type with insane space-padding rules that date back to the days of fixed-width file formats. Don't use it. Always just use text or varchar.
Given comments, let's make this really explicit:
regress=> CREATE OR REPLACE FUNCTION f_something() RETURNS SETOF record AS $$
SELECT 1, 2, TEXT 'a';
$$ LANGUAGE SQL;
CREATE FUNCTION
regress=> SELECT * FROM f_something();
ERROR: a column definition list is required for functions returning "record"
LINE 1: SELECT * FROM f_something();
regress=> SELECT * FROM f_something() AS x(a integer, b integer, c text);
a | b | c
---+---+---
1 | 2 | a
(1 row)