Passing a column as a function parameter that creates a table - postgresql

I need to create a function that will generate a table.
This table will have columns brought with several left joins.
At some point i need to filter the data based on the value of a dynamic column (i need to have: WHERE table_name.dynamic_column_name = 1)(that column has 1s and 0s, i need to keep only the 1s)
So when I 'call' the function, user should type like this: SELECT * FROM function_name(dynamic_column_name)
What i did:
CREATE OR REPLACE FUNCTION check_gap (_col_name varchar)
RETRUNS TABLE ( ... here i have several columns with their type ...)
LANGUAGE plpsql
AS $function$
BEGIN
RETURN QUERY
SELECT ... a bunch of columns ...
FROM ... a bunch of tables left joined ...
WHERE _col_name = 1;
END;
$function$
;
I even tried with table_name._col_name .. though it wasn't necessary, the query (select from) works just as fine without
** I found some solutions for dynamic value but not for a dynamic column
*** I am using PostgreSQL (from DBeaver)

You need dynamic SQL for that. Pls. note that the code below is SQL injection prone.
CREATE OR REPLACE FUNCTION check_gap (col_name varchar)
RETURNS TABLE (... several columns with their type ...) LANGUAGE plpgsql AS
$$
BEGIN
RETURN QUERY EXECUTE format
(
'SELECT ... a bunch of columns ...
FROM ... a bunch of tables left joined ...
WHERE %I = 1;', col_name
);
END;
$$;

Related

PostgreSQL function taking column name as argument

I have a table tab1 with four columns col1, col2, col3 and col4.
I want to create a function like f4(a) where a is defined by user and if user types select f4(col1) he gets column tab1.col1.
Is there any way to create such function in PostgreSQL?
There is really not good reason to complicate matters with a function here.
What you should do instead:
SELECT col1 FROM tab1;
What you ask for:
CREATE OR REPLACE FUNCTION f4(_col text)
RETURNS TABLE (col_x text)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE
format('SELECT %I FROM tab1', _col);
END
$func$;
Call:
SELECT * FROM f4('col1');
You need dynamic SQL because SQL does not allow to parameterize identifiers.
Further reading:
Using variable for fieldname in postgresql
Define table and column names as arguments in a plpgsql function?

PostgreSQL: Run function many times with different arguments (of same type)

I have a function that takes a parameter uid and returns a table. I'd like to add call this function multiple times with different arguments (but of the same type, VARCHAR(11)) and put those results in a temporary table (so I can pull a CSV out of it).
My function does not seem to be able to loop through it properly
DROP TABLE IF EXISTS programs_to_analyze;
CREATE TEMP TABLE IF NOT EXISTS programs_to_analyze (
uid VARCHAR(11)
);
INSERT INTO programs_to_analyze(uid)
VALUES ('lUGqq3na2wn');
CREATE OR REPLACE FUNCTION myfunction()
RETURNS TABLE(identifier VARCHAR(11)) AS $BODY$
DECLARE
program programs_to_analyze%ROWTYPE;
BEGIN
FOR program IN
SELECT uid
FROM programs_to_analyze
LOOP
RETURN QUERY SELECT * from some_query(program);
END LOOP;
END;
$BODY$ LANGUAGE plpgsql VOLATILE;
When running select * from myfunction(); I receive 0 rows (expected: 18 rows as tested).
How can I put those results into one table when declaring multiple rows in programs_to_analyze?

Execute a dynamic crosstab query

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.

How to execute a string result of a stored procedure in postgres

I have created the following stored procedure, which basically receives a name of table, and a prefix. The function then finds all columns that share this prefix and returns as an output a 'select' query command ('myoneliner').
as follows:
CREATE OR REPLACE FUNCTION mytext (mytable text, myprefix text)
RETURNS text AS $myoneliner$
declare
myoneliner text;
BEGIN
SELECT 'SELECT ' || substr(cols,2,length(cols)-2) ||' FROM '||mytable
INTO myoneliner
FROM (
SELECT array(
SELECT DISTINCT quote_ident(column_name::text)
FROM information_schema.columns
WHERE table_name = mytable
AND column_name LIKE myprefix||'%'
order by quote_ident
)::text cols
) sub;
RETURN myoneliner;
END;
$myoneliner$ LANGUAGE plpgsql;
Call:
select mytext('dkj_p_k27ac','enri');
As a result of running this stored procedure and the 'select' that is following it, I get the following output at the Data Output window (all within one cell, named "mytext text"):
'SELECT enrich_d_dkj_p_k27ac,enrich_lr_dkj_p_k27ac,enrich_r_dkj_p_k27ac
FROM dkj_p_k27ac'
I would like to basically be able to take the output command line that I received as an output and execute it. In other words, I would like to be able and execute the output of my stored procedure.
How can I do so?
I tried the following:
CREATE OR REPLACE FUNCTION mytext (mytable text, myprefix text)
RETURNS SETOF RECORD AS $$
declare
smalltext text;
myoneliner text;
BEGIN
SELECT 'SELECT ' || substr(cols,2,length(cols)-2) ||' FROM '||mytable
INTO myoneliner
FROM (
SELECT array(
SELECT DISTINCT quote_ident(column_name::text)
FROM information_schema.columns
WHERE table_name = mytable
AND column_name LIKE myprefix||'%'
order by quote_ident
)::text cols
) sub;
smalltext=lower(myoneliner);
raise notice '%','my additional text '||smalltext;
RETURN QUERY EXECUTE smalltext;
END;
$$ LANGUAGE plpgsql;
Call function:
SELECT * from mytext('dkj_p_k27ac','enri');
But I'm getting the following error message, could you please advise what should I change in order for it to execute?:
ERROR: a column definition list is required for functions returning "record"
LINE 26: SELECT * from mytext('dkj_p_k27ac','enri');
********** Error **********
ERROR: a column definition list is required for functions returning "record"
SQL state: 42601
Character: 728
Your first problem was solved by using dynamic SQL with EXECUTE like Craig advised.
But the rabbit hole goes deeper:
CREATE OR REPLACE FUNCTION myresult(mytable text, myprefix text)
RETURNS SETOF RECORD AS
$func$
DECLARE
smalltext text;
myoneliner text;
BEGIN
SELECT INTO myoneliner
'SELECT '
|| string_agg(quote_ident(column_name::text), ',' ORDER BY column_name)
|| ' FROM ' || quote_ident(mytable)
FROM information_schema.columns
WHERE table_name = mytable
AND column_name LIKE myprefix||'%'
AND table_schema = 'public'; -- schema name; might be another param
smalltext := lower(myoneliner); -- nonsense
RAISE NOTICE 'My additional text: %', myoneliner;
RETURN QUERY EXECUTE myoneliner;
END
$func$ LANGUAGE plpgsql;
Major points
Don't cast the whole statement to lower case. Column names might be double-quoted with upper case letters, which are case-sensitive in this case (no pun intended).
You don't need DISTINCT in the query on information_schema.columns. Column names are unique per table.
You do need to specify the schema, though (or use another way to single out one schema), or you might be mixing column names from multiple tables of the same name in multiple schemas, resulting in nonsense.
You must sanitize all identifiers in dynamic code - including table names: quote_ident(mytable). Be aware that your text parameter to the function is case sensitive! The query on information_schema.columns requires that, too.
I untangled your whole construct to build the list of column names with string_agg() instead of the array constructor. Related answer:
Update multiple columns that start with a specific string
The assignment operator in plpgsql is :=.
Simplified syntax of RAISE NOTICE.
Core problem impossible to solve
All of this still doesn't solve your main problem: SQL demands a definition of the columns to be returned. You can circumvent this by returning anonymous records like you tried. But that's just postponing the inevitable. Now you have to provide a column definition list at call time, just like your error message tells you. But you just don't know which columns are going to be returned. Catch 22.
Your call would work like this:
SELECT *
FROM myresult('dkj_p_k27ac','enri') AS f (
enrich_d_dkj_p_k27ac text -- replace with actual column types
, enrich_lr_dkj_p_k27ac text
, enrich_r_dkj_p_k27ac text);
But you don't know number, names (optional) and data types of returned columns, not at creation time of the function and not even at call time. It's impossible to do exactly that in a single call. You need two separate queries to the database.
You could return all columns of any given table dynamically with a function using polymorphic types, because there is a well defined type for the whole table. Last chapter of this related answer:
Refactor a PL/pgSQL function to return the output of various SELECT queries

Stored function with temporary table in postgresql

Im new to writing stored functions in postgresql and in general . I'm trying to write onw with an input parameter and return a set of results stored in a temporary table.
I do the following in my function .
1) Get a list of all the consumers and store their id's stored in a temp table.
2) Iterate over a particular table and retrieve values corresponding to each value from the above list and store in a temp table.
3)Return the temp table.
Here's the function that I've tried to write by myself ,
create or replace function getPumps(status varchar) returns setof record as $$ (setof record?)
DECLARE
cons_id integer[];
i integer;
temp table tmp_table;--Point B
BEGIN
select consumer_id into cons_id from db_consumer_pump_details;
FOR i in select * from cons_id LOOP
select objectid,pump_id,pump_serial_id,repdate,pumpmake,db_consumer_pump_details.status,db_consumer.consumer_name,db_consumer.wenexa_id,db_consumer.rr_no into tmp_table from db_consumer_pump_details inner join db_consumer on db_consumer.consumer_id=db_consumer_pump_details.consumer_id
where db_consumer_pump_details.consumer_id=i and db_consumer_pump_details.status=$1--Point A
order by db_consumer_pump_details.consumer_id,pump_id,createddate desc limit 2
END LOOP;
return tmp_table
END;
$$
LANGUAGE plpgsql;
However Im not sure about my approach and whether im right at the points A and B as I've marked in the code above.And getting a load of errors while trying to create the temporary table.
EDIT: got the function to work ,but I get the following error when I try to run the function.
ERROR: array value must start with "{" or dimension information
Here's my revised function.
create temp table tmp_table(objectid integer,pump_id integer,pump_serial_id varchar(50),repdate timestamp with time zone,pumpmake varchar(50),status varchar(2),consumer_name varchar(50),wenexa_id varchar(50),rr_no varchar(25));
select consumer_id into cons_id from db_consumer_pump_details;
FOR i in select * from cons_id LOOP
insert into tmp_table
select objectid,pump_id,pump_serial_id,repdate,pumpmake,db_consumer_pump_details.status,db_consumer.consumer_name,db_consumer.wenexa_id,db_consumer.rr_no from db_consumer_pump_details inner join db_consumer on db_consumer.consumer_id=db_consumer_pump_details.consumer_id where db_consumer_pump_details.consumer_id=i and db_consumer_pump_details.status=$1
order by db_consumer_pump_details.consumer_id,pump_id,createddate desc limit 2;
END LOOP;
return query (select * from tmp_table);
drop table tmp_table;
END;
$$
LANGUAGE plpgsql;
AFAIK one can't declare tables as variables in postgres. What you can do is create one in your funcion body and use it thourough (or even outside of function). Beware though as temporary tables aren't dropped until the end of the session or commit.
The way to go is to use RETURN NEXT or RETURN QUERY
As for the function result type I always found RETURNS TABLE to be more readable.
edit:
Your cons_id array is innecessary, just iterate the values returned by select.
Also you can have multiple return query statements in a single function to append result of the query to the result returned by function.
In your case:
CREATE OR REPLACE FUNCTION getPumps(status varchar)
RETURNS TABLE (objectid INTEGER,pump_id INTEGER,pump_serial_id INTEGER....)
AS
$$
BEGIN
FOR i in SELECT consumer_id FROM db_consumer_pump_details LOOP
RETURN QUERY(
SELECT objectid,pump_id,pump_serial_id,repdate,pumpmake,db_consumer_pump_details.status,db_consumer.consumer_name,db_consumer.wenexa_id,db_consumer.rr_no FROM db_consumer_pump_details INNER JOIN db_consumer ON db_consumer.consumer_id=db_consumer_pump_details.consumer_id
WHERE db_consumer_pump_details.consumer_id=i AND db_consumer_pump_details.status=$1
ORDER BY db_consumer_pump_details.consumer_id,pump_id,createddate DESC LIMIT 2
);
END LOOP;
END;
$$
edit2:
You probably want to take a look at this solution for groupwise-k-maximum problem as that's exactly what you're dealing with here.
it might be easier to just return a table (or query)
CREATE FUNCTION extended_sales(p_itemno int)
RETURNS TABLE(quantity int, total numeric) AS $$
BEGIN
RETURN QUERY SELECT quantity, quantity * price FROM sales
WHERE itemno = p_itemno;
END;
$$ LANGUAGE plpgsql;
(copied from postgresql docs)