I need to create a macro or function that takes in two parameters, and concats them together after a basic data manipulation. Then this text string should be able to be used in any other query.
CREATE OR REPLACE FUNCTION getData (_table text, _days text)
RETURNS text AS
$func$
SELECT $1 || to_char(CURRENT_TIMESTAMP - ($2 || ' days')::INTERVAL, 'YYYYMMDD');
$func$ LANGUAGE sql;
select * from getData('opportunity', '4') limit 10;
So what I am expecting from this is to actually get the same result as if I executed
select * from opportunity20151030 limit 10;
Instead I am getting "opportunity20151030"
EDIT:
The reason we need this is because my employer is doing a nightly snapshot of our salesforce data, about 17 objects in all. Thats why I can't return a table. Returning a table needs to specify the columns. But we need to be able to query a variety of tables. This really needs to be just a small utility macro. This way we can have one query, to generate graphs that compare data from today and a week ago. This can even be something inside of pgAdmin itself, and is not restricted to being postgresql function. Is there a way I can execute the function and use the result inline in another query. I spent an hour playing around with
Execute 'select * from $1' using getData('opportunity', '4')
type queries, but apparently the LANGUAGE specified changes what can and can't be used in terms of compatible SQL statements.
Thank You!
Well, your function is returning exactly what you asked for... the name of the table you want to look in.
Now, while you can do a function that returns some table's data, the function must know the "structure" of that data. Which means that you can't use it for any table but those that share the same structure (same number of fields, same data types in the same order). Guessing about your table's name i will say that is an inherited table and a function like the one belowe can do the trick for any derived (inherited) table from opportunity
CREATE OR REPLACE FUNCTION getData (_table text, _days text, _limit int default -1)
RETURNS SETOF opportunity AS
$func$
DECLARE
table_name text;
limit_clause text = ' ';
BEGIN
SELECT _table || to_char(CURRENT_TIMESTAMP - (_days || ' days')::INTERVAL, 'YYYYMMDD') INTO table_name;
IF _limit > -1 THEN
limit_clause = ' LIMIT ' || _limit;
END IF;
RETURN QUERY EXECUTE 'SELECT * FROM ' || table_name || limit_clause;
RETURN;
END
$func$ LANGUAGE plpgsql;
And use it like:
select * from getData('opportunity', '4', 10);
PS1: i put the limit as one extra parameter in the function, but because that parameter has a default value you can ignore it when calling the function and that will return all values. i put it there because otherwise the function will return all rows in the table and limit after that.
PS2: i would avoid doing this unless you really think is needed, because this cannot be optimized if put as part of a larger query.
Related
May I know on how to call an array in stored procedure? I tried to enclosed it with a bracket to put the column_name that need to be insert in the new table.
CREATE OR REPLACE PROCEDURE data_versioning_nonull(new_table_name VARCHAR(100),column_name VARCHAR(100)[], current_table_name VARCHAR(100))
language plpgsql
as $$
BEGIN
EXECUTE ('CREATE TABLE ' || quote_ident(new_table_name) || ' AS SELECT ' || quote_ident(column_name) || ' FROM ' || quote_ident(current_table_name));
END $$;
CALL data_versioning_nonull('sales_2019_sample', ['orderid', 'product', 'address'], 'sales_2019');
Using execute format() lets you replace all the quote_ident() with %I placeholders in a single text instead of a series of concatenated snippets. %1$I lets you re-use the first argument.
It's best if you use ARRAY['a','b','c']::VARCHAR(100)[] to explicitly make it an array of your desired type. '{"a","b","c"}'::VARCHAR(100)[] works too.
You'll need to convert the array into a list of columns some other way, because when cast to text, it'll get curly braces which are not allowed in the column list syntax. Demo
It's not a good practice to introduce random limitations - PostgreSQL doesn't limit identifier lengths to 100 characters, so you don't have to either. The default limit is 63 bytes, so you can go way, way longer than 100 characters (demo). You can switch that data type to a regular text. Interestingly, exceeding specified varchar length would just convert it to unlimited varchar, making it just syntax noise.
DBFiddle online demo
CREATE TABLE sales_2019(orderid INT,product INT,address INT);
CREATE OR REPLACE PROCEDURE data_versioning_nonull(
new_table_name TEXT,
column_names TEXT[],
current_table_name TEXT)
LANGUAGE plpgsql AS $$
DECLARE
list_of_columns_as_quoted_identifiers TEXT;
BEGIN
SELECT string_agg(quote_ident(name),',')
INTO list_of_columns_as_quoted_identifiers
FROM unnest(column_names) name;
EXECUTE format('CREATE TABLE %1$I.%2$I AS SELECT %3$s FROM %1$I.%4$I',
current_schema(),
new_table_name,
list_of_columns_as_quoted_identifiers,
current_table_name);
END $$;
CALL data_versioning_nonull(
'sales_2019_sample',
ARRAY['orderid', 'product', 'address']::text[],
'sales_2019');
Schema awareness: currently the procedure creates the new table in the default schema, based on a table in that same default schema - above I made it explicit, but that's what it would do without the current_schema() calls anyway. You could add new_table_schema and current_table_schema parameters and if most of the time you don't expect them to be used, you can hide them behind procedure overloads for convenience, using current_schema() to keep the implicit behaviour. Demo
First, change your stored procedure to convert selected columns from array to csv like this.
CREATE OR REPLACE PROCEDURE data_versioning_nonull(new_table_name VARCHAR(100),column_name VARCHAR(100)[], current_table_name VARCHAR(100))
language plpgsql
as $$
BEGIN
EXECUTE ('CREATE TABLE ' || quote_ident(new_table_name) || ' AS SELECT ' || array_to_string(column_name, ',') || ' FROM ' || quote_ident(current_table_name));
END $$;
Then call it as:
CALL data_versioning_nonull('sales_2019_sample', '{"orderid", "product", "address"}', 'sales_2019');
I have a PL/pgSQL function that takes table name as dynamic parameter. As I am updating an existing query to take table name as dynamic parameter, this is what I have as my function:
DECLARE rec RECORD;
BEGIN
EXECUTE 'insert into stat_300_8_0(ts, target, data)
select distinct timestamp-(timestamp%3600) as wide_row_ts,
target, array[]::real[] as data
from ' || temp_table_name || ' as temp
where class_id=8
and subclass_id=0
and not exists (select ts from stat_300_8_0
where ts=temp.timestamp-(temp.timestamp%3600)
and target=temp.target)';
FOR rec IN EXECUTE 'SELECT DISTINCT timestamp AS ts
FROM ' || temp_table_name ||
' WHERE class_id=8'
LOOP
EXECUTE 'update stat_300_8_0 as disk_table
set data[new_data.start_idx:new_data.end_idx] = array[data_0,data_1]
from (select timestamp-(timestamp%3600) as wide_row_ts,
(timestamp%3600)/300 * 2 + 1 as start_idx,
((timestamp%3600 / 300) + 1) * 2 as end_idx,
target, data_0, data_1
from ' || temp_table_name ||
' where class_id=8 and subclass_id=0
and timestamp=rec.ts) as new_data
where disk_table.ts=new_data.wide_row_ts
and disk_table.target=new_data.target';
END LOOP;
END;
However, when this function is executed I get an error saying
ERROR: missing FROM-clause entry for table "rec"
However, rec is declared in the first line of the above code. I am not able to figure what is wrong with my queries. Any help is appreciated.
Supplemental to Eelke's answer:
Assuming temp_table_name is an argument, you really, really want to run it through quote_ident() because otherwise someone could create a table with a name that could inject sql into your function.
Instead of the change suggested there, you are better off using EXECUTE...USING since that gives you parameterization regarding values (and hence protection against SQL injection). You would change rec.ts to $1 and then add to the end USING ts.rec (outside the quoted execute string). This gives you a parameterized statement inside your execute which is safer. However parameters cannot include table names, so it doesn't spare you from the first point above.
I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.
I have created the following stored procedure, which basically receives a name of table, and a prefix. The function then finds all columns that share this prefix and returns as an output a 'select' query command ('myoneliner').
as follows:
CREATE OR REPLACE FUNCTION mytext (mytable text, myprefix text)
RETURNS text AS $myoneliner$
declare
myoneliner text;
BEGIN
SELECT 'SELECT ' || substr(cols,2,length(cols)-2) ||' FROM '||mytable
INTO myoneliner
FROM (
SELECT array(
SELECT DISTINCT quote_ident(column_name::text)
FROM information_schema.columns
WHERE table_name = mytable
AND column_name LIKE myprefix||'%'
order by quote_ident
)::text cols
) sub;
RETURN myoneliner;
END;
$myoneliner$ LANGUAGE plpgsql;
Call:
select mytext('dkj_p_k27ac','enri');
As a result of running this stored procedure and the 'select' that is following it, I get the following output at the Data Output window (all within one cell, named "mytext text"):
'SELECT enrich_d_dkj_p_k27ac,enrich_lr_dkj_p_k27ac,enrich_r_dkj_p_k27ac
FROM dkj_p_k27ac'
I would like to basically be able to take the output command line that I received as an output and execute it. In other words, I would like to be able and execute the output of my stored procedure.
How can I do so?
I tried the following:
CREATE OR REPLACE FUNCTION mytext (mytable text, myprefix text)
RETURNS SETOF RECORD AS $$
declare
smalltext text;
myoneliner text;
BEGIN
SELECT 'SELECT ' || substr(cols,2,length(cols)-2) ||' FROM '||mytable
INTO myoneliner
FROM (
SELECT array(
SELECT DISTINCT quote_ident(column_name::text)
FROM information_schema.columns
WHERE table_name = mytable
AND column_name LIKE myprefix||'%'
order by quote_ident
)::text cols
) sub;
smalltext=lower(myoneliner);
raise notice '%','my additional text '||smalltext;
RETURN QUERY EXECUTE smalltext;
END;
$$ LANGUAGE plpgsql;
Call function:
SELECT * from mytext('dkj_p_k27ac','enri');
But I'm getting the following error message, could you please advise what should I change in order for it to execute?:
ERROR: a column definition list is required for functions returning "record"
LINE 26: SELECT * from mytext('dkj_p_k27ac','enri');
********** Error **********
ERROR: a column definition list is required for functions returning "record"
SQL state: 42601
Character: 728
Your first problem was solved by using dynamic SQL with EXECUTE like Craig advised.
But the rabbit hole goes deeper:
CREATE OR REPLACE FUNCTION myresult(mytable text, myprefix text)
RETURNS SETOF RECORD AS
$func$
DECLARE
smalltext text;
myoneliner text;
BEGIN
SELECT INTO myoneliner
'SELECT '
|| string_agg(quote_ident(column_name::text), ',' ORDER BY column_name)
|| ' FROM ' || quote_ident(mytable)
FROM information_schema.columns
WHERE table_name = mytable
AND column_name LIKE myprefix||'%'
AND table_schema = 'public'; -- schema name; might be another param
smalltext := lower(myoneliner); -- nonsense
RAISE NOTICE 'My additional text: %', myoneliner;
RETURN QUERY EXECUTE myoneliner;
END
$func$ LANGUAGE plpgsql;
Major points
Don't cast the whole statement to lower case. Column names might be double-quoted with upper case letters, which are case-sensitive in this case (no pun intended).
You don't need DISTINCT in the query on information_schema.columns. Column names are unique per table.
You do need to specify the schema, though (or use another way to single out one schema), or you might be mixing column names from multiple tables of the same name in multiple schemas, resulting in nonsense.
You must sanitize all identifiers in dynamic code - including table names: quote_ident(mytable). Be aware that your text parameter to the function is case sensitive! The query on information_schema.columns requires that, too.
I untangled your whole construct to build the list of column names with string_agg() instead of the array constructor. Related answer:
Update multiple columns that start with a specific string
The assignment operator in plpgsql is :=.
Simplified syntax of RAISE NOTICE.
Core problem impossible to solve
All of this still doesn't solve your main problem: SQL demands a definition of the columns to be returned. You can circumvent this by returning anonymous records like you tried. But that's just postponing the inevitable. Now you have to provide a column definition list at call time, just like your error message tells you. But you just don't know which columns are going to be returned. Catch 22.
Your call would work like this:
SELECT *
FROM myresult('dkj_p_k27ac','enri') AS f (
enrich_d_dkj_p_k27ac text -- replace with actual column types
, enrich_lr_dkj_p_k27ac text
, enrich_r_dkj_p_k27ac text);
But you don't know number, names (optional) and data types of returned columns, not at creation time of the function and not even at call time. It's impossible to do exactly that in a single call. You need two separate queries to the database.
You could return all columns of any given table dynamically with a function using polymorphic types, because there is a well defined type for the whole table. Last chapter of this related answer:
Refactor a PL/pgSQL function to return the output of various SELECT queries
FYI: I am completely new to using cursors...
So I have one function that is a cursor:
CREATE FUNCTION get_all_product_promos(refcursor, cursor_object_id integer) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR SELECT *
FROM promos prom1
JOIN promo_objects ON (prom1.promo_id = promo_objects.promotion_id)
WHERE prom1.active = true AND now() BETWEEN prom1.start_date AND prom1.end_date
AND promo_objects.object_id = cursor_object_id
UNION
SELECT prom2.promo_id
FROM promos prom2
JOIN promo_buy_objects ON (prom2.promo_id =
promo_buy_objects.promo_id)
LEFT JOIN promo_get_objects ON prom2.promo_id = promo_get_objects.promo_id
WHERE (prom2.buy_quantity IS NOT NULL OR prom2.buy_quantity > 0) AND
prom2.active = true AND now() BETWEEN prom2.start_date AND
prom2.end_date AND promo_buy_objects.object_id = cursor_object_id;
RETURN $1;
END;
' LANGUAGE plpgsql;
SO then in another function I call it and need to process it:
...
--Get the promotions from the cursor
SELECT get_all_product_promos('promo_cursor', this_object_id)
updated := FALSE;
IF FOUND THEN
--Then loop through your results
LOOP
FETCH promo_cursor into this_promotion
--Preform comparison logic -this is necessary as this logic is used in other contexts from other functions
SELECT * INTO best_promo_results FROM get_best_product_promos(this_promotion, this_object_id, get_free_promotion, get_free_promotion_value, current_promotion_value, current_promotion);
...
SO the idea here is to select from the cursor, loop using fetch (next is assumed correct?) and put the record fetched into this_promotion. Then send the record in this_promotion to another function. I can't figure out what to declare the type of this_promotion in get_best_product_promos. Here is what I have:
CREATE OR REPLACE FUNCTION get_best_product_promos(this_promotion record, this_object_id integer, get_free_promotion integer, get_free_promotion_value numeric(10,2), current_promotion_value numeric(10,2), current_promotion integer)
RETURNS...
It tells me: ERROR: plpgsql functions cannot take type record
OK first I tried:
CREATE OR REPLACE FUNCTION get_best_product_promos(this_promotion get_all_product_promos, this_object_id integer, get_free_promotion integer, get_free_promotion_value numeric(10,2), current_promotion_value numeric(10,2), current_promotion integer)
RETURNS...
Because I saw some syntax in the Postgres docs showed a function being created w/ a input parameter that had a type 'tablename' this works, but it has to be a tablename not a function :( I know I am so close, I was told to use cursors to pass records around. So I studied up. Please help.
One possibility would be to define the query you have in get_all_product_promos as a all_product_promos view instead. Then you would automatically have the "all_product_promos%rowtype" type to pass between functions.
That is, something like:
CREATE VIEW all_product_promos AS
SELECT promo_objects.object_id, prom1.*
FROM promos prom1
JOIN promo_objects ON (prom1.promo_id = promo_objects.promotion_id)
WHERE prom1.active = true AND now() BETWEEN prom1.start_date AND prom1.end_date
UNION ALL
SELECT promo_buy_objects.object_id, prom2.*
FROM promos prom2
JOIN promo_buy_objects ON (prom2.promo_id = promo_buy_objects.promo_id)
LEFT JOIN promo_get_objects ON prom2.promo_id = promo_get_objects.promo_id
WHERE (prom2.buy_quantity IS NOT NULL OR prom2.buy_quantity > 0)
AND prom2.active = true
AND now() BETWEEN prom2.start_date AND prom2.end_date
You should be able to verify using EXPLAIN that querying SELECT * FROM all_product_promos WHERE object_id = ? takes the object_id parameter into the two subqueries rather than filtering afterwards. Then from another function you can write:
DECLARE
this_promotion all_product_promos%ROWTYPE;
BEGIN
FOR this_promotion IN
SELECT * FROM all_product_promos WHERE object_id = this_object_id
LOOP
-- deal with promotion in this_promotion
END LOOP;
END
TBH I would avoid using cursors to pass records around in PLPGSQL. In fact, I would avoid using cursors in PLPGSQL full stop- unless you need to pass a whole resultset to another function for some reason. This method of simply looping through the statement is much simpler, with the caveat that the entire resultset is materialised into memory first.
The other disadvantage with this approach is that if you need to add a column to all_product_promos, you will need to recreate all the functions that depend on it, since you can't add columns to a view with 'alter view'. AFAICT this affects named types create with CREATE TYPE too, since ALTER TYPE doesn't seem to allow you to add columns to a type either.
So, you can specify a record format using "CREATE TYPE" to pass between functions. Any relation automagically specifies a type called <relation>%ROWTYPE which you can also use.
The answer:
Select specific fields in the cursor function instead of *
then:
CREATE TYPE get_all_product_promos as (buy_quantity integer, discount_amount numeric(10,2), get_quantity integer, discount_type integer, promo_id integer);
Then I can say:
CREATE OR REPLACE FUNCTION get_best_product_promos(this_promotion get_all_product_promos, this_object_id integer, get_free_promotion integer, get_free_promotion_value numeric(10,2), current_promotion_value numeric(10,2), current_promotion integer)
RETURNS...