Access column using variable instead of explicit column name - postgresql

I would like to access a column by using variable instead of a static column name.
Example:
variable := 'customer';
SELECT table.variable (this is what I would prefer) instead of table.customer
I need this functionality as records in my table vary in terms of data length (eg. some have data in 10 columns, some in 14 or 16 columns etc.) so I need to address columns dynamically. As I understand, I can't address columns by their index (eg. select 8-th column of the table) right?
I can loop and put the desired column name in a variable for the given iteration. However, I get errors when I try to access a column using that variable (e.g. table_name.variable is not working).
For the sake of simplicity, I paste just some dummy code to illustrate the issue:
CREATE OR REPLACE FUNCTION dynamic_column_name() returns text
LANGUAGE PLPGSQL
AS $$
DECLARE
col_name text;
return_value text;
BEGIN
create table customer (
id bigint,
name varchar
);
INSERT INTO customer VALUES(1, 'Adam');
col_name := 'name';
-- SELECT customer.name INTO return_value FROM customer WHERE id = 1; -- WORKING, returns 'Adam' but it is not DYNAMIC.
-- SELECT customer.col_name INTO return_value FROM customer WHERE id = 1; -- ERROR: column customer.col_name does not exist
-- SELECT 'customer.'||col_name INTO return_value FROM customer WHERE id = 1; -- NOT working, returns 'customer.name'
-- SELECT customer||'.'||col_name INTO return_value FROM customer WHERE id = 1; -- NOT working, returns whole record + .name, i.e.: (1,Adam).name
DROP TABLE customer;
RETURN return_value;
END;
$$;
SELECT dynamic_column_name();
So how to obtain 'Adam' string with SQL query using col_name variable when addressing column of customer table?

SQL does not allow to parameterize identifiers (incl. column names) or syntax elements. Only values can be parameters.
You need dynamic SQL for that. (Basically, build the SQL string and execute.) Use EXECUTE in a plpgsql function. There are multiple syntax variants. For your simple example:
CREATE OR REPLACE FUNCTION dynamic_column_name(_col_name text, OUT return_value text)
RETURNS text
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('SELECT %I FROM customer WHERE id = 1', _col_name)
INTO return_value;
END
$func$;
Call:
SELECT dynamic_column_name('name');
db<>fiddle here
Data types have to be compatible, of course.
More examples:
How to use text input as column name(s) in a Postgres function?
https://stackoverflow.com/search?q=%5Bpostgres%5D+%5Bdynamic-sql%5D+parameter+column+code%3AEXECUTE

Related

Call postgresql record's field by name

I have a function that uses RECORD to temporarily store the data. I can use it - it's fine. My problem is that I can't hardcode columns I need to get from the RECORD. I must do it dynamically. Something line:
DECLARE
r1 RECORD;
r2 RECORD;
BEGIN
for r1 in Select column_name
from columns_to_process
where process_now = True
loop
for r2 in Select *
from my_data_table
where whatever
loop
-----------------------------
here I must call column by its name that is unknown at design time
-----------------------------
... do something with
r2.(r1.column_name)
end loop;
end loop;
END;
Does anyone know how to do it?
best regards
M
There is no need to select the all the qualifying rows and compute the total in a loop. Actually when working with SQL try to drop the word loop for your vocabulary; instead just use sum(column_name) in the select. The issue here is that you do not know what column to sum when the query is written, and all structural components(table names, columns names, operators, etc) must be known before submitting. You cannot use a variable for a structural component - in this case a column name. To do that you must use dynamic sql - i.e. SQL statement built by the process. The following accomplishes that: See example here.
create or replace function sum_something(
the_something text -- column name
, for_id my_table.id%type -- my_table.id
)
returns numeric
language plpgsql
as $$
declare
k_query_base constant text :=
$STMT$ Select sum(%I) from my_table where id = %s; $STMT$;
l_query text;
l_sum numeric;
begin
l_query = format(k_query_base, the_something, for_id);
raise notice E'Rumming Statememt:\n %',l_query; -- for prod raise Log
execute l_query into l_sum;
return l_sum;
end;
$$;
Well, after some time I figured out that I could use temporary table instead of RECORD. Doing so gives me all advantages of using dynamic queries so I can call any column by its name.
DECLARE
_my_var bigint;
BEGIN
create temporary table _my_temp_table as
Select _any, _column, _you, _need
from _my_table
where whatever = something;
execute 'Select ' || _any || ' from _my_temp_table' into _my_var;
... do whatever
END;
However I still believe that there should be a way to call records field by it's name.

Postgres stored procedure/function

New to Stored Procedures , have a requirement where I need to execute multiple queries inside stored procedure and return results. I would like to know whether this is possible or not ..
Ex :
Query 1 returns a list of userid ..
Select userid from user where username = ?
For each userid from the above query , I need to execute three different queries like
Query 2 select session_details from session where userid = ?
Query 3 select location from location where userid = ?
The return value should be a collection of , session_details and location.
Is this possible,can you provide some hints?
You can loop through query results like so:
FOR id IN Select userid from user where username = ?
LOOP
...
END LOOP;
As #Fahad Anjum says in his comment, its better if you can do it in a query. But if that's not posible, you have tree posibilities to achive what you want.
SETOF
TABLE
refcursor
1. SETOF
You can return a set of values. The set can be an existing table, a temporal table, or a TYPE you define.
TYPE example:
-- In your case the type could be (userid integer, session integer, location text)
CREATE TYPE tester AS (id integer);
-- The pl returns a SETOF the created type.
CREATE OR REPLACE FUNCTION test() RETURNS SETOF tester
AS $$
BEGIN
RETURN QUERY SELECT generate_series(1, 3) as id;
END;
$$ LANGUAGE plpgsql
-- Then, you get the set by selecting the PL as if it were a table.
SELECT * FROM test();
Table and Temp Table examples:
-- Create a temporal table o a regular table:
CREATE TEMP TABLE test_table(id integer);
-- or CREATE TABLE test_table(id integer);
-- or use an existing table in your schema(s);
-- The pl returns a SETOF the table you need
CREATE OR REPLACE FUNCTION test() RETURNS SETOF test_table
AS $$
BEGIN
RETURN QUERY SELECT generate_series(1, 3) as id;
END;
$$ LANGUAGE plpgsql
-- Then, you get the set by selecting the PL as if it were a table.
SELECT * FROM test();
-- NOTE: Since you are only returning a SETOF the table,
-- you don't insert any data into the table.
-- So, if you select the 'temp' table you won't see any changes.
SELECT * FROM test_table
2. TABLE
A PL can return a table, it would be similar to create a temporal table and then return a SETOF, but, in this case you declare de 'temp' table on the 'returns' sentence of the PL.
-- Next to TABLE you define the columns of the table the PL will return
CREATE OR REPLACE FUNCTION test() RETURNS TABLE (id integer)
AS $$
BEGIN
RETURN QUERY SELECT generate_series(1, 3) as id;
END;
$$ LANGUAGE plpgsql
-- As the other examples, you select the PL to get the data.
SELECT * FROM test();
3. refcursor
This one is the more complex solution. You return a cursor, not the actual data. If you need 'dynamic' values for your returning set, this is the solution.
But since you need static data, you won't need this option.
The use of any of these ways depends on any specific case, if you use regularly the userid,session,location in different ways and PLs, it would be better to Use the SETOF with a type.
If you have a table that has the userid,session,location columns, it's better to return a SETOF table.
If you just use the userid,session,location for one case, then it would be better to use a 'RETURNS TABLE' approach.
If you need to return a dynamic set you would have to use cursors... but that solution is really more advanced.
Based solely on your example, here's probably the easiest way to do it:
CREATE FUNCTION my_func(user_id INTEGER)
RETURNS TABLE (userid INTEGER, session INTEGER, location TEXT) AS
$$
SELECT u.userid, s.session, l.location
FROM -- etc... your query here
$$
LANGUAGE SQL STABLE;
Addressing comment:
That's a bit of a different question. One question is how to return multiple records containing multiple fields in a stored procedure. One way is as above.
The other question is how to write a query that gets data from multiple tables. Again, there are many ways to do it. One way is (again, based on my interpretation of your requirements in the examples):
SELECT userid
, ARRAY_AGG(SELECT session_details FROM session s WHERE s.userid = u.userid)
, ARRAY_AGG(SELECT l.location FROM location l WHERE l.userid = u.userid)
FROM user u
WHERE username = user_name
This will return one record containing the user_id, an array of session_details for that user, and an array of locations for that user.
Then the function can be changed to:
CREATE FUNCTION my_func(user_name TEXT, OUT userid INTEGER, OUT session_details TEXT[], OUT locations TEXT[])
AS $$
SELECT userid
, ARRAY(SELECT session_details FROM session s WHERE s.userid = u.userid)
, ARRAY(SELECT l.location FROM location l WHERE l.userid = u.userid)
FROM user u
WHERE username = user_name;
$$ LANGUAGE SQL STABLE;

Function to return dynamic set of columns for given table

I have a fields table to store column information for other tables:
CREATE TABLE public.fields (
schema_name varchar(100),
table_name varchar(100),
column_text varchar(100),
column_name varchar(100),
column_type varchar(100) default 'varchar(100)',
column_visible boolean
);
And I'd like to create a function to fetch data for a specific table.
Just tried sth like this:
create or replace function public.get_table(schema_name text,
table_name text,
active boolean default true)
returns setof record as $$
declare
entity_name text default schema_name || '.' || table_name;
r record;
begin
for r in EXECUTE 'select * from ' || entity_name loop
return next r;
end loop;
return;
end
$$
language plpgsql;
With this function I have to specify columns when I call it!
select * from public.get_table('public', 'users') as dept(id int, uname text);
I want to pass schema_name and table_name as parameters to function and get record list, according to column_visible field in public.fields table.
Solution for the simple case
As explained in the referenced answers below, you can use registered (row) types, and thus implicitly declare the return type of a polymorphic function:
CREATE OR REPLACE FUNCTION public.get_table(_tbl_type anyelement)
RETURNS SETOF anyelement AS
$func$
BEGIN
RETURN QUERY EXECUTE format('TABLE %s', pg_typeof(_tbl_type));
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM public.get_table(NULL::public.users); -- note the syntax!
Returns the complete table (with all user columns).
Wait! How?
Detailed explanation in this related answer, chapter
"Various complete table types":
Refactor a PL/pgSQL function to return the output of various SELECT queries
TABLE foo is just short for SELECT * FROM foo:
Is there a shortcut for SELECT * FROM?
2 steps for completely dynamic return type
But what you are trying to do is strictly impossible in a single SQL command.
I want to pass schema_name and table_name as parameters to function and get record list, according to column_visible field in
public.fields table.
There is no direct way to return an arbitrary selection of columns (return type not known at call time) from a function - or any SQL command. SQL demands to know number, names and types of resulting columns at call time. More in the 2nd chapter of this related answer:
How do I generate a pivoted CROSS JOIN where the resulting table definition is unknown?
There are various workarounds. You could wrap the result in one of the standard document types (json, jsonb, hstore, xml).
Or you generate the query with one function call and execute the result with the next:
CREATE OR REPLACE FUNCTION public.generate_get_table(_schema_name text, _table_name text)
RETURNS text AS
$func$
SELECT format('SELECT %s FROM %I.%I'
, string_agg(quote_ident(column_name), ', ')
, schema_name
, table_name)
FROM fields
WHERE column_visible
AND schema_name = _schema_name
AND table_name = _table_name
GROUP BY schema_name, table_name
ORDER BY schema_name, table_name;
$func$ LANGUAGE sql;
Call:
SELECT public.generate_get_table('public', 'users');
This create a query of the form:
SELECT usr_id, usr FROM public.users;
Execute it in the 2nd step. (You might want to add column numbers and order columns.)
Or append \gexec in psql to execute the return value immediately. See:
How to force evaluation of subquery before joining / pushing down to foreign server
Be sure to defend against SQL injection:
INSERT with dynamic table name in trigger function
Define table and column names as arguments in a plpgsql function?
Asides
varchar(100) does not make much sense for identifiers, which are limited to 63 characters in standard Postgres:
Maximum characters in labels (table names, columns etc)
If you understand how the object identifier type regclass works, you might replace schema and table name with a singe regclass column.
I think you just need another query to get the list of columns you want.
Maybe something like (this is untested):
create or replace function public.get_table(_schema_name text, _table_name text, active boolean default true) returns setof record as $$
declare
entity_name text default schema_name || '.' || table_name;
r record;
columns varchar;
begin
-- Get the list of columns
SELECT string_agg(column_name, ', ')
INTO columns
FROM public.fields
WHERE fields.schema_name = _schema_name
AND fields.table_name = _table_name
AND fields.column_visible = TRUE;
-- Return rows from the specified table
RETURN QUERY EXECUTE 'select ' || columns || ' from ' || entity_name;
RETURN;
end
$$
language plpgsql;
Keep in mind that column/table references may need to be surrounded by double quotes if they have certain characters in them.

Execute a dynamic crosstab query

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.

How to access record type when column names are ambiguous?

I have this function where I store the result of a query in a RECORD type variable.
The problem is that the two tables of my query have one column with the same name ("description"), and I don't know how I can distinguish these two using my RECORD variable.
CREATE OR REPLACE FUNCTION fn_x(_id BIGINT)
RETURNS TEXT AS $BODY$
DECLARE
l_row RECORD;
l_tableADescription TEXT;
l_tableBDescription TEXT;
BEGIN
SELECT *
INTO l_row
FROM tableA a
JOIN tableB b ON (a.idA = b.idA)
WHERE e.idA = _id;
-- problem is here
l_tableADescription = l_row.tableA.description;
l_tableBDescription = l_row.tableB.description;
-- do other stuff
RETURN '';
END;
$BODY$
LANGUAGE plpgsql STABLE;
Using AS in the SELECT clause is not an option because these two tables have a large number of columns.
I am using PostgreSQL 9.4.4
If the column names are not unique, I don't know of any way around naming the variables when you enter the query. i.e.:
SELECT a.description AS t1Desc, b.description AS t2Desc, *
INTO row
FROM tableA a
JOIN tableB b ON (a.idA = b.idB);
Then to access them individually from the RECORD variable:
tableADescription_1 = row.t1Desc;
tableBDescription_1 = row.t2Desc;
Use the names as specified in the query, not the original column names from the table.
That's what we used in a project where we had two tables in a join with a common name.