postgres lag and window to create cohort table [duplicate] - postgresql

I am trying to create crosstab queries in PostgreSQL such that it automatically generates the crosstab columns instead of hardcoding it. I have written a function that dynamically generates the column list that I need for my crosstab query. The idea is to substitute the result of this function in the crosstab query using dynamic sql.
I know how to do this easily in SQL Server, but my limited knowledge of PostgreSQL is hindering my progress here. I was thinking of storing the result of function that generates the dynamic list of columns into a variable and use that to dynamically build the sql query. It would be great if someone could guide me regarding the same.
-- Table which has be pivoted
CREATE TABLE test_db
(
kernel_id int,
key int,
value int
);
INSERT INTO test_db VALUES
(1,1,99),
(1,2,78),
(2,1,66),
(3,1,44),
(3,2,55),
(3,3,89);
-- This function dynamically returns the list of columns for crosstab
CREATE FUNCTION test() RETURNS TEXT AS '
DECLARE
key_id int;
text_op TEXT = '' kernel_id int, '';
BEGIN
FOR key_id IN SELECT DISTINCT key FROM test_db ORDER BY key LOOP
text_op := text_op || key_id || '' int , '' ;
END LOOP;
text_op := text_op || '' DUMMY text'';
RETURN text_op;
END;
' LANGUAGE 'plpgsql';
-- This query works. I just need to convert the static list
-- of crosstab columns to be generated dynamically.
SELECT * FROM
crosstab
(
'SELECT kernel_id, key, value FROM test_db ORDER BY 1,2',
'SELECT DISTINCT key FROM test_db ORDER BY 1'
)
AS x (kernel_id int, key1 int, key2 int, key3 int); -- How can I replace ..
-- .. this static list with a dynamically generated list of columns ?

You can use the provided C function crosstab_hash for this.
The manual is not very clear in this respect. It's mentioned at the end of the chapter on crosstab() with two parameters:
You can create predefined functions to avoid having to write out the
result column names and types in each query. See the examples in the
previous section. The underlying C function for this form of crosstab
is named crosstab_hash.
For your example:
CREATE OR REPLACE FUNCTION f_cross_test_db(text, text)
RETURNS TABLE (kernel_id int, key1 int, key2 int, key3 int)
AS '$libdir/tablefunc','crosstab_hash' LANGUAGE C STABLE STRICT;
Call:
SELECT * FROM f_cross_test_db(
'SELECT kernel_id, key, value FROM test_db ORDER BY 1,2'
,'SELECT DISTINCT key FROM test_db ORDER BY 1');
Note that you need to create a distinct crosstab_hash function for every crosstab function with a different return type.
Related:
PostgreSQL row to columns
Your function to generate the column list is rather convoluted, the result is incorrect (int missing after kernel_id), it can be replaced with this SQL query:
SELECT 'kernel_id int, '
|| string_agg(DISTINCT key::text, ' int, ' ORDER BY key::text)
|| ' int, DUMMY text'
FROM test_db;
And it cannot be used dynamically anyway.

#erwin-brandstetter: The return type of the function isn't an issue if you're always returning a JSON type with the converted results.
Here is the function I came up with:
CREATE OR REPLACE FUNCTION report.test(
i_start_date TIMESTAMPTZ,
i_end_date TIMESTAMPTZ,
i_interval INT
) RETURNS TABLE (
tab JSON
) AS $ab$
DECLARE
_key_id TEXT;
_text_op TEXT = '';
_ret JSON;
BEGIN
-- SELECT DISTINCT for query results
FOR _key_id IN
SELECT DISTINCT at_name
FROM report.company_data_date cd
JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id
JOIN report.amount_types at ON cda.amount_type_id = at.id
WHERE date_start BETWEEN i_start_date AND i_end_date
AND interval_type_id = i_interval
LOOP
-- build function_call with datatype of column
IF char_length(_text_op) > 1 THEN
_text_op := _text_op || ', ' || _key_id || ' NUMERIC(20,2)';
ELSE
_text_op := _text_op || _key_id || ' NUMERIC(20,2)';
END IF;
END LOOP;
-- build query with parameter filters
RETURN QUERY
EXECUTE '
SELECT array_to_json(array_agg(row_to_json(t)))
FROM (
SELECT * FROM crosstab(''SELECT date_start, at.at_name, cda.amount ct
FROM report.company_data_date cd
JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id
JOIN report.amount_types at ON cda.amount_type_id = at.id
WHERE date_start between $$' || i_start_date::TEXT || '$$ AND $$' || i_end_date::TEXT || '$$
AND interval_type_id = ' || i_interval::TEXT || ' ORDER BY date_start'')
AS ct (date_start timestamptz, ' || _text_op || ')
) t;';
END;
$ab$ LANGUAGE 'plpgsql';
So, when you run it, you get the dynamic results in JSON, and you don't need to know how many values were pivoted:
select * from report.test(now()- '1 week'::interval, now(), 1);
tab
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"date_start":"2015-07-27T08:40:01.277556-04:00","burn_rate":0.00,"monthly_revenue":5800.00,"cash_balance":0.00},{"date_start":"2015-07-27T08:50:02.458868-04:00","burn_rate":34000.00,"monthly_revenue":15800.00,"cash_balance":24000.00}]
(1 row)
Edit: If you have mixed datatypes in your crosstab, you can add logic to look it up for each column with something like this:
SELECT a.attname as column_name, format_type(a.atttypid, a.atttypmod) AS data_type
FROM pg_attribute a
JOIN pg_class b ON (a.attrelid = b.relfilenode)
JOIN pg_catalog.pg_namespace n ON n.oid = b.relnamespace
WHERE n.nspname = $$schema_name$$ AND b.relname = $$table_name$$ and a.attstattarget = -1;"

I realise this is an older post but struggled for a little while on the same issue.
My Problem Statement:
I had a table with muliple values in a field and wanted to create a crosstab query with 40+ column headings per row.
My Solution was to create a function which looped through the table column to grab values that I wanted to use as column headings within the crosstab query.
Within this function I could then Create the crosstab query. In my use case I added this crosstab result into a separate table.
E.g.
CREATE OR REPLACE FUNCTION field_values_ct ()
RETURNS VOID AS $$
DECLARE rec RECORD;
DECLARE str text;
BEGIN
str := '"Issue ID" text,';
-- looping to get column heading string
FOR rec IN SELECT DISTINCT field_name
FROM issue_fields
ORDER BY field_name
LOOP
str := str || '"' || rec.field_name || '" text' ||',';
END LOOP;
str:= substring(str, 0, length(str));
EXECUTE 'CREATE EXTENSION IF NOT EXISTS tablefunc;
DROP TABLE IF EXISTS temp_issue_fields;
CREATE TABLE temp_issue_fields AS
SELECT *
FROM crosstab(''select issue_id, field_name, field_value from issue_fields order by 1'',
''SELECT DISTINCT field_name FROM issue_fields ORDER BY 1'')
AS final_result ('|| str ||')';
END;
$$ LANGUAGE plpgsql;

The approach described here worked well for me.
Instead of retrieving the pivot table directly. The easier approach is to let the function generate a SQL query string. Dynamically execute the resulting SQL query string on demand.

Related

In Postgres database How to get column values in one single row comma separated value [duplicate]

I am looking for a way to concatenate the strings of a field within a group by query. So for example, I have a table:
ID COMPANY_ID EMPLOYEE
1 1 Anna
2 1 Bill
3 2 Carol
4 2 Dave
and I wanted to group by company_id to get something like:
COMPANY_ID EMPLOYEE
1 Anna, Bill
2 Carol, Dave
There is a built-in function in mySQL to do this group_concat
PostgreSQL 9.0 or later:
Modern Postgres (since 2010) has the string_agg(expression, delimiter) function which will do exactly what the asker was looking for:
SELECT company_id, string_agg(employee, ', ')
FROM mytable
GROUP BY company_id;
Postgres 9 also added the ability to specify an ORDER BY clause in any aggregate expression; otherwise you have to order all your results or deal with an undefined order. So you can now write:
SELECT company_id, string_agg(employee, ', ' ORDER BY employee)
FROM mytable
GROUP BY company_id;
PostgreSQL 8.4.x:
PostgreSQL 8.4 (in 2009) introduced the aggregate function array_agg(expression) which collects the values in an array. Then array_to_string() can be used to give the desired result:
SELECT company_id, array_to_string(array_agg(employee), ', ')
FROM mytable
GROUP BY company_id;
PostgreSQL 8.3.x and older:
When this question was originally posed, there was no built-in aggregate function to concatenate strings. The simplest custom implementation (suggested by Vajda Gabo in this mailing list post, among many others) is to use the built-in textcat function (which lies behind the || operator):
CREATE AGGREGATE textcat_all(
basetype = text,
sfunc = textcat,
stype = text,
initcond = ''
);
Here is the CREATE AGGREGATE documentation.
This simply glues all the strings together, with no separator. In order to get a ", " inserted in between them without having it at the end, you might want to make your own concatenation function and substitute it for the "textcat" above. Here is one I put together and tested on 8.3.12:
CREATE FUNCTION commacat(acc text, instr text) RETURNS text AS $$
BEGIN
IF acc IS NULL OR acc = '' THEN
RETURN instr;
ELSE
RETURN acc || ', ' || instr;
END IF;
END;
$$ LANGUAGE plpgsql;
This version will output a comma even if the value in the row is null or empty, so you get output like this:
a, b, c, , e, , g
If you would prefer to remove extra commas to output this:
a, b, c, e, g
Then add an ELSIF check to the function like this:
CREATE FUNCTION commacat_ignore_nulls(acc text, instr text) RETURNS text AS $$
BEGIN
IF acc IS NULL OR acc = '' THEN
RETURN instr;
ELSIF instr IS NULL OR instr = '' THEN
RETURN acc;
ELSE
RETURN acc || ', ' || instr;
END IF;
END;
$$ LANGUAGE plpgsql;
How about using Postgres built-in array functions? At least on 8.4 this works out of the box:
SELECT company_id, array_to_string(array_agg(employee), ',')
FROM mytable
GROUP BY company_id;
As from PostgreSQL 9.0 you can use the aggregate function called string_agg. Your new SQL should look something like this: SELECT company_id, string_agg(employee, ', ')
FROM mytable
GROUP BY company_id;
I claim no credit for the answer because I found it after some searching:
What I didn't know is that PostgreSQL allows you to define your own aggregate functions with CREATE AGGREGATE
This post on the PostgreSQL list shows how trivial it is to create a function to do what's required:
CREATE AGGREGATE textcat_all(
basetype = text,
sfunc = textcat,
stype = text,
initcond = ''
);
SELECT company_id, textcat_all(employee || ', ')
FROM mytable
GROUP BY company_id;
As already mentioned, creating your own aggregate function is the right thing to do. Here is my concatenation aggregate function (you can find details in French):
CREATE OR REPLACE FUNCTION concat2(text, text) RETURNS text AS '
SELECT CASE WHEN $1 IS NULL OR $1 = \'\' THEN $2
WHEN $2 IS NULL OR $2 = \'\' THEN $1
ELSE $1 || \' / \' || $2
END;
'
LANGUAGE SQL;
CREATE AGGREGATE concatenate (
sfunc = concat2,
basetype = text,
stype = text,
initcond = ''
);
And then use it as:
SELECT company_id, concatenate(employee) AS employees FROM ...
This latest announcement list snippet might be of interest if you'll be upgrading to 8.4:
Until 8.4 comes out with a
super-effient native one, you can add
the array_accum() function in the
PostgreSQL documentation for rolling
up any column into an array, which can
then be used by application code, or
combined with array_to_string() to
format it as a list:
http://www.postgresql.org/docs/current/static/xaggr.html
I'd link to the 8.4 development docs but they don't seem to list this feature yet.
Following up on Kev's answer, using the Postgres docs:
First, create an array of the elements, then use the built-in array_to_string function.
CREATE AGGREGATE array_accum (anyelement)
(
sfunc = array_append,
stype = anyarray,
initcond = '{}'
);
select array_to_string(array_accum(name),'|') from table group by id;
Following yet again on the use of a custom aggregate function of string concatenation: you need to remember that the select statement will place rows in any order, so you will need to do a sub select in the from statement with an order by clause, and then an outer select with a group by clause to aggregate the strings, thus:
SELECT custom_aggregate(MY.special_strings)
FROM (SELECT special_strings, grouping_column
FROM a_table
ORDER BY ordering_column) MY
GROUP BY MY.grouping_column
Use STRING_AGG function for PostgreSQL and Google BigQuery SQL:
SELECT company_id, STRING_AGG(employee, ', ')
FROM employees
GROUP BY company_id;
I found this PostgreSQL documentation helpful: http://www.postgresql.org/docs/8.0/interactive/functions-conditional.html.
In my case, I sought plain SQL to concatenate a field with brackets around it, if the field is not empty.
select itemid,
CASE
itemdescription WHEN '' THEN itemname
ELSE itemname || ' (' || itemdescription || ')'
END
from items;
If you are on Amazon Redshift, where string_agg is not supported, try using listagg.
SELECT company_id, listagg(EMPLOYEE, ', ') as employees
FROM EMPLOYEE_table
GROUP BY company_id;
According to version PostgreSQL 9.0 and above you can use the aggregate function called string_agg. Your new SQL should look something like this:
SELECT company_id, string_agg(employee, ', ')
FROM mytable GROUP BY company_id;
You can also use format function. Which can also implicitly take care of type conversion of text, int, etc by itself.
create or replace function concat_return_row_count(tbl_name text, column_name text, value int)
returns integer as $row_count$
declare
total integer;
begin
EXECUTE format('select count(*) from %s WHERE %s = %s', tbl_name, column_name, value) INTO total;
return total;
end;
$row_count$ language plpgsql;
postgres=# select concat_return_row_count('tbl_name','column_name',2); --2 is the value
I'm using Jetbrains Rider and it was a hassle copying the results from above examples to re-execute because it seemed to wrap it all in JSON. This joins them into a single statement that was easier to run
select string_agg('drop table if exists "' || tablename || '" cascade', ';')
from pg_tables where schemaname != $$pg_catalog$$ and tableName like $$rm_%$$

Dynamically add a column with multiple values to any table using a PL/pgSQL function

I would like to use a function/procedure to add to a 'template' table an additional column (e.g. period name) with multiple values, and do a cartesian product on the rows, so my 'template' is duplicated with the different values provided for the new column.
E.g. Add a period column with 2 values to my template_country_channel table:
SELECT *
FROM unnest(ARRAY['P1', 'P2']) AS prd(period)
, template_country_channel
ORDER BY period DESC
, sort_cnty
, sort_chan;
/*
-- this is equivalent to:
(
SELECT 'P2'::text AS period
, *
FROM template_country_channel
) UNION ALL (
SELECT 'P1'::text AS period
, *
FROM template_country_channel
)
--
*/
This query is working fine, but I was wondering if I could turn that into a PL/pgSQL function/procedure, providing the new column values to be added, the column to add the extra column to (and optionally specify the order by conditions).
What I would like to do is:
SELECT *
FROM template_with_periods(
'template_country_channel' -- table name
, ARRAY['P1', 'P2'] -- values for the new column to be added
, 'period DESC, sort_cnty, sort_chan' -- ORDER BY string (optional)
);
and have the same result as the 1st query.
So I created a function like:
CREATE OR REPLACE FUNCTION template_with_periods(template regclass, periods text[], order_by text)
RETURNS SETOF RECORD
AS $BODY$
BEGIN
RETURN QUERY EXECUTE 'SELECT * FROM unnest($2) AS prd(period), $1 ORDER BY $3' USING template, periods, order_by ;
END;
$BODY$
LANGUAGE 'plpgsql'
;
But when I run:
SELECT *
FROM template_with_periods('template_country_channel', ARRAY['P1', 'P2'], 'period DESC, sort_cnty, sort_chan');
I have the error ERROR: 42601: a column definition list is required for functions returning “record”
After some googling, it seems that I need to define the list of columns and types to perform the RETURN QUERY (as the error message precisely states).
Unfortunately, the whole idea is to use the function with many 'template' tables, so columns name & type lists is not fixed.
Is there any other approach to try?
Or is the only way to make it work is to have within the function, a way to get list of columns' names and types of the template table?
I did this with refcursor if You want output columns list completely dynamic:
CREATE OR REPLACE FUNCTION is_record_exists(tablename character varying, columns character varying[], keepcolumns character varying[] DEFAULT NULL::character varying[])
RETURNS SETOF refcursor AS
$BODY$
DECLARE
ref refcursor;
keepColumnsList text;
columnsList text;
valuesList text;
existQuery text;
keepQuery text;
BEGIN
IF keepcolumns IS NOT NULL AND array_length(keepColumns, 1) > 0 THEN
keepColumnsList := array_to_string(keepColumns, ', ');
ELSE
keepColumnsList := 'COUNT(*)';
END IF;
columnsList := (SELECT array_to_string(array_agg(name || ' = ' || value), ' OR ') FROM
(SELECT unnest(columns[1:1]) AS name, unnest(columns[2:2]) AS value) pair);
existQuery := 'SELECT ' || keepColumnsList || ' FROM ' || tableName || ' WHERE ' || columnsList;
RAISE NOTICE 'Exist query: %', existQuery;
OPEN ref FOR EXECUTE
existQuery;
RETURN next ref;
END;$BODY$
LANGUAGE plpgsql;
Then need to call FETCH ALL IN to get results. Detailed syntax here or there: https://stackoverflow.com/a/12483222/630169. Seems it is the only way for now. Hope something will be changed in PostgreSQL 11 with PROCEDURES.

Function to return dynamic set of columns for given table

I have a fields table to store column information for other tables:
CREATE TABLE public.fields (
schema_name varchar(100),
table_name varchar(100),
column_text varchar(100),
column_name varchar(100),
column_type varchar(100) default 'varchar(100)',
column_visible boolean
);
And I'd like to create a function to fetch data for a specific table.
Just tried sth like this:
create or replace function public.get_table(schema_name text,
table_name text,
active boolean default true)
returns setof record as $$
declare
entity_name text default schema_name || '.' || table_name;
r record;
begin
for r in EXECUTE 'select * from ' || entity_name loop
return next r;
end loop;
return;
end
$$
language plpgsql;
With this function I have to specify columns when I call it!
select * from public.get_table('public', 'users') as dept(id int, uname text);
I want to pass schema_name and table_name as parameters to function and get record list, according to column_visible field in public.fields table.
Solution for the simple case
As explained in the referenced answers below, you can use registered (row) types, and thus implicitly declare the return type of a polymorphic function:
CREATE OR REPLACE FUNCTION public.get_table(_tbl_type anyelement)
RETURNS SETOF anyelement AS
$func$
BEGIN
RETURN QUERY EXECUTE format('TABLE %s', pg_typeof(_tbl_type));
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM public.get_table(NULL::public.users); -- note the syntax!
Returns the complete table (with all user columns).
Wait! How?
Detailed explanation in this related answer, chapter
"Various complete table types":
Refactor a PL/pgSQL function to return the output of various SELECT queries
TABLE foo is just short for SELECT * FROM foo:
Is there a shortcut for SELECT * FROM?
2 steps for completely dynamic return type
But what you are trying to do is strictly impossible in a single SQL command.
I want to pass schema_name and table_name as parameters to function and get record list, according to column_visible field in
public.fields table.
There is no direct way to return an arbitrary selection of columns (return type not known at call time) from a function - or any SQL command. SQL demands to know number, names and types of resulting columns at call time. More in the 2nd chapter of this related answer:
How do I generate a pivoted CROSS JOIN where the resulting table definition is unknown?
There are various workarounds. You could wrap the result in one of the standard document types (json, jsonb, hstore, xml).
Or you generate the query with one function call and execute the result with the next:
CREATE OR REPLACE FUNCTION public.generate_get_table(_schema_name text, _table_name text)
RETURNS text AS
$func$
SELECT format('SELECT %s FROM %I.%I'
, string_agg(quote_ident(column_name), ', ')
, schema_name
, table_name)
FROM fields
WHERE column_visible
AND schema_name = _schema_name
AND table_name = _table_name
GROUP BY schema_name, table_name
ORDER BY schema_name, table_name;
$func$ LANGUAGE sql;
Call:
SELECT public.generate_get_table('public', 'users');
This create a query of the form:
SELECT usr_id, usr FROM public.users;
Execute it in the 2nd step. (You might want to add column numbers and order columns.)
Or append \gexec in psql to execute the return value immediately. See:
How to force evaluation of subquery before joining / pushing down to foreign server
Be sure to defend against SQL injection:
INSERT with dynamic table name in trigger function
Define table and column names as arguments in a plpgsql function?
Asides
varchar(100) does not make much sense for identifiers, which are limited to 63 characters in standard Postgres:
Maximum characters in labels (table names, columns etc)
If you understand how the object identifier type regclass works, you might replace schema and table name with a singe regclass column.
I think you just need another query to get the list of columns you want.
Maybe something like (this is untested):
create or replace function public.get_table(_schema_name text, _table_name text, active boolean default true) returns setof record as $$
declare
entity_name text default schema_name || '.' || table_name;
r record;
columns varchar;
begin
-- Get the list of columns
SELECT string_agg(column_name, ', ')
INTO columns
FROM public.fields
WHERE fields.schema_name = _schema_name
AND fields.table_name = _table_name
AND fields.column_visible = TRUE;
-- Return rows from the specified table
RETURN QUERY EXECUTE 'select ' || columns || ' from ' || entity_name;
RETURN;
end
$$
language plpgsql;
Keep in mind that column/table references may need to be surrounded by double quotes if they have certain characters in them.

Get IDs from multiple columns in multiple tables as one set or array

I have multiple tables with each two rows of interest: connection_node_start_id and connection_node_end_id. My goal is to get a collection of all those IDs, either as a flat ARRAY or as a new TABLE consisting of one row.
Example output ARRAY:
result = {1,4,7,9,2,5}
Example output TABLE:
IDS
-------
1
4
7
9
2
5
My fist attempt is somewhat clumsy and does not work properly as the SELECT statement just returns one row. It seems there must be a simple way to do this, can someone point me into the right direction?
CREATE OR REPLACE FUNCTION get_connection_nodes(anyarray)
RETURNS anyarray AS
$$
DECLARE
table_name varchar;
result integer[];
sel integer[];
BEGIN
FOREACH table_name IN ARRAY $1
LOOP
RAISE NOTICE 'table_name(%)',table_name;
EXECUTE 'SELECT ARRAY[connection_node_end_id,
connection_node_start_id] FROM ' || table_name INTO sel;
RAISE NOTICE 'sel(%)',sel;
result := array_cat(result, sel);
END LOOP;
RETURN result;
END
$$
LANGUAGE 'plpgsql';
Test table:
connection_node_start_id | connection_node_end_id
--------------------------------------------------
1 | 4
7 | 9
Call:
SELECT get_connection_nodes(ARRAY['test_table']);
Result:
{1,4} -- only 1st row, rest is missing
For Postgres 9.3+
CREATE OR REPLACE FUNCTION get_connection_nodes(text[])
RETURNS TABLE (ids int) AS
$func$
DECLARE
_tbl text;
BEGIN
FOREACH _tbl IN ARRAY $1
LOOP
RETURN QUERY EXECUTE format('
SELECT t.id
FROM %I, LATERAL (VALUES (connection_node_start_id)
, (connection_node_end_id)) t(id)'
, _tbl);
END LOOP;
END
$func$ LANGUAGE plpgsql;
Related answer on dba.SE:
SELECT DISTINCT on multiple columns
Or drop the loop and concatenate a single query. Probably fastest:
CREATE OR REPLACE FUNCTION get_connection_nodes2(text[])
RETURNS TABLE (ids int) AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT string_agg(format(
'SELECT t.id FROM %I, LATERAL (VALUES (connection_node_start_id)
, (connection_node_end_id)) t(id)'
, tbl), ' UNION ALL ')
FROM unnest($1) tbl
);
END
$func$ LANGUAGE plpgsql;
Related:
Loop through like tables in a schema
LATERAL was introduced with Postgres 9.3.
For older Postgres
You can use the set-returning function unnest() in the SELECT list, too:
CREATE OR REPLACE FUNCTION get_connection_nodes2(text[])
RETURNS TABLE (ids int) AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT string_agg(
'SELECT unnest(ARRAY[connection_node_start_id
, connection_node_end_id]) FROM ' || tbl
, ' UNION ALL '
)
FROM (SELECT quote_ident(tbl) AS tbl FROM unnest($1) tbl) t
);
END
$func$ LANGUAGE plpgsql;
Should work with pg 8.4+ (or maybe even older). Works with current Postgres (9.4) as well, but LATERAL is much cleaner.
Or make it very simple:
CREATE OR REPLACE FUNCTION get_connection_nodes3(text[])
RETURNS TABLE (ids int) AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT string_agg(format(
'SELECT connection_node_start_id FROM %1$I
UNION ALL
SELECT connection_node_end_id FROM %1$I'
, tbl), ' UNION ALL ')
FROM unnest($1) tbl
);
END
$func$ LANGUAGE plpgsql;
format() was introduced with pg 9.1.
Might be a bit slower with big tables because each table is scanned once for every column (so 2 times here). Sort order in the result is different, too - but that does not seem to matter for you.
Be sure to sanitize escape identifiers to defend against SQL injection and other illegal syntax. Details:
Table name as a PostgreSQL function parameter
The EXECUTE ... INTO statement can only return data from a single row:
If multiple rows are returned, only the first will be assigned to the INTO variable.
In order to concatenate values from all rows you have to aggregate them first by column and then append the arrays:
EXECUTE 'SELECT array_agg(connection_node_end_id) ||
array_agg(connection_node_start_id) FROM ' || table_name INTO sel;
You're probably looking for something like this:
CREATE OR REPLACE FUNCTION d (tblname TEXT [])
RETURNS TABLE (c INTEGER) AS $$
DECLARE sql TEXT;
BEGIN
WITH x
AS (SELECT unnest(tblname) AS tbl),
y AS (
SELECT FORMAT('
SELECT connection_node_end_id
FROM %s
UNION ALL
SELECT connection_node_start_id
FROM %s
', tbl, tbl) AS s
FROM x)
SELECT string_agg(s, ' UNION ALL ')
INTO sql
FROM y;
RETURN QUERY EXECUTE sql;
END;$$
LANGUAGE plpgsql;
CREATE TABLE a (connection_node_end_id INTEGER, connection_node_start_id INTEGER);
INSERT INTO A VALUES (1,2);
CREATE TABLE b (connection_node_end_id INTEGER, connection_node_start_id INTEGER);
INSERT INTO B VALUES (100, 101);
SELECT * from d(array['a','b']);
c
-----
1
2
100
101
(4 rows)

EXECUTE...INTO...USING statement in PL/pgSQL can't execute into a record?

I'm attempting to write an area of a function in PL/pgSQL that loops through an hstore and sets a record's column(the key of the hstore) to a specific value (the value of the hstore). I'm using Postgres 9.1.
The hstore will look like: ' "column1"=>"value1","column2"=>"value2" '
Generally, here is what I want from a function that takes in an hstore and has a record with values to modify:
FOR my_key, my_value IN
SELECT key,
value
FROM EACH( in_hstore )
LOOP
EXECUTE 'SELECT $1'
INTO my_row.my_key
USING my_value;
END LOOP;
The error which I am getting with this code:
"myrow" has no field "my_key". I've been searching for quite a while now for a solution, but everything else I've tried to achieve the same result hasn't worked.
Simpler alternative to your posted answer. Should perform much better.
This function retrieves a row from a given table (in_table_name) and primary key value (in_row_pk), and inserts it as new row into the same table, with some values replaced (in_override_values). The new primary key value as per default is returned (pk_new).
CREATE OR REPLACE FUNCTION f_clone_row(in_table_name regclass
, in_row_pk int
, in_override_values hstore
, OUT pk_new int)
LANGUAGE plpgsql AS
$func$
DECLARE
_pk text; -- name of PK column
_cols text; -- list of names of other columns
BEGIN
-- Get name of PK column
SELECT INTO _pk a.attname
FROM pg_catalog.pg_index i
JOIN pg_catalog.pg_attribute a ON a.attrelid = i.indrelid
AND a.attnum = i.indkey[0] -- single PK col!
WHERE i.indrelid = in_table_name
AND i.indisprimary;
-- Get list of columns excluding PK column
SELECT INTO _cols string_agg(quote_ident(attname), ',')
FROM pg_catalog.pg_attribute
WHERE attrelid = in_table_name -- regclass used as OID
AND attnum > 0 -- exclude system columns
AND attisdropped = FALSE -- exclude dropped columns
AND attname <> _pk; -- exclude PK column
-- INSERT cloned row with override values, returning new PK
EXECUTE format('
INSERT INTO %1$I (%2$s)
SELECT %2$s
FROM (SELECT (t #= $1).* FROM %1$I t WHERE %3$I = $2) x
RETURNING %3$I'
, in_table_name, _cols, _pk)
USING in_override_values, in_row_pk -- use override values directly
INTO pk_new; -- return new pk directly
END
$func$;
Call:
SELECT f_clone_row('tbl', 1, '"col1"=>"foo_new","col2"=>"bar_new"');
db<>fiddle here
Old sqlfiddle
Use regclass as input parameter type, so only valid table names can be used to begin with and SQL injection is ruled out. The function also fails earlier and more gracefully if you should provide an illegal table name.
Use an OUT parameter (pk_new) to simplify the syntax.
No need to figure out the next value for the primary key manually. It is inserted automatically and returned after the fact. That's not only simpler and faster, you also avoid wasted or out-of-order sequence numbers.
Use format() to simplify the assembly of the dynamic query string and make it less error-prone. Note how I use positional parameters for identifiers and unquoted strings respectively.
I build on your implicit assumption that allowed tables have a single primary key column of type integer with a column default. Typically serial columns.
Key element of the function is the final INSERT:
Merge override values with the existing row using the #= operator in a subselect and decompose the resulting row immediately.
Then you can select only relevant columns in the main SELECT.
Let Postgres assign the default value for the PK and get it back with the RETURNING clause.
Write the returned value into the OUT parameter directly.
All done in a single SQL command, that is generally fastest.
Since I didn't want to have to use any external functions for speed purposes, I created a solution using hstores to insert a record into a table:
CREATE OR REPLACE FUNCTION fn_clone_row(in_table_name character varying, in_row_pk integer, in_override_values hstore)
RETURNS integer
LANGUAGE plpgsql
AS $function$
DECLARE
my_table_pk_col_name varchar;
my_key text;
my_value text;
my_row record;
my_pk_default text;
my_pk_new integer;
my_pk_new_text text;
my_row_hstore hstore;
my_row_keys text[];
my_row_keys_list text;
my_row_values text[];
my_row_values_list text;
BEGIN
-- Get the next value of the pk column for the table.
SELECT ad.adsrc,
at.attname
INTO my_pk_default,
my_table_pk_col_name
FROM pg_attrdef ad
JOIN pg_attribute at
ON at.attnum = ad.adnum
AND at.attrelid = ad.adrelid
JOIN pg_class c
ON c.oid = at.attrelid
JOIN pg_constraint cn
ON cn.conrelid = c.oid
AND cn.contype = 'p'
AND cn.conkey[1] = at.attnum
JOIN pg_namespace n
ON n.oid = c.relnamespace
WHERE c.relname = in_table_name
AND n.nspname = 'public';
-- Get the next value of the pk in a local variable
EXECUTE ' SELECT ' || my_pk_default
INTO my_pk_new;
-- Set the integer value back to text for the hstore
my_pk_new_text := my_pk_new::text;
-- Add the next value statement to the hstore of changes to make.
in_override_values := in_override_values || hstore( my_table_pk_col_name, my_pk_new_text );
-- Copy over only the given row to the record.
EXECUTE ' SELECT * '
' FROM ' || quote_ident( in_table_name ) ||
' WHERE ' || quote_ident( my_table_pk_col_name ) ||
' = ' || quote_nullable( in_row_pk )
INTO my_row;
-- Replace the values that need to be changed in the column name array
my_row := my_row #= in_override_values;
-- Create an hstore of my record
my_row_hstore := hstore( my_row );
-- Create a string of comma-delimited, quote-enclosed column names
my_row_keys := akeys( my_row_hstore );
SELECT array_to_string( array_agg( quote_ident( x.colname ) ), ',' )
INTO my_row_keys_list
FROM ( SELECT unnest( my_row_keys ) AS colname ) x;
-- Create a string of comma-delimited, quote-enclosed column values
my_row_values := avals( my_row_hstore );
SELECT array_to_string( array_agg( quote_nullable( x.value ) ), ',' )
INTO my_row_values_list
FROM ( SELECT unnest( my_row_values ) AS value ) x;
-- Insert the values into the columns of a new row
EXECUTE 'INSERT INTO ' || in_table_name || '(' || my_row_keys_list || ')'
' VALUES (' || my_row_values_list || ')';
RETURN my_pk_new;
END
$function$;
It's quite a bit longer than what I had envisioned, but it works and is actually quite speedy.