Get IDs from multiple columns in multiple tables as one set or array - postgresql

I have multiple tables with each two rows of interest: connection_node_start_id and connection_node_end_id. My goal is to get a collection of all those IDs, either as a flat ARRAY or as a new TABLE consisting of one row.
Example output ARRAY:
result = {1,4,7,9,2,5}
Example output TABLE:
IDS
-------
1
4
7
9
2
5
My fist attempt is somewhat clumsy and does not work properly as the SELECT statement just returns one row. It seems there must be a simple way to do this, can someone point me into the right direction?
CREATE OR REPLACE FUNCTION get_connection_nodes(anyarray)
RETURNS anyarray AS
$$
DECLARE
table_name varchar;
result integer[];
sel integer[];
BEGIN
FOREACH table_name IN ARRAY $1
LOOP
RAISE NOTICE 'table_name(%)',table_name;
EXECUTE 'SELECT ARRAY[connection_node_end_id,
connection_node_start_id] FROM ' || table_name INTO sel;
RAISE NOTICE 'sel(%)',sel;
result := array_cat(result, sel);
END LOOP;
RETURN result;
END
$$
LANGUAGE 'plpgsql';
Test table:
connection_node_start_id | connection_node_end_id
--------------------------------------------------
1 | 4
7 | 9
Call:
SELECT get_connection_nodes(ARRAY['test_table']);
Result:
{1,4} -- only 1st row, rest is missing

For Postgres 9.3+
CREATE OR REPLACE FUNCTION get_connection_nodes(text[])
RETURNS TABLE (ids int) AS
$func$
DECLARE
_tbl text;
BEGIN
FOREACH _tbl IN ARRAY $1
LOOP
RETURN QUERY EXECUTE format('
SELECT t.id
FROM %I, LATERAL (VALUES (connection_node_start_id)
, (connection_node_end_id)) t(id)'
, _tbl);
END LOOP;
END
$func$ LANGUAGE plpgsql;
Related answer on dba.SE:
SELECT DISTINCT on multiple columns
Or drop the loop and concatenate a single query. Probably fastest:
CREATE OR REPLACE FUNCTION get_connection_nodes2(text[])
RETURNS TABLE (ids int) AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT string_agg(format(
'SELECT t.id FROM %I, LATERAL (VALUES (connection_node_start_id)
, (connection_node_end_id)) t(id)'
, tbl), ' UNION ALL ')
FROM unnest($1) tbl
);
END
$func$ LANGUAGE plpgsql;
Related:
Loop through like tables in a schema
LATERAL was introduced with Postgres 9.3.
For older Postgres
You can use the set-returning function unnest() in the SELECT list, too:
CREATE OR REPLACE FUNCTION get_connection_nodes2(text[])
RETURNS TABLE (ids int) AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT string_agg(
'SELECT unnest(ARRAY[connection_node_start_id
, connection_node_end_id]) FROM ' || tbl
, ' UNION ALL '
)
FROM (SELECT quote_ident(tbl) AS tbl FROM unnest($1) tbl) t
);
END
$func$ LANGUAGE plpgsql;
Should work with pg 8.4+ (or maybe even older). Works with current Postgres (9.4) as well, but LATERAL is much cleaner.
Or make it very simple:
CREATE OR REPLACE FUNCTION get_connection_nodes3(text[])
RETURNS TABLE (ids int) AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT string_agg(format(
'SELECT connection_node_start_id FROM %1$I
UNION ALL
SELECT connection_node_end_id FROM %1$I'
, tbl), ' UNION ALL ')
FROM unnest($1) tbl
);
END
$func$ LANGUAGE plpgsql;
format() was introduced with pg 9.1.
Might be a bit slower with big tables because each table is scanned once for every column (so 2 times here). Sort order in the result is different, too - but that does not seem to matter for you.
Be sure to sanitize escape identifiers to defend against SQL injection and other illegal syntax. Details:
Table name as a PostgreSQL function parameter

The EXECUTE ... INTO statement can only return data from a single row:
If multiple rows are returned, only the first will be assigned to the INTO variable.
In order to concatenate values from all rows you have to aggregate them first by column and then append the arrays:
EXECUTE 'SELECT array_agg(connection_node_end_id) ||
array_agg(connection_node_start_id) FROM ' || table_name INTO sel;

You're probably looking for something like this:
CREATE OR REPLACE FUNCTION d (tblname TEXT [])
RETURNS TABLE (c INTEGER) AS $$
DECLARE sql TEXT;
BEGIN
WITH x
AS (SELECT unnest(tblname) AS tbl),
y AS (
SELECT FORMAT('
SELECT connection_node_end_id
FROM %s
UNION ALL
SELECT connection_node_start_id
FROM %s
', tbl, tbl) AS s
FROM x)
SELECT string_agg(s, ' UNION ALL ')
INTO sql
FROM y;
RETURN QUERY EXECUTE sql;
END;$$
LANGUAGE plpgsql;
CREATE TABLE a (connection_node_end_id INTEGER, connection_node_start_id INTEGER);
INSERT INTO A VALUES (1,2);
CREATE TABLE b (connection_node_end_id INTEGER, connection_node_start_id INTEGER);
INSERT INTO B VALUES (100, 101);
SELECT * from d(array['a','b']);
c
-----
1
2
100
101
(4 rows)

Related

Print prepared dynamic query with result set in Postgresql function using DBeaver

I have the function in which I have prepared dynamic query, which I want print in output window before executing it.
Note: In the following example I have just add simple select statement to understand the requirement.
Sample tables:
create table t1
(
col1 int,
col2 text
);
insert into t1 values(1,'Table T1');
insert into t1 values(2,'Table T1');
create table t2
(
col1 int,
col2 text
);
insert into t2 values(1,'Table T2');
insert into t2 values(2,'Table T2');
Function:
create or replace function fn_testing(tbl_Name text)
returns table(col1 int,col2 text) as
$$
begin
return query execute 'select col1,col2 from '||tbl_name||'';
end;
$$
language plpgsql;
Function call:
select * from fn_testing('t2');
I want to print following in message window with result set too in result window:
select col1,col2 from t1;
You can use RAISE NOTICE for messages.
CREATE OR REPLACE FUNCTION fn_testing
(_tbl_name name)
RETURNS TABLE
(col1 integer,
col2 text)
AS
$$
DECLARE
_query text;
BEGIN
_query := format('SELECT col1, col2 FROM %I;', _tbl_name);
RAISE NOTICE '%', _query;
RETURN QUERY EXECUTE _query;
END;
$$
LANGUAGE plpgsql;
Note: There's a special type, name, for identifiers. And to prevent SQL injection or errors you should make sure the dynamic identifiers are properly quoted. You can use format() with %I for that.

Dynamically select column in PostgreSQL

I want to select a column from a table, with the column name being the result of a query like the following:
-- This query returns a single value
with x as (
select a from table1 where <condition>
)
-- my_function() yields a table
select x from my_function()
How do I do that?
Thank you very much.
You could write it in SQL with a temporary function:
CREATE FUNCTION pg_temp.tablefunc()
RETURNS SETOF my_function_result_type
LANGUAGE plpgsql AS
$$DECLARE
v_colname text;
BEGIN
SELECT a INTO v_colname
FROM table1
LIMIT 1;
RETURN QUERY EXECUTE
format(E'SELECT %I\n'
'FROM my_function()',
v_colname);
END;$$;
SELECT * FROM pg_temp.tablefunc();

How to PREPARE & EXECUTE a query from a string stored in a table

This stored function returns a query:
DROP FUNCTION IF EXISTS get_query (
ctl text, scm text, tbl text, seq text);
CREATE OR REPLACE FUNCTION get_query (
ctl text, scm text, tbl text, seq text)
RETURNS text
AS
$$
select concat('insert into ',$2,'.',$1, ' select nextval("',$4,'") as id, ',
string_agg(concat('NEW.', column_name), ', '), ', current_timestamp as audited_at;')
from information_schema.columns
where table_catalog = $1
and table_schema = $2
and table_name = $3
$$
LANGUAGE sql;
How do I PREPARE the query that this function returns.
I want insert a record in a table when a trigger is fired but I don't want to specify the list of columns to be inserted. The schema might keep changing. Hence, trying to use prepared statements.
This sample code illustrates how I mean the query string to be executed:
DROP FUNCTION IF EXISTS fn_name (store_temporary_query text);
CREATE OR REPLACE FUNCTION fn_name (store_temporary_query text)
RETURNS table (query text)
LANGUAGE plpgsql
AS
$$
begin
select 'select 1 as ID' into store_temporary_query;
return query (select store_temporary_query);
end;
$$
select fn_name('');
The above query gives the following output
fn_name
select 1 as ID
The desired result is the query
ID
1
EDIT #2
DROP FUNCTION IF EXISTS fn_name (store_temporary_query text);
CREATE OR REPLACE FUNCTION fn_name (store_temporary_query text)
RETURNS table (query text)
LANGUAGE plpgsql
AS
$$
begin
select 'select 1 as ID;' into store_temporary_query;
return query execute store_temporary_query;
end;
$$
select fn_name('');
This gets us here,
Error executing SQL statement. ERROR: syntax error at or near "select"
Position: 254 - Connection: Aurora Legacy: 794ms
You need EXECUTE to execute a query stored in a string:
RETURN QUERY EXECUTE store_temporary_query;

Merge multiple result tables and perform final query on result

I have a function returning table, which accumulates output of multiple calls to another function returning table. I would like to perform final query on built table before returning result. Currently I implemented this as two functions, one accumulating and one performing final query, which is ugly:
CREATE OR REPLACE FUNCTION func_accu(LOCATION_ID INTEGER, SCHEMA_CUSTOMER TEXT)
RETURNS TABLE("networkid" integer, "count" bigint) AS $$
DECLARE
GATEWAY_ID integer;
BEGIN
FOR GATEWAY_ID IN
execute format(
'SELECT id FROM %1$I.gateway WHERE location_id=%2$L'
, SCHEMA_CUSTOMER, LOCATION_ID)
LOOP
RETURN QUERY execute format(
'SELECT * FROM get_available_networks_gw(%1$L, %2$L)'
, GATEWAY_ID, SCHEMA_CUSTOMER);
END LOOP;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION func_query(LOCATION_ID INTEGER, SCHEMA_CUSTOMER TEXT)
RETURNS TABLE("networkid" integer, "count" bigint) AS $$
DECLARE
BEGIN
RETURN QUERY execute format('
SELECT networkid, max(count) FROM func_accu(%2$L, %1$L) GROUP BY networkid;'
, SCHEMA_CUSTOMER, LOCATION_ID);
END;
$$ LANGUAGE plpgsql;
How can this be done in single function, elegantly?
Both functions simplified and merged, also supplying value parameters in the USING clause:
CREATE OR REPLACE FUNCTION pg_temp.func_accu(_location_id integer, schema_customer text)
RETURNS TABLE(networkid integer, count bigint) AS
$func$
BEGIN
RETURN QUERY EXECUTE format('
SELECT f.networkid, max(f.ct)
FROM %I.gateway g
, get_available_networks_gw(g.id, $1) f(networkid, ct)
WHERE g.location_id = $2
GROUP BY 1'
, _schema_customer)
USING _schema_customer, _location_id;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM func_accu(123, 'my_schema');
Related:
Dynamically access column value in record
I am using alias names for the columns returned by the function (f(networkid, ct)) to be sure because you did not disclose the return type of get_available_networks_gw(). You can use the column names of the return type directly.
The comma (,) in the FROM clause is short syntax for CROSS JOIN LATERAL .... Requires Postgres 9.3 or later.
What is the difference between LATERAL and a subquery in PostgreSQL?
Or you could run this query instead of the function:
SELECT f.networkid, max(f.ct)
FROM myschema.gateway g, get_available_networks_gw(g.id, 'my_schema') f(networkid, ct)
WHERE g.location_id = $2
GROUP BY 1;

Multirow subselect as parameter to `execute using`

The multirow subselect will be used in the right hand side of the in operator in the where clause:
create table t (a integer);
insert into t (a) values (1), (9);
drop function if exists f();
create function f()
returns void as $$
begin
execute '
select a
from t
where a in $1
' using (select 1 union select 2);
end;$$
language plpgsql;
select f();
ERROR: more than one row returned by a subquery used as an expression
CONTEXT: SQL statement "SELECT (select 1 union select 2)"
PL/pgSQL function "f" line 3 at EXECUTE statement
How to achieve what the above function would if it worked?
I see nothing in your question that couldn't more easily be solved with:
SELECT a
FROM t
JOIN (VALUES (1), (2)) x(a) USING (a); -- any query returning multiple int
Can you clarify the necessity in your example?
As a proof of concept, this would be simpler / faster:
CREATE OR REPLACE FUNCTION x.f1()
RETURNS SETOF integer AS
$BODY$
BEGIN
RETURN QUERY EXECUTE '
SELECT a
FROM t
WHERE a = ANY($1)'
USING ARRAY(VALUES (1), (2)); -- any query here
END;
$BODY$
LANGUAGE plpgsql;
Performance of IN () and = ANY()
Your observation is spot on. And there is reason for it. Try:
EXPLAIN ANALYZE SELECT * FROM tbl WHERE id IN (1,2,3);
The query plan will reveal:
Index Cond: (id = ANY ('{1,2,3}'::integer[]))
PostgreSQL transforms the id IN (..) construct into id = ANY(..) internally. These two perform identically - except for a negligible overhead penalty.
My code is only faster in building the statement - exactly as you diagnosed.
create function f()
returns setof integer as $$
begin
return query execute format('
select a
from t
where a in %s
', replace(replace(array(select 1 union select 2)::text, '{', '('), '}', ')'));
end$$
language plpgsql;