Simplify complex union all in PostgreSQL - postgresql

I have very complex query in PostgreSQL that unions several tables, that all have common set of fields we want to union. currently we're pregenerating this query. I have seen solution to this using UNPIVOT and I'm wondering if it's possible to do this in PostgreSQL flavour of SQL.
What I have is something like
SELECT a,b,c FROM a UNION ALL
SELECT a,b,c FROM c UNION ALL
SELECT a,b,c FROM d UNION ALL
SELECT a,b,c FROM e UNION ALL
SELECT a,b,c FROM f
I'd like to have names of tables to union in separate table and use that for this query.
PS. Changing schema is not an option.

Use inheritance for the Postgres documentation for inheritance. You'll need to recreate the database, but that's easy if you dump the tables without schema, create a new schema with inheritance, and load the data back.
The schema would look something like this:
CREATE TABLE base (a, b, c);
CREATE TABLE a () INHERITS (base);
CREATE TABLE b () INHERITS (base);
....
With this design, you can do a simple select:
SELECT * FROM base;
This will return all rows in base and all the tables inheriting from base.
Read about PostgreSQL table partitioning from the docs if you haven't done so already.

If you really can't fix your design (or don't want to use the very good suggestion from jmz), your only choice is probably a set returning function that builds the necessary UNION "on the fly" and then returns the results from that.
create or replace function my_union()
returns table(a integer, b integer, c integer)
as
$body$
declare
union_cursor refcursor;
table_cursor cursor for SELECT table_name FROM union_source;
union_query text := '';
begin
-- build the query string for the union
for table_list_record in table_cursor loop
if union_query '' then
union_query := union_query||' UNION ALL';
end if;
union_query := union_query||' SELECT a,b,c FROM '||table_list_record.table_name;
end loop;
-- run the union and return the result
for a,b,c IN EXECUTE union_query LOOP
return next;
end loop;
end;
$body$
language plpgsql;
The function builds the necessary UNION based on the table names in union_source and then executes the union and returns the result.
You could extend the union_source table to also store the column list for each table, if the columns do not always have the same names.
To use this function, simply select from it:
select *
from my_union()

Related

Query has no result in destination data when calling colpivot inside pgsql stored procedure

I have created a procedure to generate temp table using colpivot https://github.com/hnsl/colpivot
and saving the result into a physical table as below in PGSQL
create or replace procedure create_report_table()
language plpgsql
as $$
begin
drop table if exists reports;
select colpivot('_report',
'select
u.username,
c.shortname as course_short_name,
to_timestamp(cp.timecompleted)::date as completed
FROM mdl_course_completions AS cp
JOIN mdl_course AS c ON cp.course = c.id
JOIN mdl_user AS u ON cp.userid = u.id
WHERE c.enablecompletion = 1
ORDER BY u.username' ,array['username'], array['course_short_name'], '#.completed', null);
create table reports as (SELECT * FROM _report);
commit;
end; $$
colpivot function , drop table , delete table works really fine in isolation. but when I create the procedure as above, and call the procedure to execute, this throws an error Query has no result in destination data
Is there any way I can use colpivot in collaboration with several queries as I am currently trying ?
Use PERFORM instead of SELECT. That will execute the statement, without the need to keep the result somewhere. This is what the manual says:
Sometimes it is useful to evaluate an expression or SELECT query but discard the result, for example when calling a function that has
side-effects but no useful result value. To do this in PL/pgSQL, use
the PERFORM statement

PostgreSQL dynamic column selection

I am struggling a bit on some dynamic postgresql :
I have a table named "list_columns" containing the columns names list with "column_name" as the variable name; those column names come from an input table called "input_table".
[list_columns]
column_name
col_name_a
col_name_b
col_name_c...
[input_table]
col_name_a
col_name_b
col_name_c
col_name_d
col_name_e
value_a_1
value_b_1
value_c_1
value_d_1
value_e_1
value_a_2
value_b_2
value_c_2
value_d_2
value_e_2
...
...
...
...
...
What I'd like to do is dynamically create a new table using that list, something like this:
create table output_table as
select (select distinct(column_name) seperated by "," from list_columns) from input_table;
The resulting table would be
[output_table]
col_name_a
col_name_b
col_name_c
value_a_1
value_b_1
value_c_1
value_a_2
value_b_2
value_c_2
...
...
...
I saw I should use some execute procedures but I can't figure out how to do so.
Note: I know i could directly select the 3 columns; I oversimplied the case.
If someone would be kind enough to help me on this,
Thank you,
Regards,
Jonathan
You need dynamic SQL for this, and for that you need PL/pgSQL.
You need to assemble the CREATE TABLE statement based on the input_table, then run that generated SQL.
do
$$
declare
l_columns text;
l_sql text;
begin
-- this generates the list of columns
select string_agg(distinct column_name, ',')
into l_columns
from list_table;
-- this generates the actual CREATE TABLE statement using the columns
-- from the previous step
l_sql := 'create table output_table as select '||l_columns||' from input_table';
-- this runs the generated SQL, thus creating the output table.
execute l_sql;
end;
$$;
If you need that a lot, you can put that into a stored function (your unsupported Postgres version doesn't support real procedures).

Execute a SELECT with dynamic ORDER BY expression inside a function

I'm trying to EXECUTE some SELECTs to use inside a function, my code is something like this:
DECLARE
result_one record;
BEGIN
EXECUTE 'WITH Q1 AS
(
SELECT id
FROM table_two
INNER JOINs, WHERE, etc, ORDER BY... DESC
)
SELECT Q1.id
FROM Q1
WHERE, ORDER BY...DESC';
RETURN final_result;
END;
I know how to do it in MySQL, but in PostgreSQL I'm failing. What should I change or how should I do it?
For a function to be able to return multiple rows it has to be declared as returns table() (or returns setof)
And to actually return a result from within a PL/pgSQL function you need to use return query (as documented in the manual)
To build dynamic SQL in Postgres it is highly recommended to use the format() function to properly deal with identifiers (and to make the source easier to read).
So you need something like:
create or replace function get_data(p_sort_column text)
returns table (id integer)
as
$$
begin
return query execute
format(
'with q1 as (
select id
from table_two
join table_three on ...
)
select q1.id
from q1
order by %I desc', p_sort_column);
end;
$$
language plpgsql;
Note that the order by inside the CTE is pretty much useless if you are sorting the final query unless you use a LIMIT or distinct on () inside the query.
You can make your life even easier if you use another level of dollar quoting for the dynamic SQL:
create or replace function get_data(p_sort_column text)
returns table (id integer)
as
$$
begin
return query execute
format(
$query$
with q1 as (
select id
from table_two
join table_three on ...
)
select q1.id
from q1
order by %I desc
$query$, p_sort_column);
end;
$$
language plpgsql;
What a_horse said. And:
How to return result of a SELECT inside a function in PostgreSQL?
Plus, to pick a column for ORDER BY dynamically, you have to add that column to the SELECT list of your CTE, which leads to complications if the column can be duplicated (like with passing 'id') ...
Better yet, remove the CTE entirely. There is nothing in your question to warrant its use anyway. (Only use CTEs when needed in Postgres, they are typically slower than equivalent subqueries or simple queries.)
CREATE OR REPLACE FUNCTION get_data(p_sort_column text)
RETURNS TABLE (id integer) AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$q$
SELECT t2.id -- assuming you meant t2?
FROM table_two t2
JOIN table_three t3 on ...
ORDER BY t2.%I DESC NULL LAST -- see below!
$q$, $1);
END
$func$ LANGUAGE plpgsql;
I appended NULLS LAST - you'll probably want that, too:
PostgreSQL sort by datetime asc, null first?
If p_sort_column is from the same table all the time, hard-code that table name / alias in the ORDER BY clause. Else, pass the table name / alias separately and auto-quote them separately to be safe:
Define table and column names as arguments in a plpgsql function?
I suggest to table-qualify all column names in a bigger query with multiple joins (t2.id not just id). Avoids various kinds of surprising results / confusion / abuse.
And you may want to schema-qualify your table names (myschema.table_two) to avoid similar troubles when calling the function with a different search_path:
How does the search_path influence identifier resolution and the "current schema"

Dynamic table name in postgreSQL 9.3

I am using postgreSQL. I want to select data from a table. Such table name contains the current year. such as abc2013. I have tried
select * from concat('abc',date_part('year',current_date))
select *from from concat('abc', extract (year from current_date))
So how to fetch data from such table dynamically?
Please don't do this - look hard at alternatives first, starting with partitioning and constraint exclusion.
If you must use dynamic table names, do it at application level during query generation.
If all else fails you can use a PL/PgSQL procedure like:
CREATE OR REPLACE pleasedont(int year) RETURNS TABLE basetable AS $$
BEGIN
RETURN QUERY EXECUTE format('SELECT col1, col2, col3 FROM %I', 'basetable_'||year);
END;
$$ LANGUAGE plpgsql;
This will only work if you have a base table that has the same structure as the sub-tables. It's also really painful to work with when you start adding qualifiers (where clause constraints, etc), and it prevents any kind of plan caching or effective prepared statement use.

SELECT .. INTO to create a table in PL/pgSQL

I want to use SELECT INTO to make a temporary table in one of my functions. SELECT INTO works in SQL but not PL/pgSQL.
This statement creates a table called mytable (If orig_table exists as a relation):
SELECT *
INTO TEMP TABLE mytable
FROM orig_table;
But put this function into PostgreSQL, and you get the error: ERROR: "temp" is not a known variable
CREATE OR REPLACE FUNCTION whatever()
RETURNS void AS $$
BEGIN
SELECT *
INTO TEMP TABLE mytable
FROM orig_table;
END; $$ LANGUAGE plpgsql;
I can SELECT INTO a variable of type record within PL/pgSQL, but then I have to define the structure when getting data out of that record. SELECT INTO is really simple - automatically creating a table of the same structure of the SELECT query. Does anyone have any explanation for why this doesn't work inside a function?
It seems like SELECT INTO works differently in PL/pgSQL, because you can select into the variables you've declared. I don't want to declare my temporary table structure, though. I wish it would just create the structure automatically like it does in SQL.
Try
CREATE TEMP TABLE mytable AS
SELECT *
FROM orig_table;
Per http://www.postgresql.org/docs/current/static/sql-selectinto.html
CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO is not available in ECPG or PL/pgSQL, because they interpret the INTO clause differently. Furthermore, CREATE TABLE AS offers a superset of the functionality provided by SELECT INTO.