PostgreSQL dynamic column selection - postgresql-9.4

I am struggling a bit on some dynamic postgresql :
I have a table named "list_columns" containing the columns names list with "column_name" as the variable name; those column names come from an input table called "input_table".
[list_columns]
column_name
col_name_a
col_name_b
col_name_c...
[input_table]
col_name_a
col_name_b
col_name_c
col_name_d
col_name_e
value_a_1
value_b_1
value_c_1
value_d_1
value_e_1
value_a_2
value_b_2
value_c_2
value_d_2
value_e_2
...
...
...
...
...
What I'd like to do is dynamically create a new table using that list, something like this:
create table output_table as
select (select distinct(column_name) seperated by "," from list_columns) from input_table;
The resulting table would be
[output_table]
col_name_a
col_name_b
col_name_c
value_a_1
value_b_1
value_c_1
value_a_2
value_b_2
value_c_2
...
...
...
I saw I should use some execute procedures but I can't figure out how to do so.
Note: I know i could directly select the 3 columns; I oversimplied the case.
If someone would be kind enough to help me on this,
Thank you,
Regards,
Jonathan

You need dynamic SQL for this, and for that you need PL/pgSQL.
You need to assemble the CREATE TABLE statement based on the input_table, then run that generated SQL.
do
$$
declare
l_columns text;
l_sql text;
begin
-- this generates the list of columns
select string_agg(distinct column_name, ',')
into l_columns
from list_table;
-- this generates the actual CREATE TABLE statement using the columns
-- from the previous step
l_sql := 'create table output_table as select '||l_columns||' from input_table';
-- this runs the generated SQL, thus creating the output table.
execute l_sql;
end;
$$;
If you need that a lot, you can put that into a stored function (your unsupported Postgres version doesn't support real procedures).

Related

Declare and return value for DELETE and INSERT

I am trying to remove duplicated data from some of our databases based upon unique id's. All deleted data should be stored in a separate table for auditing purposes. Since it concerns quite some databases and different schemas and tables I wanted to start using variables to reduce chance of errors and the amount of work it will take me.
This is the best example query I could think off, but it doesn't work:
do $$
declare #source_schema varchar := 'my_source_schema';
declare #source_table varchar := 'my_source_table';
declare #target_table varchar := 'my_target_schema' || source_table || '_duplicates'; --target schema and appendix are always the same, source_table is a variable input.
declare #unique_keys varchar := ('1', '2', '3')
begin
select into #target_table
from #source_schema.#source_table
where id in (#unique_keys);
delete from #source_schema.#source_table where export_id in (#unique_keys);
end ;
$$;
The query syntax works with hard-coded values.
Most of the times my variables are perceived as columns or not recognized at all. :(
You need to create and then call a plpgsql procedure with input parameters :
CREATE OR REPLACE PROCEDURE duplicates_suppress
(my_target_schema text, my_source_schema text, my_source_table text, unique_keys text[])
LANGUAGE plpgsql AS
$$
BEGIN
EXECUTE FORMAT(
'WITH list AS (INSERT INTO %1$I.%3$I_duplicates SELECT * FROM %2$I.%3$I WHERE array[id] <# %4$L :: integer[] RETURNING id)
DELETE FROM %2$I.%3$I AS t USING list AS l WHERE t.id = l.id', my_target_schema, my_source_schema, my_source_table, unique_keys :: text) ;
END ;
$$ ;
The procedure duplicates_suppress inserts into my_target_schema.my_source_table || '_duplicates' the rows from my_source_schema.my_source_table whose id is in the array unique_keys and then deletes these rows from the table my_source_schema.my_source_table .
See the test result in dbfiddle.
As has been commented, you need some kind of dynamic SQL. In a FUNCTION, PROCEDURE or a DO statement to do it on the server.
You should be comfortable with PL/pgSQL. Dynamic SQL is no beginners' toy.
Example with a PROCEDURE, like Edouard already suggested. You'll need a FUNCTION instead to wrap it in an outer transaction (like you very well might). See:
When to use stored procedure / user-defined function?
CREATE OR REPLACE PROCEDURE pg_temp.f_archive_dupes(_source_schema text, _source_table text, _unique_keys int[], OUT _row_count int)
LANGUAGE plpgsql AS
$proc$
-- target schema and appendix are always the same, source_table is a variable input
DECLARE
_target_schema CONSTANT text := 's2'; -- hardcoded
_target_table text := _source_table || '_duplicates';
_sql text := format(
'WITH del AS (
DELETE FROM %I.%I
WHERE id = ANY($1)
RETURNING *
)
INSERT INTO %I.%I TABLE del', _source_schema, _source_table
, _target_schema, _target_table);
BEGIN
RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _unique_keys; -- execute
GET DIAGNOSTICS _row_count = ROW_COUNT;
END
$proc$;
Call:
CALL pg_temp.f_archive_dupes('s1', 't1', '{1, 3}', 0);
db<>fiddle here
I made the procedure temporary, since I assume you don't need to keep it permanently. Create it once per database. See:
How to create a temporary function in PostgreSQL?
Passed schema and table names are case-sensitive strings! (Unlike unquoted identifiers in plain SQL.) Either way, be wary of SQL-injection when concatenating SQL dynamically. See:
Are PostgreSQL column names case-sensitive?
Table name as a PostgreSQL function parameter
Made _unique_keys type int[] (array of integer) since your sample values look like integers. Use a the actual data type of your id columns!
The variable _sql holds the query string, so it can easily be debugged before actually executing. Using RAISE NOTICE '%', _sql; for that purpose.
I suggest to comment the EXECUTE line until you are sure.
I made the PROCEDURE return the number of processed rows. You didn't ask for that, but it's typically convenient. At hardly any cost. See:
Dynamic SQL (EXECUTE) as condition for IF statement
Best way to get result count before LIMIT was applied
Last, but not least, use DELETE ... RETURNING * in a data-modifying CTE. Since that has to find rows only once it comes at about half the cost of separate SELECT and DELETE. And it's perfectly safe. If anything goes wrong, the whole transaction is rolled back anyway.
Two separate commands can also run into concurrency issues or race conditions which are ruled out this way, as DELETE implicitly locks the rows to delete. Example:
Replicating data between Postgres DBs
Or you can build the statements in a client program. Like psql, and use \gexec. Example:
Filter column names from existing table for SQL DDL statement
Based on Erwin's answer, minor optimization...
create or replace procedure pg_temp.p_archive_dump
(_source_schema text, _source_table text,
_unique_key int[],_target_schema text)
language plpgsql as
$$
declare
_row_count bigint;
_target_table text := '';
BEGIN
select quote_ident(_source_table) ||'_'|| array_to_string(_unique_key,'_') into _target_table from quote_ident(_source_table);
raise notice 'the deleted table records will store in %.%',_target_schema, _target_table;
execute format('create table %I.%I as select * from %I.%I limit 0',_target_schema, _target_table,_source_schema,_source_table );
execute format('with mm as ( delete from %I.%I where id = any (%L) returning * ) insert into %I.%I table mm'
,_source_schema,_source_table,_unique_key, _target_schema, _target_table);
GET DIAGNOSTICS _row_count = ROW_COUNT;
RAISE notice 'rows influenced, %',_row_count;
end
$$;
--
if your _unique_key is not that much, this solution also create a table for you. Obviously you need to create the target schema yourself.
If your unique_key is too much, you can customize to properly rename the dumped table.
Let's call it.
call pg_temp.p_archive_dump('s1','t1', '{1,2}','s2');
s1 is the source schema, t1 is source table, {1,2} is the unique key you want to extract to the new table. s2 is the target schema

PostgreSQL select INTO function

I am writing a function which will select and SUM the resulting output into a new table-therefore I attempted to use the INTO function. However, my standalone code works, yet once a place into a function I get an error stating that the new SELECT INTO table is not a defined variable (perhaps I am missing something). Please see code below:
CREATE OR REPLACE FUNCTION rev_1.calculate_costing_layer()
RETURNS trigger AS
$BODY$
BEGIN
-- This will create an intersection between pipelines and sum the cost to a new table for output
-- May need to create individual cost columns- Will also keep infrastructure costing seperated
--DROP table rev_1.costing_layer;
SELECT inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
INTO rev_1.costing_layer
FROM rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY catchment_e_gravity_lines.name, inyaninga_phases.geom;
RETURN NEW;
END;
$BODY$
language plpgsql
Per the documentation:
CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO is not available in ECPG or PL/pgSQL, because they interpret the INTO clause differently. Furthermore, CREATE TABLE AS offers a superset of the functionality provided by SELECT INTO.
Use CREATE TABLE AS.
Although SELECT ... INTO new_table is valid PostgreSQL, its use has been deprecated (or, at least, "unrecommended"). It doesn't work at all in PL/PGSQL, because INSERT INTO is used to get results into variables.
If you want to create a new table, you should use instead:
CREATE TABLE rev_1.costing_layer AS
SELECT
inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
FROM
rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY
catchment_e_gravity_lines.name, inyaninga_phases.geom;
If the table has already been created an you just want to insert a new row in it, you should use:
INSERT INTO
rev_1.costing_layer
(geom, name, gravity_sum)
-- Same select than before
SELECT
inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
FROM
rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY
catchment_e_gravity_lines.name, inyaninga_phases.geom;
In a trigger function, you're not very likely to create a new table every time, so, my guess is that you want to do the INSERT and not the CREATE TABLE ... AS.

Get values from varying columns in a generic trigger

I am new to PostgreSQL and found a trigger which serves my purpose completely except for one little thing. The trigger is quite generic and runs across different tables and logs different field changes. I found here.
What I now need to do is test for a specific field which changes as the tables change on which the trigger fires. I thought of using substr as all the column will have the same name format e.g. XXX_cust_no but the XXX can change to 2 or 4 characters. I need to log the value in theXXX_cust_no field with every record that is written to the history_ / audit table. Using a bunch of IF / ELSE statements to accomplish this is not something I would like to do.
The trigger as it now works logs the table_name, column_name, old_value, new_value. I however need to log the XXX_cust_no of the record that was changed as well.
Basically you need dynamic SQL for dynamic column names. format helps to format the DML command. Pass values from NEW and OLD with the USING clause.
Given these tables:
CREATE TABLE tbl (
t_id serial PRIMARY KEY
,abc_cust_no text
);
CREATE TABLE log (
id int
,table_name text
,column_name text
,old_value text
,new_value text
);
It could work like this:
CREATE OR REPLACE FUNCTION trg_demo()
RETURNS TRIGGER AS
$func$
BEGIN
EXECUTE format('
INSERT INTO log(id, table_name, column_name, old_value, new_value)
SELECT ($2).t_id
, $3
, $4
,($1).%1$I
,($2).%1$I', TG_ARGV[0])
USING OLD, NEW, TG_RELNAME, TG_ARGV[0];
RETURN NEW;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER demo
BEFORE UPDATE ON tbl
FOR EACH ROW EXECUTE PROCEDURE trg_demo('abc_cust_no'); -- col name here.
SQL Fiddle.
Related answer on dba.SE:
How to access NEW or OLD field given only the field's name?
List of special variables visible in plpgsql trigger functions in the manual.

Dynamic table name in postgreSQL 9.3

I am using postgreSQL. I want to select data from a table. Such table name contains the current year. such as abc2013. I have tried
select * from concat('abc',date_part('year',current_date))
select *from from concat('abc', extract (year from current_date))
So how to fetch data from such table dynamically?
Please don't do this - look hard at alternatives first, starting with partitioning and constraint exclusion.
If you must use dynamic table names, do it at application level during query generation.
If all else fails you can use a PL/PgSQL procedure like:
CREATE OR REPLACE pleasedont(int year) RETURNS TABLE basetable AS $$
BEGIN
RETURN QUERY EXECUTE format('SELECT col1, col2, col3 FROM %I', 'basetable_'||year);
END;
$$ LANGUAGE plpgsql;
This will only work if you have a base table that has the same structure as the sub-tables. It's also really painful to work with when you start adding qualifiers (where clause constraints, etc), and it prevents any kind of plan caching or effective prepared statement use.

SELECT .. INTO to create a table in PL/pgSQL

I want to use SELECT INTO to make a temporary table in one of my functions. SELECT INTO works in SQL but not PL/pgSQL.
This statement creates a table called mytable (If orig_table exists as a relation):
SELECT *
INTO TEMP TABLE mytable
FROM orig_table;
But put this function into PostgreSQL, and you get the error: ERROR: "temp" is not a known variable
CREATE OR REPLACE FUNCTION whatever()
RETURNS void AS $$
BEGIN
SELECT *
INTO TEMP TABLE mytable
FROM orig_table;
END; $$ LANGUAGE plpgsql;
I can SELECT INTO a variable of type record within PL/pgSQL, but then I have to define the structure when getting data out of that record. SELECT INTO is really simple - automatically creating a table of the same structure of the SELECT query. Does anyone have any explanation for why this doesn't work inside a function?
It seems like SELECT INTO works differently in PL/pgSQL, because you can select into the variables you've declared. I don't want to declare my temporary table structure, though. I wish it would just create the structure automatically like it does in SQL.
Try
CREATE TEMP TABLE mytable AS
SELECT *
FROM orig_table;
Per http://www.postgresql.org/docs/current/static/sql-selectinto.html
CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO is not available in ECPG or PL/pgSQL, because they interpret the INTO clause differently. Furthermore, CREATE TABLE AS offers a superset of the functionality provided by SELECT INTO.