variable reference for postgreSQL - postgresql

I have a query that I have to update every month and generate a new table. There are several references to this table, and I always seem to miss one. I was wondering if there is a way that I can set a local variable and reuse it through out the query. As an example:
DECLARE 'table'||to_char(curent_timestamp, 'MON') ||
to_char(current_timestanp,"YY") AS table_ref;
CREATE TABLE table_ref AS select * FROM base_table;
SELECT * FROM table_ref;
Thanks.

You can use FORMAT() and EXECUTE to execute your dynamic SQL, like so:
DO $$
DECLARE table_name TEXT;
BEGIN
SELECT FORMAT('table%I',TO_CHAR(CURRENT_TIMESTAMP,'MONYY')) INTO table_name; -- ex. tableJUN16
EXECUTE FORMAT('CREATE TABLE %I AS SELECT * FROM base_table;',table_name);
END; $$ LANGUAGE PLPGSQL;
This will create your new table from the base_table with your dynamic name.
https://www.postgresql.org/docs/current/static/functions-string.html

Related

Pass schema name as a variable in postgresql query

I have a table with schema names. I am writing a procedure to select schema name from the table and into a variable. The variable is then used to fetch records from the schema table.
Sample code below.
CREATE OR REPLACE PROCEDURE vasmol_master.sp_pushmt(
)
LANGUAGE 'plpgsql'
AS $BODY$
DECLARE
shortcodedatabase CHARACTER VARYING;
services CHARACTER VARYING;
BEGIN
FOR shortcodedatabase IN SELECT dbase FROM vasmol_master.shortcode_services ORDER BY shortcode ASC LOOP
services := shortcodedatabase ||'.smsservices';
SELECT * FROM services;
END LOOP;
END
$BODY$;
As the documentation linked to in a comment by Richard Huxton explains, to substitute a variable identifier into a query, use EXECUTE. See the documentation.
Depending on the return values you want, your procedure could look something like the following. This uses format to create a query string, substituting in the schema name as an identifier (%I).
CREATE OR REPLACE PROCEDURE vasmol_master.sp_pushmt()
LANGUAGE 'plpgsql'
AS $BODY$
DECLARE
shortcodedatabase CHARACTER VARYING;
BEGIN
FOR shortcodedatabase IN
SELECT dbase
FROM vasmol_master.shortcode_services
ORDER BY shortcode ASC
LOOP
EXECUTE format('SELECT * FROM %I.smsservices', shortcodedatabase);
END LOOP;
END
$BODY$;

Declare and return value for DELETE and INSERT

I am trying to remove duplicated data from some of our databases based upon unique id's. All deleted data should be stored in a separate table for auditing purposes. Since it concerns quite some databases and different schemas and tables I wanted to start using variables to reduce chance of errors and the amount of work it will take me.
This is the best example query I could think off, but it doesn't work:
do $$
declare #source_schema varchar := 'my_source_schema';
declare #source_table varchar := 'my_source_table';
declare #target_table varchar := 'my_target_schema' || source_table || '_duplicates'; --target schema and appendix are always the same, source_table is a variable input.
declare #unique_keys varchar := ('1', '2', '3')
begin
select into #target_table
from #source_schema.#source_table
where id in (#unique_keys);
delete from #source_schema.#source_table where export_id in (#unique_keys);
end ;
$$;
The query syntax works with hard-coded values.
Most of the times my variables are perceived as columns or not recognized at all. :(
You need to create and then call a plpgsql procedure with input parameters :
CREATE OR REPLACE PROCEDURE duplicates_suppress
(my_target_schema text, my_source_schema text, my_source_table text, unique_keys text[])
LANGUAGE plpgsql AS
$$
BEGIN
EXECUTE FORMAT(
'WITH list AS (INSERT INTO %1$I.%3$I_duplicates SELECT * FROM %2$I.%3$I WHERE array[id] <# %4$L :: integer[] RETURNING id)
DELETE FROM %2$I.%3$I AS t USING list AS l WHERE t.id = l.id', my_target_schema, my_source_schema, my_source_table, unique_keys :: text) ;
END ;
$$ ;
The procedure duplicates_suppress inserts into my_target_schema.my_source_table || '_duplicates' the rows from my_source_schema.my_source_table whose id is in the array unique_keys and then deletes these rows from the table my_source_schema.my_source_table .
See the test result in dbfiddle.
As has been commented, you need some kind of dynamic SQL. In a FUNCTION, PROCEDURE or a DO statement to do it on the server.
You should be comfortable with PL/pgSQL. Dynamic SQL is no beginners' toy.
Example with a PROCEDURE, like Edouard already suggested. You'll need a FUNCTION instead to wrap it in an outer transaction (like you very well might). See:
When to use stored procedure / user-defined function?
CREATE OR REPLACE PROCEDURE pg_temp.f_archive_dupes(_source_schema text, _source_table text, _unique_keys int[], OUT _row_count int)
LANGUAGE plpgsql AS
$proc$
-- target schema and appendix are always the same, source_table is a variable input
DECLARE
_target_schema CONSTANT text := 's2'; -- hardcoded
_target_table text := _source_table || '_duplicates';
_sql text := format(
'WITH del AS (
DELETE FROM %I.%I
WHERE id = ANY($1)
RETURNING *
)
INSERT INTO %I.%I TABLE del', _source_schema, _source_table
, _target_schema, _target_table);
BEGIN
RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _unique_keys; -- execute
GET DIAGNOSTICS _row_count = ROW_COUNT;
END
$proc$;
Call:
CALL pg_temp.f_archive_dupes('s1', 't1', '{1, 3}', 0);
db<>fiddle here
I made the procedure temporary, since I assume you don't need to keep it permanently. Create it once per database. See:
How to create a temporary function in PostgreSQL?
Passed schema and table names are case-sensitive strings! (Unlike unquoted identifiers in plain SQL.) Either way, be wary of SQL-injection when concatenating SQL dynamically. See:
Are PostgreSQL column names case-sensitive?
Table name as a PostgreSQL function parameter
Made _unique_keys type int[] (array of integer) since your sample values look like integers. Use a the actual data type of your id columns!
The variable _sql holds the query string, so it can easily be debugged before actually executing. Using RAISE NOTICE '%', _sql; for that purpose.
I suggest to comment the EXECUTE line until you are sure.
I made the PROCEDURE return the number of processed rows. You didn't ask for that, but it's typically convenient. At hardly any cost. See:
Dynamic SQL (EXECUTE) as condition for IF statement
Best way to get result count before LIMIT was applied
Last, but not least, use DELETE ... RETURNING * in a data-modifying CTE. Since that has to find rows only once it comes at about half the cost of separate SELECT and DELETE. And it's perfectly safe. If anything goes wrong, the whole transaction is rolled back anyway.
Two separate commands can also run into concurrency issues or race conditions which are ruled out this way, as DELETE implicitly locks the rows to delete. Example:
Replicating data between Postgres DBs
Or you can build the statements in a client program. Like psql, and use \gexec. Example:
Filter column names from existing table for SQL DDL statement
Based on Erwin's answer, minor optimization...
create or replace procedure pg_temp.p_archive_dump
(_source_schema text, _source_table text,
_unique_key int[],_target_schema text)
language plpgsql as
$$
declare
_row_count bigint;
_target_table text := '';
BEGIN
select quote_ident(_source_table) ||'_'|| array_to_string(_unique_key,'_') into _target_table from quote_ident(_source_table);
raise notice 'the deleted table records will store in %.%',_target_schema, _target_table;
execute format('create table %I.%I as select * from %I.%I limit 0',_target_schema, _target_table,_source_schema,_source_table );
execute format('with mm as ( delete from %I.%I where id = any (%L) returning * ) insert into %I.%I table mm'
,_source_schema,_source_table,_unique_key, _target_schema, _target_table);
GET DIAGNOSTICS _row_count = ROW_COUNT;
RAISE notice 'rows influenced, %',_row_count;
end
$$;
--
if your _unique_key is not that much, this solution also create a table for you. Obviously you need to create the target schema yourself.
If your unique_key is too much, you can customize to properly rename the dumped table.
Let's call it.
call pg_temp.p_archive_dump('s1','t1', '{1,2}','s2');
s1 is the source schema, t1 is source table, {1,2} is the unique key you want to extract to the new table. s2 is the target schema

Postgres DROP TABLE using DO DECLARE and EXECUTE

I'm trying to drop tables returned from a query using EXECUTE. Here's an example:
CREATE TABLE test_a (id BIGINT);
CREATE TABLE test_b (id BIGINT);
DO
$f$
DECLARE command TEXT;
BEGIN
SELECT INTO command 'SELECT ARRAY_TO_STRING(ARRAY_AGG($$DROP TABLE $$ || table_name), $$;$$) FROM information_schema.tables WHERE table_name ILIKE $$test%$$';
EXECUTE command;
END;
$f$;
The SELECT statement returns "DROP TABLE test_a; DROP TABLE test_b", which I'm passing into the declared variable and trying to run using EXECUTE, but to no effect. What am I doing wrong?
PostgreSQL 9.5.18, compiled by Visual C++ build 1800, 64-bit
You are storing the string SELECT ARRAY_TO_STRING ... in that variable, not the result of the SELECT statement.
You can also simplify ARRAY_TO_STRING(ARRAY_AGG(..)) to string_agg() and it's highly recommended to use format() to generate dynamic SQL, to properly deal with identifiers that need quoting.
Use the following:
DO
$f$
DECLARE
command TEXT;
BEGIN
SELECT string_agg(format('drop table %I', table_name), ';')
INTO command
FROM information_schema.tables
WHERE table_name ILIKE 'test%';
execute command;
END;
$f$;

Execute a SELECT with dynamic ORDER BY expression inside a function

I'm trying to EXECUTE some SELECTs to use inside a function, my code is something like this:
DECLARE
result_one record;
BEGIN
EXECUTE 'WITH Q1 AS
(
SELECT id
FROM table_two
INNER JOINs, WHERE, etc, ORDER BY... DESC
)
SELECT Q1.id
FROM Q1
WHERE, ORDER BY...DESC';
RETURN final_result;
END;
I know how to do it in MySQL, but in PostgreSQL I'm failing. What should I change or how should I do it?
For a function to be able to return multiple rows it has to be declared as returns table() (or returns setof)
And to actually return a result from within a PL/pgSQL function you need to use return query (as documented in the manual)
To build dynamic SQL in Postgres it is highly recommended to use the format() function to properly deal with identifiers (and to make the source easier to read).
So you need something like:
create or replace function get_data(p_sort_column text)
returns table (id integer)
as
$$
begin
return query execute
format(
'with q1 as (
select id
from table_two
join table_three on ...
)
select q1.id
from q1
order by %I desc', p_sort_column);
end;
$$
language plpgsql;
Note that the order by inside the CTE is pretty much useless if you are sorting the final query unless you use a LIMIT or distinct on () inside the query.
You can make your life even easier if you use another level of dollar quoting for the dynamic SQL:
create or replace function get_data(p_sort_column text)
returns table (id integer)
as
$$
begin
return query execute
format(
$query$
with q1 as (
select id
from table_two
join table_three on ...
)
select q1.id
from q1
order by %I desc
$query$, p_sort_column);
end;
$$
language plpgsql;
What a_horse said. And:
How to return result of a SELECT inside a function in PostgreSQL?
Plus, to pick a column for ORDER BY dynamically, you have to add that column to the SELECT list of your CTE, which leads to complications if the column can be duplicated (like with passing 'id') ...
Better yet, remove the CTE entirely. There is nothing in your question to warrant its use anyway. (Only use CTEs when needed in Postgres, they are typically slower than equivalent subqueries or simple queries.)
CREATE OR REPLACE FUNCTION get_data(p_sort_column text)
RETURNS TABLE (id integer) AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$q$
SELECT t2.id -- assuming you meant t2?
FROM table_two t2
JOIN table_three t3 on ...
ORDER BY t2.%I DESC NULL LAST -- see below!
$q$, $1);
END
$func$ LANGUAGE plpgsql;
I appended NULLS LAST - you'll probably want that, too:
PostgreSQL sort by datetime asc, null first?
If p_sort_column is from the same table all the time, hard-code that table name / alias in the ORDER BY clause. Else, pass the table name / alias separately and auto-quote them separately to be safe:
Define table and column names as arguments in a plpgsql function?
I suggest to table-qualify all column names in a bigger query with multiple joins (t2.id not just id). Avoids various kinds of surprising results / confusion / abuse.
And you may want to schema-qualify your table names (myschema.table_two) to avoid similar troubles when calling the function with a different search_path:
How does the search_path influence identifier resolution and the "current schema"

truncate function doesnt work in postgres

I have created the following function to truncate bunch of tables starting with "irm_gtresult". There are no syntax errors in my function, but the function doesn't truncate the tables when I run it. What could be wrong here?
My Postgres db version is 8.4.
create or replace function br()
RETURNS void
LANGUAGE plpgsql
AS
$$
DECLARE
row text;
BEGIN
FOR row IN
select table_name from information_schema.tables where table_name ILIKE 'irm_gtresult%'
LOOP
EXECUTE 'TRUNCATE TABLE ' || row;
END LOOP;
END;
$$;
Call:
select br();
Your code is valid. I tested and it works for me in Postgres 9.4.
Using the outdated and unsupported version 8.4 (like you added) may be the problem. The version is just too old, consider upgrading to a current version.
However, I have a couple of suggestions:
Don't use key word row as variable name.
You don't need to loop, you can TRUNCATE all tables in a single command. Faster, shorter.
You may need to add CASCADE if there are dependencies. Be aware of the effect.
CREATE OR REPLACE FUNCTION br()
RETURNS void AS
$func$
BEGIN
EXECUTE (
SELECT 'TRUNCATE TABLE '
|| string_agg(format('%I.%I', schemaname, tablename), ',')
|| ' CASCADE'
FROM pg_tables t
WHERE tablename ILIKE 'irm_gtresult%'
AND schemaname = 'public'
-- AND tableowner = 'postgres' -- optionaly restrict to one user
);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT br();
I am using the view pg_tables from the system catalog. You can as well use information_schema.tables like you did. Note the subtle differences:
How to check if a table exists in a given schema
Related answers with more explanation:
Can I truncate tables dynamically?
Truncating all tables in a Postgres database
To truncate in postgres you just have to use the TRUNC() function.
Example:
SELECT TRUNC(price, 0) AS truncated_price
FROM product;