PostgreSQL (9.4) temporary table scope - postgresql

After some aggravation, I found (IMO) odd behavior when a function calls another. If the outer function creates a temporary table, and the inner function creates a temporary table with the same name, the inner function "wins." Is this intended? FWIW, I am proficient at SQL Server, and temporary tables do not act this way. Temporary tables (#temp or #temp) are scoped to the function. So, an equivalent function (SQL Server stored procedure) would return "7890," not "1234."
drop function if exists inner_function();
drop function if exists outer_function();
create function inner_function()
returns integer
as
$$
begin
drop table if exists tempTable;
create temporary table tempTable (
inner_id int
);
insert into tempTable (inner_id) values (1234);
return 56;
end;
$$
language plpgsql;
create function outer_function()
returns table (
return_id integer
)
as
$$
declare intReturn integer;
begin
drop table if exists tempTable; -- note that inner_function() also declares tempTable
create temporary table tempTable (
outer_id integer
);
insert into tempTable (outer_id) values (7890);
intReturn = inner_function(); -- the inner_function() function recreates tempTable
return query
select * from tempTable; -- returns "1234", not "7890" like I expected
end;
$$
language plpgsql;
select * from outer_function(); -- returns "1234", not "7890" like I expected

There are no problem with this behaviour, in PostgreSQL temp table can have two scopes:
- session (default)
- transaction
To use the "transaction" scope you should use "ON COMMIT DROP" at the end of the CREATE TEMP statement, i.e:
CREATE TEMP TABLE foo(bar INT) ON COMMIT DROP;
Anyway your two functions will be executed in one transaction so when you call the inner_function from the outer_function you'll be in the same transaction and PostgreSQL will detect that "tempTable" already exists in the current session and will drop it in "inner_function" and create again...

Is this intended?
Yes, these are tables in the database, similar to permanent tables.
They exist in a special schema, and are automatically dropped at the end of a session or transaction. If you create a temporary table with the same name as a permanent table, then you must prefix the permanent table with its schema name to reference it while the temporary table exists.
If you want to emulate the SQL Server implementation then you might consider using particular prefixes for your temporary tables.

Related

Declare and return value for DELETE and INSERT

I am trying to remove duplicated data from some of our databases based upon unique id's. All deleted data should be stored in a separate table for auditing purposes. Since it concerns quite some databases and different schemas and tables I wanted to start using variables to reduce chance of errors and the amount of work it will take me.
This is the best example query I could think off, but it doesn't work:
do $$
declare #source_schema varchar := 'my_source_schema';
declare #source_table varchar := 'my_source_table';
declare #target_table varchar := 'my_target_schema' || source_table || '_duplicates'; --target schema and appendix are always the same, source_table is a variable input.
declare #unique_keys varchar := ('1', '2', '3')
begin
select into #target_table
from #source_schema.#source_table
where id in (#unique_keys);
delete from #source_schema.#source_table where export_id in (#unique_keys);
end ;
$$;
The query syntax works with hard-coded values.
Most of the times my variables are perceived as columns or not recognized at all. :(
You need to create and then call a plpgsql procedure with input parameters :
CREATE OR REPLACE PROCEDURE duplicates_suppress
(my_target_schema text, my_source_schema text, my_source_table text, unique_keys text[])
LANGUAGE plpgsql AS
$$
BEGIN
EXECUTE FORMAT(
'WITH list AS (INSERT INTO %1$I.%3$I_duplicates SELECT * FROM %2$I.%3$I WHERE array[id] <# %4$L :: integer[] RETURNING id)
DELETE FROM %2$I.%3$I AS t USING list AS l WHERE t.id = l.id', my_target_schema, my_source_schema, my_source_table, unique_keys :: text) ;
END ;
$$ ;
The procedure duplicates_suppress inserts into my_target_schema.my_source_table || '_duplicates' the rows from my_source_schema.my_source_table whose id is in the array unique_keys and then deletes these rows from the table my_source_schema.my_source_table .
See the test result in dbfiddle.
As has been commented, you need some kind of dynamic SQL. In a FUNCTION, PROCEDURE or a DO statement to do it on the server.
You should be comfortable with PL/pgSQL. Dynamic SQL is no beginners' toy.
Example with a PROCEDURE, like Edouard already suggested. You'll need a FUNCTION instead to wrap it in an outer transaction (like you very well might). See:
When to use stored procedure / user-defined function?
CREATE OR REPLACE PROCEDURE pg_temp.f_archive_dupes(_source_schema text, _source_table text, _unique_keys int[], OUT _row_count int)
LANGUAGE plpgsql AS
$proc$
-- target schema and appendix are always the same, source_table is a variable input
DECLARE
_target_schema CONSTANT text := 's2'; -- hardcoded
_target_table text := _source_table || '_duplicates';
_sql text := format(
'WITH del AS (
DELETE FROM %I.%I
WHERE id = ANY($1)
RETURNING *
)
INSERT INTO %I.%I TABLE del', _source_schema, _source_table
, _target_schema, _target_table);
BEGIN
RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _unique_keys; -- execute
GET DIAGNOSTICS _row_count = ROW_COUNT;
END
$proc$;
Call:
CALL pg_temp.f_archive_dupes('s1', 't1', '{1, 3}', 0);
db<>fiddle here
I made the procedure temporary, since I assume you don't need to keep it permanently. Create it once per database. See:
How to create a temporary function in PostgreSQL?
Passed schema and table names are case-sensitive strings! (Unlike unquoted identifiers in plain SQL.) Either way, be wary of SQL-injection when concatenating SQL dynamically. See:
Are PostgreSQL column names case-sensitive?
Table name as a PostgreSQL function parameter
Made _unique_keys type int[] (array of integer) since your sample values look like integers. Use a the actual data type of your id columns!
The variable _sql holds the query string, so it can easily be debugged before actually executing. Using RAISE NOTICE '%', _sql; for that purpose.
I suggest to comment the EXECUTE line until you are sure.
I made the PROCEDURE return the number of processed rows. You didn't ask for that, but it's typically convenient. At hardly any cost. See:
Dynamic SQL (EXECUTE) as condition for IF statement
Best way to get result count before LIMIT was applied
Last, but not least, use DELETE ... RETURNING * in a data-modifying CTE. Since that has to find rows only once it comes at about half the cost of separate SELECT and DELETE. And it's perfectly safe. If anything goes wrong, the whole transaction is rolled back anyway.
Two separate commands can also run into concurrency issues or race conditions which are ruled out this way, as DELETE implicitly locks the rows to delete. Example:
Replicating data between Postgres DBs
Or you can build the statements in a client program. Like psql, and use \gexec. Example:
Filter column names from existing table for SQL DDL statement
Based on Erwin's answer, minor optimization...
create or replace procedure pg_temp.p_archive_dump
(_source_schema text, _source_table text,
_unique_key int[],_target_schema text)
language plpgsql as
$$
declare
_row_count bigint;
_target_table text := '';
BEGIN
select quote_ident(_source_table) ||'_'|| array_to_string(_unique_key,'_') into _target_table from quote_ident(_source_table);
raise notice 'the deleted table records will store in %.%',_target_schema, _target_table;
execute format('create table %I.%I as select * from %I.%I limit 0',_target_schema, _target_table,_source_schema,_source_table );
execute format('with mm as ( delete from %I.%I where id = any (%L) returning * ) insert into %I.%I table mm'
,_source_schema,_source_table,_unique_key, _target_schema, _target_table);
GET DIAGNOSTICS _row_count = ROW_COUNT;
RAISE notice 'rows influenced, %',_row_count;
end
$$;
--
if your _unique_key is not that much, this solution also create a table for you. Obviously you need to create the target schema yourself.
If your unique_key is too much, you can customize to properly rename the dumped table.
Let's call it.
call pg_temp.p_archive_dump('s1','t1', '{1,2}','s2');
s1 is the source schema, t1 is source table, {1,2} is the unique key you want to extract to the new table. s2 is the target schema

Automatically DROP FUNCTION when DROP TABLE on POSTGRESQL 11.7

I'm trying to, somehow, trigger a automatic function drop when a table is dropped and I can't figure out how to do it.
TL;DR: Is there a way to trigger a function drop when a specific table is dropped? (POSTGRESQL 11.7)
Detailed explanation
I'll try to explain my problem using a simplified use case with dummy names.
I have three tables: sensor1, sensor2 and sumSensors;
A FUNCTION (sumdata) was created to INSERT data on sumSensors table. Inside this function I'll fetch data from sensor1 and sensor2 tables and insert its sum on table sumSensors;
A trigger was created for each sensor table which like this:
CREATE TRIGGER trig1
AFTER INSERT ON sensor1
FOR EACH ROW EXECUTE
FUNCTION sumdata();
Now, when a new row is inserted on tables sensor1 OR sensor2, the function sumdata will be executed and insert the sum of last values from both on table sumSensors
If I wanted to DROP FUNTION sumdata CASCADE;, the triggers would be automatically removed from tables sensor1 and sensor2. Until now that's everything fine! But that's not what I want.
My problem is:
Q: And if I just DROP TABLE sumSensors CASCADE;? What would happen to the function which was meant to insert on this table?
A: As expected, since there's no association between sumSensors table and sumdata function, the function won't be dropped (still exist)! The same happens to the triggers which use it (still exist). This means that when a new row is inserted on sensor tables, the function sumdata will be executed and corrupted, leading to a failure (even the INSERT which triggered the function execution won't be actually inserted).
Is there a way to trigger a function drop when a specific table is dropped?
Thank you in advance
There is no dependency tracking for functions in PostgreSQL (as of version 12).
You can use event triggers to maintain the dependencies yourself.
Full example follows.
More information: documentation of event triggers feature, support functions.
BEGIN;
CREATE TABLE _testtable ( id serial primary key, payload text );
INSERT INTO _testtable (payload) VALUES ('Test data');
CREATE FUNCTION _testfunc(integer) RETURNS integer
LANGUAGE SQL AS $$ SELECT $1 + count(*)::integer FROM _testtable; $$;
SELECT _testfunc(100);
CREATE FUNCTION trg_drop_dependent_functions()
RETURNS event_trigger
LANGUAGE plpgsql AS $$
DECLARE
_dropped record;
BEGIN
FOR _dropped IN
SELECT schema_name, object_name
FROM pg_catalog.pg_event_trigger_dropped_objects()
WHERE object_type = 'table'
LOOP
IF _dropped.schema_name = 'public' AND _dropped.object_name = '_testtable' THEN
EXECUTE 'DROP FUNCTION IF EXISTS _testfunc(integer)';
END IF;
END LOOP;
END;
$$;
CREATE EVENT TRIGGER trg_drop_dependent_functions ON sql_drop
EXECUTE FUNCTION trg_drop_dependent_functions();
DROP TABLE _testtable;
ROLLBACK;

Is there an alternative to temp tables in PostgreSQL to hold and manipulate result sets?

Within a plpgsql function, I need to hold the result set of a select statement and perform many subsequent queries and manipulations over that set.
I read about temp tables in PostgreSQL, but these tables are visible in session/transaction scope, while i need my table (or any data structure holding result set) to be locally visible and only exist within the function, so that each function call can have its own copy of that (table/data structure)
I simply need a table alike structure to hold select result set within a function call, instead of temp tables.
Is there such thing?
Concurrent sessions may have their own (local) temp tables with the same names. Here is an example of a function which does not do anything wise but creates a temp table, returns its data and drops the table on exit:
create or replace function my_function()
returns table (id int, str text)
language plpgsql as $$
begin
create temp table my_temp_table as
select i as id, i::text as str
from generate_series(1, 3) i;
return query
select *
from my_temp_table;
drop table my_temp_table;
end $$;
The function may be safely run in concurrent sessions.
drop table if exists... at the beginning of the function is a reasonable alternative. Do not forget to use temp or temporary in create temp table...

call psql function after table creation

I have .sql file in which I am creating a table and inserting rows.
CREATE TABLE vehicle (
name TEXT PRIMARY KEY,
id INTEGER DEFAULT 0
)
INSERT INTO vehicle values('car',1);
INSERT INTO vehicle values('bus',2);
Instead of executing insert commands I want to have a function which will do the inserting above two rows. How I can call the function?
CREATE OR REPLACE FUNCTION insert()
RETURNS VOID AS $$
BEGIN
INSERT INTO vehicle values('car',1);
INSERT INTO vehicle values('bus',2);
RETURN;
END;
$$ LANGUAGE 'plpgsql';
Why do you need such a function?
Probably, all you need is just to wrap everything into single transaction block?
begin;
CREATE TABLE vehicle (
name TEXT PRIMARY KEY,
id INTEGER DEFAULT 0
);
INSERT INTO vehicle values('car',1);
INSERT INTO vehicle values('bus',2);
commit;
?
If no, consider using "anonymous" plpgsql block, w/o defining a function exclicitly -- it's useful when you need to run some code only once:
do $$
begin
INSERT INTO vehicle values('car',1);
INSERT INTO vehicle values('bus',2);
end;
$$ language plpgsql;
Finally, if you do need a function, ok, you defined it correctly, so let's run it. In Postgres, stored procedures are functions, what means that they can easily be integrated into regular SQL statements, so let's just use SELECT (in Postgres, SELECT statements are allowed to have no FROM clause):
select insert();

sync two tables after insert

I am using postgresql. I have two schemas main and sec containing only one table datastore with the same structure (this is only an extract)
I am trying unsucessfully to create a trigger for keep sync both tables when insert occurs in one of them. The problem is some kind of circular or recursive reference.
Can you create some example for solve this?
I am working on this, I'll post my solution later.
You can use this code as reference for creating schemas and tables
CREATE SCHEMA main;
CREATE SCHEMA sec;
SET search_path = main, pg_catalog;
CREATE TABLE datastore (
fullname character varying,
age integer
);
SET search_path = sec, pg_catalog;
CREATE TABLE datastore (
fullname character varying,
age integer
);
An updatable view is the best solution and is as simple as (Postgres 9.3+):
drop table sec.datastore;
create view sec.datastore
as select * from main.datastore;
However, if you cannot do it for some inscrutable reasons, use pg_trigger_depth() function (Postgres 9.2+) to ensure that the trigger function is not executed during replication. The trigger on main.datastore may look like this:
create or replace function main.datastore_insert_trigger()
returns trigger language plpgsql as $$
begin
insert into sec.datastore
select new.fullname, new.age;
return new;
end $$;
create trigger datastore_insert_trigger
before insert on main.datastore
for each row when (pg_trigger_depth() = 0)
execute procedure main.datastore_insert_trigger();
The trigger on sec.datastore should be defined analogously.
create OR REPLACE function copytosec() RETURNS TRIGGER AS $$
BEGIN
insert into sec.datastore(fullname,age) values (NEW.fullname,NEW.age);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
create trigger copytosectrigger after insert on public.datastore
for each row
execute procedure copytosec();`