PostgreSQL TEMP table alternating between exist and not exist - postgresql

I'm using PostgreSQL 9.6.2, with Toad client on Mac. Auto-commit is set to ON.
I first created a simple temp table like this:
CREATE TEMP TABLE demo_pairs
AS
WITH t (name, value) AS (VALUES ('a', 'b'), ('c', 'd'))
SELECT * FROM t;
Then something weird happens when I ran:
SELECT * FROM demo_pairs;
Every time I run the select (without re-running the create), it alternates between successfully selecting the values and error with table does not exist!
Can anyone help me understand what's going on?

https://www.postgresql.org/docs/current/static/sql-createtable.html
TEMPORARY or TEMP
If specified, the table is created as a temporary table. Temporary
tables are automatically dropped at the end of a session, or
optionally at the end of the current transaction (see ON COMMIT
below). Existing permanent tables with the same name are not visible
to the current session while the temporary table exists, unless they
are referenced with schema-qualified names. Any indexes created on a
temporary table are automatically temporary as well.
If you use session pooler that can close session for your or just close it yourself (eg network problem), the temp table will be dropped.
Also you can create it the way it is dropped on transaction end as well:
ON COMMIT
The behavior of temporary tables at the end of a transaction block can
be controlled using ON COMMIT. The three options are:
PRESERVE ROWS
No special action is taken at the ends of transactions. This is the
default behavior.
DELETE ROWS
All rows in the temporary table will be deleted at the end of each
transaction block. Essentially, an automatic TRUNCATE is done at each
commit.
DROP
The temporary table will be dropped at the end of the current transaction block.

Related

Cannot drop priorly modified new table in execute block

I'm not well acquainted with FB database and its subtleties.
On script executing, the problem occurres:
EXECUTE ibeblock
AS
BEGIN
-- 1. Create temporary table
execute statement 'recreate GLOBAL TEMPORARY table TMPTBL (ID bigint) /*on commit delete rows*/;';
commit;
-- 2. dummy fill of temporary table
insert into tmptbl (ID)
values (0xFE);
commit; -- not necessary
-- 3. perform some actions...
-- 4. Delete temporary table
execute statement 'drop table TMPTBL;';
commit; -- FAILURE!
END
The idea of script is primitive: 1) create temporary table; 2) fill it with records; 3) perform actions on other DB objects using populated records; 4) drop temp table.
For simulation, step-3 is useless (skipped). Step-4 leads to an error on commit: "This operation is not defined for system tables. unsuccessful metadata update. object TABLE "TMPTBL" is in use.".
Neither triggers nor constraints are applied for the table. Obviously, there should be nothing locking temp table.
Help, please, with resolution. Hopefully I missed something.
P.S.: FB 2.5, IBExpert 2017.12.13.1 used as DB managing tool
There are a number of problems with your code:
A global temporary table is intended as a permanent object, it is just the content that is temporary (either for transaction or connection duration). So normally you would create a global temporary table once, and not drop it, but instead reuse its definition.
Although you technically can execute DDL using execute statement, you are not supposed to, and it is not guaranteed to work. Your code is specifically an example of one of the things that will not work.
The problem here, is that you are trying to drop the table in the same transaction that used it (though to be honest, I'm surprised the insert even worked, because normally you can't insert into a table that was created in the same transaction).
The insert you executed on TMPTBL will mark the table in use, and given the transaction isn't committed yet, you can't drop the table: it is in use.
You shouldn't call commit in PSQL code (to be honest, I thought this wasn't even possible).
In short, you need to rethink how you use global temporary tables: define it once, and do not use execute statement to create it, but create it separately.
If you do want to create and drop it and not retain the definition of the global temporary table, then create it before the execute block, commit, then the execute block (with only the inserts and the 'perform some actions'), commit, and then drop it (and commit).
Alternatively, you might get away with executing the create using execute statement ... with autonomous transaction, the inserts and the 'perform some actions' in another execute statement ... with autonomous transaction, and finally the drop in yet another execute statement ... with autonomous transaction. However that makes your code very brittle, and this is not a recommend approach.
I have been forced again by devops guys to find robust solution to provide DB structure upgrades. Requirements: safely combine DDL and DML statements; ability to create temporary tables (for heavy selections); leave no garbage. Of course, upgrade is handled within single connection.
Referencing to the clues given by Mark a deeper insight and lots of experiments were made.
Here is template filescript that really worked out (isql native utility used):
SET TERM #;
-- 1. Create temporary table
EXECUTE BLOCK
AS
BEGIN
execute statement 'recreate GLOBAL TEMPORARY table TMPTBL (ID bigint) /*on commit preserve rows*/;';
END#
commit#
-- Data manipulations
EXECUTE BLOCK
AS
declare xid bigint;
BEGIN
-- 2. dummy fill of temporary table
begin
insert into TMPTBL (ID) values (0xFE);
end
-- 3. perform some actions...
for
select tt.ID
from TMPTBL tt
into :xid
do
begin
-- use :xid var
end
END#
commit#
-- 4. Delete temporary table
EXECUTE BLOCK
AS
BEGIN
execute statement 'drop table TMPTBL;';
END#
commit#
SET TERM ;#
Might be usefull for someone.
Damn, Firebird do drives crazy!

Create non conflicting temporary tables in a Pl/pgSQL function

I want to create a TEMPORARY TABLE in a Pl/pgSQL function because I want to index it before doing some process. The fact that any concurrent call to the function will try to reuse the same table seems to be a problem.
e.g. A first call to the function creates and uses a temporary table named "test" with data depending on the function parameters. A second concurrent call tries also to create and use the temporary table with the same name but with different data...
The doc says
"Temporary tables are automatically dropped at the end of a session,
or optionally at the end of the current transaction"
I guess the problem would not exist if temporary tables created with the "ON COMMIT DROP" option would only be visible to the current transaction. Is this the case?
If not, how to automatically create independent tables from two different function calls?
I could probably try to create a temporary name and check if a table with this name already exists but that seems like a lot of management to me...
Temporary tables of distinct sessions cannot conflict because each session has a dedicated temporary schema, only visible to the current session.
In current Postgres only one transaction runs inside the same session at a time. So only two successive calls in the same session can see the same temporary objects. ON COMMIT DROP, like you found, limits the lifespan of temp tables to the current transaction, avoiding conflicts with other transactions.
If you (can) have temp tables that don't die with the transaction (like if you want to keep using some of those tables after the end of the current transaction), then an alternative approach would be to truncate instead of create if the temp table already exists - which is a bit cheaper, too.
Wrapped into a function:
CREATE OR REPLACE FUNCTION f_create_or_trunc_temp_table(_tbl text, OUT _result "char") AS
$func$
BEGIN
SELECT INTO _result relkind
FROM pg_catalog.pg_class
WHERE relnamespace = pg_my_temp_schema() -- only temp objects!
AND relname = _tbl;
IF NOT FOUND THEN -- not found
EXECUTE format('CREATE TEMP TABLE %I(id int)', _tbl);
ELSIF _result = 'r' THEN -- table exists
EXECUTE format('TRUNCATE TABLE %I', _tbl); -- assuming identical table definition
ELSE -- other temp object occupies name
RAISE EXCEPTION 'Other temp object of type >>%<< occupies name >>%<<', _result, _tbl;
-- or do nothing, return more info or raise a warning / notice instead of an exception
END IF;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT f_create_or_trunc_temp_table('my_tbl');
This assumes identical table definition if the table exists. You might do more and also return more informative messages, etc. This is just the basic concept.
Related:
How can I determine if a table exists in the current search_path with PLPGSQL?
How to check if a table exists in a given schema
Temporary tables are visible only in the current session. Concurrent processes do not see each other's temporary tables even when they share the same names. Per the documentation:
PostgreSQL requires each session to issue its own CREATE TEMPORARY TABLE command for each temporary table to be used. This allows different sessions to use the same temporary table name for different purposes (...)

Best practices for performing a table swap in Redshift

We're in the process of running a handful of hourly scripts on our Redshift cluster which build summary tables for data consumers. After assembling a staging table, the script then runs a transaction which deletes the existing table and replaces it with the staging table, as such:
BEGIN;
DROP TABLE IF EXISTS public.data_facts;
ALTER TABLE public.data_facts_stage RENAME TO data_facts;
COMMIT;
The problem with this operation is that long-running analysis queries will place an AccessShareLock on public.data_facts, preventing it from being dropped and thrashing our ETL cycle. I'm thinking a better solution would be one which renames the existing table, as such:
ALTER TABLE public.data_facts RENAME TO data_facts_old;
ALTER TABLE public.data_facts_stage RENAME TO data_facts;
DROP TABLE public.data_facts_old;
However, this approach presupposes that 1) public.data_facts exists, and 2) public.data_facts_old does not exist.
Do you know if there's a way to conduct this operation safely in SQL, without relying on application logic? (eg. something like ALTER TABLE IF EXISTS).
I haven't tried it but looking at the documentation of CREATE VIEW it seems that this can be done with late-binding views.
The main idea would be a view public.data_facts that users interact with. Behind the scenes, you can load new data and then swap the view to “point” to the new table.
Bootstrap
-- load data into public.data_facts_v0
CREATE VIEW public.data_facts AS
SELECT * from public.data_facts_v0 WITH NO SCHEMA BINDING;
Update
-- load data into public.data_facts_v1
CREATE OR REPLACE VIEW public.data_facts AS
SELECT * from public.data_facts_v1 WITH NO SCHEMA BINDING;
DROP TABLE public.data_facts_v0;
The WITH NO SCHEMA BINDING means the view will be late-binding. “A late-binding view doesn't check the underlying database objects, such as tables and other views, until the view is queried.” This means the update can even introduce a table with renamed columns or a completely new structure.
Notes:
It might be a good idea to wrap the swap operations into a transaction to make sure we don't drop the previous table if the VIEW swap failed.
You can add a new load time timestamp encode runlength default getdate() column to your target table, and make your ETL do this:
INSERT INTO public.data_facts
SELECT * FROM public.data_facts_staging;
DELETE FROM public.data_facts
WHERE load_time<(select max(load_time) from public.data_facts);
DROP TABLE public.data_facts_staging;
note: public.data_facts_staging should have exactly the same structure as public.data_facts except that the last column of public.data_facts is load_time, so that on insert it will be populated with the current timestamp.
The only implication is that it would require extra disk space for a moment between you insert new rows and delete the old rows, and load_time has to be always the last column. Also you have to vaccum table every time you do this.
Another good thing about this is that if your ETL fails and staging table is empty or there is no staging table you won't lose your data. In the pure SQL scenario of swapping tables with DDL you're not protected from dropping the target table when staging table is missing. In the suggested scenario if no new rows are inserted the delete statement deletes nothing (there are no rows less than max load time), so worst case is just having the old version of data.
p.s. there is a command that instead of insert ... select ... just changes the pointer from staging to target table (alter table ... append from ...) but it requires the same type of lock as alter table I guess, so I don't suggest this

How do I know which temp table to delete if multiple stored procedures are creating temp tables with the same name?

I have been trying to figure this out the ENTIRE DAY :( ...
I have several stored procedures (in the same database as well as different databases) that do the same thing.
Creates temp table with name X.
Does processing with X.
Drops X.
The problem is that these stored procedures are creating temp tables with the same name. How do I know which temp table to drop once I'm done with the processing if they all have the name and I can't really DROP using "LIKE" because a temp table might be being used by a different stored procedure?
Here's a scenario.
SP1 starts -
Create temp table.
...and before it goes on, this happens:
SP2 is about finish
Drop temp table.
If the above happens, SP1 runs into issues. Such as "temp table does not exist"
How do I bypass this issue?
When I go to drop a temp table, I need to make sure I'm dropping the table related to the stored procedure that created it. Is this even possible?
You are trying to solve a problem you don't have. Just drop the table. If you look in SSMS you will really have unique tables. The SP knows which one to drop.
If SP1 and SP2 were using the same table you would have more problems than just drop.
IF OBJECT_ID(N'tempdb..#Temp', N'U') IS NOT NULL DROP TABLE #Temp
CREATE TABLE #Temp (sID INT PRIMARY KEY CLUSTERED);
-- look in SSMS and you will see #temp appended
-- use #temp
IF OBJECT_ID(N'tempdb..#Temp', N'U') IS NOT NULL DROP TABLE #Temp
In SP not sure you even need to drop. I think it will be dropped automatically.
But if you run the first two lines and look in SSMS you will see that you have your own #TEMP - not a shared #TEMP. Run the last line and you will see it go away.

Why doesn't this rule prevent duplicate key violations?

(postgresql) I was trying to COPY csv data into a table but I was getting duplicate key violation errors, and there's no way to tell COPY to ignore those, so following internet wisdom I tried adding this rule:
CREATE OR REPLACE RULE ignore_duplicate_inserts AS
ON INSERT TO mytable
WHERE (EXISTS ( SELECT mytable.id
FROM mytable
WHERE mytable.id = new.id)) DO NOTHING;
to circumvent the problem, but I still get those errors - any ideas why ?
Rules by default add things to the current action:
Roughly speaking, a rule causes additional commands to be executed when a given command on a given table is executed.
But an INSTEAD rule allows you to replace the action:
Alternatively, an INSTEAD rule can replace a given command by another, or cause a command not to be executed at all.
So, I think you want to specify INSTEAD:
CREATE OR REPLACE RULE ignore_duplicate_inserts AS
ON INSERT TO mytable
WHERE (EXISTS ( SELECT mytable.id
FROM mytable
WHERE mytable.id = new.id)) DO INSTEAD NOTHING;
Without the INSTEAD, your rule is essentially saying "do the INSERT and then do nothing" when you want to say "instead of the INSERT, do nothing" and, AFAIK, the DO INSTEAD NOTHING will do that.
I'm not an expert on PostgreSQL rules but I think adding the "INSTEAD" should work.
UPDATE: Thanks to araqnid we know that:
COPY FROM will invoke any triggers and check constraints on the destination table. However, it will not invoke rules
So a rule isn't going to work in this situation. However, triggers are fired during COPY FROM so you could write a BEFORE INSERT trigger that would return NULL when it detected duplicate rows:
It can return NULL to skip the operation for the current row. This instructs the executor to not perform the row-level operation that invoked the trigger (the insertion or modification of a particular table row).
That said, I think you'd be better off with araqnid's "load it all into a temporary table, clean it up, and copy it to the final destination" would be a more sensible solution for a bulk loading operation like you have.
COPY FROM will not invoke rules (http://www.postgresql.org/docs/9.0/interactive/sql-copy.html#AEN58860)
My approach would be to load the CSV data into a temp table, then use an INSERT...SELECT statement to copy the data into the target table where it doesn't already exist. (If there are duplicates in the CSV data itself, remove those from the temp table first). Something like:
BEGIN;
CREATE TEMP TABLE stage_data(key_column, data_columns...) ON COMMIT DROP;
\copy stage_data from data.csv with csv header
-- prevent any other updates while we are merging input (omit this if you don't need it)
LOCK target_data IN SHARE ROW EXCLUSIVE MODE;
-- insert into target table
INSERT INTO target_data(key_column, data_columns...)
SELECT key_column, data_columns...
FROM stage_data
WHERE NOT EXISTS (SELECT 1 FROM target_data
WHERE target_data.key_column = stage_data.key_column)
END;