Dropping of temp tables in PostgreSQL? - postgresql

Curious as to whether or not one should be dropping temp tables that are used strictly in regards to the function they are declared within? New to postgreSQL and haven't found much information on the topic. I'm aware of the fact that in SQL this is taken care of automatically but certainly MS SQL and postgreSQL have their differences. What do you think is best practice in terms of the dropping of temp tables declared in functions if at all necesarry?

they are somewhat different for MS and Pg. Ms treats local temp tables created in SP specially - drops on the completion of the procedure Postgres does not currently support GLOBAL temp tables (specifying it in create statement is ignored)
Optionally, GLOBAL or LOCAL can be written before TEMPORARY or TEMP.
This presently makes no difference in PostgreSQL and is deprecated;
The best practice is not very much applicable here I would say. Leaving temp tables for duration of the session is ok (they will be dropped at the end). But often you would prefer using ON COMMIT DROP to drop table after transaction ends, not session... If endless session is comparably ok for postgres, endless transaction can be so-so for MVCC and locking and so on... Again you might want to look into ways to fight it...
To summarise: It is often practice to leave temp tables persist till the end of session, more "normal" to leave them persist till end of transaction. Postgres does not treat temp tables created in fn() specially. Postgres does not have GLOBAL temp tables. Depending on the code you write and env you have you might want to drop or leave temp table to be dropped automatically. Mind session/transaction pooling particularities here a s well.

Related

How does postgresql lock tables when inserting and selecting?

I'm migrating data from one table to another in an environment where any long locks or downtime is not acceptable, in total about 80000 rows. Essentially the query boils down to this simple case:
INSERT INTO table_2
SELECT * FROM table_1
JOIN table_3 on table_1.id = table_3.id
All 3 tables are being read from and could have an insert at any time. I want to just run the query above, but I'm not sure how the locking works and whether the tables will be totally inaccessible during the operation. My understanding tells me that only the affected rows (newly inserted) will be locked. Table 1 is just being selected, so no harm, and concurrent inserts are safe so table 2 should be freely accessible.
Is this understanding correct, and can I run this query in a production environment without fear? If it's not safe, what is the standard way to accomplish this?
You're fine.
If you're interested in the details, you can read up on multiversion concurrency control, or on the details of the Postgres MVCC implementation, or how its various locking modes interact, but the implications for your case are nicely summarised in the docs:
reading never blocks writing and writing never blocks reading
In short, every record stored in the database has some version number attached to it, and every query knows which versions to consider and which to ignore.
This means that an INSERT can safely write to a table without locking it, as any concurrent queries will simply ignore the new rows until the inserting transaction decides to commit.

Online backup blocking truncate table

It´s documented that in DB2 the TRUNCATE statement is not compatible with online backup because it gets a Z lock on the table and prevents an online backup from running concurrently.
The lock wait happens when a truncate tries to get a shared lock on an internal online backup object.
Since this is by design in the product I will have to go for workarounds, so this thread is not about a solution, but why they can´t work together. I didn´t find a reasonable explanation why there is such limitation in db2.
Any insights?
Thanks,
Luciano Moreira
from http://www.ibm.com/developerworks/data/library/techarticle/dm-0501melnyk/
When a table holds a Z lock, no concurrent application can read or
update data in that table.
So now we know that a Z lock is and exclusive access to a table denying read and write access to the table.
from http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.sql.ref.doc/doc/r0053474.html
Exclusive Access: No other session can have a cursor open on the table, or a lock held on the table (SQLSTATE 25001).
from https://sites.google.com/site/umeshdanderdbms/difference-between-truncate-and-delete
Delete is logging operation, where as Truncate is makes the table empty on container level.
(Logging operation – DML operation are logged into logs (redo log in oracle, transaction log in DB2 etc). It is stored in logs for commit or rollback operation.)
This is the most interesting part. Truncate just 'forgets' the content of the table whereas deletes removes line by line processing all triggers, bells, and whistles. Therefore when you truncate all reading cursors will get invalid. To prevent stupid stuff like that you can only completely empty a table when nobody tries to access it. Online backup obviously needs to read the table. Therefore it is not possible to have both accessing the same table at the same time.

report migration from SQL server to Oracle

I have a report in SQL server and I am migrating this to Oracle.
The approach I used in SQL server is load sum(sales) , person for given month into temporary tables (hash tables) and use this table to join with other transaction tables show the details, but when it comes to oracle I am not sure if I can use the same method here, because hash tables (temporary tables in SQL server) are specific to session and might not create any problem with output, please advise if there is anything in oracle which is analogous to that.
I came to know there are global temp tables in oracle, do they work in the manner I mentinoed above, also
If a user has no create/drop table privileges can they still use gloabal temp tables?
please help me.
You'll have to show some code or atleast some pseudo-code of how your process runs for anyone to help you. Having said that...
One thing that is different in oracle compared to temporary tables in other databases is that you do not create them each time you need them. You create them once and the data in the table is present either until you commit/rollback (transaction based) or until you end your session (session-based global temporary tables). Also, The data in a temporary table is visible only to the session that inserts the data into the table..
If you are generating the output files once and you don't need that data later, then Global temporary tables would probably fit in cleanly, with some minor changes.
Since you do not create the temporary tables each time you use them, you don't need the create/drop privilege. All you'd need is the insert/read privilege. Just read will not help because you cannot read another session's data anyways, so there is no use for it.

Replicating data between Postgres DBs

I have a Postgres DB that is used by a chat application. The chat system often truncates these tables when they grow to big but I need this data copied to another Postgres database. I will not be truncating the tables in this DB.
How I can configure a few tables on the chat-system's database to replicate data to another Postgres database. Is there a quick way to accomplish this?
Slony can replicate only select tables, but I'm not sure how it handles truncates, and it can be a pain to configure.
You might also use something like pgpool to send copies of the insert statements to a second database.
You might modify the source of your chat application to do two writes (one to each db) when a new record is created.
You could just write a script in Perl/PHP/Python to read from one and write to another, then fire it by cron so that you're sure it gets run before truncation.
If you only copy a batch of rows every other day, you may be better off with a plain INSERT to a different schema in the same database or a different database in the same database cluster (you need something like dblink for that).
The safest / fastest solution in the same database would be a data-modifying CTE. Something along these lines:
WITH del AS (
DELETE FROM tbl
WHERE <some condition>
RETURNING *
)
INSERT INTO backup.tbl
SELECT * FROM del;
For true replication consider these official sources:
https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
https://www.postgresql.org/docs/current/runtime-config-replication.html

How to prevent Write Ahead Logging on just one table in PostgreSQL?

I am considering log-shipping of Write Ahead Logs (WAL) in PostgreSQL to create a warm-standby database. However I have one table in the database that receives a huge amount of INSERT/DELETEs each day, but which I don't care about protecting the data in it. To reduce the amount of WALs produced I was wondering, is there a way to prevent any activity on one table from being recorded in the WALs?
Ran across this old question, which now has a better answer. Postgres 9.1 introduced "Unlogged Tables", which are tables that don't log their DML changes to WAL. See the docs for more info, but at least now there is a solution for this problem.
See Waiting for 9.1 - UNLOGGED tables by depesz, and the 9.1 docs.
Unfortunately, I don't believe there is. The WAL logging operates on the page level, which is much lower than the table level and doesn't even know which page holds data from which table. In fact, the WAL files don't even know which pages belong to which database.
You might consider moving your high activity table to a completely different instance of PostgreSQL. This seems drastic, but I can't think of another way off the top of my head to avoid having that activity show up in your WAL files.
To offer one option to my own question. There are temp tables - "temporary tables are automatically dropped at the end of a session, or optionally at the end of the current transaction (see ON COMMIT below)" - which I think don't generate WALs. Even so, this might not be ideal as the table creation & design will be have to be in the code.
I'd consider memcached for use-cases like this. You can even spread the load over a bunch of cheap machines too.