wrapping postgresql commands in a transaction: truncate vs delete or upsert/merge - postgresql

I am using the following commands below in postgresql 9.1.3 to move data from a temp staging table to a table being used in a webapp (geoserver) all in the same db. Then dropping the temp table.
TRUNCATE table_foo;
INSERT INTO table_foo
SELECT * FROM table_temp;
DROP TABLE table_temp;
I want to wrap this in a transaction to allow for concurrency. The data-set is small less than 2000 rows and truncating is faster than delete.
What is the best way to run these commands in a transaction?
Is creating a function advisable or writing a UPSERT/MERGE etc in a CTE?
Would it be better to DELETE all rows then bulk INSERT from temp table instead of TRUNCATE?
In postgres which would allow for a roll back TRUNCATE or DELETE?
The temp table is delivered daily via an ETL scripted in arcpy how could I automate the truncate/delete/bulk insert parts within postgres?
I am open to using PL/pgsql, PL/python (or the recommended py for postgres)
Currently I am manually executing the sql commands after the temp staging table is imported into my DB.

Both, truncate and delete can be rolled back (which is clearly documented in the manual).
truncate - due to its nature - has some oddities regarding the visibility.
See the manual for details: http://www.postgresql.org/docs/current/static/sql-truncate.html (the warning at the bottom)
If your application can live with the fact that table_foo is "empty" during that process, truncate is probably better (again see the big red box in the manual for an explanation). If you don't want the application to notice, you need to use delete
To run these statements in a transaction simply put them into one:
begin transaction;
delete from table_foo;
insert into ....
drop table_temp;
commit;
Whether you do that in a function or not is up to you.
truncate/insert will be faster (than delete/insert) as that minimizes the amount of WAL generated.

Related

Auto generate script for CREATE TABLE including all indices, constraints, etc (not via SSMS)

I have a data anonymization process that takes a production copy of a database and turns it into an anonymized copy by UPDATE-ing some columns.
Some of the tables contain several million rows so instead of UPDATE-ing the columns, which is very log intensive, I went down the way of
SELECT
Id,
CAST('Redacted' AS NVARCHAR(255)) [ColumnRequiringAnonymization]
INTO MyTable_New
FROM MyTable
EXEC sp_rename MyTable, MyTable_old
EXEC sp_rename MyTable_new, MyTable
DROP TABLE MyTable_old
The problem with this approach is that the "new" table no longer has any of the keys, indices and other dependent objects. I have figured out the keys and indices using SPs to generate the DROP and CREATE scripts. The SPs are based on manually written SQL as can be seen e.g. in this answer.
The next problem is that we have a schemabound view on top of this table, which has indices and a full-text index on its own. The number of SPs to generate scripts is growing and I am sure there will be mistakes.
Is there a way to completely script a table/view by using SQL commands only? ie. just like SSMS does when you click "Script table as - CREATE to" but within a stored procedure?
Right-click on the database, select Tasks; there is Generate Scripts there. Just follow prompts or Google for additional information.

Move truncated records to another table in Postgresql 9.5

Problem is following: remove all records from one table, and insert them to another.
I have a table that is partitioned by date criteria. To avoid partitioning each record one by one, I'm collecting the data in one table, and periodically move them to another table. Copied records have to be removed from first table. I'm using DELETE query with RETURNING, but the side effect is that autovacuum is having a lot of work to do to clean up the mess from original table.
I'm trying to achieve the same effect (copy and remove records), but without creating additional work for vacuum mechanism.
As I'm removing all rows (by delete without where conditions), I was thinking about TRUNCATE, but it does not support RETURNING clause. Another idea was to somehow configure the table, to automatically remove tuple from page on delete operation, without waiting for vacuum, but I did not found if it is possible.
Can you suggest something, that I could use to solve my problem?
You need to use something like:
--Open your transaction
BEGIN;
--Prevent concurrent writes, but allow concurrent data access
LOCK TABLE table_a IN SHARE MODE;
--Copy the data from table_a to table_b, you can also use CREATE TABLE AS to do this
INSERT INTO table_b AS SELECT * FROM table_a;
--Zeroying table_a
TRUNCATE TABLE table_a;
--Commits and release the lock
COMMIT;

db2 reorganize a table

When I alter a table in db2, I have to reorganize it
so I execute the next query:
Call Sysproc.admin_cmd ('reorg Table myTable');
I m searching an appropriate solution to reorganize a table when it s altered, or reorganize all the schema after making various modifications
You can determine when tables will require a REORG by looking at SYSIBMADM.ADMINTABINFO:
select tabschema, tabname
from sysibmadm.admintabinfo
where reorg_pending = 'Y'
You may also want to look at the NUM_REORG_REC_ALTERS column as this may show you additional tables that don't require reorganization due to various ALTER TABLE statements.
The reorg operation is similar to a defrag in hard disk. It frees empty spaces in pages, and eventually it could reorganize data according to an index. Depending on the features, it creates the compression dictionary and compress data.
As you can see, reorg operation is an administrative task, and it is not necessary each time data is modified. A database could run without reorg.
It order to ease this, DB2 included autonomic features like automatic backup, however this doesn't answer you own question. This will only trigger reorg on tables that need that.
To reorg a table explicitly you need to execute the command reorg http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.admin.cmd.doc/doc/r0001966.html
or via the admin_cmd http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.sql.rtn.doc/doc/r0023582.html
in db2 config we have:
Automatic reorganization (AUTO_REORG) = OFF
we can set auto_reorg to on

DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=68

I am getting this error when I ran:
alter table tablename add column columnname varchar(1) default 'N';
DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=68
How to solve it?
The alter statement wants to get an X lock on this row in SYSIBM.SYSTABLES. There is an open transaction that has this row/index value in an incompatible lock state. This lock that caused the timeout could even be from an open cursor that reads this row with an RS or RR isolation level.
Terminate any other SQL currently trying to query SYSTABLES and any utilities that may be trying to update SYSTABLES like reorg and runstats then try the alter again.
See DB2 Info center (I picked the one for DB2 10, most likely this error code is the same in other versions, but doublecheck!).
Seems there is a transaction open on your table, that prevents your alter command from execution.
after you have Altered a table you need to Reorg: reade up on it here:
Run the runstats script, which is a DB2 script, at regular intervals and set the script to gather RUNSTATS WITH DISTRIBUTION AND DETAILED INDEXES ALL.
In addition to running the runstats scripts regularly, you can perform the following tasks to avoid the problem:
Use REOPT ONCE or REOPT ALWAYS with the command-line interface (CLI ) packages to change the query optimization behavior.
In the DB2 database, change the table to make it volatile. Volatile tables indicate to the DB2 optimizer that the table cardinality can change significantly at run time (from empty to large and vice versa). Therefore, DB2 uses an index to access a table rather than a table scan.

MS-SQL 2000: Turn off logging during stored procedure

Here's my scenario:
I have a simple stored procedure that removes a specific set of rows from a table (we'll say about 30k rows), and then inserts about the same amount of rows. This generally should only take a few seconds; however, the table has a trigger on it that watches for inserts/deletes, and tries to mimic what happened to a linked table on another server.
This process in turn is unbareably slow due to the trigger, and the table is also locked during this process. So here are my two questions:
I'm guessing a decent part of the slowdown is due to the transaction log. Is there a way for me to specify in my stored procedure that I do not want what's in the procedure to be logged?
Is there a way for me to do my 'DELETE FROM' and 'INSERT INTO' commands without me locking the table during the entire process?
Thanks!
edit - Thanks for the answers; I figured it was the case (not being able to do either of the above), but wanted to make sure. The trigger was created a long time ago, and doesn't look very effecient, so it looks like my next step will be to go in to that and find out what's needed and how it can be improved. Thanks!
1) no, also you are not doing a minimally logged operation like TRUNCATE or BULK INSERT
2) No, how would you prevent corruption otherwise?
I wouldn't automatically assume that the performance problem is due to logging. In fact, it's likely that the trigger is written in such a way that is causing your performance problems. I encourage you to modify your original question and show the code for the trigger.
You can't turn off transactional integrity when modifying the data. You could ignore locks when you select data using select * from table (nolock); however, you need to be very careful and ensure your application can handle doing dirty reads.
It doesn't help with your trigger, but the solution to the locking issue is to perform the transactions in smaller batches.
Instead of
DELETE FROM Table WHERE <Condition>
Do something like
WHILE EXISTS ( SELECT * FROM table WHERE <condition to delete>)
BEGIN
SET ROWCOUNT 1000
DELETE FROM Table WHERE <Condition>
SET ROWCOUNT 0
END
You can temporarily disable the trigger, run your proc, then do whatever the trigger was doing in a more efficient manner.
-- disable trigger
ALTER TABLE [Table] DISABLE TRIGGER [Trigger]
GO
-- execute your proc
EXEC spProc
GO
-- do more stuff to clean up / sync with other server
GO
-- enable trigger
ALTER TABLE [Table] ENABLE TRIGGER [Trigger]
GO