cannot truncate the tables in landing area after transfering data - postgresql

I have 2 schemas with exactly the same 15 tables in Postgres. Data will be inserted in to first schema every 3 hours. then data needs to be transfer into second schema.
afterthat tables of the first schema needs to be truncated. (landing area needs to be empty).I wrote a trigger to transfer data after inserting to first schema into second schema.
but how tables of the first schema should be truncated?
I searched already and I tried two ways but non of them works.
1.I put the truncate command after all (insert into... conflict on... )in the same trigger's function that transfer data from first schema into second schema. which doesn't work.
2.I made another trigger which will implemented after insert into (or update) tables of second schema. which also doesn't work. why this one doesn't work? if data already inserted into second_schema. it should be possible to trucate first_schema. It isnot in use by active query.
Error->cannot TRUNCATE "table1" because it is being used by active queries in this session
what should I do? I need to tranfer data after every insert into first schema, to second schema. and then truncate all 15 tables of the first schema.
Tables of first schema should be empty for new insert.

Related

Cloudant/Db2 - How to determine if a database table row was read from?

I have two databases - Cloudant and IBM Db2. I have a table in each of these databases that hold static data that is only read from and never updated. These were created a long time ago and I'm not sure if they are used today so I wish to do a clean-up.
I want to determine if these tables or rows from these tables, are still being read from.
Is there a way to record the read timestamp (or at least know if it is simply accessed like a dirty bit) on a row of the table when it is read from?
OR
Record the read timestamp of the entire table (if any record from it is accessed)?
There is SYSCAT.TABLES.LASTUSED system catalog column in Db2 for DML statements on whole table.
There is no way to track each table row read access.

Cloning a Postgres table, including indexes and data

I am trying to create a clone of a Postgres table using plpgsql.
To date I have been simply truncating table 2 and re-inserting data from table 1.
TRUNCATE TABLE "dbPlan"."tb_plan_next";
INSERT INTO "dbPlan"."tb_plan_next" SELECT * FROM "dbPlan"."tb_plan";
As code this works as expected, however "dbPlan"."tb_plan" contains around 3 million records and therefore completes in around 20 minutes. This is too long and has a knock on effects on other processes.
It's important that all constraints, indexes and data are copied exactly to table 2.
I had tried dropping the table and re-creating it, however this did not improve speed.
DROP TABLE IF EXISTS "dbPlan"."tb_plan_next";
CREATE TABLE "dbPlan"."tb_plan_next" (LIKE "dbPlan"."tb_plan" INCLUDING ALL);
INSERT INTO "dbPlan"."tb_plan_next" SELECT * FROM "dbPlan"."tb_plan";
Is there a better method for achieving this?
I am considering creating the table and then creating indexes as a second step.
PostgreSQL doesn't provide a very elegant way of doing this. You could use pg_dump with -t and --section= to dump the pre-data and post-data for the table. Then you would replay the pre-data to create the table structure and the check constraints, then load the data from whereever you get it from, then replay the post-data to add the indexes and FK constraints.

How to nuke a postgres schema?

I'm coming up on the limits of my Postgres SQL knowledge, and I'm quite unsure how to diagnose this issue. Please pardon the noob-ness in my questions; I'm open to updating the question as the (expected) follow-up questions come.
I have a fairly complex database structure, in which under a schema, a number of tables are connected to one another by foreign keys. I unfortunately cannot reveal the schema itself.
One of the tables, let's call it "A", used to store close to 100K records. It's got foreign key relationships to two other tables, one called "B" with also approx. 100K records, and the other called "C" with approx. 100 records. There are 5 more tables as well.
I wanted to drop all of the tables. However, using:
truncate table schema.A cascade
takes a very long time (over 10 minutes without finishing), even though I have already removed all rows from the table (yes, I understand truncate is designed to do that exact operation). This is the first point that I don't understand: why would it take a long time to perform this operation?
Secondly, I tried:
drop table schema.A;
(using Postico, a GUI, rather than by entering SQL commands directly)
That also runs for over 10 minutes without finishing.
Are the foreign key relations the key blocker here?
If I wanted to "just quickly nuke" the schema, and start over from scratch (all of my table schemas are defined in a SQLAlchemy file, so recreating is trivial), would I have to drop the entire schema using admin privileges, or is it possible to do it as a user without admin privileges?
If you want to drop the schema:
DROP SCHEMA schema_name CASCADE
For the default schema:
DROP SCHEMA public CASCADE
To quickly reset a single schema database:
DROP SCHEMA public CASCADE;
CREATE SCHEMA public;

Implications of using ADD COLUMN on large dataset

Docs for Redshift say:
ALTER TABLE locks the table for reads and writes until the operation completes.
My question is:
Say I have a table with 500 million rows and I want to add a column. This sounds like a heavy operation that could lock the table for a long time - yes? Or is it actually a quick operation since Redshift is a columnar db? Or it depends if column is nullable / has default value?
I find that adding (and dropping) columns is a very fast operation even on tables with many billions of rows, regardless of whether there is a default value or it's just NULL.
As you suggest, I believe this is a feature of the it being a columnar database so the rest of the table is undisturbed. It simply creates empty (or nearly empty) column blocks for the new column on each node.
I added an integer column with a default to a table of around 65M rows in Redshift recently and it took about a second to process. This was on a dw2.large (SSD type) single node cluster.
Just remember you can only add a column to the end (right) of the table, you have to use temporary tables etc if you want to insert a column somewhere in the middle.
Personally I have seen rebuilding the table works best.
I do it in following ways
Create a new table N_OLD_TABLE table
Define the datatype/compression encoding in the new table
Insert data into N_OLD(old_columns) select(old_columns) from old_table Rename OLD_Table to OLD_TABLE_BKP
Rename N_OLD_TABLE to OLD_TABLE
This is a much faster process. Doesn't block any table and you always have a backup of old table incase anything goes wrong

Insert data from staging table into multiple, related tables?

I'm working on an application that imports data from Access to SQL Server 2008. Currently, I'm using a stored procedure to import the data individually by record. I can't go with a bulk insert or anything like that because the data is inserted into two related tables...I have a bunch of fields that go into the Account table (first name, last name, etc.) and three fields that will each have a record in an Insurance table, linked back to the Account table by the auto-incrementing AccountID that's selected with SCOPE_IDENTITY in the stored procedure.
Performance isn't very good due to the number of round trips to the database from the application. For this and some other reasons I'm planning to instead use a staging table and import the data from there. Reading up on my options for approaching this, a cursor that executes the same insert stored procedure on the data in the staging table would make sense. However it appears that cursors are evil incarnate and should be avoided.
Is there any way to insert data into one table, retrieve the auto-generated IDs, then insert data for the same records into another table using the corresponding ID, in a set-based operation? Or is a cursor my only option here?
Look at the OUTPUT clause. You should be able to add it to your INSERT statement to do what you want.
BTW, if you need to output columns into the second table that weren't inserted into the first one, then use MERGE instead of INSERT (as suggested in the comment to the original question) as its OUTPUT clause supports referencing other columns from the source table(s). Otherwise, keeping it with an INSERT is more straightforward, and it does give you access to the inserted identity column.
I'm having experiment to worked out in inserting multiple record into related table using databinding. So, try this!
Hopefully this is very helpful. Follow this link How to insert record into related tables. for more information.