How to nuke a postgres schema? - postgresql

I'm coming up on the limits of my Postgres SQL knowledge, and I'm quite unsure how to diagnose this issue. Please pardon the noob-ness in my questions; I'm open to updating the question as the (expected) follow-up questions come.
I have a fairly complex database structure, in which under a schema, a number of tables are connected to one another by foreign keys. I unfortunately cannot reveal the schema itself.
One of the tables, let's call it "A", used to store close to 100K records. It's got foreign key relationships to two other tables, one called "B" with also approx. 100K records, and the other called "C" with approx. 100 records. There are 5 more tables as well.
I wanted to drop all of the tables. However, using:
truncate table schema.A cascade
takes a very long time (over 10 minutes without finishing), even though I have already removed all rows from the table (yes, I understand truncate is designed to do that exact operation). This is the first point that I don't understand: why would it take a long time to perform this operation?
Secondly, I tried:
drop table schema.A;
(using Postico, a GUI, rather than by entering SQL commands directly)
That also runs for over 10 minutes without finishing.
Are the foreign key relations the key blocker here?
If I wanted to "just quickly nuke" the schema, and start over from scratch (all of my table schemas are defined in a SQLAlchemy file, so recreating is trivial), would I have to drop the entire schema using admin privileges, or is it possible to do it as a user without admin privileges?

If you want to drop the schema:
DROP SCHEMA schema_name CASCADE
For the default schema:
DROP SCHEMA public CASCADE
To quickly reset a single schema database:
DROP SCHEMA public CASCADE;
CREATE SCHEMA public;

Related

cannot truncate the tables in landing area after transfering data

I have 2 schemas with exactly the same 15 tables in Postgres. Data will be inserted in to first schema every 3 hours. then data needs to be transfer into second schema.
afterthat tables of the first schema needs to be truncated. (landing area needs to be empty).I wrote a trigger to transfer data after inserting to first schema into second schema.
but how tables of the first schema should be truncated?
I searched already and I tried two ways but non of them works.
1.I put the truncate command after all (insert into... conflict on... )in the same trigger's function that transfer data from first schema into second schema. which doesn't work.
2.I made another trigger which will implemented after insert into (or update) tables of second schema. which also doesn't work. why this one doesn't work? if data already inserted into second_schema. it should be possible to trucate first_schema. It isnot in use by active query.
Error->cannot TRUNCATE "table1" because it is being used by active queries in this session
what should I do? I need to tranfer data after every insert into first schema, to second schema. and then truncate all 15 tables of the first schema.
Tables of first schema should be empty for new insert.

Huge delete on PostgreSQL table : Deleting 99,9% of the rows of the table

I have a table in my PostgreSQL database that became huge, filled with a lot of useless rows.
As these useless rows represent 99.9% of my table data (about 3.3M rows), I was wondering if deleting them could have a bad impact on my DB :
I know that this operation could take some time and I will be able to block writes on the table during the maintenance operation
But I was wondering if this huge change in the data could also impact performance after the opertation itself.
I found solutions like creating a new table / using TRUNCATE to drop all lines but as this operation will be specific and one shot, I would like to be able to choose the most adapted solution.
I know that Postgre SQL has a VACUUM mechanism but I'm not a DBA expert : Could anyone please confirm that this delete will not impact my table integrity / data structure and that freed space will be reclaimed if needed for new data ?
PostgreSQL 11.12, with default settings on AWS RDS. I don't have any index on my table and the criteria for rows deletion will not be based on the PK
Deleting rows typically does not shrink a PostgreSQL table, sou you would then have to run VACUUM (FULL) to compact it, during which the table is inaccessible.
If you are deleting many rows, both the DELETE and the VACUUM (FULL) will take a long time, and you would be much better off like this:
create a new table that is defined like the old one
INSERT INTO new_tab SELECT * FROM old_tab WHERE ... to copy over the rows you want to keep
drop foreign key constraints that point to the old table
create all indexes and constraints on the new table
drop the old table and rename the new one
By planning that carefully, you can get away with a short down time.

Migrating `int` to `bigint` in PostgresSQL without any downtime?

I have a database that is going to experience the integer exhaustion problem that Basecamp famously faced back in November. I have several months to figure out what to do.
Is there a no-downtime-required, proactive solution to migrating this column type? If so what is it? If not, is it just a matter of eating the downtime and migrating the column when I can?
Is this article sufficient, assuming I have several days/weeks to perform the migration now before I'm forced to do it when I run out of ids?
Use logical replication.
With logical replication you can have different data types at primary and standby.
Copy the schema with pg_dump -s, change the data types on the copy and then start logical replication.
Once all data is copied over, switch the application to use the standby.
For zero down time, the application has to be able to reconnect and retry, but that's always a requirement in such a case.
You need PostgreSQL v10 or better for that, and your database
shouldn't modify the schema, as DDL is not replicated.
should not use sequence (SERIAL or IDENTITY), as the last used value would not be replicated
Another solution for pre-v10 databases where all transactions are short:
Add a bigint column to the table.
Create a BEFORE trigger that sets the new column whenever a row is added or updated.
Run a series of updates that set the new column from the old one where it IS NULL. Keep those batches short so you don't lock long and don't deadlock much. Make sure these transaction run with session_replication_role = replica so they don't trigger triggers.
Once all rows are updated, create a unique index CONCURRENTLY on the new column.
Add a unique constraint USING the index you just created. That will be fast.
Perform the switch:
BEGIN;
ALTER TABLE ... DROP oldcol;
ALTER TABLE ... ALTER newcol RENAME TO oldcol;
COMMIT;
That will be fast.
Your new column has no NOT NULL set. This cannot be done without a long invasive lock. But you can add a check constraint IS NOT NULL and create it NOT VALID. That is good enough, and you can later validate it without disruptions.
If there are foreign key constraints, things get a little more complicated. You have to drop these and create NOT VALID foreign keys to the new column.
Create a copy of the old table but with modified ID field. Next create a trigger on the old table that inserts new data to both tables. Finally copy data from the old table to the new one (it would be a good idea to distinguish pre-trigger data with post-trigger for example by id if it is sequential). Once you are done switch tables and delete the old one.
This obviously requires twice as much space (and time for copy) but will work without any downtime.

Is it possible to merge two Postgres databases

We have two copies of a simple application that is based on SQLite. The application has 10 tables with a variety of relations between the tables. We would like to merge the databases to a single Postgres database with the same schema. We can use Talend to facilitate this, however the issue is that there would be duplicate keys (as both the source databases are independent). Is there a systematic method by which we can insert data into Postgres with the original key plus an offset resulting from loading the first database?
Step 1. Restore the first database.
Step 2. Change foreign keys of all tables by adding the option on update cascade.
For example, if the column table_b.a_id refers to the column table_a.id:
alter table table_b
drop constraint table_b_a_id_fkey,
add constraint table_b_a_id_fkey
foreign key (a_id) references table_a(id)
on update cascade;
Step 3. Update primary keys of the tables by adding the desired offset, e.g.:
update table_a
set id = 10000+ id;
Step 4. Restore the second database.
If you have the possibility to edit the script with database schema (or do the transfer manually with your own script), you can merge steps 1 and 2 and edit the script before the restore (adding the option on update cascade for foreign keys in tables declarations).

Create TABLE in many Postgres schemas with a single command

I have two schemas in my Postgres BOOK database, MSPRESS and ORELLY.
I want to create the same table in two schemas:
CREATE TABLE MSPRRESS.BOOK(title TEXT, author TEXT);
CREATE TABLE ORELLY.BOOK(title TEXT, author TEXT);
Now, I want to create the same table in all my schemas with a single command.
To accomplish this, I thought about event triggers available in Postgres 9.3 (http://www.postgresql.org/docs/9.3/static/event-triggers.html). Intercepting CREATE TABLE command by my event trigger, I thought to determine name of table and schema in which it is created and just repeat the same command for all available schemas. Sadly, event trigger procedure does not get name of table being created.
Is there a way to 'real-time' synchronization of Postgres schema?
Currently only TG_EVENT and TG_TAG is available from an event trigger, but this feature will likely be expanded. In the meantime, you can query information_schema for differences and try to add every table, where its missing; but don't forget that this synchronization will also trigger several event triggers, so you should do it carefully.
But if you just want to build several schemas with the same structure (without further synchronization), you could just write schema-less queries to build them & run it on every schema one-by-one, after changing search path with SET search_path / SET SCHEMA.