I have two schemas in my Postgres BOOK database, MSPRESS and ORELLY.
I want to create the same table in two schemas:
CREATE TABLE MSPRRESS.BOOK(title TEXT, author TEXT);
CREATE TABLE ORELLY.BOOK(title TEXT, author TEXT);
Now, I want to create the same table in all my schemas with a single command.
To accomplish this, I thought about event triggers available in Postgres 9.3 (http://www.postgresql.org/docs/9.3/static/event-triggers.html). Intercepting CREATE TABLE command by my event trigger, I thought to determine name of table and schema in which it is created and just repeat the same command for all available schemas. Sadly, event trigger procedure does not get name of table being created.
Is there a way to 'real-time' synchronization of Postgres schema?
Currently only TG_EVENT and TG_TAG is available from an event trigger, but this feature will likely be expanded. In the meantime, you can query information_schema for differences and try to add every table, where its missing; but don't forget that this synchronization will also trigger several event triggers, so you should do it carefully.
But if you just want to build several schemas with the same structure (without further synchronization), you could just write schema-less queries to build them & run it on every schema one-by-one, after changing search path with SET search_path / SET SCHEMA.
Related
When collaborating with colleagues I need to change the schema name every time I receive a SQL script (Postgres).
I am only an ordinary user of a corporate database (no permissions to change anything). Also, we are not allowed to create tables in PUBLIC schema. However, we can use (read-only) all the tables from BASE schema.
It is cumbersome for the team of users, where everybody is creating SQL scripts (mostly only for creating tables), which need to be shared amongst others. Every user has its own schema.
Is it possible to change the script below, where I will share the script to another user without the need for the other user to find/replace the schema, in this case, user1?
DROP TABLE IF EXISTS user1.table1;
CREATE TABLE user1.table1 AS
SELECT * FROM base.table1;
You can set the default schema at the start of the script (similar to what pg_dump generates):
set search_path = user1;
DROP TABLE IF EXISTS table1;
CREATE TABLE table1 AS
SELECT * FROM base.table1;
Because the search path was change to contain user1 as the first schema, tables will be searched in that schema when dropping and creating. And because the search path does not include any other schema, no other schema will be consulted.
If you
However the default search_path is "$user", public which means that any unqualified table will be searched or created in a schema with the same name as the current user.
Caution
Note that a DROP TABLE will drop the table in the first schema found in that case. So if table1 doesn't exist in the user's schema, but in the public schema, it would be dropped from the public schema. So for your use-case setting the path to exactly one schema might be more secure.
I'm coming up on the limits of my Postgres SQL knowledge, and I'm quite unsure how to diagnose this issue. Please pardon the noob-ness in my questions; I'm open to updating the question as the (expected) follow-up questions come.
I have a fairly complex database structure, in which under a schema, a number of tables are connected to one another by foreign keys. I unfortunately cannot reveal the schema itself.
One of the tables, let's call it "A", used to store close to 100K records. It's got foreign key relationships to two other tables, one called "B" with also approx. 100K records, and the other called "C" with approx. 100 records. There are 5 more tables as well.
I wanted to drop all of the tables. However, using:
truncate table schema.A cascade
takes a very long time (over 10 minutes without finishing), even though I have already removed all rows from the table (yes, I understand truncate is designed to do that exact operation). This is the first point that I don't understand: why would it take a long time to perform this operation?
Secondly, I tried:
drop table schema.A;
(using Postico, a GUI, rather than by entering SQL commands directly)
That also runs for over 10 minutes without finishing.
Are the foreign key relations the key blocker here?
If I wanted to "just quickly nuke" the schema, and start over from scratch (all of my table schemas are defined in a SQLAlchemy file, so recreating is trivial), would I have to drop the entire schema using admin privileges, or is it possible to do it as a user without admin privileges?
If you want to drop the schema:
DROP SCHEMA schema_name CASCADE
For the default schema:
DROP SCHEMA public CASCADE
To quickly reset a single schema database:
DROP SCHEMA public CASCADE;
CREATE SCHEMA public;
I am about to model a PostgreSQL database, based on an Oracle database. The latter is old and its tables have been named after a 3-letter-scheme.
E.g. a table that holds parameters for tasks would be named TSK_PAR.
As I model the new database, I'd like to rename those tables to a more descriptive name using actual words. My problem is, that some parts of the software might rely on these old names until they're rewritten and adapted to the new scheme.
Is it possible to create something like an alias that's being used for the whole database?
E.g. I create a new task_parameters database, but add a TSK_PAR alias to it, so if a SELECT * FROM TSK_PAR is being used, it automatically refers to the new name?
Postgres has no synonyms like Oracle.
But for your intended use case, views should do just fine. A view that simply does select * from taks_parameters is automatically updateable (see here for an online example).
If you don't want to clutter your default schema (usually public) with all those views, you can create them in a different schema, and then adjust the user's search path to include that "synonym schema".
For example:
create schema synonyms;
create table public.task_parameters (
id integer primary key,
....
);
create view synonyms.task_par
as
select *
from public.task_parameters;
However, that approach has one annoying drawback: if a table is used by a view, the allowed DDL statements on it are limited, e.g. you can't drop a column or rename it.
As we manage our schema migrations using Liquibase, we always drop all views before applying "normal" migrations, then once everything is done, we simply re-create all views (by running the SQL scripts stored in Git). With that approach, ALTER TABLE statements never fail because there are not views using the tables. As creating a view is really quick, it doesn't add overhead when deploying a migration.
We're in the process of running a handful of hourly scripts on our Redshift cluster which build summary tables for data consumers. After assembling a staging table, the script then runs a transaction which deletes the existing table and replaces it with the staging table, as such:
BEGIN;
DROP TABLE IF EXISTS public.data_facts;
ALTER TABLE public.data_facts_stage RENAME TO data_facts;
COMMIT;
The problem with this operation is that long-running analysis queries will place an AccessShareLock on public.data_facts, preventing it from being dropped and thrashing our ETL cycle. I'm thinking a better solution would be one which renames the existing table, as such:
ALTER TABLE public.data_facts RENAME TO data_facts_old;
ALTER TABLE public.data_facts_stage RENAME TO data_facts;
DROP TABLE public.data_facts_old;
However, this approach presupposes that 1) public.data_facts exists, and 2) public.data_facts_old does not exist.
Do you know if there's a way to conduct this operation safely in SQL, without relying on application logic? (eg. something like ALTER TABLE IF EXISTS).
I haven't tried it but looking at the documentation of CREATE VIEW it seems that this can be done with late-binding views.
The main idea would be a view public.data_facts that users interact with. Behind the scenes, you can load new data and then swap the view to “point” to the new table.
Bootstrap
-- load data into public.data_facts_v0
CREATE VIEW public.data_facts AS
SELECT * from public.data_facts_v0 WITH NO SCHEMA BINDING;
Update
-- load data into public.data_facts_v1
CREATE OR REPLACE VIEW public.data_facts AS
SELECT * from public.data_facts_v1 WITH NO SCHEMA BINDING;
DROP TABLE public.data_facts_v0;
The WITH NO SCHEMA BINDING means the view will be late-binding. “A late-binding view doesn't check the underlying database objects, such as tables and other views, until the view is queried.” This means the update can even introduce a table with renamed columns or a completely new structure.
Notes:
It might be a good idea to wrap the swap operations into a transaction to make sure we don't drop the previous table if the VIEW swap failed.
You can add a new load time timestamp encode runlength default getdate() column to your target table, and make your ETL do this:
INSERT INTO public.data_facts
SELECT * FROM public.data_facts_staging;
DELETE FROM public.data_facts
WHERE load_time<(select max(load_time) from public.data_facts);
DROP TABLE public.data_facts_staging;
note: public.data_facts_staging should have exactly the same structure as public.data_facts except that the last column of public.data_facts is load_time, so that on insert it will be populated with the current timestamp.
The only implication is that it would require extra disk space for a moment between you insert new rows and delete the old rows, and load_time has to be always the last column. Also you have to vaccum table every time you do this.
Another good thing about this is that if your ETL fails and staging table is empty or there is no staging table you won't lose your data. In the pure SQL scenario of swapping tables with DDL you're not protected from dropping the target table when staging table is missing. In the suggested scenario if no new rows are inserted the delete statement deletes nothing (there are no rows less than max load time), so worst case is just having the old version of data.
p.s. there is a command that instead of insert ... select ... just changes the pointer from staging to target table (alter table ... append from ...) but it requires the same type of lock as alter table I guess, so I don't suggest this
I am developing a windows application and using Postgres as backend database. At some point in my application i am dynamically creating table e.g Table1, then Table2 and so on. In this way i have many dynamic table in my database. Now i provide a button "Clean Database", so i need to remove all those dynamic tables using SQL query. Should some one guide me how to write SQL Query that automatically delete all such tables?
You should just be able to say
DROP TABLE {tablename}
for each dynamically created table. Try that and see if it works.