How to migrate dependent views when deep copying the table - amazon-redshift

I am adding distkey/sortkey to all my tables in redshift and would like to automate this. I am doing the following:
ALTER TABLE table RENAME TO tmp_table;
CREATE TABLE table
distkey(id)
sortkey(id)
AS
select * from tmp_table;
DROP TABLE tmp_table;
This works great, except the views don't get migrated. When you ALTER TABLE, the existing views would point to the tmp_table. Ideally I want to restore the views to the way before, possibly in the same query transaction or as part of a script.

In your migration procedure:
For each view execute:
select definition from pg_views where viewname = 'my_view';
And store all the results.
Execute for all views:
drop view 'my_view'; // or rename only to have a rollback option
Alter your tables
Execute the create view commands obtained in step 2

Just drop and recreate the views. That could be part of the script.
Views might not form part of a transaction, so some testing might be required.

Related

How to practically rename tables and columns in PostgreSQL on a production system?

It often happens that the original name something is given is not the best name. Maybe requirements shifted slightly, or maybe as time went on, a better understanding of the concept being represented developed. Sometimes one name is used in developing a feature, but testing with real users reveals that a better name is needed, and it'd be nice to have the names for things in the DB match the names used in the UI.
PostgreSQL lets you rename tables and columns via alter table, but often this isn't feasible in a production without significant downtime. If existing clients are using the old name, you can't just yank it out from under them.
I was hoping there was some way to "add a name" to a table or column, so that old_name and new_name will both work, and then at a later time remove the old name. Then a name migration could work like this:
add new_name
modify all clients to use new_name instead of old_name
once all clients have been updated, remove old_name
Is there a way to do this? If not, is there a recommended procedure for renaming a column while minimizing downtime? How about a table?
Some recipes for renaming tables and/or columns in a production system that seem to work. However, I've only tested these on a small test database, not on a large production system. Renaming and view creation are both supposed to be very fast, though.
Renaming a table
Rename the table, and temporarily add an updatable view with the old name:
begin;
alter table old_table rename to new_table;
create view old_table as select * from new_table;
commit;
Migrate clients to use the new table name.
Once all clients are migrated, drop the view:
drop view old_table;
Renaming columns without renaming the table
Renaming a column without renaming the table is a bit more complicated,
because we can't have a view shadow a table (apparently).
Rename the column(s), temporarily rename the table, and add an updatable
view that adds the old name for the column with the table's correct name:
begin;
alter table my_table rename column old_name to new_name;
alter table my_table rename to my_table_tmp;
create view my_table as
select *, new_name as old_name from my_table_tmp;
commit;
Migrate clients to use the new column name.
Once all clients are migrated, drop the view, rename the table back:
begin;
drop view my_table;
alter table my_table_tmp rename to my_table;
commit;
Renaming a table and some of its columns simultaneously
Rename the table and columns, and temporarily add an updatable view with the old name and old columns:
begin;
alter table old_table rename to new_table;
alter table new_table rename column to old_name to new_name;
create view old_table as
select *, old_name as new_name from new_table;
commit;
Instead of select *, old_name as new_name, it might be better to only have
the original set of columns, so clients have to migrate to the new table
name to get the new column names:
create view old_table as
select unchanged_name, old_name as new_name from new_table;
Migrate clients to use the new table and column names.
Once all clients are migrated, drop the view:
drop view old_table;

How to create materialized view of an external table

CREATE VIEW materialized_view WITH SCHEMABINDING AS
SELECT ...
FROM ext.external_table
Fails with
The option 'SCHEMABINDING' is not supported with external tables.
If I understand correctly SCHEMABINDING is necessary to make a materialized view.
How can I correct this query?
You cannot create an indexed view based on tables that are in a different database.
I think your options are:
a) create the indexed view in the other database and create a regular view in this database to query that indexed view
b) create a copy of the table in this database and a mechanism to update this table whenever the data is changed in the table which is in the other database; this could be done with triggers, replication, a stored procedure called on a schedule, etc.

Best practices for performing a table swap in Redshift

We're in the process of running a handful of hourly scripts on our Redshift cluster which build summary tables for data consumers. After assembling a staging table, the script then runs a transaction which deletes the existing table and replaces it with the staging table, as such:
BEGIN;
DROP TABLE IF EXISTS public.data_facts;
ALTER TABLE public.data_facts_stage RENAME TO data_facts;
COMMIT;
The problem with this operation is that long-running analysis queries will place an AccessShareLock on public.data_facts, preventing it from being dropped and thrashing our ETL cycle. I'm thinking a better solution would be one which renames the existing table, as such:
ALTER TABLE public.data_facts RENAME TO data_facts_old;
ALTER TABLE public.data_facts_stage RENAME TO data_facts;
DROP TABLE public.data_facts_old;
However, this approach presupposes that 1) public.data_facts exists, and 2) public.data_facts_old does not exist.
Do you know if there's a way to conduct this operation safely in SQL, without relying on application logic? (eg. something like ALTER TABLE IF EXISTS).
I haven't tried it but looking at the documentation of CREATE VIEW it seems that this can be done with late-binding views.
The main idea would be a view public.data_facts that users interact with. Behind the scenes, you can load new data and then swap the view to “point” to the new table.
Bootstrap
-- load data into public.data_facts_v0
CREATE VIEW public.data_facts AS
SELECT * from public.data_facts_v0 WITH NO SCHEMA BINDING;
Update
-- load data into public.data_facts_v1
CREATE OR REPLACE VIEW public.data_facts AS
SELECT * from public.data_facts_v1 WITH NO SCHEMA BINDING;
DROP TABLE public.data_facts_v0;
The WITH NO SCHEMA BINDING means the view will be late-binding. “A late-binding view doesn't check the underlying database objects, such as tables and other views, until the view is queried.” This means the update can even introduce a table with renamed columns or a completely new structure.
Notes:
It might be a good idea to wrap the swap operations into a transaction to make sure we don't drop the previous table if the VIEW swap failed.
You can add a new load time timestamp encode runlength default getdate() column to your target table, and make your ETL do this:
INSERT INTO public.data_facts
SELECT * FROM public.data_facts_staging;
DELETE FROM public.data_facts
WHERE load_time<(select max(load_time) from public.data_facts);
DROP TABLE public.data_facts_staging;
note: public.data_facts_staging should have exactly the same structure as public.data_facts except that the last column of public.data_facts is load_time, so that on insert it will be populated with the current timestamp.
The only implication is that it would require extra disk space for a moment between you insert new rows and delete the old rows, and load_time has to be always the last column. Also you have to vaccum table every time you do this.
Another good thing about this is that if your ETL fails and staging table is empty or there is no staging table you won't lose your data. In the pure SQL scenario of swapping tables with DDL you're not protected from dropping the target table when staging table is missing. In the suggested scenario if no new rows are inserted the delete statement deletes nothing (there are no rows less than max load time), so worst case is just having the old version of data.
p.s. there is a command that instead of insert ... select ... just changes the pointer from staging to target table (alter table ... append from ...) but it requires the same type of lock as alter table I guess, so I don't suggest this

Convert a view to a table in PostgreSQL

I have a view view_a in my database on which several other views depend (view_b, view_c, etc.)
I need to convert view_a into a table, because I no longer want the information in this relation to be dynamic and I need the capability to edit rows manually.
Can I replace view_a with a table without doing a DROP CASCADE and redefining all views that reference view_a?
Clarification: I want view_b and view_c to continue to reference view_a (now a table). I want to replace a view with a table, not have a table in addition to a view.
I was able to resolve this without tracking down and redefining all objects that depend on view_a, at the expense of adding one level of useless redirection.
-- create a copy of the result of view_a
CREATE TABLE table_a AS SELECT * FROM view_a;
-- redefine view_a to its own result
CREATE OR REPLACE VIEW view_a AS SELECT * FROM table_a;

TOAD-Modify Materialized View Select Query without dropping

Is there any way I can edit the MVIEW query without dropping it in TOAD.I am not sure if we can do it?If not how Can I do it in a way that does not affect the table contents?
Thanks in advance
You can't alter a materialized view query, you have to drop and recreate it.
See: http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_2001.htm
Use the ALTER MATERIALIZED VIEW statement to modify an existing
materialized view in one or more of the following ways:
To change its storage characteristics
To change its refresh method, mode, or time
To alter its structure so that it is a different type of materialized
view
To enable or disable query rewrite
As pointed out by Jeffre Kemp, you cannot change the query of an existing Materialized View.
However, if you expect that changes will be necessary in the future, you can choose one of these approaches:
use an intermediate VIEW that does all the heavy lifting; your MATERIALIZED VIEW then does a simple SELECT * FROM myView. This allows you to change the query - however, it does not allow you to add / remove columns to/from the MATERIALIZED VIEW
use a pre-built table for your materialized view; this logically separates the TABLE that holds the content from the query that is used to update the content. Now, you can drop the MATERIALIZED VIEW, add/remove columns from your TABLE as needed, and re-create the MATERIALIZED VIEW afterwards
use a synonym for all code that queries the MATERIALIZED VIEW; if necessary, create a backup table using CTAS, point the synonym to this backup table, drop and re-create your MATERIALIZED VIEW, point the synonym back to the MATERIALIZED VIEW, and drop your backup table (that's how we do it)