amazon redshift utility - view v_generate_tbl_ddl not returning primary key constraint in ddl after table has been populated

amazon redshift utility - view v_generate_tbl_ddl not returning primary key constraint in ddl after table has been populated - postgresql

I'm using the amazon redshift utility v_generate_tbl_ddl to generate a table ddl, but have found that after my table has been populated the primary key constraint doesn't show (example below)
Structure table creation:
DROP TABLE IF EXISTS temp_table;
CREATE TABLE temp_table
(
id INTEGER NOT NULL,
text VARCHAR,
PRIMARY KEY (id)
)
DISTSTYLE ALL SORTKEY (id);
Checking the ddl after the table has been created shows the primary key constraint is present:
SELECT *
FROM admin.v_generate_tbl_ddl
WHERE tablename = 'temp_table'
AND schemaname = 'myschema';
Inserting some data:
COPY temp_table
FROM 's3://s3_bucket_location/manifest_file_temp_table.json' CREDENTIALS 'aws_access_key_id=...;aws_secret_access_key=...' DELIMITER AS '|' gzip manifest
Checking again the ddl and now the primary key constraint isn't there
SELECT *
FROM admin.v_generate_tbl_ddl
WHERE tablename = 'temp_table'
AND schemaname = 'myschema';
I'm sure the primary key is still there as I've doubled checked using:
SELECT pg_table_def.tablename,
pg_table_def.column
FROM pg_table_def
WHERE schemaname = 'myschema'
AND tablename = 'temp_table_pkey';
Digging into the sql of v_generate_tbl_ddl it is this section that isn't returning rows after the table has been populated, but I can't figure out why this stops returning rows? Any help here would be appreciated
SELECT n.nspname AS schemaname,
c.relname AS tablename,
200000000 + CAST(con.oid AS INT) AS seq,
('\t,' + pg_get_constraintdef(con.oid))::CHARACTER VARYING (500) AS ddl
FROM pg_constraint AS con
INNER JOIN pg_class AS c
ON c.relnamespace = con.connamespace
AND c.relfilenode = con.conrelid
INNER JOIN pg_namespace AS n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
AND pg_get_constraintdef (con.oid) NOT LIKE 'FOREIGN KEY%'
AND c.relname = 'temp_table'
ORDER BY seq

Related

Select columns that are primary key or foreign key only

Assuming that I have the following table
CREATE TABLE my_table (
a TEXT NULL,
id TEXT PRIMARY KEY NOT NULL,
c TEXT NULL,
d TEXT REFERENCES other_table_1(id) NOT NULL,
e TEXT REFERENCES other_table_2(id) NOT NULL
);
I want to perform a select statement that only select important column that is primary key and foreign key only.
SELECT (...?) FROM my_table
Expected output columns only id, d, e
What is the best non-hacky way I can achieve this?

You can create a function that build the sql statement for you and after you can execute the result.
CREATE OR REPLACE FUNCTION build_select(_tbl regclass)
RETURNS text AS
$func$
SELECT format('SELECT %s FROM %s'
, string_agg(quote_ident(important_column), ', ')
, $1)
FROM (SELECT a.attname as important_column
FROM pg_index i
JOIN pg_attribute a ON a.attrelid = i.indrelid
AND a.attnum = ANY(i.indkey)
WHERE i.indrelid = $1
AND i.indisprimary
UNION
SELECT ta.attname AS important_column
FROM (
SELECT conname, conrelid, confrelid,
unnest(conkey) AS conkey, unnest(confkey) AS confkey
FROM pg_constraint
WHERE conrelid = $1
) sub
JOIN pg_attribute AS ta ON ta.attrelid = conrelid AND ta.attnum = conkey
JOIN pg_attribute AS fa ON fa.attrelid = confrelid AND fa.attnum = confkey) my_sub_query
$func$ LANGUAGE sql;
With this function, if you do:
SELECT build_select('my-schema.my-table')
Will return you:
SELECT id, d, e FROM my-schema.my-table
And you can execute this one

Redshift - Cannot load query results into table - leader node issue

My goal is it to load query results to a table to store an ordered list of column names for a given table. (Then I will stuff all these column names, into a single column, using the listagg function which I will pass to dynamic sql.) The reason I cannot load this into a table is because system table queries compute on leader nodes, yet this query does not execute on a leader node, and there is no way to force it to execute on a leader node. Any ideas how to get this to execute successfully?
create table temp_columns
as
select
cast(t1.columnname as varchar) as columname
--,
--cast(t2.ordinal_position as integer) as ordinal_position
FROM
(
SELECT
cast(schemaname as varchar) as schemaname,
cast(tablename as varchar) as tablename,
cast("column" as varchar) as columnname
FROM PG_TABLE_DEF
WHERE
schemaname = 'schema1'
and tablename = 'table1'
)
t1
join information_schema.columns t2
on t1.schemaname = t2.table_schema
and t1.tablename = t2.table_name
and t1.columnname = t2.column_name
WHERE
t1.schemaname = 'schema1'
and t1.tablename = 'table1'
order by t2.ordinal_position;

We got around this by creating the table first, then doing a INSERT INTO.
so something like:
create table public.temp_columns
(
columname varchar(255),
ordinal_position int
);
insert into public.temp_columns
select
cast(t1.columnname as varchar) as columname,
cast(t2.ordinal_position as integer) as ordinal_position
FROM
(
SELECT
cast(schemaname as varchar) as schemaname,
cast(tablename as varchar) as tablename,
cast("column" as varchar) as columnname
FROM PG_TABLE_DEF
WHERE
schemaname = 'schema1'
and tablename = 'table1'
)
t1
join information_schema.columns t2
on t1.schemaname = t2.table_schema
and t1.tablename = t2.table_name
and t1.columnname = t2.column_name
WHERE
t1.schemaname = 'schema1'
and t1.tablename = 'table1'
order by t2.ordinal_position;

How to list tables affected by cascading delete

I'm trying to perform a cascading delete on 15+ tables but I'm not certain that all of the requisite foreign keys have been configured properly. I would like to check for missing constraints without manually reviewing each constraint.
Is there a way to obtain a list of tables that will be affected by a cascading delete query?

Use pg_depend. Example:
create table master (id int primary key);
create table detail_1 (id int, master_id int references master(id) on delete restrict);
create table detail_2 (id int, master_id int references master(id) on delete cascade);
select pg_describe_object(classid, objid, objsubid)
from pg_depend
where refobjid = 'master'::regclass and deptype = 'n';
pg_describe_object
------------------------------------------------------
constraint detail_1_master_id_fkey on table detail_1
constraint detail_2_master_id_fkey on table detail_2
(2 rows)
deptype = 'n' means:
DEPENDENCY NORMAL - A normal relationship between separately-created
objects. The dependent object can be dropped without affecting the
referenced object. The referenced object can only be dropped by
specifying CASCADE, in which case the dependent object is dropped,
too.
Use pg_get_constraintdef() to get constraint definitions:
select
pg_describe_object(classid, objid, objsubid),
pg_get_constraintdef(objid)
from pg_depend
where refobjid = 'master'::regclass and deptype = 'n';
pg_describe_object | pg_get_constraintdef
------------------------------------------------------+------------------------------------------------------------------
constraint detail_1_master_id_fkey on table detail_1 | FOREIGN KEY (master_id) REFERENCES master(id) ON DELETE RESTRICT
constraint detail_2_master_id_fkey on table detail_2 | FOREIGN KEY (master_id) REFERENCES master(id) ON DELETE CASCADE
(2 rows)
To find the full chain of cascading dependencies we should use recursion and look into the catalog pg_constraint to get id of a dependent table.
with recursive chain as (
select classid, objid, objsubid, conrelid
from pg_depend d
join pg_constraint c on c.oid = objid
where refobjid = 'the_table'::regclass and deptype = 'n'
union all
select d.classid, d.objid, d.objsubid, c.conrelid
from pg_depend d
join pg_constraint c on c.oid = objid
join chain on d.refobjid = chain.conrelid and d.deptype = 'n'
)
select pg_describe_object(classid, objid, objsubid), pg_get_constraintdef(objid)
from chain;

Using transitive closure, one can determine the referencing and referenced tables. A caveat is that this query/view depends on the existence of Foreign Keys to determine the dependencies, and will not find tables if the FK's are missing (and the latter seems to be what the OP is asking for).
Table dependencies via Foreign Keys
CREATE OR REPLACE VIEW table_dependencies AS (
WITH RECURSIVE t AS (
SELECT
c.oid AS origin_id,
c.oid::regclass::text AS origin_table,
c.oid AS referencing_id,
c.oid::regclass::text AS referencing_table,
c2.oid AS referenced_id,
c2.oid::regclass::text AS referenced_table,
ARRAY[c.oid::regclass,c2.oid::regclass] AS chain
FROM pg_catalog.pg_constraint AS co
INNER JOIN pg_catalog.pg_class AS c ON c.oid = co.conrelid
INNER JOIN pg_catalog.pg_class AS c2 ON c2.oid = co.confrelid
-- Add this line as an input parameter if you want to make a one-off query
-- WHERE c.oid::regclass::text = 'YOUR TABLE'
UNION ALL
SELECT
t.origin_id,
t.origin_table,
t.referenced_id AS referencing_id,
t.referenced_table AS referencing_table,
c3.oid AS referenced_id,
c3.oid::regclass::text AS referenced_table,
t.chain || c3.oid::regclass AS chain
FROM pg_catalog.pg_constraint AS co
INNER JOIN pg_catalog.pg_class AS c3 ON c3.oid = co.confrelid
INNER JOIN t ON t.referenced_id = co.conrelid
WHERE
-- prevent infinite recursion by pruning paths where the last entry in
-- the path already appears somewhere else in the path
NOT (
ARRAY[ t.chain[array_upper(t.chain, 1)] ] -- an array containing the last element
<# -- "is contained by"
t.chain[1:array_upper(t.chain, 1) - 1] -- a slice of the chain,
-- from element 1 to n-1
)
)
SELECT origin_table,
referenced_table,
array_upper(chain,1) AS "depth",
array_to_string(chain,',') as chain
FROM t
);
Tables referencing a specific table
SELECT * FROM table_dependencies WHERE origin_table = 'clients';
Tables directly related to the "clients" table
SELECT *
FROM table_dependencies
WHERE referenced_table = 'clients'
AND depth = 2
ORDER BY origin_table;

Yes. you can truncate cascade in transaction and rollback. Note ROLLBACK is a key to save the data.
postgres will NOTICE you what other referencing tables will be affected.
postgres=# begin;
BEGIN
postgres=# truncate table a cascade;
NOTICE: truncate cascades to table "b"
TRUNCATE TABLE
postgres=# rollback;
ROLLBACK

Get table schema in Redshift

Hello I am trying to retrieve the schema of an existing table. I am mysql developer and am trying to work with amazon redshift. How can I export the schema of an existing table. In mysql we can use the show create table command.
SHOW CREATE TABLE tblName;

Recently I wrote a python script to clone table schemas between redshift clusters. If you only want the columns and column types of a table, you can do it via:
select column_name,
case
when data_type = 'integer' then 'integer'
when data_type = 'bigint' then 'bigint'
when data_type = 'smallint' then 'smallint'
when data_type = 'text' then 'text'
when data_type = 'date' then 'date'
when data_type = 'real' then 'real'
when data_type = 'boolean' then 'boolean'
when data_type = 'double precision' then 'float8'
when data_type = 'timestamp without time zone' then 'timestamp'
when data_type = 'character' then 'char('||character_maximum_length||')'
when data_type = 'character varying' then 'varchar('||character_maximum_length||')'
when data_type = 'numeric' then 'numeric('||numeric_precision||','||numeric_scale||')'
else 'unknown'
end as data_type,
is_nullable,
column_default
from information_schema.columns
where table_schema = 'xxx' and table_name = 'xxx' order by ordinal_position
;
But if you need the compression types and distkey/sortkeys, you need to query another table:
select * from pg_table_def where tablename = 'xxx' and schemaname='xxx';

This query will give you the complete schema definition including the Redshift specific attributes distribution type/key, sort key, primary key, and column encodings in the form of a create statement as well as providing an alter table statement that sets the owner to the current owner. The only thing it can't tell you are foreign keys. I'm working on the latter, but there's a current privilege issue in RS that prevents us from querying the right tables. This query could use some tuning, but I haven't had time or the need to work it further.
select pk.pkey, tm.schemaname||'.'||tm.tablename, 'create table '||tm.schemaname||'.'||tm.tablename
||' ('
||cp.coldef
-- primary key
||decode(pk.pkey,null,'',pk.pkey)
-- diststyle and dist key
||decode(d.distkey,null,') diststyle '||dist_style||' ',d.distkey)
--sort key
|| (select decode(skey,null,'',skey) from (select
' sortkey(' ||substr(array_to_string(
array( select ','||cast(column_name as varchar(100)) as str from
(select column_name from information_schema.columns col where col.table_schema= tm.schemaname and col.table_name=tm.tablename) c2
join
(-- gives sort cols
select attrelid as tableid, attname as colname, attsortkeyord as sort_col_order from pg_attribute pa where
pa.attnum > 0 AND NOT pa.attisdropped AND pa.attsortkeyord > 0
) st on tm.tableid=st.tableid and c2.column_name=st.colname order by sort_col_order
)
,'')
,2,10000) || ')' as skey
))
||';'
-- additional alter table queries here to set owner
|| 'alter table '||tm.schemaname||'.'||tm.tablename||' owner to "'||tm.owner||'";'
from
-- t master table list
(
SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid ,use2.usename as owner, decode(c.reldiststyle,0,'EVEN',1,'KEY',8,'ALL') as dist_style
FROM pg_namespace n, pg_class c, pg_user use2
WHERE n.oid = c.relnamespace
AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
AND c.relname <> 'temp_staging_tables_1'
and c.relowner = use2.usesysid
) tm
-- cp creates the col params for the create string
join
(select
substr(str,(charindex('QQQ',str)+3),(charindex('ZZZ',str))-(charindex('QQQ',str)+3)) as tableid
,substr(replace(replace(str,'ZZZ',''),'QQQ'||substr(str,(charindex('QQQ',str)+3),(charindex('ZZZ',str))-(charindex('QQQ',str)+3)),''),2,10000) as coldef
from
( select array_to_string(array(
SELECT 'QQQ'||cast(t.tableid as varchar(10))||'ZZZ'|| ','||column_name||' '|| decode(udt_name,'bpchar','char',udt_name) || decode(character_maximum_length,null,'', '('||cast(character_maximum_length as varchar(9))||')' )
-- default
|| decode(substr(column_default,2,8),'identity','',null,'',' default '||column_default||' ')
-- nullable
|| decode(is_nullable,'YES',' NULL ','NO',' NOT NULL ')
-- identity
|| decode(substr(column_default,2,8),'identity',' identity('||substr(column_default,(charindex('''',column_default)+1), (length(column_default)-charindex('''',reverse(column_default))-charindex('''',column_default) ) ) ||') ', '')
-- encoding
|| decode(enc,'none','',' encode '||enc)
as str
from
-- ci all the col info
(
select cast(t.tableid as int), cast(table_schema as varchar(100)), cast(table_name as varchar(100)), cast(column_name as varchar(100)),
cast(ordinal_position as int), cast(column_default as varchar(100)), cast(is_nullable as varchar(20)) , cast(udt_name as varchar(50)) ,cast(character_maximum_length as int),
sort_col_order , decode(d.colname,null,0,1) dist_key , e.enc
from
(select * from information_schema.columns c where c.table_schema= t.schemaname and c.table_name=t.tablename) c
left join
(-- gives sort cols
select attrelid as tableid, attname as colname, attsortkeyord as sort_col_order from pg_attribute a where
a.attnum > 0 AND NOT a.attisdropped AND a.attsortkeyord > 0
) s on t.tableid=s.tableid and c.column_name=s.colname
left join
(-- gives encoding
select attrelid as tableid, attname as colname, format_encoding(a.attencodingtype::integer) AS enc from pg_attribute a where
a.attnum > 0 AND NOT a.attisdropped
) e on t.tableid=e.tableid and c.column_name=e.colname
left join
-- gives dist col
(select attrelid as tableid, attname as colname from pg_attribute a where
a.attnum > 0 AND NOT a.attisdropped AND a.attisdistkey = 't'
) d on t.tableid=d.tableid and c.column_name=d.colname
order by ordinal_position
) ci
-- for the working array funct
), '') as str
from
(-- need tableid
SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid
FROM pg_namespace n, pg_class c
WHERE n.oid = c.relnamespace
AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
) t
)) cp on tm.tableid=cp.tableid
-- primary key query here
left join
(select c.oid as tableid, ', primary key '|| substring(pg_get_indexdef(indexrelid),charindex('(',pg_get_indexdef(indexrelid))-1 ,60) as pkey
from pg_index i , pg_namespace n, pg_class c
where i.indisprimary=true
and i.indrelid =c.oid
and n.oid = c.relnamespace
) pk on tm.tableid=pk.tableid
-- dist key
left join
( select
-- close off the col defs after the primary key
')' ||
' distkey('|| cast(column_name as varchar(100)) ||')' as distkey, t.tableid
from information_schema.columns c
join
(-- need tableid
SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid
FROM pg_namespace n, pg_class c
WHERE n.oid = c.relnamespace
AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
) t on c.table_schema= t.schemaname and c.table_name=t.tablename
join
-- gives dist col
(select attrelid as tableid, attname as colname from pg_attribute a where
a.attnum > 0 AND NOT a.attisdropped AND a.attisdistkey = 't'
) d on t.tableid=d.tableid and c.column_name=d.colname
) d on tm.tableid=d.tableid
where tm.schemaname||'.'||tm.tablename='myschema.mytable'

If you want to get the table structure with create statement, constraints and triggers, you can use pg_dump utility
pg_dump -U user_name -s -t table_name -d db_name
Note: -s used for schema only dump
if you want to take the data only dump , you can use -a switch.
This will output the create syntax with all the constraints. Hope this will help you.

I did not find any complete solutions out there.
And wrote a python script:
https://github.com/cxmcc/redshift_show_create_table
It will work like pg_dump, plus dealing with basic redshift features, SORTKEY/DISTKEY/DISTSTYLES etc.

As show table doesn't work on Redshift:
show table <YOUR_TABLE>;
ERROR: syntax error at or near "<YOUR_TABLE>"
We can use pg_table_def table to get the schema out:
select "column", type, encoding, distkey, sortkey, "notnull"
from pg_table_def
where tablename = '<YOUR_TABLE>';
NOTE: If the schema is not on the search path, add it to search path using:
set search_path to '$user', 'public', '<YOUR_SCHEMA>';

For redshift please try
show table <**tablename**> ;

In Postgres, you'd query the catalog.
From with psql use the shorthands to a variety of commands whose list you'll get by using \? (for help). Therefor, either of:
\d yourtable
\d+ yourtable
For use in an app, you'll need to learn the relevant queries involved. It's relatively straightforward by running psql -E (for echo hidden queries) instead of plain psql.
If you need the precise create table statement, see #Anant answer.

One easy way to do this is to use the utility provided by AWS. All you need to do is to create the view in your database and then query that view to get any table ddl. The advantage to use this view is that it will give you the sortkey and distkey as well which was used in original create table command.
https://github.com/awslabs/amazon-redshift-utils/blob/master/src/AdminViews/v_generate_tbl_ddl.sql
Once the view is created, to get the the ddl of any table. You need to query like this -
select ddl from table where tablename='table_name' and schemaname='schemaname';
Note: Admin schema might not be already there in your cluster. So you can create this view in public schema.

Below query will generate the DDL of the table for you:
SELECT ddl
FROM admin.v_generate_tbl_ddl
WHERE schemaname = '<schemaname>'
AND tablename in (
'<tablename>');

Are you needing to retrieve it programatically or from the psql prompt?
In psql use : \d+ tablename
Programatically, you can query the ANSI standard INFORMATION_SCHEMA views documented here:
http://www.postgresql.org/docs/9.1/static/information-schema.html
The INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.COLUMNS views should have what you need.

You can use admin view provided by AWS Redshift - https://github.com/awslabs/amazon-redshift-utils/blob/master/src/AdminViews/v_generate_tbl_ddl.sql
once you have created the view you can get schema creation script by running:
select * from <db_schema>.v_generate_tbl_ddl where tablename = '<table_name>'

To get the column data and schema of a particular table:
select * from information_schema.columns where tablename='<<table_name>>'
To get the information of a table metadata fire the below query
select * from information_schema.tables where schema='<<schema_name>>'

In the new "query editor 2", you can right click on a table and select "show definition", this will place the DDL for the table in a query window.

The below command will work:
mysql > show create table test.users_info;
Redshift/postgress >pg_dump -U root-w --no-password -h 62.36.11.547 -p 5439 -s -t test.users_info ;

Postgres dynamically update constraint foreign key

I have lots of tables with lots of foreign keys and about all of them are UPDATE NO ACTION and DELETE NO ACTION.
Is it possible to dynamically update all this foreign keys to CASCADE instead of NO ACTION or RESTRICT?
For example:
ALTER TABLE * ALTER FOREIGN KEY * SET ON UPDATE CASCADE ON DELETE CASCADE;
Yours,
Diogo

No, this is not possible.
You will need to drop and re-create all constraints as a foreign key constraint cannot be altered like that.
The following statement will generate the necessary alter table statements to drop and re-create the foreign keys:
select 'alter table '||pgn.nspname||'.'||tbl.relname||' drop constraint '||cons.conname||';'
from pg_constraint cons
join pg_class tbl on cons.confrelid = tbl.oid
join pg_namespace pgn on pgn.oid = tbl.relnamespace
where contype = 'f'
union all
select 'alter table '||pgn.nspname||'.'||tbl.relname||' add constraint '||cons.conname||' '||pg_get_constraintdef(cons.oid, true)||' ON UPDATE CASCADE ON DELETE CASCADE;'
from pg_constraint cons
join pg_class tbl on cons.confrelid = tbl.oid
join pg_namespace pgn on pgn.oid = tbl.relnamespace
where contype = 'f'
Save the output of this statement to a file and run it.
Make sure you validate the generated statements before running them!

I would use the following code to generate the necessary alter table SQL statements to drop and re-create the foreign keys (in order):
select 'ALTER TABLE '||pgn.nspname||'.'||tbl.relname||' DROP CONSTRAINT '||cons.conname||';' as sqlstr
from pg_constraint cons
join pg_class tbl on cons.conrelid = tbl.oid
join pg_namespace pgn on pgn.oid = tbl.relnamespace
where contype = 'f'
union
select 'ALTER TABLE '||pgn.nspname||'.'||tbl.relname||' ADD CONSTRAINT '||cons.conname||' FOREIGN KEY ('||(select attname from pg_attribute where attrelid=cons.conrelid and attnum = ANY(cons.conkey))||') REFERENCES '||tblf.relname||' ('||(select attname from pg_attribute where attrelid=cons.confrelid and attnum = ANY(cons.confkey))||') ON UPDATE CASCADE ON DELETE CASCADE;' as sqlstr
from pg_constraint cons
join pg_class tbl on cons.conrelid = tbl.oid
join pg_namespace pgn on pgn.oid = tbl.relnamespace
join pg_class tblf on cons.confrelid = tblf.oid
where contype = 'f'
ORDER BY sqlstr desc