How do we get all columns which are the part of sortkey in Redshift - amazon-redshift

I need to get all columns, which are the part of sortkey in Redshift.
I tried get information using "select * from svv_table_info" but it have only the information of one column only. Can you let me know, how do I get all columns which are the part of Sortkey for a table.
Thanks,
Sanjeev

Thanks all for your help. I already tried "pg_table_def" table to get sortkey and distkey information but I have seen only pg_catalog and Public schema, I just go through the Amazon developer guide and found we need to add schema to search path using below commands:-
show search_path;
set search_path to '$user', 'public', 'NewSchema';
After adding the "NewSchema" in search path I can see sortkey and distkey information for this schema in pg_table_def
Thanks,
Sanjeev

Sanjeev,
A table called pg_table_def has information about the columns.
In the example below, I created a simple table with four columns and used 2 of these columns as my sort key.
As you can see in my query results the "sort key" field shows a number other than 0 if the column is part of a sort key.
dev=# drop table tb1;
DROP TABLE
dev=# create table tb1 (col1 integer, col2 integer, col3 integer, col4 integer) distkey(col1) sortkey(col2, col4);
CREATE TABLE
dev=# select * from pg_table_def where tablename = 'tb1';
schemaname | tablename | column | type | encoding | distkey | sortkey | notnull
------------+-----------+--------+---------+----------+---------+---------+---------
public | tb1 | col1 | integer | none | t | 0 | f
public | tb1 | col2 | integer | none | f | 1 | f
public | tb1 | col3 | integer | none | f | 0 | f
public | tb1 | col4 | integer | none | f | 2 | f
(4 rows)

What about:
select "column", type, encoding, distkey, sortkey, "notnull"
from pg_table_def
where tablename = 'YOURTABLE'
and sortkey <> 0;

Related

Postgres: does updating column value to the same value marks page as dirty?

Consider following scenario in PostgreSQL (any version from 10+):
CREATE TABLE users(
id serial primary key,
name text not null unique,
last_seen timestamp
);
INSERT INTO users(name, last_seen)
VALUES ('Alice', '2019-05-01'),
('Bob', '2019-04-29'),
('Dorian', '2019-05-11');
CREATE TABLE inactive_users(
user_id int primary key references users(id),
last_seen timestamp not null);
INSERT INTO inactive_users(user_id, last_seen)
SELECT id as user_id, last_seen FROM users
WHERE users.last_seen < '2019-05-04'
ON CONFLICT (user_id) DO UPDATE SET last_seen = excluded.last_seen;
Now let's say that I want to insert the same values (execute last statement) multiple times, every now and then. In practice from the database point of view, on conflicting values 90% of the time last_seen column will be updated to the same value it already had. The values of the rows stay the same, so there's no reason to do I/O writes, right? But is this really the case, or will postgres perform corresponding updates even though the actual value didn't change?
In my case the destination table has dozens of millions of rows, but only few hundreds/thousands of them will be really changing on each of the insert calls.
Any UPDATE to a row will actually create a new row (marking the old row deleted/dirty), regardless of the before/after values:
[root#497ba0eaf137 /]# psql
psql (12.1)
Type "help" for help.
postgres=# create table foo (id int, name text);
CREATE TABLE
postgres=# insert into foo values (1,'a');
INSERT 0 1
postgres=# select ctid,* from foo;
ctid | id | name
-------+----+------
(0,1) | 1 | a
(1 row)
postgres=# update foo set name = 'a' where id = 1;
UPDATE 1
postgres=# select ctid,* from foo;
ctid | id | name
-------+----+------
(0,2) | 1 | a
(1 row)
postgres=# update foo set id = 1 where id = 1;
UPDATE 1
postgres=# select ctid,* from foo;
ctid | id | name
-------+----+------
(0,3) | 1 | a
(1 row)
postgres=# select * from pg_stat_user_tables where relname = 'foo';
-[ RECORD 1 ]-------+-------
relid | 16384
schemaname | public
relname | foo
seq_scan | 5
seq_tup_read | 5
idx_scan |
idx_tup_fetch |
n_tup_ins | 1
n_tup_upd | 2
n_tup_del | 0
n_tup_hot_upd | 2
n_live_tup | 1
n_dead_tup | 2
<...>
And according to your example:
postgres=# select ctid,* FROM inactive_users ;
ctid | user_id | last_seen
-------+---------+---------------------
(0,1) | 1 | 2019-05-01 00:00:00
(0,2) | 2 | 2019-04-29 00:00:00
(2 rows)
postgres=# INSERT INTO inactive_users(user_id, last_seen)
postgres-# SELECT id as user_id, last_seen FROM users
postgres-# WHERE users.last_seen < '2019-05-04'
postgres-# ON CONFLICT (user_id) DO UPDATE SET last_seen = excluded.last_seen;
INSERT 0 2
postgres=# select ctid,* FROM inactive_users ;
ctid | user_id | last_seen
-------+---------+---------------------
(0,3) | 1 | 2019-05-01 00:00:00
(0,4) | 2 | 2019-04-29 00:00:00
(2 rows)
Postgres does not do any data validation against the column values -- if you are looking to prevent unnecessary write activity, you will need to surgically craft your WHERE clauses.
Disclosure: I work for EnterpriseDB (EDB)

Postgres Translate column value into schema prefix in a query

I have a database that uses postgresql schemas for multi-tenancy purposes. It has a table in the public schema called customers with an id and tenant column. The value for tenant is a string, and there's a corresponding postgresql schema with tables in it that match.
It looks like this:
# public.customers # first.users # second.users
| id | tenant | | id | name | | id | name |
|----|--------| |----|--------| |----|--------|
| 1 | first | | 1 | bob | | 1 | jen |
| 2 | second | | 2 | jess | | 2 | mike |
I'm wondering how I could make a single query to fetch values from a table in the schema, just given a customer id.
So if I have a customer_id of 1, how can I select * from first.users in a single query.
I'm guessing this might have to be a function written in pgpsql, but I don't have a lot of experience with that. Something like:
select * from tenant_table(1, 'users');
?
create or replace function f(_id int)
returns table (id int, name text) as $f$
declare _tenant text;
begin;
select tenant into _tenant
from public.customers
where id = _id;
return query execute format($e$
select *
from %I.users
$e$, _tenant);
end;
$f$ language plpgsql;
You cannot do that with a single query.
You'll have to use one query that selects the schema name, then construct a second query and run that.
Of course you can define a PL/pgSQL function that does both for you and executes the dynamic query with EXECUTE.

Checking fillfactor setting for tables and indexes

There is maybe some function to check the fillfactor for indexes and tables? I've tried already \d+ but have basic definition only, without fillfactor value:
Index "public.tab1_pkey"
Column | Type | Definition | Storage
--------+--------+------------+---------
id | bigint | id | plain
primary key, btree, for table "public.tab1"
For tables haven't found anything. If table was created with fillfactor other than default:
CREATE TABLE distributors (
did integer,
name varchar(40),
UNIQUE(name) WITH (fillfactor=70)
)
WITH (fillfactor=70);
Then \d+ distributorsshows non-standard fillfactor.
Table "public.distributors"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-----------------------+-----------+----------+--------------+-------------
did | integer | | plain | |
name | character varying(40) | | extended | |
Indexes:
"distributors_name_key" UNIQUE CONSTRAINT, btree (name) WITH (fillfactor=70)
Has OIDs: no
Options: fillfactor=70
But maybe there is a way to get this value without parsing output?
You need to query the pg_class system table:
select t.relname as table_name,
t.reloptions
from pg_class t
join pg_namespace n on n.oid = t.relnamespace
where t.relname in ('tab11_pkey', 'tab1')
and n.nspname = 'public'
reloptions is an array, with each element containing one option=value definition. But it will be null for relations that have the default options.

See all indexes and appropriate columns for table

How to see all existed indexes for table? for example given table mytable, how to see his every index with appropriate columns?
Try this SQL
SELECT * FROM pg_indexes WHERE tablename = 'mytable';
In psql use the \d command:
postgres=> create table foo (id integer not null primary key, some_data varchar(20));
CREATE TABLE
postgres=> create index foo_data_idx on foo (some_data);
CREATE INDEX
postgres=> \d+ foo
Table "public.foo"
Column | Type | Modifiers | Storage | Stats target | Description
-----------+-----------------------+-----------+----------+--------------+------------
id | integer | not null | plain | |
some_data | character varying(20) | | extended | |
Indexes:
"foo_pkey" PRIMARY KEY, btree (id)
"foo_data_idx" btree (some_data)
Has OIDs: no
postgres=>
Other SQL tools have other means of displaying this information.

reference to a sequence column (postgresql)

I encountered a problem when creating a foreign key referencing to a sequence, see the code example below.
But on creating the tables i recieve the following error.
"Detail: Key columns "product" and "id" are of incompatible types: integer and ownseq"
I've already tried different datatypes for the product column (like smallint, bigint) but none of them is accepted.
CREATE SEQUENCE ownseq INCREMENET BY 1 MINVALUE 100 MAXVALUE 99999;
CREATE TABLE products (
id ownseq PRIMARY KEY,
...);
CREATE TABLE basket (
basket_id SERIAL PRIMARY KEY,
product INTEGER FOREIGN KEY REFERENCES products(id));
CREATE SEQUENCE ownseq INCREMENT BY 1 MINVALUE 100 MAXVALUE 99999;
CREATE TABLE products (
id integer PRIMARY KEY default nextval('ownseq'),
...
);
alter sequence ownseq owned by products.id;
The key change is that id is defined as an integer, rather than as ownseq. This is what would happen if you used the SERIAL pseudo-type to create the sequence.
Try
CREATE TABLE products (
id INTEGER DEFAULT nextval(('ownseq'::text)::regclass) NOT NULL PRIMARY KEY,
...);
or don't create the sequence ownseq and let postgres do it for you:
CREATE TABLE products (
id SERIAL NOT NULL PRIMARY KEY
...);
In the above case the name of the sequence postgres has create should be products_id_seq.
Hope this helps.
PostgreSQL is powerful and you have just been bitten by an advanced feature.
Your DDL is quite valid but not at all what you think it is.
A sequence can be thought of as an extra-transactional simple table used for generating next values for some columns.
What you meant to do
You meant to have the id field defined thus, as per the other answer:
id integer PRIMARY KEY default nextval('ownseq'),
What you did
What you did was actually define a nested data structure for your table. Suppose I create a test sequence:
CREATE SEQUENCE testseq;
Then suppose I \d testseq on Pg 9.1, I get:
Sequence "public.testseq"
Column | Type | Value
---------------+---------+---------------------
sequence_name | name | testseq
last_value | bigint | 1
start_value | bigint | 1
increment_by | bigint | 1
max_value | bigint | 9223372036854775807
min_value | bigint | 1
cache_value | bigint | 1
log_cnt | bigint | 0
is_cycled | boolean | f
is_called | boolean | f
This is the definition of the type the sequence used.
Now suppose I:
create table seqtest (test testseq, id serial);
I can insert into it:
INSERT INTO seqtest (id, test) values (default, '("testseq",3,4,1,133445,1,1,0,f,f)');
I can then select from it:
select * from seqtest;
test | id
----------------------------------+----
(testseq,3,4,1,133445,1,1,0,f,f) | 2
Moreover I can expand test:
SELECT (test).* from seqtest;
select (test).* from seqtest;
sequence_name | last_value | start_value | increment_by | max_value | min_value
| cache_value | log_cnt | is_cycled | is_called
---------------+------------+-------------+--------------+-----------+----------
-+-------------+---------+-----------+-----------
| | | | |
| | | |
testseq | 3 | 4 | 1 | 133445 | 1
| 1 | 0 | f | f
(2 rows)
This sort of thing is actually very powerful in PostgreSQL but full of unexpected corners (for example not null and check constraints don't work as expected with nested data types). I don't generally recommend nested data types, but it is worth knowing that PostgreSQL can do this and will be happy to accept SQL commands to do it without warning.