If I create a PostgreSQL 10 partition for a table like this:
CREATE TABLE measurement_y2006m01 PARTITION OF measurement
FOR VALUES FROM ('2006-01-01') TO ('2006-02-01');
How can I recreate the DDL from the pg_catalog tables and views? The pg_class table has a relpartbound column, but its content is in an internal unreadable format.
You can use pg_get_expr() to get a readable version of the partition definition:
select pg_get_expr(c.relpartbound, c.oid, true) as partition_expression
from pg_class c
where relname = 'measurement_y2006m01';
Related
Are there multiple ways to set n_distinct in PostgreSQL? Both of these seem to be doing the same thing but end up changing a different value within pg_attribute. What is the difference between these two commands?
alter table my_table alter column my_column set (n_distinct = 500);
alter table my_table alter column my_column set statistics 1000;
select
c.relname,
a.attname,
a.attoptions,
a.attstattarget
from
pg_class c
inner join
pg_attribute a
on c.oid = a.attrelid
where
c.relname = 'my_table'
and
a.attname = 'my_column'
order by
c.relname,
a.attname;
Name |Value
-------------|----------------
relname |my_table
attname |my_column
attoptions |{n_distinct=500}
attstattarget|1000
Both of these seem to be doing the same thing
Why would you say that? Both commands are obviously distinct. Both are related to column statistics and query planning. But they do very different things.
The statistics target ...
controls the level of detail of statistics accumulated for this column by ANALYZE. See:
Check statistics targets in PostgreSQL
Basics in the manual.
Setting n_distinct is something completely different. It means hard-coding the number (or ratio) of distinct values to expect for the given column. (But only effective after the next ANALYZE.)
Related answer on dba.SE with more on n_distinct:
Very bad query plan in PostgreSQL 9.6
I have the following table in my PostgreSQL database:
CREATE TABLE values
(
dt timestamp,
series_id integer,
value real
);
CREATE INDEX idx_values_date ON public."values" USING btree (dt);
ALTER TABLE ONLY public."values" ADD CONSTRAINT values_series_id_fkey FOREIGN KEY (series_id) REFERENCES public.series(id) ON DELETE CASCADE;
I'm parsing some CSV files and extracting floats which I add to this table together with a timestamp and series_id which is a foreign key to another table.
The directory containing my raw data files amounts to about 28MB on my drive.
After feeding the data into my table I do a
SELECT pg_size_pretty( pg_total_relation_size('values') );
And find that the table now has ~871503 rows and is now 98MB in size. Is this normal? I was expecting my table to be way less in
size than actual text files containing raw data.
I'd like to mention that the PostgreSQL instance also has PostGIS installed but I'm not using it
in this particular schema. Furthermore, I'm running PostgreSQL from a docker container.
Later edit ...
After doing some more research and running the following query:
SELECT *, pg_size_pretty(total_bytes) AS total
, pg_size_pretty(index_bytes) AS INDEX
, pg_size_pretty(toast_bytes) AS toast
, pg_size_pretty(table_bytes) AS TABLE
FROM (
SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes FROM (
SELECT c.oid,nspname AS table_schema, relname AS TABLE_NAME
, c.reltuples AS row_estimate
, pg_total_relation_size(c.oid) AS total_bytes
, pg_indexes_size(c.oid) AS index_bytes
, pg_total_relation_size(reltoastrelid) AS toast_bytes
FROM pg_class c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE relkind = 'r'
) a
) a WHERE a.table_name = 'values';
I came up with the following results:
Index: 61MB
Table: 38MB
Can I somehow optimize the index? Maybe it's using some defaults that make it take up so much space?
When I populate a table with your structure with that number of rows, I get:
table: 37 MB
index: 24 MB
So either your index is bloated (you can drop and recreate it, or use REINDEX) or you have more indexes than you are admitting to.
But perhaps the better answer is "Yes, relational databases have a lot of overhead, get used to it." If you try to investigate every difference between a database and a flat file, you will drive yourself crazy and accomplish very little from it.
i check partition information from pg_partition,
select
relname,
parttype,
parentid,
rangenum,
interval,
boundaries
from pg_partition where parttype='p';
the problem is: how to know where these partitions comes from,
If you're using greenplum, pg_partitions has a tablename column. See This answer
For Postgres, the name of the table that stores partition info is pg_partitioned_table
For table details that contain partitions, you may simple query pg_class as in this answer
select c.relnamespace::regnamespace::text as schema,
c.relname as table_name,
pg_get_partkeydef(c.oid) as partition_key
from pg_class c
where c.relkind = 'p';
Here's a demo
If you want all the information of partition and their tables, you may combine the 2 tables as in this answer
I just created an new database and it already takes up 7MB. Do you know what is taking up this much space? Is there a way to get the "real" size of the database used as in how much data is stored?
0f41ba72-a1ea-4516-a9f0-de8a3609bc4a=> select pg_size_pretty(pg_database_size(current_database()));
pg_size_pretty
----------------
7055 kB
(1 row)
0f41ba72-a1ea-4516-a9f0-de8a3609bc4a=> \dt
No relations found.
Well, even you don't created any relation yet the new database is not empty. When a CREATE DATABASE is issued, Postgres copy a TEMPLATE database - which comes with catalog tables - to a new database. In fact, "Nothing is created, everything is transformed". You can use commands below to inspect this:
--Size per table
SELECT pg_size_pretty(pg_total_relation_size(oid)), relname FROM pg_class WHERE relkind = 'r' AND NOT relisshared;
--Total size
SELECT pg_size_pretty(sum(pg_total_relation_size(oid))) FROM pg_class WHERE relkind = 'r' AND NOT relisshared;
--Total size of databases
SELECT pg_size_pretty(pg_database_size(oid)), datname FROM pg_database;
A quote from the docs:
By default, the new database will be created by cloning the standard
system database template1.
An empty database contains system catalogs and The Information Schema.
Execute this query to see them:
select nspname as schema, relname as table, pg_total_relation_size(c.oid)
from pg_class c
join pg_namespace n on n.oid = relnamespace
order by 3 desc;
schema | table | pg_total_relation_size
--------------------+-----------------------------+------------------------
pg_catalog | pg_depend | 1146880
pg_catalog | pg_proc | 950272
pg_catalog | pg_rewrite | 589824
pg_catalog | pg_attribute | 581632
... etc
You can get the total size of non-system relations with the query:
select sum(pg_total_relation_size(c.oid))
from pg_class c
join pg_namespace n on n.oid = relnamespace
where nspname not in ('information_schema', 'pg_catalog', 'pg_toast');
The query returns null on empty database.
Every PostgreSQL databases has own system catalogue .. 7MB. So your numbers are correct. PostgreSQL is designed for client-server architecture and 1GB and longer databases - so this cost is not significant.
If you need reduced space allocation, you can try embedded databases like SQLite or Firebird.
Does anyone know how to find the OID of a table in Postgres 9.1?
I am writing an update script that needs to test for the existence of a column in a table before it tries to add the column. This is to prevent errors when running the script repeatedly.
To get a table OID, cast to the object identifier type regclass (while connected to the same DB):
SELECT 'mytbl'::regclass::oid;
This finds the first table (or view, etc.) with the given name along the search_path or raises an exception if not found.
Schema-qualify the table name to remove the dependency on the search path:
SELECT 'myschema.mytbl'::regclass::oid;
In Postgres 9.4 or later you can also use to_regclass('myschema.mytbl'), which doesn't raise an exception if the table is not found:
How to check if a table exists in a given schema
Then you only need to query the catalog table pg_attribute for the existence of the column:
SELECT TRUE AS col_exists
FROM pg_attribute
WHERE attrelid = 'myschema.mytbl'::regclass
AND attname = 'mycol'
AND NOT attisdropped -- no dropped (dead) columns
-- AND attnum > 0 -- no system columns (you may or may not want this)
;
The postgres catalog table pg_class is what you should look at. There should be one row per table, with the table name in the column relname, and the oid in the hidden column oid.
You may also be interested in the pg_attribute catalog table, which includes one row per table column.
See: http://www.postgresql.org/docs/current/static/catalog-pg-class.html and http://www.postgresql.org/docs/current/static/catalog-pg-attribute.html
SELECT oid FROM pg_class WHERE relname = 'tbl_name' AND relkind = 'r';
Just to complete the possibilities I'd like to add that there exists a syntax for dropping columns in order to no error out:
ALTER TABLE mytbl
DROP COLUMN IF EXISTS mycol
See http://www.postgresql.org/docs/9.0/static/sql-altertable.html
Then you can safely add your column.