Why my empty postgres database is 7MB? - postgresql

I just created an new database and it already takes up 7MB. Do you know what is taking up this much space? Is there a way to get the "real" size of the database used as in how much data is stored?
0f41ba72-a1ea-4516-a9f0-de8a3609bc4a=> select pg_size_pretty(pg_database_size(current_database()));
pg_size_pretty
----------------
7055 kB
(1 row)
0f41ba72-a1ea-4516-a9f0-de8a3609bc4a=> \dt
No relations found.

Well, even you don't created any relation yet the new database is not empty. When a CREATE DATABASE is issued, Postgres copy a TEMPLATE database - which comes with catalog tables - to a new database. In fact, "Nothing is created, everything is transformed". You can use commands below to inspect this:
--Size per table
SELECT pg_size_pretty(pg_total_relation_size(oid)), relname FROM pg_class WHERE relkind = 'r' AND NOT relisshared;
--Total size
SELECT pg_size_pretty(sum(pg_total_relation_size(oid))) FROM pg_class WHERE relkind = 'r' AND NOT relisshared;
--Total size of databases
SELECT pg_size_pretty(pg_database_size(oid)), datname FROM pg_database;
A quote from the docs:
By default, the new database will be created by cloning the standard
system database template1.

An empty database contains system catalogs and The Information Schema.
Execute this query to see them:
select nspname as schema, relname as table, pg_total_relation_size(c.oid)
from pg_class c
join pg_namespace n on n.oid = relnamespace
order by 3 desc;
schema | table | pg_total_relation_size
--------------------+-----------------------------+------------------------
pg_catalog | pg_depend | 1146880
pg_catalog | pg_proc | 950272
pg_catalog | pg_rewrite | 589824
pg_catalog | pg_attribute | 581632
... etc
You can get the total size of non-system relations with the query:
select sum(pg_total_relation_size(c.oid))
from pg_class c
join pg_namespace n on n.oid = relnamespace
where nspname not in ('information_schema', 'pg_catalog', 'pg_toast');
The query returns null on empty database.

Every PostgreSQL databases has own system catalogue .. 7MB. So your numbers are correct. PostgreSQL is designed for client-server architecture and 1GB and longer databases - so this cost is not significant.
If you need reduced space allocation, you can try embedded databases like SQLite or Firebird.

Related

Change Schema Name and Then Change It Back Again

In a clean-up effort, I changed some schema names in Redshift. Then I nearly immediately switched the schema names back. All but a few of the tables disappeared.
Is this a known issue?
Should I be more careful about renaming tables to previous names?
sql> ALTER SCHEMA common_schema RENAME TO common_schema_v1
[2019-05-01 14:39:25] completed in 432 ms
sql> ALTER SCHEMA common_schema_v1 RENAME TO common_schema
[2019-05-01 14:48:41] completed in 371 ms
The tables would not normally be dropped by a rename operation.
It could be the rename changed your search path and you're just not seeing the tables now. Try re-adding the schema name to your search path.
SHOW search_path;
SET search_path TO public, common_schema;
You can also look for the tables in the catalog to confirm they're still there.
SELECT *
FROM information_schema.tables
WHERE table_schema = 'common_schema'
;
Or
SELECT nspname AS schema_name
, relname AS table_name
FROM pg_class c
, pg_namespace n
WHERE n.oid = c.relnamespace
AND c.reltype > 0
AND n.nspname = 'common_schema'
ORDER BY 1, 2
;

How to access information_schema foreign key constraints with read-only user in Postgres?

Introduction
I've been developing a wizard to create complex database Postgres queries for users without any programming/SQL background. Thanks to foreign key constraints stored in a view in information_schema, the user may select any number of tables and the tool will find the correct join settings (thus, the user does not have to add ON table_a.field_1 = table_b.field_2).
While developing, I have been using an administration database user and now wanted to change that to a read-only user to make it more secure. However, this read-only user seems not to be able to access the foreign key constraints.
Current situation
When more than one table has been selected, the tool tries to get the connections between the various tables in order to know how to join them. During that process, the following query is executed:
SELECT
tc.constraint_name,
tc.table_name,
kcu.column_name,
ccu.table_name AS foreign_table_name,
ccu.column_name AS foreign_column_name
FROM information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage AS ccu
ON ccu.constraint_name = tc.constraint_name
WHERE constraint_type = 'FOREIGN KEY'
AND ccu.table_name = 'TableB'
AND tc.table_name IN ('TableA');
(Note: the last WHERE clause uses IN because there can be more than one base table available. TableA is the base table and each successfully connected/joined table will be available for additional joins, e.g. a third table could use AND ccu.table_name = 'TableC' AND tc.table_name IN ('TableA', 'TableB'); and so on.)
When using the admin db user (with most common privileges like GRANT, SELECT, INSERT, UPDATE, DELETE, TRUNCATE, ...) executes the query, the result looks something like this:
constraint_name | table_name | column_name | foreign_table_name | foreign_column_name
----------------+------------+-------------+--------------------+---------------------
constraint1 | TableA | field_1 | TableB | field_2
(1 row)
But when the read-only db user runs that query, it returns:
constraint_name | table_name | column_name | foreign_table_name | foreign_column_name
----------------+------------+-------------+--------------------+---------------------
(0 rows)
Due to the existing but not returned foreign key constraint entry, the joins can not be properly written as SQL and the user generated query (by using the wizard) fails.
What I tried
First of course, I thought the read-only user (ro_user) might not have the permissions to access tables and views in database information_schema. So I ran
GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO ro_user;
as admin but to no avail. Getting more into the depths of the documentation, I found that all tables and views in information_schema are available and accessible to any user by default in postgres anyways. So granting the select privilege shouldn't even change anything.
Just to make sure, I also ran
GRANT REFERENCES ON ALL TABLES IN SCHEMA actual_database TO ro_user;
but of course, this didn't change anything neither, since REFERENCESis only needed for creating new foreign key, I just need to read them.
Next, I thought, maybe the sql from the tool is failing due to some information not being available, so I queried the three views separately by running:
SELECT * FROM information_schema.table_constraints AS tc WHERE constraint_type = 'FOREIGN KEY';
SELECT * FROM information_schema.key_column_usage AS kcu;
SELECT * FROM information_schema.constraint_column_usage AS ccu;
And sure enough, the last one wouldn't return any single row for the ro_user:
psql=> SELECT * FROM information_schema.constraint_column_usage AS ccu;
table_catalog | table_schema | table_name | column_name | constraint_catalog | constraint_schema | constraint_name
---------------+--------------+------------+-------------+--------------------+-------------------+-----------------
(0 rows)
whereas the admin user got lots of results. So, it was coming down to that one view information_schema.constraint_column_usage.
As I was typing out that question over the course of an hour recollecting and boiling down all the ideas I tried during the last days, I finally found the cause.
The view constraint_column_usage identifies all columns in the current database that are used by some constraint. Only those columns are shown that are contained in a table owned by a currently enabled role.
From documentation via this SO answer
And through that I found a solution
SELECT
conrelid::regclass AS table_from,
conname,
pg_get_constraintdef(c.oid) AS cdef
FROM pg_constraint c
JOIN pg_namespace n
ON n.oid = c.connamespace
WHERE contype IN ('f')
AND n.nspname = 'public'
AND pg_get_constraintdef(c.oid) LIKE '%"TableB"%'
AND conrelid::regclass::text IN ('"TableA"')
ORDER BY conrelid::regclass::text, contype DESC;
It doesn't output the same format as the old query, but it contains the same information and is - most importantly - available to the ro_user.

Is it possible to know which database takes up the most storage on my disk

I found the largest folder under my postgreSQL storage directory is /usr/local/var/postgres/base/209510
How could I know this data is belonging to which datatable ?
In the other way, is it possible to know which database or datatable takes up the most storege.
Because there's almost no free space on my SSD disk
To find the largest database in postgreSQL :
SELECT datname, pg_size_pretty(pg_database_size(datname)) db_size
FROM pg_database
ORDER BY db_size;
In psql, you can use the \l+ command to get a nice summary of databases with sizes.
The best way to do this is from within an SQL prompt - there are several examples of queries you can run listed at https://wiki.postgresql.org/wiki/Disk_Usage and one for the largest tables in current database is copied below for posterity.
SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(C.oid) DESC
LIMIT 20;

How to retrieve the comment of a PostgreSQL database?

I recently discovered you can attach a comment to all sort of objects in PostgreSQL. In particular, I'm interested on playing with the comment of a database. For example, to set the comment of a database:
COMMENT ON DATABASE mydatabase IS 'DB Comment';
However, what is the opposite statement, to get the comment of mydatabase?
From the psql command line, I can see the comment along with other information as a result of the \l+ command; which I could use with the aid of awk in order to achieve my goal. But I'd rather use an SQL statement, if possible.
First off, your query for table comments can be simplified using a cast to the appropriate object identifier type:
SELECT description
FROM pg_description
WHERE objoid = 'myschema.mytbl'::regclass;
The schema part is optional. If you omit it, your current search_path decides visibility of any table named mytbl.
Better yet, there are dedicated functions in PostgreSQL to simplify and canonize these queries. The manual:
obj_description(object_oid, catalog_name) ... get comment for a
database object
shobj_description(object_oid, catalog_name) ... get comment for a shared database object
Description for table:
SELECT obj_description('myschema.mytbl'::regclass, 'pg_class');
Description for database:
SELECT pg_catalog.shobj_description(d.oid, 'pg_database') AS "Description"
FROM pg_catalog.pg_database d
WHERE datname = 'mydb';
How do you find out about that?
Well, reading the excellent manual is enlightening. :)
But there is a more direct route in this case: most psql meta commands are implemented with plain SQL. Start a session with psql -E, to see the magic behind the curtains. The manual:
-E
--echo-hidden
Echo the actual queries generated by \d and other backslash commands. You can use this to study psql's internal operations. This
is equivalent to setting the variable ECHO_HIDDEN to on.
To get the comment on the database, use the following query:
select description from pg_shdescription
join pg_database on objoid = pg_database.oid
where datname = '<database name>'
This query will get you table comment for the given table name:
select description from pg_description
join pg_class on pg_description.objoid = pg_class.oid
where relname = '<your table name>'
If you use the same table name in different schemas, you need to modify it a bit:
select description from pg_description
join pg_class on pg_description.objoid = pg_class.oid
join pg_namespace on pg_class.relnamespace = pg_namespace.oid
where relname = '<table name>' and nspname='<schema name>'
For tables, try
\dd TABLENAME
This shows the comment I added to a table
This query will get only table comment for all tables
SELECT RelName,Description
FROM pg_Description
JOIN pg_Class
ON pg_Description.ObjOID = pg_Class.OID
WHERE ObjSubID = 0
This query will return the comment of a table
SELECT obj_description('public.myTable'::regclass)
FROM pg_class
WHERE relkind = 'r' limit 1
To get the comments on all the databases (not on their objects like tables etc.) :
SELECT datname, shobj_description( oid, 'pg_database' ) AS comment
FROM pg_database
ORDER BY datname
An example showing databases, sizes and descriptions from a shell script:
psql -U postgres -c "SELECT datname,
format('%8s MB.', pg_database_size(datname)/1000000) AS size,
shobj_description( oid, 'pg_database' ) as comment
FROM pg_database ORDER BY datname"
Sample output:
datname | size | comment
----------------------+--------------+-----------------------------------------------------
last_wikidb | 18 MB. | Wiki backup from yesterday
postgres | 7 MB. | default administrative connection database
previous_wikidb | 18 MB. | Wiki backup from the day before yesterday
some_db | 82 MB. |
template0 | 7 MB. | unmodifiable empty database
template1 | 7 MB. | default template for new databases

Dumping tables and schemas that are accessible to me only?

Is it possible to dump tables and schemas that are accessible to me only in PostgreSQL?
First find the relations you have access to (this can be tweaked to pull schemas too)
with relnames as (SELECT relname FROM pg_class
WHERE relkind='r' and relnamespace = (select oid from pg_namespace where nspname = 'public'))
select array_agg(relname) from relnames WHERE has_table_privilege(SESSION_USER, relname, 'SELECT');
Now we aren't quite done now because that just creates an array of tables we have access to. We now need to change this to use array_to_string to get something we can feed into pg_dump:
with relnames as (SELECT relname FROM pg_class
WHERE relkind='r' and relnamespace = (select oid from pg_namespace where nspname = 'public'))
select array_to_string(array_agg(relname), ' -t ') from relnames WHERE has_table_privilege(SESSION_USER, relname, 'SELECT');
the above queries can be tweaked (changing the pg_namespace subquery) to pull namespaces you have access to and you could change it to a join to pull fully qualified table names.