I imported data into 30 tables (every table also has indexes and constraints) in my PostgreSQL database. After some check I truncated all the tables and performed vacuum full analyze on all the tables. Some space indeed was freed but I`m sure that more data need to be freed. Before the import my PostgreSQL directory was about 20G. After the import it grew to be 270G. Currently, the size of the data directory is 215G.
I run this select :
SELECT
relname as "Table",
pg_size_pretty(pg_total_relation_size(relid)) As "Size",
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as
"External Size"
FROM pg_catalog.pg_statio_user_tables ORDER BY
pg_total_relation_size(relid)
DESC;
and the result was that the biggest tables are 660M (and right now there are only 4 tables that their size bigger than 100M).
Table | Size | External Size
-------------------------------+------------+---------------
my_table_1 | 660 MB | 263 MB
my_Table_2 | 609 MB | 277 MB
my_table_3 | 370 MB | 134 MB
my_table_4 | 137 MB | 37 MB
my_table_5 | 83 MB | 31 MB
my_table_6 | 5056 kB | 24 kB
mariel_test_table | 4912 kB | 8192 bytes
..........
The data/base directory size is 213G.
I run also this select :
SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(C.oid) DESC
LIMIT 20;
output :
relation | size
-----------------------------------+--------
my_table_1 | 397 MB
my_Table_2 | 332 MB
my_Table_3 | 235 MB
my_table_7 | 178 MB
my_table_4 | 100 MB
my_table_8 | 99 MB
The outputs of the selects aren't identical.
tempfile sizes :
SELECT temp_files AS "Temporary files"
, temp_bytes AS "Size of temporary files"
FROM pg_stat_database db;;
Temporary files | Size of temporary files
-----------------+-------------------------
0 | 0
0 | 0
0 | 0
100 | 47929425920
0 | 0
I also tried to restart the PostgreSQL instance and the Linux server. What can I try next?
Related
Table queried size is not same as on physical layout
and table fsm file is of same size as table file. Ideally It should have been very small comparatively.
Postgresql Engine= 13
OS = centos7
Table file size:
# du -sh 16385_vm
8.0K 16385_vm
# du -sh 16385
24K 16385
# du -sh 16385_fsm
24K 16385_fsm
but when I query table sizes are below:
testing=# select pg_size_pretty(pg_relation_size('test1'));
pg_size_pretty
----------------
24 kB
(1 row)
testing=# select pg_size_pretty(pg_total_relation_size('test1'));
pg_size_pretty
----------------
64 kB
(1 row)
testing=# select pg_size_pretty(pg_table_size('test1'));
pg_size_pretty
----------------
64 kB
(1 row)
testing=# \d+ test1
Table "public.test1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+-------------------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
name | character varying | | | | extended | |
Access method: heap
testing=#
**Free space Map:**
testing=# SELECT * FROM pg_freespace('test1');
blkno | avail
-------+-------
0 | 1088
1 | 1120
2 | 3456
(3 rows)
toast table size is also zero:
testing=# select * from pg_class where oid=16385;
-[ RECORD 1 ]-------+------
oid | 16385
relname | test1
relnamespace | 2200
reltype | 16387
reloftype | 0
relowner | 10
relam | 2
relfilenode | 16385
reltablespace | 0
relpages | 3
reltuples | 17
relallvisible | 3
reltoastrelid | 16388
relhasindex | f
relisshared | f
relpersistence | p
relkind | r
relnatts | 2
relchecks | 0
relhasrules | f
relhastriggers | f
relhassubclass | f
relrowsecurity | f
relforcerowsecurity | f
relispopulated | t
relreplident | d
relispartition | f
relrewrite | 0
relfrozenxid | 487
relminmxid | 1
relacl |
reloptions |
relpartbound |
testing=#
[root#ip-10-15-11-219 16384]# du -sh 16388
0 16388
[root#ip-10-15-11-219 16384]#
Then how come sql query is returning 64 KB table and total relation size instead of 24 KB.
Why table fsm file size is 24 kB which is equal to actual table size of 24 KB?
In my postgresql 9.6 instance I have 1 production database. When I query the size of all databases :
combit=> Select pg_database.datname,pg_size_pretty(pg_database_size(pg_database.datname)) as size from pg_database;
datname | size
-----------+---------
template0 | 7265 kB
combit | 285 GB
postgres | 7959 kB
template1 | 7983 kB
repmgr | 8135 kB
(5 rows)
When I check what are the big tables in my database (includes indexes) :
combit=> SELECT nspname || '.' || relname AS "relation",
combit-> pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
combit-> FROM pg_class C
combit-> LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
combit-> WHERE nspname NOT IN ('pg_catalog', 'information_schema')
combit-> AND C.relkind <> 'i'
combit-> AND nspname !~ '^pg_toast'
combit-> ORDER BY pg_total_relation_size(C.oid) DESC
combit-> LIMIT 20;
relation | total_size
-----------------------------+------------
rep.ps_rf_inst_prod | 48 GB
rep.nap_inter_x5 | 46 GB
rep.man_x5 | 16 GB
rep.tc_fint_x5 | 9695 MB
rep.nap_ip_debit_x5 | 7645 MB
rep.ip__billing | 5458 MB
rep.ps_rd | 3417 MB
rep.nap_ip_discount | 3147 MB
rep.custo_x5 | 2154 MB
rep.ip_service_discou_x5 | 1836 MB
rep.tc_sub_rate__x5 | 294 MB
The total sum is not more than 120G.
When I check the fs directly :
[/data/base] : du -sk * | sort -n
7284 13322
7868 13323
7892 1
8156 166694
298713364 16400
[/data/base] :
16400 is the oid of the combit database. As you can see the size of combit on the fs is about 298G.
I checked for dead tuples in the biggest tables :
combit=>select relname,n_dead_tup,last_autoanalyze,last_analyze,last_autovacuum,last_vacuum from pg_stat_user_tables order by n_live_tup desc limit4;
-[ RECORD 1 ]----+------------------------------
relname | ps_rf_inst_prod
n_dead_tup | 0
last_autoanalyze | 2017-12-04 09:00:16.585295+02
last_analyze | 2017-12-05 16:08:31.218621+02
last_autovacuum |
last_vacuum |
-[ RECORD 2 ]----+------------------------------
relname | man_x5
n_dead_tup | 0
last_autoanalyze | 2017-12-05 06:02:07.189184+02
last_analyze | 2017-12-05 16:12:58.130519+02
last_autovacuum |
last_vacuum |
-[ RECORD 3 ]----+------------------------------
relname | tc_fint_x5
n_dead_tup | 0
last_autoanalyze | 2017-12-05 06:04:06.698422+02
last_analyze |
last_autovacuum |
last_vacuum |
-[ RECORD 4 ]----+------------------------------
relname | nap_inter_x5
n_dead_tup | 0
last_autoanalyze | 2017-12-04 08:54:16.764392+02
last_analyze | 2017-12-05 16:10:23.411266+02
last_autovacuum |
last_vacuum |
I run vacuum full on all 5 top tables 2 hours ago and it didnt free alot of space...
On this database the only operations that happen are truncate , insert and select. So how can it be that I had dead tuples on some of my tables ? If I only run truncate,select,insert query tuples shouldnt be created..
And the bigger question, Where are the missing 180G ?
Just wanted to mention that the solution was dumping the database with pg_dump into a file, dropping the database and then restoring it. I had in the database`s directory files that represented objects that were no longer exist.
I want to check something regarding PostgreSQL performance during my app is running.
My app does the next things on 20 tables in a loop :
truncate table.
drop constraints on table
drop indexes on table
insert into local_table select * from remote_oracle_table
Recently I'm getting an error in this part
SQLERRM = could not extend file "base/16400/124810.23": wrote only 4096 of 8192 bytes at block
3092001
create constraints on table
create indexes on table.
This operation runs every night. Most of the tables are small 500M-2G but few tables are pretty big 24G-45G.
My wals and my data directory are on different fs. My data directory fs size is 400G. During this operation the data directory fs becomes full. However, after this operation 100G are freed which means that 300G are used from the 400g of the data directory fs. Something regarding those sizes doesn't seems ok.
When I check my database size:
mydb=# SELECT
mydb-# pg_database.datname,
mydb-# pg_size_pretty(pg_database_size(pg_database.datname)) AS size
mydb-# FROM pg_database;
datname | size
-----------+---------
template0 | 7265 kB
mydb | 246 GB
postgres | 568 MB
template1 | 7865 kB
(4 rows)
When I check all the tables in mydb database:
mydb-# relname as "Table",
mydb-# pg_size_pretty(pg_total_relation_size(relid)) As "Size",
mydb-# pg_size_pretty(pg_total_relation_size(relid) -
pg_relation_size(relid)) as "External Size"
mydb-# FROM pg_catalog.pg_statio_user_tables ORDER BY
pg_total_relation_size(relid) DESC;
Table | Size | External Size
-------------------+------------+---------------
table 1| 45 GB | 13 GB
table 2| 15 GB | 6330 MB
table 3| 9506 MB | 3800 MB
table 4| 7473 MB | 1838 MB
table 5| 7267 MB | 2652 MB
table 6| 5347 MB | 1701 MB
table 7| 3402 MB | 1377 MB
table 8| 3092 MB | 1318 MB
table 9| 2145 MB | 724 MB
table 10| 1804 MB | 381 MB
table 11 293 MB | 83 MB
table 12| 268 MB | 103 MB
table 13| 225 MB | 108 MB
table 14| 217 MB | 40 MB
table 15| 172 MB | 47 MB
table 16| 134 MB | 36 MB
table 17| 102 MB | 27 MB
table 18| 86 MB | 22 MB
.....
In the data directory the base directory`s size is 240G. I have 16G of ram in my machine.
We are running on PostgreSQL version 9.1, previously we had over 1Billion rows in one table and has been deleted. However, it looks like the \l+ command still reports inaccurately about the actual database size (it reported 568GB but in reality it's much much less than).
The proof of that 568GB is wrong is that the individual table size tally didn't add up to the number, as you can see, top 20 relations has 4292MB in size, the remaining 985 relations are all well below 10MB. In fact all of them add up to about less than 6GB.
Any idea why PostgreSQL so much bloat? If confirmed, how can I debloat? I am not super familiar with VACUUM, is that what I need to do? If so, how?
Much appreciate it.
pmlex=# \l+
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description
-----------------+----------+----------+-------------+-------------+-----------------------+---------+------------+--------------------------------------------
pmlex | pmlex | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 568 GB | pg_default |
pmlex_analytics | pmlex | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 433 MB | pg_default |
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 5945 kB | pg_default | default administrative connection database
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 5841 kB | pg_default | unmodifiable empty database
| | | | | postgres=CTc/postgres | | |
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 5841 kB | pg_default | default template for new databases
| | | | | postgres=CTc/postgres | | |
(5 rows)
pmlex=# SELECT nspname || '.' || relname AS "relation",
pmlex-# pg_size_pretty(pg_relation_size(C.oid)) AS "size"
pmlex-# FROM pg_class C
pmlex-# LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
pmlex-# WHERE nspname NOT IN ('pg_catalog', 'information_schema')
pmlex-# ORDER BY pg_relation_size(C.oid) DESC;
relation | size
-------------------------------------+---------
public.page_page | 1289 MB
public.page_pageimagehistory | 570 MB
pg_toast.pg_toast_158103 | 273 MB
public.celery_taskmeta_task_id_key | 233 MB
public.page_page_unique_hash_uniq | 140 MB
public.page_page_ad_text_id | 136 MB
public.page_page_kn_result_id | 125 MB
public.page_page_seo_term_id | 124 MB
public.page_page_kn_search_id | 124 MB
public.page_page_direct_network_tag | 124 MB
public.page_page_traffic_source_id | 123 MB
public.page_page_active | 123 MB
public.page_page_is_referrer | 123 MB
public.page_page_category_id | 123 MB
public.page_page_host_id | 123 MB
public.page_page_serp_id | 121 MB
public.page_page_domain_id | 120 MB
public.celery_taskmeta_pkey | 106 MB
public.page_pagerenderhistory | 102 MB
public.page_page_campaign_id | 89 MB
...
...
...
pg_toast.pg_toast_4354379 | 0 bytes
(1005 rows)
Your options include:
1). Ensuring autovacuum is enabled and set aggressively.
2). Recreating the table as I mentioned in an earlier comment (create-table-as-select + truncate + reload the original table).
3). Running CLUSTER on the table if you can afford to be locked out of that table (exclusive lock).
4). VACUUM FULL, though CLUSTER is more efficient and recommended.
5). Running a plain VACUUM ANALYZE a few times and leaving the table as-is, to eventually fill the space back up as new data comes in.
6). Dump and reload the table via pg_dump
7). pg_repack (though I haven't used it in production)
it will likely look different if you use pg_total_relation_size instead of pg_relation_size
pg_relation_size doesn't give the total size of the table, see
https://www.postgresql.org/docs/9.5/static/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE
I'm looking for getting anticipated table size by referring column type and length size. I'm trying to use pg_column_size for this.
When testing the function, I realized something seems wrong with this function.
The result value from pg_column_size(...) is sometimes even smaller than the return value from octet_length(...) on the same string.
There is nothing but numeric characters in the column.
postgres=# \d+ t5
Table "public.t5"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-------------------+-----------+----------+--------------+-------------
c1 | character varying | | extended | |
Has OIDs: no
postgres=# select pg_column_size(c1), octet_length(c1) as octet from t5;
pg_column_size | octet
----------------+-------
2 | 1
704 | 700
101 | 7000
903 | 77000
(4 rows)
Is this the bug or something? Is there someone with the some formula to calculate anticipated table size from column types and length values of it?
I'd say pg_column_size is reporting the compressed size of TOASTed values, while octet_length is reporting the uncompressed sizes. I haven't verified this by checking the function source or definitions, but it'd make sense, especially as strings of numbers will compress quite well. You're using EXTENDED storage so the values are eligible for TOAST compression. See the TOAST documentation.
As for calculating expected DB size, that's whole new question. As you can see from the following demo, it depends on things like how compressible your strings are.
Here's a demonstration showing how octet_length can be bigger than pg_column_size, demonstrating where TOAST kicks in. First, let's get the results on query output where no TOAST comes into play:
regress=> SELECT octet_length(repeat('1234567890',(2^n)::integer)), pg_column_size(repeat('1234567890',(2^n)::integer)) FROM generate_series(0,12) n;
octet_length | pg_column_size
--------------+----------------
10 | 14
20 | 24
40 | 44
80 | 84
160 | 164
320 | 324
640 | 644
1280 | 1284
2560 | 2564
5120 | 5124
10240 | 10244
20480 | 20484
40960 | 40964
(13 rows)
Now let's store that same query output into a table and get the size of the stored rows:
regress=> CREATE TABLE blah AS SELECT repeat('1234567890',(2^n)::integer) AS data FROM generate_series(0,12) n;
SELECT 13
regress=> SELECT octet_length(data), pg_column_size(data) FROM blah;
octet_length | pg_column_size
--------------+----------------
10 | 11
20 | 21
40 | 41
80 | 81
160 | 164
320 | 324
640 | 644
1280 | 1284
2560 | 51
5120 | 79
10240 | 138
20480 | 254
40960 | 488
(13 rows)