Postgresql autovacuum_vacuum - postgresql

The disk size increases periodically, even though the dead tuple count is reduced by increasing the performance of Autovacuum.
The amount of inserts is less than the number of dead tuples.
Test Environment:
Centos 7. postgesql 10.7 , memory 128G, ssd :600G,cpu : 16core
More than 40 million insert a day.
There are about 120 million dead tuples due to periodic updates.
I save data for a month and delete it once a week.
saved data is about 1.2 billion.
I checked disk size and dead tuple status periodically.
Disk size:
SELECT nspname || '.' || relname AS "relation",
pg_total_relation_size(C.oid) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size(C.oid) DESC
LIMIT 3;
Dead tuples:
SELECT relname, n_live_tup, n_dead_tup,
n_dead_tup / (n_live_tup::float) as ratio
FROM pg_stat_user_tables
WHERE n_live_tup > 0
AND n_dead_tup > 1000
ORDER BY ratio DESC;
It takes more than 3 days for Autovacuum.
I changed it to run within 30 minutes with this setting like this:
ALTER SYSTEM SET maintenance_work_mem ='1GB';
select pg_reload_conf();
alter table pm_reporthour set (autovacuum_vacuum_cost_limit = 1000);
ALTER TABLE PM_REPORTHOUR SET (autovacuum_vacuum_cost_delay =0);
Please advise which one to review.

If you have large deletes, you will always get bloat. The king's way in this case is partitioning: partition by range so that you can run DROP TABLE on a partition instead of a massive DELETE. Apart from better performance, that will avoid all bloat.

Related

Postgres: Autovacuum and autovacuum wraparound. When does each start?

I've got 'autovacuum_freeze_max_age' which is 200 000 000 by default.
And in theory I found a rule that autovacuum wraparound starts when:
If age(relfrozenxid) > autovacuum_freeze_max_age
But when then usual autovacuum is started? How can I count a moment:
When usual autovacuum on a table is started?
When autovacuum becomes autovacuum wraparound? Really after age(relfrozenxid) > autovacuum_freeze_max_age?
As the documentation states, normal autovacuum is triggered
if the number of tuples obsoleted since the last VACUUM exceeds the “vacuum threshold”, the table is vacuumed. The vacuum threshold is defined as:
vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
where the vacuum base threshold is autovacuum_vacuum_threshold, the vacuum scale factor is autovacuum_vacuum_scale_factor, and the number of tuples is pg_class.reltuples.
The table is also vacuumed if the number of tuples inserted since the last vacuum has exceeded the defined insert threshold, which is defined as:
vacuum insert threshold = vacuum base insert threshold + vacuum insert scale factor * number of tuples
where the vacuum insert base threshold is autovacuum_vacuum_insert_threshold, and vacuum insert scale factor is autovacuum_vacuum_insert_scale_factor.
The second part applies only to PostgreSQL v13 and later.
Furthermore,
If the relfrozenxid value of the table is more than vacuum_freeze_table_age transactions old, an aggressive vacuum is performed to freeze old tuples and advance relfrozenxid; otherwise, only pages that have been modified since the last vacuum are scanned.
So an autovacuum worker run that was triggered by the normal mechanism can run as an anti-wraparound VACUUM if there are old enough rows in the table.
Finally,
Tables whose relfrozenxid value is more than autovacuum_freeze_max_age transactions old are always vacuumed
So if a table with old live tuples is never autovacuumed during normal processing, a special anti-wraparound autovacuum run is triggered for it, even if autovacuum is disabled. Such an autovacuum run is also forced if there are multixacts that are older than vacuum_multixact_freeze_table_age, see here. From PostgreSQL v14 on, if an unfrozen row in a table is older than vacuum_failsafe_age, an anti-wraparound autovacuum will skip index cleanup for faster processing.
Yes, this is pretty complicated.
Made a query which shows the dead turple (when simple vacuum is started) and when vacuum wrapadaround:
with dead_tup as (
SELECT st.schemaname || '.' || st.relname tablename,
st.n_dead_tup dead_tup,
current_setting('autovacuum_vacuum_threshold')::int8 +
current_setting('autovacuum_vacuum_scale_factor')::float * c.reltuples
max_dead_tup,
(current_setting('autovacuum_vacuum_threshold')::int8 +
current_setting('autovacuum_vacuum_scale_factor')::float * c.reltuples - st.n_dead_tup) as left_for_tr_vacuum,
st.last_autovacuum,
c.relnamespace,
c.oid
FROM pg_stat_all_tables st,
pg_class c
WHERE c.oid = st.relid
AND c.relkind IN ('r','m','t')
AND st.schemaname not like ('pg_temp%'))
SELECT c.oid::regclass as table,
current_setting('autovacuum_freeze_max_age')::int8 -
age(c.relfrozenxid) as xid_left,
pg_relation_size(c.oid) as relsize,
dt.dead_tup,
dt.max_dead_tup,
dt.left_for_tr_vacuum,
dt.last_autovacuum
from (pg_class c
join pg_namespace n on (c.relnamespace=n.oid)
left join dead_tup dt on (c.relnamespace=dt.relnamespace and c.oid=dt.oid))
where c.relkind IN ('r','m','t') --and (age(c.relfrozenxid)::int8 > (current_setting('autovacuum_freeze_max_age')::int8 * 0.8))
AND n.nspname not like ('pg_temp%')
order by 2

Postgres' insert fail results in doubled database size - VACUUM FULL does not reclaim space

So this has been baffling me for days.
I have a Postgres database with the Timescale extension active. I have a table table_name which is partitioned by week on field created_at (date time with timezone) using Postgres data partitioning feature. It had about 244M rows (~300GB). I wanted to take advantage of the Timescaledb extension and move all the data from the data partitions to a hypertable, so I followed the same db migration steps: https://docs.timescale.com/latest/getting-started/migrating-data#same-db :
I created a table, table_name_timescale like the table_name but without the primary key on id as the hypertable creation would not work. Ran SELECT create_hypertable('table_name_timescale', 'created_at'); and created the hypertable successfully.
Then ran INSERT INTO table_name_timescale SELECT * FROM table_name;, which failed after a few hours because server ran out of disk space. The initial db was ~ 350GB, Server had about 1TB in total, about half of which was free, table_name occupied about ~ 300GBs.
After that the database size went up to ~750GB and I cannot recover that space. table_name_timescale did not contain any rows, neither did all the Timescale-specific tables.
Among the things I've tried:
A VACUUM FULL VERBOSE on the table
VACUUM FULL VERBOSE on each partition
Example output of VACUUM FULL ANALYZE
VACUUM FULL ANALYZE VERBOSE table_name;
INFO: vacuuming "public.table_name"
INFO: "request_sets": found 0 removable, 0 nonremovable row versions in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU: user: 0.09 s, system: 0.00 s, elapsed: 0.10 s.
INFO: analyzing "public.table_name"
INFO: "table_name": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows
INFO: analyzing "public.table_name" inheritance tree
INFO: "table_name_y2020_w23": scanned 19639 of 2653099 pages, containing 163525 live rows and 0 dead rows; 19639 rows in sample, 22091146 estimated total rows
INFO: "table_name_y2020_w24": scanned 24264 of 3277907 pages, containing 201591 live rows and 0 dead rows; 24264 rows in sample, 27233620 estimated total rows
INFO: "table_name_y2020_w25": scanned 24970 of 3373276 pages, containing 205052 live rows and 0 dead rows; 24970 rows in sample, 27701121 estimated total rows
INFO: "table_name_y2020_w26": scanned 21279 of 2874745 pages, containing 170232 live rows and 0 dead rows; 21279 rows in sample, 22997960 estimated total rows
INFO: "table_name_y2020_w27": scanned 21227 of 2867687 pages, containing 169816 live rows and 0 dead rows; 21227 rows in sample, 22941496 estimated total rows
INFO: "table_name_y2020_w28": scanned 19487 of 2632630 pages, containing 155896 live rows and 0 dead rows; 19487 rows in sample, 21061040 estimated total rows
INFO: "table_name_y2020_w29": scanned 19134 of 2584880 pages, containing 153072 live rows and 0 dead rows; 19134 rows in sample, 20679040 estimated total rows
VACUUM
In general, VACUUM FULL does not come up with dead tuples or release any space, especially since the table_name is only used for inserts and no deletes or updates at all.
Ran a manual CHECKPOINT to release space from wal - I understand that this is not the correct way but...
Retried the insert, but per partition this time INSERT INTO table_name_timescale SELECT * FROM table_name_y2020_w1;, in case this would force the server to re-write/ release space (some of the partitions were inserted successfully but when we reached e.g. week 28, it once more failed with panic and out of space). I didn't want to delete the data in partitions that were successfully inserted because I want to be sure Timescale works as it is supposed to before deleting anything.
Discarded some old data that wasn't in use, table size went down to 155GB, 159GB with index and toast.
Ran SELECT drop_chunks(interval '1 days', 'table_name_timescale'); to drop anything that could have been inserted into timescale, which of course did nothing.
Dropped table_name_timescale, which again did nothing as it did not contain any data, but it did take a lot of time to complete.
Checked all timescale related tables to see if there's anything hanging, checked also with VACUUM ANALYZE, nothing.
Checked for locks
Checked for long running transactions
Stopped - restarted the server
I will eventually to back-up - drop db - restore, (I've tried restoring on a different server and the db size was back to normal) but before I do that I would really like to understand what happened here. What I cannot explain or understand is that the table size is 159GB* in total (checking with https://wiki.postgresql.org/wiki/Disk_Usage), the database size is 541GB (SELECT pg_size_pretty( pg_database_size('database_name') );). And of course I can see that under /var/lib/postgresql/data/base/24834 the files occupy the 541GB*.
(* after I deleted some old rows as mentioned above)
The output of:
SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size(C.oid) DESC
LIMIT 10;
"public.table_name_y2020_w25" "26 GB"
"public.table_name_y2020_w24" "26 GB"
"public.table_name_y2020_w26" "22 GB"
"public.table_name_y2020_w27" "22 GB"
"public.table_name_y2020_w23" "21 GB"
"public.table_name_y2020_w28" "21 GB"
"public.table_name_y2020_w29" "20 GB"
"public.models" "1564 MB"
"public.table_name_y2020_w20" "567 MB"
"public.table_name_y2020_w18" "97 MB"
Digging into the directory as suggested:
select oid, * from pg_database where oid = 24834;
-- output:
oid datname datdba encoding datcollate datctype datistemplate datallowconn datconnlimit datlastsysoid datfrozenxid datminmxid dattablespace datacl
24834 database_name 10 6 en_US.utf8 en_US.utf8 FALSE TRUE -1 13043 30663846 293 1663
And
select relname, relfilenode from pg_class where oid='24834';
outputs nothing.
SELECT pg_relation_filepath('table_name');
pg_relation_filepath
base/24834/343621
select * from pg_class where relfilenode='343621';
relname relnamespace reltype reloftype relowner relam relfilenode reltablespace relpages reltuples relallvisible reltoastrelid relhasindex relisshared relpersistence relkind relnatts relchecks relhasoids relhaspkey relhasrules relhastriggers relhassubclass relrowsecurity relforcerowsecurity relispopulated relreplident relispartition relfrozenxid relminmxid relacl reloptions relpartbound
table_name 2200 24940 0 10 0 343621 0 0 0 0 24943 TRUE FALSE p r 26 0 FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE d FALSE 30664717 293
Disclaimer: Postgres internals noob here, still reading upon how things work. Also, yes, space is tight on the server and ~400GB free space to do this transition is almost ok, but it is what it is unfortunately.. :)

Delete unused indexes

I run this query for check if there are some unused indexes in my DataBase.
select
t.tablename AS "relation",
indexname,
c.reltuples AS num_rows,
pg_relation_size(quote_ident(t.tablename)::text) AS table_size,
pg_relation_size(quote_ident(indexrelname)::text) AS index_size,
idx_scan AS number_of_scans,
idx_tup_read AS tuples_read,
idx_tup_fetch AS tuples_fetched
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns, psai.idx_scan, idx_tup_read, idx_tup_fetch, indexrelname, indisunique FROM pg_index x
JOIN pg_class c ON c.oid = x.indrelid
JOIN pg_class ipg ON ipg.oid = x.indexrelid
JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid )
AS foo
ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
and idx_scan = 0
ORDER BY
--1,2
--6
5 desc
;
And I got a lot of rows where those fields are all zero:
number_of_scans,
tuples_read,
tuples_fetched
Is that mean that I can drop them? Is there a chance that that Metadata is out-of-date? How can I check it?
I'm using Postgres with version 9.6
Your query misses some uses of indexes that do not require them to be scanned:
they enforce primary key, unique and exclusion constraints
they influence statistics collection (for “expression indexes”)
Here is my gold standard query from my blog post:
SELECT s.schemaname,
s.relname AS tablename,
s.indexrelname AS indexname,
pg_relation_size(s.indexrelid) AS index_size
FROM pg_catalog.pg_stat_user_indexes s
JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid
WHERE s.idx_scan = 0 -- has never been scanned
AND 0 <>ALL (i.indkey) -- no index column is an expression
AND NOT EXISTS -- does not enforce a constraint
(SELECT 1 FROM pg_catalog.pg_constraint c
WHERE c.conindid = s.indexrelid)
ORDER BY pg_relation_size(s.indexrelid) DESC;
Anything that shows up there has not been used since the statistics have been reset and can be safely dropped.
There are a few caveats:
statistics collection must run (look for the “statistics collector” process and see if you have warnings about “stale statistics” in the log)
run the query against your production database
if your program is running at many sites, try it on all of them (different users have different usage patterns)
It is possible you can delete them, however you should make sure your query runs after a typical workload. That is, are there some indexes that show no usage in this query only used during certain times when specialized queries run? Month-end reporting, weekly runs, etc? We ran into this a couple of times - several large indexes didn't get used during the day but supported month-end summaries.

Postgresql partitioned table is using space despite no entries

We have recently partitioned a master table with couple of millions of rows. This table is partitioned by "id range". There are 7 child tables created under this master table and entries are going into the child table during insert. All these are managed by pg_partman extension.
While running this query, the master table is shown to occupy about 300GB of disk space. This is strange because this table has no entries and I could confirm that by running check_parent() function.
SELECT nspname || '.' || relname AS "relation",pg_size_pretty(pg_relation_size(C.oid)) AS "size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema') ORDER BY pg_relation_size(C.oid) DESC LIMIT 30;
We never had this problem while partitioning this table in other environments, where the data is not that much. Could this be due to unreleased disk space during partitioning?
Yes, that could definitely be due to unreleased disk space. You should do a VACUUM FULL after moving data to different structures like partitioned tables.
PostgreSQL generally does not release table space automatically. Normal (automatic) VACUUM ANALYZE maintains the database but does not shrink tables on disk. VACUUM FULL locks the table, though, so be careful not to run it during normal operation hours.

How do I delete old rows based on table size and available free space?

I have continuous inserts into my table, and want to automatically delete the oldest rows to maintain a constant table size (meaning the size of the table on the hard drive). I was able to do this in mysql using:
SELECT DATA_LENGTH,DATA_FREE FROM information_schema.tables WHERE table_name = 'table';
Using these values and the free space on the drive, I can issue delete statements, and the table size will not grow.
How can I do this in postgresql? I could only find the table size:
SELECT pg_total_relation_size('table_name');
Is there a way to find the free space in the table, or is there a better solution to keep the table at a fixed size?
This may be a start for table sizes
SELECT relname, relpages * 8192 "Size_In_Bytes"
FROM pg_class c
JOIN information_schema.tables t
ON c.relname = t.table_name
WHERE table_schema = 'public'
ORDER BY relpages * 8192 DESC
I am somewhat curious about the requirement to keep the DB at a certain size. Do you archive the data elsewhere? If not, why keep it at all? Why not make it a time-based retention policy than a size-based retention policy?