Postgresql partitioned table is using space despite no entries

Postgresql partitioned table is using space despite no entries - postgresql

We have recently partitioned a master table with couple of millions of rows. This table is partitioned by "id range". There are 7 child tables created under this master table and entries are going into the child table during insert. All these are managed by pg_partman extension.
While running this query, the master table is shown to occupy about 300GB of disk space. This is strange because this table has no entries and I could confirm that by running check_parent() function.
SELECT nspname || '.' || relname AS "relation",pg_size_pretty(pg_relation_size(C.oid)) AS "size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema') ORDER BY pg_relation_size(C.oid) DESC LIMIT 30;
We never had this problem while partitioning this table in other environments, where the data is not that much. Could this be due to unreleased disk space during partitioning?

Yes, that could definitely be due to unreleased disk space. You should do a VACUUM FULL after moving data to different structures like partitioned tables.
PostgreSQL generally does not release table space automatically. Normal (automatic) VACUUM ANALYZE maintains the database but does not shrink tables on disk. VACUUM FULL locks the table, though, so be careful not to run it during normal operation hours.

Related

Postgresql autovacuum_vacuum

The disk size increases periodically, even though the dead tuple count is reduced by increasing the performance of Autovacuum.
The amount of inserts is less than the number of dead tuples.
Test Environment:
Centos 7. postgesql 10.7 , memory 128G, ssd :600G,cpu : 16core
More than 40 million insert a day.
There are about 120 million dead tuples due to periodic updates.
I save data for a month and delete it once a week.
saved data is about 1.2 billion.
I checked disk size and dead tuple status periodically.
Disk size:
SELECT nspname || '.' || relname AS "relation",
pg_total_relation_size(C.oid) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size(C.oid) DESC
LIMIT 3;
Dead tuples:
SELECT relname, n_live_tup, n_dead_tup,
n_dead_tup / (n_live_tup::float) as ratio
FROM pg_stat_user_tables
WHERE n_live_tup > 0
AND n_dead_tup > 1000
ORDER BY ratio DESC;
It takes more than 3 days for Autovacuum.
I changed it to run within 30 minutes with this setting like this:
ALTER SYSTEM SET maintenance_work_mem ='1GB';
select pg_reload_conf();
alter table pm_reporthour set (autovacuum_vacuum_cost_limit = 1000);
ALTER TABLE PM_REPORTHOUR SET (autovacuum_vacuum_cost_delay =0);
Please advise which one to review.

If you have large deletes, you will always get bloat. The king's way in this case is partitioning: partition by range so that you can run DROP TABLE on a partition instead of a massive DELETE. Apart from better performance, that will avoid all bloat.

Why is the postgreSQL waiting while executing vacuum full table? 4T table data

I have a bloated table, its name is "role_info".
There are about 20K insert operations and a lot of update operations per day, there are no delete operations.
The table is about 4063GB now.
We have migrated the table to another database using dump, and the new table is about 62GB, so the table on the old database is bloated very seriously.
PostgreSQL version: 9.5.4
The table schema is below:
CREATE TABLE "role_info" (
"roleId" bigint NOT NULL,
"playerId" bigint NOT NULL,
"serverId" int NOT NULL,
"status" int NOT NULL,
"baseData" bytea NOT NULL,
"detailData" bytea NOT NULL,
PRIMARY KEY ("roleId")
);
CREATE INDEX "idx_role_info_serverId_playerId_roleId" ON "role_info" ("serverId", "playerId", "roleId");
The average size of field 'detailData' is about 13KB each line.
There are some SQL execution results below:
1)
SELECT
relname AS name,
pg_stat_get_live_tuples(c.oid) AS lives,
pg_stat_get_dead_tuples(c.oid) AS deads
FROM pg_class c
ORDER BY deads DESC;
Execution Result:
2)
SELECT *,
Pg_size_pretty(total_bytes) AS total,
Pg_size_pretty(index_bytes) AS INDEX,
Pg_size_pretty(toast_bytes) AS toast,
Pg_size_pretty(table_bytes) AS TABLE
FROM (SELECT *,
total_bytes - index_bytes - Coalesce(toast_bytes, 0) AS
table_bytes
FROM (SELECT c.oid,
nspname AS table_schema,
relname AS TABLE_NAME,
c.reltuples AS row_estimate,
Pg_total_relation_size(c.oid) AS total_bytes,
Pg_indexes_size(c.oid) AS index_bytes,
Pg_total_relation_size(reltoastrelid) AS toast_bytes
FROM pg_class c
LEFT JOIN pg_namespace n
ON n.oid = c.relnamespace
WHERE relkind = 'r') a
WHERE table_schema = 'public'
ORDER BY total_bytes DESC) a;
Execution Result:
3)
I have tried to vacuum full the table "role_info", but it seemed blocked by some other process, and didn't execute at all.
select * from pg_stat_activity where query like '%VACUUM%' and query not like '%pg_stat_activity%';
Execution Result:
select * from pg_locks;
Execution Result:
There are parameters of vacuum:
I have two questions:
How to deal with table bloating? autovacuum seems not working.
Why did the vacuum full blocked?

With your autovacuum settings, it will sleep for 20ms once for every 10 pages (200 cost_limit / 20 cost_dirty) it dirties. Even more because there will also be cost_hit and cost_miss as well. At that rate is would take over 12 days to autovacuum a 4063GB table which is mostly in need of dirtying pages. That is just the throttling time, not counting the actual work-time, nor the repeated scanning of the indexes. So it the actual run time could be months. The chances of autovacuum getting to run to completion in one sitting without being interrupted by something could be pretty low. Does your database get restarted often? Do you build and drop indexes on this table a lot, or add and drop partitions, or run ALTER TABLE?
Note that in v12, the default setting of autovacuum_vacuum_cost_delay was lowered by a factor of 10. This is not just because of some change to the code in v12, it was because we realized the default setting was just not sensible on modern hardware. So it would probably make sense to backport this change your existing database, if not go even further. Before 12, you can't lower to less than 1 ms, but you could lower it to 1 ms and also either increase autovacuum_vacuum_cost_delay or lower vacuum_cost_page_* setting.
Now this analysis is based on the table already being extremely bloated. Why didn't autovacuum prevent it from getting this bloated in the first place, back when the table was small enough to be autovacuumed in a reasonable time? That is hard to say. We really have no evidence as to what happened back then. Maybe your settings were even more throttled than they are now (although unlikely as it looks like you just accepted the defaults), maybe it was constantly interrupted by something. What is the "autovacuum_count" from pg_stat_all_tables for the table and its toast table?
Why did the vacuum full blocked?
Because that is how it works, as documented. That is why it is important to avoid getting into this situation in the first place. VACUUM FULL needs to swap around filenodes at the end, and needs an AccessExclusive lock to do that. It could take a weaker lock at first and then try to upgrade to AccessExclusive later, but lock upgrades have a strong deadlock risk, so it takes the strongest lock it needs up front.
You need a maintenance window where no one else is using the table. If you think you are already in such window, then you should look at the query text for the process doing the blocking. Because the lock already held is ShareUpdateExclusive, the thing holding it is not a normal query/DML, but some kind of DDL or maintenance operation.
If you can't take a maintenance window now, then you can at least do a manual VACUUM without the FULL. This takes a much weaker lock. It probably won't shrink the table dramatically, but should at least free up space for internal reuse so it stops getting even bigger while you figure out when you can schedule a maintenance window or what your other next steps are.

Temporary tables bloating pg_attribute

I'm using COPY to insert large batches of data into our database from CSVs. The insert looks something like this:
-- This tmp table will contain all the items that we want to try to insert
CREATE TEMP TABLE tmp_items
(
field1 INTEGER NULL,
field2 INTEGER NULL,
...
) ON COMMIT DROP;
COPY tmp_items(
field1,
field2,
...
) FROM 'path\to\data.csv' WITH (FORMAT csv);
-- Start inserting some items
WITH newitems AS (
INSERT INTO items (field1, field2)
SELECT tmpi.field1, tmpi,field2
FROM tmp_items tmpi
WHERE some condition
-- Return the new id and other fields to the next step
RETURNING id AS newid, field1 AS field1
)
-- Insert the result into another temp table
INSERT INTO tmp_newitems SELECT * FROM newitems;
-- Use tmp_newitems to update other tables
etc....
When will then use the data in tmp_items to do multiple inserts in multiple tables. We check for duplicates and manipulate the data in a few ways before inserting, so not everything in tmp_items will be used or inserted as is. We do this by a combination of CTEs and more temporary tables.
This works very well and is fast enough for our needs. We do loads of these and the problem we have is that pg_attribute is becoming very bloated quite fast and autovacuum doesn't seem to be able to keep up (and consumes a lot of CPU).
My questions are:
Is it possible to perform this kind of insert without using temp tables?
If not, should we just make autovacuum of pg_attribute more agressive? Won't that take up as much or more CPU?

The best solution would be that you create your temporary tables at at session start with
CREATE TEMPORARY TABLE ... (
...
) ON COMMIT DELETE ROWS;
Then the temporary tables would be kept for the duration of the session but emptied at every commit.
This will reduce the bloat of pg_attribute considerable, and bloating shouldn't be a problem any more.
You could also join the dark side (be warned, this is unsupported):
Start PostgreSQL with
pg_ctl start -o -O
so that you can modify system catalogs.
Connect as superuser and run
UPDATE pg_catalog.pg_class
SET reloptions = ARRAY['autovacuum_vacuum_cost_delay=0']
WHERE oid = 'pg_catalog.pg_attribute'::regclass;
Now autovacuum will run much more aggressively on pg_attribute, and that will probably take care of your problem.
Mind that the setting will be gone after a major upgrade.

I know this is an old question, but somebody might find my help useful here in the future.
So we're very heavy on temp tables having >500 rps and async i\o via nodejs and thus experienced a very heavy bloating of pg_attribute because of that. All you are left with is a very aggressive vacuuming which halts performance.
All answers given here do not solve this, because droping and recreating temp table bloats pg_attribute heavily and therefore one sunny morning you will find db performance dead, and pg_attribute 200+ gb while your db would be like 10gb.
So the solution is elegantly this
create temp table if not exists my_temp_table (description) on commit delete rows;
So you go on playing with temp tables, save your pg_attribute, no dark side heavy vacuuming and get desired performance.
don't forget
vacuum full pg_depend;
vacuum full pg_attribute;
Cheers :)

PostgreSQL - restored database smaller than original

I have made a backup of my PostgreSQL database using pg_dump to ".sql" file.
When I restored the database, its size was 2.8GB compared with 3.7GB of the source (original) database. The application that access the database appears to work fine.
What is the reason to smaller size of the restored database?

The short answer is that database storage is more optimised for speed than space.
For instance, if you inserted 100 rows into a table, then deleted every row with an odd numbered ID, the DBMS could write out a new table with only 50 rows, but it's more efficient for it to simply mark the deleted rows as free space and reuse them when you next insert a row. Therefore the table takes up twice as much space as is currently needed.
Postgres's use of "MVCC", rather than locking, for transaction management makes this even more likely, since an UPDATE usually involves writing a new row to storage, then marking the old row deleted once no transactions are looking at it.
By dumping and restoring the database, you are recreating a DB without all this free space. This is essentially what the VACUUM FULL command does - it rewrites the current data into a new file, then deletes the old file.
There is an extension distributed with Postgres called pg_freespace which lets you examine some of this. e.g. you can list the main table size (not including indexes and columns stored in separate "TOAST" tables) and free space used by each table with the below:
Select oid::regclass::varchar as table,
pg_size_pretty(pg_relation_size(oid)/1024 * 1024) As size,
pg_size_pretty(sum(free)) As free
From (
Select c.oid,
(pg_freespace(c.oid)).avail As free
From pg_class c
Join pg_namespace n on n.oid = c.relnamespace
Where c.relkind = 'r'
And n.nspname Not In ('information_schema', 'pg_catalog')
) tbl
Group By oid
Order By pg_relation_size(oid) Desc, sum(free) Desc;

The reason is simple: during its normal operation, when rows are updated, PostgreSQL adds a new copy of the row and marks the old copy of the row as deleted. This is multi-version concurrency control (MVCC) in action. Then VACUUM reclaims the space taken by the old row for data that can be inserted in the future, but doesn't return this space to the operating system as it's in the middle of a file. Note that VACUUM isn't executed immediately, only after enough data has been modified in the table or deleted from the table.
What you're seeing is entirely normal. It just shows that PostgreSQL database will be larger in size than the sum of the sizes of the rows. Your new database will most likely eventaully grow to 3.7GB when you start actively using it.

How do I delete old rows based on table size and available free space?

I have continuous inserts into my table, and want to automatically delete the oldest rows to maintain a constant table size (meaning the size of the table on the hard drive). I was able to do this in mysql using:
SELECT DATA_LENGTH,DATA_FREE FROM information_schema.tables WHERE table_name = 'table';
Using these values and the free space on the drive, I can issue delete statements, and the table size will not grow.
How can I do this in postgresql? I could only find the table size:
SELECT pg_total_relation_size('table_name');
Is there a way to find the free space in the table, or is there a better solution to keep the table at a fixed size?

This may be a start for table sizes
SELECT relname, relpages * 8192 "Size_In_Bytes"
FROM pg_class c
JOIN information_schema.tables t
ON c.relname = t.table_name
WHERE table_schema = 'public'
ORDER BY relpages * 8192 DESC
I am somewhat curious about the requirement to keep the DB at a certain size. Do you archive the data elsewhere? If not, why keep it at all? Why not make it a time-based retention policy than a size-based retention policy?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse