postgresql consumed 100% of storage - postgresql

I'm running a Red Hat 4.8.3-9 server in AWS cloud with version 9.4 of Postgresql. The database has consumed 100% of my disk usage. I went in to the database and truncated the table with the most data. After viewing the size of the tables with \d+, there was not any tables over a couple MB's. I ran du -h * --max-depth=1 and found that /var/lib/pgsql94/data/base/16384 held 472G of the 500G total storage. Even after truncating the tables, my disk usage is still 99%. I'm wondering if there is a solution to release the data because I believe deleting all the OID's in 'data/base/16384 would be bad. I have tried stopping, restarting postgres service. I am not allowed to reboot the machine unfortunately.
df -ih shows inode usage is 1%
sudo lsof +L1 does not show any large files at all
Thank you
Log files: 8K worth of repeating string
LOG: could not write temporary statistics file
"pg_stat_tmp/db_16384.tmp": No space left on device
LOG: could not
close temporary statistics file "pg_stat_tmp/db_0.tmp": No space left
on device
LOG: could not close temporary statistics file
"pg_stat_tmp/global.tmp": No space left on device
LOG: using stale
statistics instead of current ones because stats collector is not
responding

Related

FATAL: could not write lock file "postmaster.pid": No space left on device

I'm running a PostgreSQL database in a docker container. I allocated 300Gb for docker and everything was going great, having like 90% of free space. Some day I got this error, PostgreSQL Database directory appears to contain a database; Skipping initialization:: FATAL: could not write lock file "postmaster.pid": No space left on device, which I don't know where it came from and how to actually fix it. Temporarily, I just increased the size to 400Gb, which fixed it, but why? Now I have 95% space left, which goes according to the previous output of 90% free space. I tried to trace using df -h and some other commands to check for disk usage and didn't find anything. Has anyone faced something similar?
I tried df -h to check for disk usage and allocate more docker resources

delete postgres logs when database is running - Get back disk space

postgres Log files are saturating the disk and I intend to delete all disks after backing up, should I restart postgres service or can postgres see the new free space after deletion without retsart? If no is there a command that forces postgres to see the nes space size while it is running?
You can delete PostgreSQL log files at any time. Note, however, that deleting (unlinking) a file does not actually delete it as long as a process still holds it open. So you have to notify PostgreSQL with
pg_ctl logrotate
just like the documentation describes.

PostgreSQL error: could not extend file no space left on device

I'm running a query that duplicates a very large table (92 million rows) on PostgreSQL. After a 3 iterations I got this error message:
The query was:
CREATE TABLE table_name
AS SELECT * FROM big_table
The issue isn't due to lack of space in the database cluster: at 0.3% of max possible storage at the time of running the query, table size is about 0.01% of max storage including all replicas. I also checked temporary files and it's not that.
You are definitely running out of file system resources.
Make sure you got the size right:
SELECT pg_table_size('big_table');
Don't forget that the files backing the new table are deleted after the error, so it is no surprise that you have lots of free space after the statement has failed.
One possibility is that you are not running out of disk space, but of free i-nodes. How to examine the free resources differs from file system to file system; for ext4 on Linux it would be
sudo dumpe2fs -h /dev/... | grep Free

Postgres 11 Standby never catches up

Since upgrading to Postgres 11 I cannot get my production standby server to catch up. In the logs things look fine eventually:
2019-02-06 19:23:53.659 UTC [14021] LOG: consistent recovery state reached at 3C772/8912C508
2019-02-06 19:23:53.660 UTC [13820] LOG: database system is ready to accept read only connections
2019-02-06 19:23:53.680 UTC [24261] LOG: started streaming WAL from primary at 3C772/8A000000 on timeline 1
But the following queries show everything is not fine:
warehouse=# SELECT coalesce(abs(pg_wal_lsn_diff(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn())), -1) / 1024 / 1024 / 1024 AS replication_delay_gbytes;
replication_delay_gbytes
-------------------------
208.2317776754498486
(1 row)
warehouse=# select now() - pg_last_xact_replay_timestamp() AS replication_delay;
replication_delay
-------------------
01:54:19.150381
(1 row)
After a while (a couple hours) replication_delay stays about the same but replication_delay_gbytes grows, although note replication_delay is behind from the beginning and replication_delay_gbytes starts near 0. During startup there were a number of these messages:
2019-02-06 18:24:36.867 UTC [14036] WARNING: xlog min recovery request 3C734/FA802AA8 is past current point 3C700/371ED080
2019-02-06 18:24:36.867 UTC [14036] CONTEXT: writing block 0 of relation base/16436/2106308310_vm
but Googling suggests these are fine.
Replica was created using repmgr by running pg_basebackup to perform the clone and then starting up the replica and seeing it catch up. This previously was working with Postgres 10.
Any thoughts on why this replica comes up but is perpetually lagging?
I'm still not sure what the issue is/was, but I was able to get the standby caught up with these two changes:
set use_replication_slots=true in the repmgr config
set wal_compression=on in the postgres config
Use replication slots didn't seem to change anything other than to cause replication_delay_gbytes to stay roughly flat. Turing on WAL compression did help, somehow, although I'm not entirely sure how. Yes, in theory it made it possible to ship WAL files to the standby faster, but reviewing network logs I see a drop in sent/received bytes that matches the effects of compression, so it seems to be shipping WAL files at the same speed just using less network.
It still seems like there is some underlying issue at play here, though, because for example when I do pg_basebackup to create the standby it generates roughly 500 MB/s of network traffic, but then when it is streaming WALs after the standby finishes recovery it drops to ~250 MB/s without WAL compression and ~100 MB/s with WAL compression, but there is no decrease in network traffic after it caught up with WAL compression, so I'm not sure what's going on there that allowed it to catch up.

unable to recover disk space postgresql

The system is FC21 with Postgresql 9.3.9. The cluster has 6 databases and uses 38 GB of storage in the pgsql directory. Recently over 20GB of redundant data has been removed. Each db has been vacuumed with a 'vacuum all' command twice, additionally the entire cluster has been vacuumed twice with a vacuumdb -a command. All ran successfully. Postgresql has been stopped and restarted.
For verification a pg_dumpall command creates an 12GB file.
All the tables from one db were removed:
select pg_size_pretty(pg_database_size('db'));
Shows over 6GB remaining.
How can the space be recovered? It seems unreasonable to have to do a pg_restore to recover the space. I have read and re-read the 'recovering disk space' document.
A VACUUM command will only reclaim space that is at the end of table-files. You will want VACUUM FULL or vacuumdb -f.
You might also want to consider reindexdb since all this row rewriting might leave your indexes a little bloated.