I have a medium sized database cluster running on PostgreSQL 8.3.
The database stores digital files (images) as LOBs.
There is a fair bit of activity in the database cluster, a lot of content is created and deleted in an ongoing manner.
Even though the application table which hosts the OIDs, gets maintained properly by the application (when an image file is deleted), the size of the database cluster grows continuously.
Auto-vacuuming is active so this shouldn't happen.
LOBs are NOT deleted from the database when the rows of your application table (containing the OIDs) get deleted.
This also means that space will NOT be reclaimed by the VACUUM process.
In order to get rid of unused LOBs, you have run VACUUMLO on the databases. Vacuumlo will delete all of the unreferenced LOBs from a database.
Example call:
vacuumlo -U postgres -W -v <database_name>
(I only included the -v to make vacuumlo a bit more verbose so that you see how many LOBs it removes)
After vacuumlo has deleted the LOBs, you can run VACUUM FULL (or let the auto-vacuum process run).
Related
I'm running a looooong pg_restore process of a database with 70 tables and 800Gb. The process is taking 5 days now. I'm monitoring some aspects of the process to evaluate how long will it take but I've some things missing and this is why I'm asking.
I run pg_dump with parameters -F d -j 10 the dump took about 12 hours. I noticed each one of the 10 threads took responsibility of a single table from start to end. After ending of processing a single table, the same process (pid) started with another table not taken by another process.
Running pg_restore is taking much longer (5 days and still working). The main reason is that I'm restoring to a NAS external drive mounted using nfs and that drive is very slow compared to a local hard drive. This is NOT a problem, I'll migrate the information back from the NAS to the original hard drive once I format the hard drive again and install the new operating system.
I'm doing two things to monitor progress:
In a separate terminal I launch du -sh /var/lib/pgsql and evaluate the disk space consumed in the new installation. It has to reach, more or less, the same space the original database was using.
In a separate terminal I launch ps -fu postgress and I see several pg_restore processes running. Each one of then linked with another process with this shape postgress: postress {dbname} [local] {command} where {dbname} is the database name, and {command} varies. Initially, there was the COPY command I think that was used to restore the table content. I also saw some CREATE INDEX commands for re-creating the indexes of that table, and now I see ALTER TABLE commands, don't know exactly for what.
At this time, all processes are just doing ALTER TABLE and the overall used space almost matches the initial space, but the process does not ends (and it is taking 5 days now).
So I'm asking if someone has more experience and can tell me what pg_restore is doing with the ALTER_TABLE command and if there is any other mechanism to estimate how long will it take.
Thanks!
Ignacio
The ALTER TABLE statements at the end of a pg_restore create primary and unique keys as well as foreign key constraints. They could also be attaching partitions, but that is normally very fast.
Look into pg_stat_progress_create_index if you have a recent enough PostgreSQL version (you didn't say), then you can monitor the progress of primary and unique key indexes being created.
postgres Log files are saturating the disk and I intend to delete all disks after backing up, should I restart postgres service or can postgres see the new free space after deletion without retsart? If no is there a command that forces postgres to see the nes space size while it is running?
You can delete PostgreSQL log files at any time. Note, however, that deleting (unlinking) a file does not actually delete it as long as a process still holds it open. So you have to notify PostgreSQL with
pg_ctl logrotate
just like the documentation describes.
I run df -h and it says 36Gb used, 9Gb available on my server. I can see that my PostgreSQL db file is 26Gb.
I delete millions of rows from my database, roughly half of the data in there.
I run df -h and it says the exact same numbers: 36Gb used, 9Gb available on disk.
I googled and found something about the VACUUM command, which said the deleted rows are still taking disk space until you run vacuum, I did this in verbose mode and it looks good but still no disk space released.
The system is FC21 with Postgresql 9.3.9. The cluster has 6 databases and uses 38 GB of storage in the pgsql directory. Recently over 20GB of redundant data has been removed. Each db has been vacuumed with a 'vacuum all' command twice, additionally the entire cluster has been vacuumed twice with a vacuumdb -a command. All ran successfully. Postgresql has been stopped and restarted.
For verification a pg_dumpall command creates an 12GB file.
All the tables from one db were removed:
select pg_size_pretty(pg_database_size('db'));
Shows over 6GB remaining.
How can the space be recovered? It seems unreasonable to have to do a pg_restore to recover the space. I have read and re-read the 'recovering disk space' document.
A VACUUM command will only reclaim space that is at the end of table-files. You will want VACUUM FULL or vacuumdb -f.
You might also want to consider reindexdb since all this row rewriting might leave your indexes a little bloated.
What will happen if we restore a pgdump file of earlier time on a running db?
I have restored an older sql file over existing database does it harm to DB and its functinality ?
In general, yes, it'll screw up the database. Rows that were deleted in the past will be back. Sequences may be reset. Dropped tables can be re-created. All sorts of things.
Without more details, particularly the command used when restoring the dump and the nature of the dump, it's hard to be sure in this specific case.
If you restored with:
psql -1 -v ON_ERROR_STOP=1 -f the_dump.sql
then it's possibly you might not have any damage, or might only have to re-set some sequences.