How to enable auto clean of pg_xlog - postgresql

I'm trying to configure a PostgreSQL 9.6 database to limit the size of the pg_xlog folder. I've read a lot of threads about this issue or similar ones but nothing I've tried helped.
I wrote an setup script for my Postgresql 9.6 service instance. It executes initdb, registers a windows service, starts it, creates an empty database and restores a dump into the database. After the script is done, the database structure is fine, the data is there but the xlog folder already contains 55 files (880 mb).
To reduce the size of the folder, I tried setting wal_keep_segments to 0 or 1, setting the max_wal_size to 200mb, reducing the checkpoint_timeout, setting archive_mode off and archive_command to an empty string. I can see the properties have been set correctly when I query pg_settings.
I then forced checkpoints through SQL, vacuumed the database, restarted the windows service and tried pg_archivecleanup, nothing really worked. My xlog folder downsized to 50 files (800 mb), not anywhere near the 200 mb limit I set in the config.
I have no clue what else to try. If anyone can tell me what I'm doing wrong, I would be very grateful. If more information is required, I'll be glad to provide it.
Many thanks

PostgreSQL won't aggressively remove WAL segments that have already been allocated when max_wal_size was at the default value of 1GB.
The reduction will happen gradually, whenever a WAL segment is full and needs to be recycled. Then PostgreSQL will decide whether to delete the file (if max_wal_size is exceeded) or rename it to a new WAL segment for future use.
If you don't want to wait that long, you could force a number of WAL switches by calling the pg_switch_xlog() function, that should reduce the number of files in your pg_xlog.

Related

PostgreSQL how to safely remove files inside pg_wal directory

i have a legacy postgreSQL DB and the pg_wal size is very huge,
how to safely remove the early files inside pg_wal directory to reduce the pg_wal size without interrupting the current database?
Thank you
There is no safe way to manually remove files in pg_wal. Don't do it.
You have to figure out the reason that keeps PostgreSQL from deleting the files. A stale replication slot? Is the archiver stuck? Is wal_keep_size (wal_keep_segments in older releases) large?
Once you have fixed the problem, the situation will gradually improve. WAL segments are automatically deleted during checkpoints.

DB2 SQL3706N A disk full error was encountered

I have nearly 600+ files to load in DB2 database version 10.5.9. Each file size is nearly 200 MB. I have a batch script to upload each files in a loop.
My Disk "/mnt/blumeta0/db2/copy"size is 16 GB
If i run this upload with nonrecoverable mode it works. But i cant do that in my prod database.
I tried to db3 connect refresh and db3 terminate after each file uploaded but does not worked.
Manually cleaned up disk /mnt/blumeta0/db2/copy but total size of all files is more than 16 GB so got same error.
I cannot clean folder in script as clean up can be done with super user.
db2 "LOAD FROM $i OF DEL INSERT INTO <table_name>"
SQL3706N A disk full error was encountered on "/mnt/blumeta0/db2/copy".
How DB2 server cleans copy folder? Is there any other alternative i can try?
You suggested that the Load succeeds when using NONRECOVERABLE mode, however fails otherwise with error "SQL3706N A disk full error was encountered on "/mnt/blumeta0/db2/copy".
I'm guessing that the Load is being performed using the COPY YES option. Since the Load command that you pasted does not show the COPY YES option, I'm guessing that you have a special configuration setting enabled that forces Load operations to use COPY YES in order to prevent the table from becoming inaccessible in a rollforward recovery event or HADR standby takeover event. The name of this configuration setting (registry variable) is "DB2_LOAD_COPY_NO_OVERRIDE".
When the Load is performed with COPY YES, a copy of the table pages/extents that were generated during the Load operation is written into a copy image file.
I suspect that you have the registry variable "DB2_LOAD_COPY_NO_OVERRIDE=COPY YES /mnt/blumeta0/db2/copy" configured (you can use db2set -all on the database server to display all configured registry variables). If so, the copy image files are being stored in this path, which at 16GB appears to be too small to contain them all.
You can consider changing the location of this path to somewhere with more disk space, however the path should always be accessible in the event of a database rollforward recovery or hadr standby takeover, otherwise the table will not be accessible after such an event.

Backup postgresql WAL logs

I try to configure backuping database in postgresql with pg_basebackup and WAL logs.
For now I created full backup once a week and want to backup wal logs too. But, as I understand, posgresql writes them all the time. So, how can I copy them and be shure that they are not corrupted?
Thanks
You set archive_command to a shell command that copies the WAL file to a safe archive location, so that burden is mostly on you.
When PostgreSQL runs archive_command, it assumes that the WAL file is not corrupted. Only a PostgreSQL bug or a bug in the storage system could cause a corrupted WAL segment.
There is no better protection against PostgreSQL bugs than always running the latest bugfix release, and you can invest in storage hardware that will at least detect failure.
You can also write your archive_command with a certain amount of paranoia, e.g. by comparing the md5sum of the WAL segment and its archive copy.
Another idea is to write two copies of the WAL file to different storage systems.

Is it safe to delete archive_status log file in postgresql 10

I am not a DBA but i am using postgresql for production server and i am using postgresql 10 database. I am using Bigsql and i started replication of my production server to other server and on replication server everything is working but on my production server their is no space left. And after du command on my production server i am getting that pg_wal folder have 17 gb file and each file is of 16 mb size.
After some google search i change my postgresql.conf file as:
wal_level = logical
archive_mode = on
archive_command = 'cp -i %p /etc/bigsql/data/pg10/pg_wal/archive_status/%f'
i install postgresql 10 from Bigsql and did above changes.
After changes the dir /pg_wal/archive_status had 16 gb of log. So my question is that should i delete them manually or i have to wait for system delete them automatically.
And is that if i write archive_mode to on should that wal file getting removed automatically??
Thanks for your precious time.
This depends on how you do your backups and whether you'd ever need to restore the database to some point in time.
Only a full offline filesystem backup (offline meaning with database turned off) or an on-line logical backup with pg_dumpall will not need those files for a restore.
You'd need those files to restore a filesystem backup created while the database is running. Without them the backup will fail to restore. Though there exist backup solutions that copy needed WAL files automatically (like Barman).
You'd also need those files if your replica database will ever fall behind the master for some reason. Or you'd need to restore the database to some past point-in-time.
But these files compress pretty well - should be less than 10% size after compression - you can write your archive_command to compress them automatically instead of just copying.
And you should delete them eventually from the archive. I'd recommend to not delete them until they're at least a month old and also at least 2 full successful backups are done after creating them.

Which Postgresql WAL files can I safely remove from the WAL archive folder

Current situation
So I have WAL archiving set up to an independent internal harddrive on a data logging computer running Postgres. The harddrive containing the WAL archives is filling up and I'd like to remove and archive all the WAL archive files, including the initial base backup, to external backup drives.
The directory structure is like:
D:/WALBACKUP/ which is the parent folder for all the WAL files (00000110000.CA00000004 etc)
D:/WALBACKUP/BASEBACKUP/ which holds the .tar of the initial base backup
The question I have then is:
Can I safely move literally every single WAL file except the current WAL archive file, (000000000001.CA0000.. and so on), including the base backup, and move them to another hdd. (Note that the database is live and receiving data)
cheers!
WAL archives
You can use the pg_archivecleanup command to remove WAL from an archive (not pg_xlog) that's not required by a given base backup.
In general I suggest using PgBarman or a similar tool to automate your base backups and WAL retention though. It's easier and less error prone.
pg_xlog
Never remove WAL from pg_xlog manually. If you have too much WAL then:
your wal_keep_segments setting is keeping WAL around;
you have archive_mode on and archive_command set but it isn't working correctly (check the logs);
your checkpoint_segments is ridiculously high so you're just generating too much WAL; or
you have a replication slot (see the pg_replication_slots view) that's preventing the removal of WAL.
You should fix the problem that's causing WAL to be retained. If nothing seems to have happened after changing a setting run a manual CHECKPOINT command.
If you have an offline server and need to remove WAL to start it you can use pg_archivecleanup if you must. It knows how to remove only WAL that isn't needed by the server its self ... but it might break your archive-based backups, streaming replicas, etc. So don't use it unless you must.
WAL files are incremental, so the simple answer is: You cannot throw any files out. The solution is to make a new base backup and then all previous WALs can be deleted.
The WAL files contain individual statements that modify tables so if you throw some older WALs out, then the recovery process will fail (it will not silently skip missing WAL files) because the state of the database cannot be restored reliably. You can move the WAL files to some other location without upsetting the WAL process but then you'd have to make all WAL files available again from a single location if you ever need to recover your database from some point in the past; if you are running out of disk space then that may mean recovering from some location where you have enough space to store the base backup and all WAL files. The main issue here is if you can do that fast enough to restore a full database after an incident.
Another issue is that if you cannot identify where/when a problem occurred that needs to be corrected your only option is to start with the base backup and then replay all the WAL files. This procedure is not difficult, but if you have an old base backup and many WAL files to process, this simply takes a lot of time.
The best approach for your case, in general, is to make a new base backup every x months and collect WALs with that base backup. After every new base backup you can delete the old base backup and its subsequent WALs or move them to cheap offline storage (DVD, tape, etc). In the case of a major incident you can quickly restore the database to a known correct state from the recent base backup and the relatively few WAL files collected since then.
A solution that we went for, is executing pg_basebackup every night. This would create a base backup and later on we can use pg_archivecleanup to clean up all the "old" WAL files before that base using something like
"%POSTGRES_INSTALLDIR%\bin\pg_archivecleanup" -d %WAL_backup_dir% %newestBaseFile%
Fortunately, we never had to recover yet, but it should work in theory.
In case someone found this by searching how to safely cleanup the WAL directory under a replication architecture, consider the scenario where there might be left overs from offline replicas, in this case, unused replica slots waiting for the replica to come back online and thus keeping a lot of WAL archives on the Master DB.
In our case we had an issue with a replica going down due to hardware failure, we had to recreate it along with its replica_slot on the Master DB but forgot to get rid of the previous used one. Once we cleared that out PSQL got rid of unused WALs and all was good.
You can add the script to automatically clean or remove pg_wal files. This will work in pg-11 version. If you want to use other psql version the you can simply replace the command "/usr/pgsql-11/bin/pg_archivecleanup" to /usr/pgsql-12/bin/pg_archivecleanup or 13 as per your wish.
#!/bin/bash
/usr/pgsql-11/bin/pg_controldata -D /var/lib/pgsql/11/data/ > pgwalfile.txt
/usr/pgsql-11/bin/pg_archivecleanup -d /var/lib/pgsql/11/data/pg_wal $(cat pgwalfile.txt | grep "Latest checkpoint's REDO WAL file" | awk '{print $6}')