New to a certain Postgres implementation done by someone else and need help figuring out an issue.
We have the following archive command configured, If I understand correctly then the archive command copies WAL files to a mounted storage /mnt/database:
archive_command = 'if { egrep -q " /mnt/database .* rw," /proc/mounts ;} && { ! pgrep test -u postgres ;} ; then test ! -f /mnt/database/%f && cp %p /mnt/database/%f ; else exit 1; fi'
We then have a cron job to move corrupted WALs out of the way:
find /mnt/database -type f -regextype posix-extended -regex ".*[A-Z0-9]{24}$" -mmin +60 -size -16777216c -exec logger "Trimming Postgres WAL Logs" \; -exec find /var/lib/pgsql/9.6/data/pg_xlog/{} -type f \; -exec mv {} {}.incomplete \;
The issue we are having is the /mnt/database keeps filling up and we need to extend the disk every few days. Is that because we have excessive WAL writing or too many corrupted WAL files ?
The live WAL in 'pg_wal' cleans itself up automatically. Your WAL archive, '/mnt/database/' here, does not. It is up to you to arrange for that to get cleaned up based on your organization's retention policy.
If your policy is to keep WAL forever, then you need to get enough storage space to do that. If you have some other policy, you would need to understand what it is (and describe it to us, if you want our help implementing it)
Neither of the commands you show seem to be related to retention.
Related
I didn't check that I'm at root and I ran
chown www-data:www-data -R *
find . -type d -exec chmod 755 {} \;
find . -type f -exec chmod 644 {} \;
I don't have backup thats why I cant restore
all my website's goes 503 :(
The last thing that I saw on command was http://prntscr.com/oags7v before I scream
How can I recover the original ownership of the files and their permissions with chown and chmod?
thank you in advance & best regards...
At this point... first read The Tao of Backup. Better late than never.
Then, back up anything yours. Any apps you made, any data you have. Anything that is not OS.
Wipe the machine clean, reinstall OS. Reinstall any necessary software.
Put your data back, and chown them appropriately.
Make sure everything works.
Then see if you learned anything from The Tao of Backup. Reread as needed.
i am in no way a db admin, so please don't shoot me if i'm doing it completly wrong ...
I have to add some archiving to a productive postgres database (newest version in docker container) and trying to build some scripts to use with WAL.
The idea is to have a weekly script, that does a full backup to a new directory and then creates a symlink to this new directory that is used by the WAL script to write it's logs. Also the weekly script will delete old backups older than 30 days.
I would be very happy for any comments on this...
db settings
wal_level = replica
archive_mode = on
archive_command = '/archive/archive_wal.sh "%p" "%f"'
archive_timeout = 300
weekly script:
#!/bin/bash
#create base archive dir
#base_arch_dir=/tmp/archive/
base_arch_dir=/archive/
if [ ! -d "$base_arch_dir" ]; then
mkdir "$base_arch_dir"
chown -R postgres:postgres "$base_arch_dir"
fi
#create dir for week
dir="$base_arch_dir"$(date '+%Y_%m_%d__%H_%M_%S')
if [ ! -d "$dir" ]; then
mkdir "$dir"
chown -R postgres:postgres "$dir"
fi
#change/create the symlink
newdir="$base_arch_dir"wals
ln -fsn "$dir" "$newdir"
chown -R postgres:postgres "$newdir"
#do the base backup to the wals dir
if pg_basebackup -D "$newdir" -F tar -R -X fetch -z -Z 9 -U postgres; then
find "$base_arch_dir"* -type d -mtime +31|xargs rm -rf
fi
crchive script:
#!/bin/bash
set -e
arch_dir=/archive/wals
arch_log="$arch_dir/arch.log"
if [ ! -d "$arch_dir" ]; then
echo arch_dir '"$arch_dir"' does not exist >> "$arch_log"
exit -1
fi
#get the variables from postgres
p=$1
f=$2
if [ -f "$arch_dir"/$f.xz ]; then
echo wal file '"$arch_dir"/$f.xz' already exists
exit -1
fi
pxz -2 -z --keep -c $p > "$arch_dir"/$f.xz
Thank you in advance
It's not terribly difficult to put together your own archiving scripts, but there are a few things you need to keep track of, because when you need your backups you really need them. There are some packaged backup systems for PostgreSQL. You may find these two a good place to start, but others are available.
https://www.pgbarman.org/
https://pgbackrest.org/
I used centos 6.5
in / path I have a lot of files that started with this name tmp_
I work with a user franco who has a limit permission ( and I can't add permission to this user )
and with FileZilla when I try to delete these files, I have a permission denied message.
so the solution is to delete these files with command in putty tool
because in putty, I can use command like this sudo rm .....
but I did not find the exact command.
I found this kind of command :
rm ./-tmp_
I want only to delete the files which are only in \ path and not in its subdirectories and which started with tmp_
I work with critical system so I want to be sure before execute any command.
To find target files use :
This will just print files on console.
find / -maxdepth 1 -type f -name 'tmp_*'
To remove files (not directories):
find / -maxdepth 1 -type f -name 'tmp_*' -exec rm -f {} \;
Please use -maxdepth attribute if you want to target files to specific depth.
The command "sudo rm -rf /tmp_" is worked if the /tmp_ directory not used for you.
I do a backup of postgresql 9.4 database as per the postgres doc of WAL archiving.
After backup I create 2 records in DB.
Now when I try to restore the DB, the last 2 records which I created above is not coming up.
WAL archive steps:
cd /etc/postgresql/9.4/
mkdir archives
mkdir backups
chown postgres:postgres archives
chown postgres:postgres backups
cd /etc/postgresql/9.4/main/
echo 'max_wal_senders=1' >> postgresql.conf
echo 'wal_level=hot_standby' >> postgresql.conf
echo 'archive_mode=on' >> postgresql.conf
echo "archive_command='test ! -f /etc/postgresql/9.4/archives/%f && cp %p /etc/postgresql/9.4/archives/%f'" >> postgresql.conf
echo 'local replication postgres trust' >> pg_hba.conf
service postgresql restart
Backup steps:
cd /etc/postgresql/9.4/backups
rm -rf *
pg_basebackup --xlog -U postgres --format=t -D /etc/postgresql/9.4/backups/
Restore steps:
service postgresql stop
cd /var/lib/postgresql/9.4/
if [ ! -d "/var/lib/postgresql/9.4/tmp/" ]
then
mkdir tmp
else
rm -rf tmp
fi
mkdir tmp
mv /var/lib/postgresql/9.4/main/* /var/lib/postgresql/9.4/tmp/
cd /var/lib/postgresql/9.4/main/
rm -rf *
cd /etc/postgresql/9.4/backups
tar -xf base.tar -C /var/lib/postgresql/9.4/main/
cd /var/lib/postgresql/9.4/main/
FROMDIR="/etc/postgresql/9.4/archives/"
TODIR="/var/lib/postgresql/9.4/tmp/pg_xlog/"
if [ ! -d "$FROMDIR" ]
then
echo "Directory $FROMDIR does not exist!!"
exit
fi
if [ ! -d "$TODIR" ]
then
echo "Directory $TODIR does not exist!!"
exit
fi
cd $FROMDIR
for i in `find . -type f`
do
if [ ! -f $TODIR/$i ]
then
echo "copying file $i"
cp $i /var/lib/postgresql/9.4/main/pg_xlog/$i
fi
done
cd /var/lib/postgresql/9.4/main/pg_xlog/
chown -R postgres:postgres *
cd /var/lib/postgresql/9.4/main/
FILE="recovery.done"
if [ -f $FILE ]
then
mv $FILE recovery.conf
else
echo "restore_command = 'cp /etc/postgresql/9.4/archives/%f %p'" >> recovery.conf
fi
su postgres service postgresql start
exit
Changes appear in archive (/etc/postgresql/9.4/archives/ in your case) when the current WAL segment (usually 16 Mb) is filled up. Let me quote the documentation:
The archive_command is only invoked for completed WAL segments. Hence,
if your server generates little WAL traffic (or has slack periods
where it does so), there could be a long delay between the completion
of a transaction and its safe recording in archive storage. To limit
how old unarchived data can be, you can set archive_timeout to force
the server to switch to a new WAL segment file periodically. When this
parameter is greater than zero, the server will switch to a new
segment file whenever this many seconds have elapsed since the last
segment file switch, and there has been any database activity,
including a single checkpoint. (Increasing checkpoint_timeout will
reduce unnecessary checkpoints on an idle system.) Note that archived
files that are closed early due to a forced switch are still the same
length as completely full files. Therefore, it is unwise to use a very
short archive_timeout — it will bloat your archive storage.
If you just want to test the restore process, you can simply do select pg_switch_xlog(); after creating some records to force switch to a new WAL segment. Then verify that a new file appeared in the archive directory.
Also, you need not copy files from the archive directory to pg_xlog/. Restore_command will do it for you.
I am not sure if this belongs to superuser. Please excuse.
Here is what I am trying to do. I need to create a ksh script which will establish an ssh connection to a remote machine and find all ".tar" files in a particular path for a particular date and list them. Next, I will need to perform an scp command to copy all those .tar files to the server I am executing the ksh script on.
Here is what I have so far and it is far from complete... (please bear with me.. I am very new to ksh scripting).
Can someone please advise if I am going in the right direction and provide some pointers as to how I can improve and achieve what I am trying to do?
Many thanks in advance.
SSERVER=server1
SOURCEPATH=/tmp/test
sudo ssh $SSERVER \
find $SOURCEPATH -name "*.tar" -mtime +7 -exec ls {} \;
#will the above two statements work?
#I then need to output the ls results to a temp variable (i believe) and issue an scp on each of the files
#Copy files from SOURCEPATH to PATH
sudo scp "$SSERVER:$SOURCEPATH/$file1" /tftpboot
sudo scp "$SSERVER:$SOURCEPATH/$file2" /tftpboot
SSERVER=server1
SOURCEPATH=/tmp/test
sudo ssh "$SSERVER" "find $SOURCEPATH -name '*.tar' -mtime +7" |
while IFS= read -r; do
sudo scp "$SSERVER:'$REPLY'" /tftpboot
done