postgres backup with WAL - postgresql

i am in no way a db admin, so please don't shoot me if i'm doing it completly wrong ...
I have to add some archiving to a productive postgres database (newest version in docker container) and trying to build some scripts to use with WAL.
The idea is to have a weekly script, that does a full backup to a new directory and then creates a symlink to this new directory that is used by the WAL script to write it's logs. Also the weekly script will delete old backups older than 30 days.
I would be very happy for any comments on this...
db settings
wal_level = replica
archive_mode = on
archive_command = '/archive/archive_wal.sh "%p" "%f"'
archive_timeout = 300
weekly script:
#!/bin/bash
#create base archive dir
#base_arch_dir=/tmp/archive/
base_arch_dir=/archive/
if [ ! -d "$base_arch_dir" ]; then
mkdir "$base_arch_dir"
chown -R postgres:postgres "$base_arch_dir"
fi
#create dir for week
dir="$base_arch_dir"$(date '+%Y_%m_%d__%H_%M_%S')
if [ ! -d "$dir" ]; then
mkdir "$dir"
chown -R postgres:postgres "$dir"
fi
#change/create the symlink
newdir="$base_arch_dir"wals
ln -fsn "$dir" "$newdir"
chown -R postgres:postgres "$newdir"
#do the base backup to the wals dir
if pg_basebackup -D "$newdir" -F tar -R -X fetch -z -Z 9 -U postgres; then
find "$base_arch_dir"* -type d -mtime +31|xargs rm -rf
fi
crchive script:
#!/bin/bash
set -e
arch_dir=/archive/wals
arch_log="$arch_dir/arch.log"
if [ ! -d "$arch_dir" ]; then
echo arch_dir '"$arch_dir"' does not exist >> "$arch_log"
exit -1
fi
#get the variables from postgres
p=$1
f=$2
if [ -f "$arch_dir"/$f.xz ]; then
echo wal file '"$arch_dir"/$f.xz' already exists
exit -1
fi
pxz -2 -z --keep -c $p > "$arch_dir"/$f.xz
Thank you in advance

It's not terribly difficult to put together your own archiving scripts, but there are a few things you need to keep track of, because when you need your backups you really need them. There are some packaged backup systems for PostgreSQL. You may find these two a good place to start, but others are available.
https://www.pgbarman.org/
https://pgbackrest.org/

Related

PostgreSQL - HyperLogLog extension not found

Can someone explain in a better way (well, in a way for dummies to understand), or more correctly how to install HyperLogLog hll extension for PostgreSQL on my Mac M1 machine.
When running CREATE EXTENSION hll;
I get:
Query 1 ERROR: ERROR: could not open extension control file "/opt/homebrew/share/postgresql/extension/hll.control": No such file or directory
I am new at this, so this documentation https://github.com/citusdata/postgresql-hll did not helped me a lot.
I installed all other extensions that I need except this one..
When typing which postgres I get:
/opt/homebrew/bin/postgres
And version: postgres (PostgreSQL) 14.3
I saw about configuring PG_CONFIG but I do not understand what exactly I should be doing here?
I will appreciate the help and I hope that this post will be of use for other dummies as I. :)
We can simplify the script above and execute it inline by copying and pasting all of the following into your terminal:
> yes |
#!/bin/bash
# download latest release
curl -s https://api.github.com/repos/citusdata/postgresql-hll/releases/latest \
| grep '"tarball_url":' \
| sed -E 's/.*"([^"]+)".*/\1/' \
| xargs curl -o package.tar.gz -L
# extract to new hll directory
mkdir hll && tar xf package.tar.gz -C hll --strip-components 1
# build and install extension to postgres extensions folder
cd hll
make
make install
# remove hll directory
cd ../
rm -r ./hll
# connect to PostgreSQL and install extension
psql -U postgres -c "CREATE EXTENSION hll;"
I wrote the script for myself to get the last package and install it.
I build it by using make.
# check if Makefile installed
make -v
# download latest release
curl -s https://api.github.com/repos/citusdata/postgresql-hll/releases/latest \
| grep '"tarball_url":' \
| sed -E 's/.*"([^"]+)".*/\1/' \
| xargs curl -o package.tar.gz -L
# extract to hll directory
mkdir hll && tar xf package.tar.gz -C hll --strip-components 1
cd hll
# build and instll extension to postgres extensions folder
make
make install
# remove hll directory
cd ../
rm -r ./hll
# connect to PostgreSQL
psql -U postgres
# install extension in your DB
CREATE EXTENSION hll;

Postgres 9.4 restore not working

I do a backup of postgresql 9.4 database as per the postgres doc of WAL archiving.
After backup I create 2 records in DB.
Now when I try to restore the DB, the last 2 records which I created above is not coming up.
WAL archive steps:
cd /etc/postgresql/9.4/
mkdir archives
mkdir backups
chown postgres:postgres archives
chown postgres:postgres backups
cd /etc/postgresql/9.4/main/
echo 'max_wal_senders=1' >> postgresql.conf
echo 'wal_level=hot_standby' >> postgresql.conf
echo 'archive_mode=on' >> postgresql.conf
echo "archive_command='test ! -f /etc/postgresql/9.4/archives/%f && cp %p /etc/postgresql/9.4/archives/%f'" >> postgresql.conf
echo 'local replication postgres trust' >> pg_hba.conf
service postgresql restart
Backup steps:
cd /etc/postgresql/9.4/backups
rm -rf *
pg_basebackup --xlog -U postgres --format=t -D /etc/postgresql/9.4/backups/
Restore steps:
service postgresql stop
cd /var/lib/postgresql/9.4/
if [ ! -d "/var/lib/postgresql/9.4/tmp/" ]
then
mkdir tmp
else
rm -rf tmp
fi
mkdir tmp
mv /var/lib/postgresql/9.4/main/* /var/lib/postgresql/9.4/tmp/
cd /var/lib/postgresql/9.4/main/
rm -rf *
cd /etc/postgresql/9.4/backups
tar -xf base.tar -C /var/lib/postgresql/9.4/main/
cd /var/lib/postgresql/9.4/main/
FROMDIR="/etc/postgresql/9.4/archives/"
TODIR="/var/lib/postgresql/9.4/tmp/pg_xlog/"
if [ ! -d "$FROMDIR" ]
then
echo "Directory $FROMDIR does not exist!!"
exit
fi
if [ ! -d "$TODIR" ]
then
echo "Directory $TODIR does not exist!!"
exit
fi
cd $FROMDIR
for i in `find . -type f`
do
if [ ! -f $TODIR/$i ]
then
echo "copying file $i"
cp $i /var/lib/postgresql/9.4/main/pg_xlog/$i
fi
done
cd /var/lib/postgresql/9.4/main/pg_xlog/
chown -R postgres:postgres *
cd /var/lib/postgresql/9.4/main/
FILE="recovery.done"
if [ -f $FILE ]
then
mv $FILE recovery.conf
else
echo "restore_command = 'cp /etc/postgresql/9.4/archives/%f %p'" >> recovery.conf
fi
su postgres service postgresql start
exit
Changes appear in archive (/etc/postgresql/9.4/archives/ in your case) when the current WAL segment (usually 16 Mb) is filled up. Let me quote the documentation:
The archive_command is only invoked for completed WAL segments. Hence,
if your server generates little WAL traffic (or has slack periods
where it does so), there could be a long delay between the completion
of a transaction and its safe recording in archive storage. To limit
how old unarchived data can be, you can set archive_timeout to force
the server to switch to a new WAL segment file periodically. When this
parameter is greater than zero, the server will switch to a new
segment file whenever this many seconds have elapsed since the last
segment file switch, and there has been any database activity,
including a single checkpoint. (Increasing checkpoint_timeout will
reduce unnecessary checkpoints on an idle system.) Note that archived
files that are closed early due to a forced switch are still the same
length as completely full files. Therefore, it is unwise to use a very
short archive_timeout — it will bloat your archive storage.
If you just want to test the restore process, you can simply do select pg_switch_xlog(); after creating some records to force switch to a new WAL segment. Then verify that a new file appeared in the archive directory.
Also, you need not copy files from the archive directory to pg_xlog/. Restore_command will do it for you.

How do I replace the --D flag for pg_dumpall in Postgres?

I'm trying to create a PostgreSQL backup script using this answer as the basis of my script. The script is:
#! /bin/bash
# backup-postgresql.sh
# by Craig Sanders
# this script is public domain. feel free to use or modify as you like.
DUMPALL="/usr/bin/pg_dumpall"
PGDUMP="/usr/bin/pg_dump"
PSQL="/usr/bin/psql"
# directory to save backups in, must be rwx by postgres user
BASE_DIR="/var/backups/postgres"
YMD=$(date "+%Y-%m-%d")
DIR="$BASE_DIR/$YMD"
mkdir -p $DIR
cd $DIR
# get list of databases in system , exclude the tempate dbs
DBS=$($PSQL -l -t | egrep -v 'template[01]' | awk '{print $1}')
# first dump entire postgres database, including pg_shadow etc.
$DUMPALL -D | gzip -9 > "$DIR/db.out.gz"
# next dump globals (roles and tablespaces) only
$DUMPALL -g | gzip -9 > "$DIR/globals.gz"
# now loop through each individual database and backup the schema and data separately
for database in $DBS; do
SCHEMA=$DIR/$database.schema.gz
DATA=$DIR/$database.data.gz
# export data from postgres databases to plain text
$PGDUMP -C -c -s $database | gzip -9 > $SCHEMA
# dump data
$PGDUMP -a $database | gzip -9 > $DATA
done
The line:
$DUMPALL -D | gzip -9 > "$DIR/db.out.gz"
is returning this error:
psql: FATAL: role "root" does not exist
/usr/lib/postgresql/9.3/bin/pg_dumpall: invalid option -- 'D'
When I look at the PostgreSQL docs, there doesn't seem to be a -D option anymore. What should the updated command look like?
This is the modified script I ended up using to periodically backup my PostgreSQL database:
#! /bin/bash
# backup-postgresql.sh
# by Craig Sanders
# this script is public domain. feel free to use or modify as you like.
DUMPALL="/usr/bin/pg_dumpall"
PGDUMP="/usr/bin/pg_dump"
PSQL="/usr/bin/psql"
# directory to save backups in, must be rwx by postgres user
BASE_DIR="/var/backups/postgres"
YMD=$(date "+%Y-%m-%d")
DIR="$BASE_DIR/$YMD"
mkdir -p $DIR
cd $DIR
# get list of databases in system , exclude the tempate dbs
DBS=$($PSQL -l -t | egrep -v 'template[01]' | awk '{print $1}' | egrep -v '^\|' | egrep -v '^$')
# first dump entire postgres database, including pg_shadow etc.
$DUMPALL -c -f "$DIR/db.out"
# next dump globals (roles and tablespaces) only
$DUMPALL -g -f "$DIR/globals"
# now loop through each individual database and backup the schema and data separately
for database in $DBS; do
SCHEMA=$DIR/$database.schema
DATA=$DIR/$database.data
# export data from postgres databases to plain text
$PGDUMP -C -c -s $database -f $SCHEMA
# dump data
$PGDUMP -a $database -f $DATA
done
# delete backup files older than 30 days
OLD=$(find $BASE_DIR -type d -mtime +30)
if [ -n "$OLD" ] ; then
echo deleting old backup files: $OLD
echo $OLD | xargs rm -rfv
fi

issues backing up postgres databases in cron

I am trying to backup postgres databases. I am running a cron job to do so. Issue is that postgres runs under user postgres and I dont think I can run a cron job under ubuntu user. I tried to create a cron job under postgres user and that also did not work. My script, if login as postgres user works just fine.
Here is my script
#!/bin/bash
# Location to place backups.
backup_dir="/home/postgres-backup/"
#String to append to the name of the backup files
backup_date=`date +%d-%m-%Y`
#Numbers of days you want to keep copie of your databases
number_of_days=30
databases=`psql -l -t | cut -d'|' -f1 | sed -e 's/ //g' -e '/^$/d'`
for i in $databases; do
if [ "$i" != "template0" ] && [ "$i" != "template1" ]; then
echo Dumping $i to $backup_dir$i\_$backup_date
pg_dump -Fc $i > $backup_dir$i\_$backup_date
fi
done
find $backup_dir -type f -prune -mtime +$number_of_days -exec rm -f {} \;
if I do
sudo su - postgres
I see
-rwx--x--x 1 postgres postgres 570 Jan 12 20:48 backup_all_db.sh
and when I do
./backup_all_db.sh
it gets backed up in /home/postgres-backup/
however with cronjob its not working, regardless if I add the cron job under postgres or under ubuntu.
here is my cronjob
0,30 * * * * /var/lib/pgsql/backup_all_db.sh 1> /dev/null 2> /home/cron.err
Will appreciate any help
Enable user to run cron jobs
If the /etc/cron.allow file exists, then users must be listed in it in order to be allowed to run the crontab command. If the /etc/cron.allow file does not exist but the /etc/cron.deny file does, then users must not be listed in the /etc/cron.deny file in order to run crontab.
In the case where neither file exists, the default on current Ubuntu (and Debian, but not some other Linux and UNIX systems) is to allow all users to run jobs with crontab.
Add cron jobs
Use this command to add a cron job for the current user:
crontab -e
Use this command to add a cron job for a specified user (permissions are required):
crontab -u <user> -e
Additional reading
man 5 crontab
Crontab in Ubuntu: https://help.ubuntu.com/community/CronHowto

Delete non git directory in git bash, windows

xx#xx-PC ~/xampp/htdocs/sites
$ rmdir /s "yo-2"
rmdir: `/s': No such file or directory
rmdir: `yo-2': Directory not empty
xx#xx-PC ~/xampp/htdocs/sites
$ rmdir "yo-2"
rmdir: `yo-2': Directory not empty
I cant seem to get rmdir to work in git bash. Its not in a git repo and I've tried the above. Mkdir works as expected, why doesnt this?
rmdir will not work if directory is empty
Try
rm -rf yo-2
git-bash is a Linux like shell
If you are trying to remove an entire directory regardless of contents, you could use:
rm <dirname> -rf
just use the command below:
rm -rfv mydirectory
After trying out a couple of other commands, this worked for me:
rm dirname -rf
A bit late, but I believe it still can help someone with performance problems on Windows systems. It is REALLY FAST to delete on Windows using git bash comparing with just the ordinary rm -rf. The trick here is to move the file/directory to another random name in a temporary directory at the same drive (on Windows) or at the same partition (on *nix systems) and invoke the rm -rf command in background mode. At least you don't need to wait for a blocking IO task and OS will perform the deletion as soon it gets idle.
Depending on the system you are using you may need to install the realpath program (ie macOS). Another alternative is to write a bash portable function like in this post: bash/fish command to print absolute path to a file.
fast_rm() {
path=$(realpath $1) # getting the absolute path
echo $path
if [ -e $path ]; then
export TMPDIR="$(dirname $(mktemp -u))"
kernel=$(uname | awk '{print tolower($0)}')
# if windows, make sure to use the same drive
if [[ "${kernel}" == "mingw"* ]]; then # git bash
export TMPDIR=$(echo "${path}" | awk '{ print substr($0, 1, 2)"/temp"}')
if [ ! -e $TMPDIR ]; then mkdir -p $TMPDIR; fi
fi
if [ "${kernel}" == "darwin" ]; then MD5=md5; else MD5=md5sum; fi
rnd=$(echo $RANDOM | $MD5 | awk '{print $0}')
to_remove="${TMPDIR}/$(basename ${path})-${rnd}"
mv "${path}" "${to_remove}"
nohup rm -rf "${to_remove}" > /dev/null 2>&1 &
fi
}
# invoking the function
directory_or_file=./vo-2
fast_delete $directory_or_file
I have faced same issue. this is worked for me
rimraf is a Node.js package, which is the UNIX command rm -rf for node, so you will need to install Node.js which includes npm. Then you can run:
npm install -g rimraf
Then you can run rimraf from the command line.
rimraf directoryname
visit https://superuser.com/questions/78434/how-to-delete-directories-with-path-names-too-long-for-normal-delete
I found this solution because npm itself was causing this problem due to the way it nests dependencies.
Late reply, but for those who search a solution, for me the
rm <dirname> -rf
wasn't good, I always get the directory non-empty or path too long on node directories.
A really simple solution :
Move the directory you want to delete to the root of your disk (to shorten your path) and then you can delete it normally.