Postgresql database size is lower after restore - postgresql

I just made a simple bash script to backup a postgresql database from remote server and restore it to the same server with prefix _bak. This script looks like:
export FILENAME="/var/pg-backups/$4-$(date +%s).bak"
echo "$(date) Creating dump with args - $1 $2 $3 $4. Filename = $FILENAME"
pg_dump -h $1 -p $2 -U $3 -b -F d -j 4 -v -f "$FILENAME" $4
echo "$(date) Re-create _bak database"
psql -h $1 -p $2 -U $3 -d $4 -c "drop database if exists $4_bak;"
psql -h $1 -p $2 -U $3 -d $4 -c "create database $4_bak;"
echo "$(date) Restoring hot copy with _bak postfix"
pg_restore -h $1 -p $2 -U $3 -d "$4_bak" -w -v "$FILENAME"
echo "$(date) Done"
It works well, but there's some strange thing with _bak database, which was restored - it weights lower than original!! Here's the output of pg_database_size for original (backuped) database and its restored copy:
postgres=# select pg_database_size('somedb');
pg_database_size
------------------
548152175
postgres=# select pg_database_size('somedb_bak');
pg_database_size
------------------
511648623
(1 строка)
Also, I took one table as an example to check relation size difference, with such query:
select pg_relation_size('resources.attachment', 'main') as main,
pg_relation_size('resources.attachment', 'fsm') as fsm,
pg_relation_size('resources.attachment', 'vm') as vm,
pg_relation_size('resources.attachment', 'init') as init,
pg_table_size('resources.attachment'), pg_indexes_size('resources.attachment') as indexes,
pg_total_relation_size('resources.attachment') as total;
and here's what I got for original database:
main | fsm | vm | init | pg_table_size | indexes | total
-------+-------+------+------+---------------+---------+-----------
65536 | 24576 | 8192 | 0 | 109158400 | 32768 | 109191168
and backuped version:
main | fsm | vm | init | pg_table_size | indexes | total
-------+-------+----+------+---------------+---------+-----------
65536 | 24576 | 0 | 0 | 100302848 | 32768 | 100335616
So only pg_table_size and VM differs... records count is the same btw in original and backuped table.
Can someone please try to explain the size difference between original and restored database?
Postgresql verison:
PostgreSQL 12.7 (Debian 12.7-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
P.S.
i've also tried -F c without parallelism - same result

Related

Psql output csv file formatting

Trying to make batch file that will get query from script file and put results into csv.
Batch file code:
psql -h host -d test -U test -p 5432 -a -q -f C:\Users\test\Documents\my_query.sql TO STDOUT WITH CSV HEADER DELIMITER ';' > C:\Users\test\Documents\res.csv
In result file I'm getting result like this:
select *
from public.test
limit 3
id | name | count_01
----------+------------+---------------+
11021555 | a | 1 |
39534568 | b | 2 |
11695210 | c | 3 |
(3 rows)
How to get only script results without rows count and symbols like '|' or '+' and using ';' delimetres as in the usual csv file?
Working script:
psql -h host -d test -U test -p 5432 -q --quiet --no-align --field-separator=';' --file=C:\Users\test\Documents\my_query.sql --output=C:\Users\test\Documents\res.csv
From PostgreSQL v12 on, you can use the CSV output format of psql:
psql --quiet --csv --file=my_query.sql --output=res.csv
--quiet suppresses the psql welcome message.
Should work with
psql -h host -d dbname -U user -p port -a -q -f my_query.sql -o res.csv --record-separator=',' --csv

Restore SQL dump in PostgreSQL 11 Docker at image build time

I want to build a custom Postgres11 image in which some users are created and some extensions are installed. As I want these to be created at build time, so I don't want to use docker-entrypoint-initdb.d. A next step would be to restore a sql dump as well.
FROM postgres:11
ENV PG_MAJOR 11
ENV POSTGISV 2.5
ENV TZ Europe/Brussels
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
postgresql-$PG_MAJOR-postgis-$POSTGISV \
postgresql-$PG_MAJOR-postgis-$POSTGISV-scripts
USER postgres
RUN initdb && pg_ctl -o "-c listen_addresses='*'" start &&\
psql -h 0.0.0.0 --command "CREATE USER docker WITH SUPERUSER PASSWORD 'docker';" &&\
psql -h 0.0.0.0 --command "CREATE USER akela_test WITH PASSWORD 'akela';" &&\
createdb -E UTF8 -U postgres -h 0.0.0.0 -O akela_test akela_test --template template0 &&\
psql -U postgres -d akela_test -h 0.0.0.0 -c 'CREATE EXTENSION IF NOT EXISTS "hstore";' &&\
psql -U postgres -d akela_test -h 0.0.0.0 -c 'CREATE EXTENSION IF NOT EXISTS "postgis";' &&\
psql -U postgres -d akela_test -h 0.0.0.0 -c 'CREATE EXTENSION IF NOT EXISTS "uuid-ossp";' &&\
psql -U postgres -d akela_test -h 0.0.0.0 -c "CREATE ROLE akela_db WITH LOGIN PASSWORD 'akela'" &&\
psql -U postgres -d akela_test -h 0.0.0.0 -c "GRANT ALL PRIVILEGES ON DATABASE akela_test to akela_db" &&\
psql -U postgres -d akela_test -h 0.0.0.0 -c "CREATE schema db" &&\
pg_ctl stop
# gunzip -c /tmp/dump.sql.gz | psql -U akela -h 0.0.0.0 akela
USER root
seems to work:
...
CREATE SCHEMA
ALTER SCHEMA
CREATE ROLE
GRANT
CREATE SCHEMA
ALTER SCHEMA
waiting for server to shut down....2019-07-08 12:58:06.962 CEST [22] LOG: received fast shutdown request
2019-07-08 12:58:06.964 CEST [22] LOG: aborting any active transactions
2019-07-08 12:58:06.965 CEST [22] LOG: background worker "logical replication launcher" (PID 29) exited with exit code 1
2019-07-08 12:58:06.965 CEST [24] LOG: shutting down
2019-07-08 12:58:07.006 CEST [22] LOG: database system is shut down
done
server stopped
...
running the image however shows no users nor db:
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+------------+------------+-----------------------
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
(3 rows)
postgres=# \du
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------------------+-----------
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
What could be the issue?
The Dockerfile for postgres defines a volume which means any changes to this directory by a RUN step will be discarded. To make changes to this directory you need to do one of the following options:
Make the changes at run time rather than doing the build, and save the resulting volume.
Make changes during the build, but in a different directory. This would require changing the postgres configuration to use the different directory.
Save your changes to a different directory and then restore those changes when you start the container (see the save and load volume scripts for an example of this).
Build your own postgres image without the volume definition.

postgres -c <parameter>=<value> not working for Postgres 11 in docker

I'm running a Postgres inside a docker container. I want to change the default config of Postgres so I'm running:
docker container run -d postgres -c max_connections=200 -c shared_buffers=1GB -c effective_cache_size=3GB -c maintenance_work_mem=256MB -c checkpoint_completion_target=0.7 -c wal_buffers=16MB
But when I'm connecting to Postgres running:
docker exec -it container_name psql
And then the result of :
SHOW max_connections;
is
max_connections
-----------------
100
(1 row)
And it's not just max_connections. None of the parameters are changed. And I don't know what is the problem with what I'm doing?
Update: the result of
root=# SELECT *
root-# FROM pg_settings
root-# WHERE name = 'max_connections';
is
name | setting | unit | category | short_desc | extra_desc | context | vartype | source | min_val | max_val | enumvals | boot_val | reset_val | sourcefile | sourceline | pending_restart
-----------------+---------+------+------------------------------------------------------+----------------------------------------------------+------------+------------+---------+--------------------+---------+---------+----------+----------+-----------+------------------------------------------+------------+-----------------
max_connections | 100 | | Connections and Authentication / Connection Settings | Sets the maximum number of concurrent connections. | | postmaster | integer | configuration file | 1 | 262143 | | 100 | 100 | /var/lib/postgresql/data/postgresql.conf | 64 | f
(1 row)
If you cannot get it to work while starting the server, try ALTER SYSTEM:
psql -c "ALTER SYSTEM SET max_connections=200; SELECT pg_reload_conf()"
That changes the setting in postgresql.auto.conf.
Changing shared_buffers, wal_buffersandmax_connections` require a restart of the PostgreSQL server, the other parameters can be changed on the fly.
Is it possible that you are connecting to the wrong container? When I try to run psql as you have, I get:
$ docker exec -it boring_hermann psql
psql: FATAL: role "root" does not exist
...because the standard user is root and has no access to the container's DB. When I run as user 999, group 999 (the one listed as postgres in that image), it works correctly:
$ docker exec -u 999:999 -it boring_hermann psql
psql (11.1 (Debian 11.1-1.pgdg90+1))
Type "help" for help.
postgres=# show max_connections;
max_connections
-----------------
200
(1 row)

job stalls with (END)

I'm trying to set up two databases on travis but it just stops halfway the before_install stating:
(END)
No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself.
The build has been terminated
eg https://travis-ci.org/B3Partners/brmo/builds/85746119
my yml is the following:
language: java
sudo: false
branches:
only:
- travis-integration
addons:
postgresql: "9.3"
jdk:
# - openjdk6
# - openjdk7
- oraclejdk7
# - oraclejdk8
matrix:
fast_finish: true
cache:
directories:
- $HOME/.m2
before_install:
# STAGING
- psql --version
- psql -U postgres -c 'create database staging'
- psql -U postgres -c 'create database rsgb'
- psql -U postgres --list
# set up RSGB
- psql -U postgres -d rsgb -c 'create extension postgis'
- psql -U postgres -d rsgb -f ./datamodel/generated_scripts/datamodel_postgresql.sql --single-transaction --echo-all
# - psql -f ./datamodel/utility_scripts/111a_update_gemeente_geom.sql -U postgres -d rsgb --single-transaction
# - psql -f ./datamodel/utility_scripts/113a_update_wijk_geom.sql -U postgres -d rsgb --single-transaction
install:
# install all dependencies + artifacts without any testing
- mvn install -Dmaven.test.skip=true -B -V -fae -q
before_script:
# dit dient na afloop van de 'install' gedaan te worden omdat de staging DB sql gegenereerd wordt door Hibernate
- psql -U postgres -d staging -f ./brmo-persistence/target/ddlscripts/create-brmo-persistence-postgresql.sql --single-transaction
- psql -U postgres -d staging -f ./brmo-persistence/db/01_create_indexes.sql
- psql -U postgres -d staging -f ./brmo-persistence/db/02_insert_default_user.sql
- psql -U postgres -d staging -f ./brmo-persistence/db/03_update_status_enum_value.sql
# run tests
script:
# run unit tests
- mvn -e test -B
# run integration tests
- mvn -e verify -B
after_success:
after_failure:
after_script:
notifications:
email: false
# on_success: [always|never|change] # default: change
# on_failure: [always|never|change] # default: always
and as you can see in the log it just stalls after a few psql calls.
0.01s$ psql --version
psql (PostgreSQL) 9.3.5
before_install.2
0.02s$ psql -U postgres -c 'create database staging'
CREATE DATABASE
before_install.3
0.22s$ psql -U postgres -c 'create database rsgb'
CREATE DATABASE
before_install.4
1.04s$ psql -U postgres -d rsgb -c 'create extension postgis'
CREATE EXTENSION
$ psql -U postgres --list
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
rsgb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
staging | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
travis | travis | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(6 rows)
(END)
No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself.
The build has been terminated
I just spent about 3 hours troubleshooting this same issue and the problem is pretty simple once you understand why. psql is simply trying to page the output. There are multiple ways to disable the pager, but the solution I went with was to set the PAGER=cat environment variable in .travis.yml like so:
env:
- PGUSER=postgres
PAGER=cat

Check if database exists in PostgreSQL using shell

I was wondering if anyone would be able to tell me about whether it is possible to use shell to check if a PostgreSQL database exists?
I am making a shell script and I only want it to create the database if it doesn't already exist but up to now haven't been able to see how to implement it.
Note/Update (2021): While this answer works, philosophically I agree with other comments that the right way to do this is to ask Postgres.
Check whether the other answers that have psql -c or --command in them are a better fit for your use case (e.g. Nicholas Grilly's, Nathan Osman's, bruce's or Pedro's variant
I use the following modification of Arturo's solution:
psql -lqt | cut -d \| -f 1 | grep -qw <db_name>
What it does
psql -l outputs something like the following:
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+-----------+----------+------------+------------+-----------------------
my_db | my_user | UTF8 | en_US.UTF8 | en_US.UTF8 |
postgres | postgres | LATIN1 | en_US | en_US |
template0 | postgres | LATIN1 | en_US | en_US | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | LATIN1 | en_US | en_US | =c/postgres +
| | | | | postgres=CTc/postgres
(4 rows)
Using the naive approach means that searching for a database called "List, "Access" or "rows" will succeed. So we pipe this output through a bunch of built-in command line tools to only search in the first column.
The -t flag removes headers and footers:
my_db | my_user | UTF8 | en_US.UTF8 | en_US.UTF8 |
postgres | postgres | LATIN1 | en_US | en_US |
template0 | postgres | LATIN1 | en_US | en_US | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | LATIN1 | en_US | en_US | =c/postgres +
| | | | | postgres=CTc/postgres
The next bit, cut -d \| -f 1 splits the output by the vertical pipe | character (escaped from the shell with a backslash), and selects field 1. This leaves:
my_db
postgres
template0
template1
grep -w matches whole words, and so won't match if you are searching for temp in this scenario. The -q option suppresses any output written to the screen, so if you want to run this interactively at a command prompt you may with to exclude the -q so something gets displayed immediately.
Note that grep -w matches alphanumeric, digits and the underscore, which is exactly the set of characters allowed in unquoted database names in postgresql (hyphens are not legal in unquoted identifiers). If you are using other characters, grep -w won't work for you.
The exit status of this whole pipeline will be 0 (success) if the database exists or 1 (failure) if it doesn't. Your shell will set the special variable $? to the exit status of the last command. You can also test the status directly in a conditional:
if psql -lqt | cut -d \| -f 1 | grep -qw <db_name>; then
# database exists
# $? is 0
else
# ruh-roh
# $? is 1
fi
The following shell code seems to work for me:
if [ "$( psql -XtAc "SELECT 1 FROM pg_database WHERE datname='DB_NAME'" )" = '1' ]
then
echo "Database already exists"
else
echo "Database does not exist"
fi
Quick help about the psql flags given above:
General options:
-c, --command=COMMAND run only single command (SQL or internal) and exit
-X, --no-psqlrc do not read startup file (~/.psqlrc)
Output format options:
-A, --no-align unaligned table output mode
-t, --tuples-only print rows only
I'm new to postgresql, but the following command is what I used to check if a database exists
if psql ${DB_NAME} -c '\q' 2>&1; then
echo "database ${DB_NAME} exists"
fi
postgres#desktop:~$ psql -l | grep <exact_dbname> | wc -l
This will return 1 if the database specified exists or 0 otherwise.
Also, if you try to create a database that already exists, postgresql will return an error message like this:
postgres#desktop:~$ createdb template1
createdb: database creation failed: ERROR: database "template1" already exists
You can create a database, if it doesn't already exist, using this method:
if [[ -z `psql -Atqc '\list mydatabase' postgres` ]]; then createdb mydatabase; fi
I'm combining the other answers to a succinct and POSIX compatible form:
psql -lqtA | grep -q "^$DB_NAME|"
A return of true (0) means it exists.
If you suspect your database name might have a non-standard character such as $, you need a slightly longer approach:
psql -lqtA | cut -d\| -f1 | grep -qxF "$DB_NAME"
The -t and -A options make sure the output is raw and not "tabular" or whitespace-padded output. Columns are separated by the pipe character |, so either the cut or the grep has to recognize this. The first column contains the database name.
EDIT: grep with -x to prevent partial name matches.
#!/bin/sh
DB_NAME=hahahahahahaha
psql -U postgres ${DB_NAME} --command="SELECT version();" >/dev/null 2>&1
RESULT=$?
echo DATABASE=${DB_NAME} RESULT=${RESULT}
#
For completeness, another version using regex rather than string cutting:
psql -l | grep '^ exact_dbname\b'
So for instance:
if psql -l | grep '^ mydatabase\b' > /dev/null ; then
echo "Database exists already."
exit
fi
The other solutions (which are fantastic) miss the fact that psql can wait a minute or more before timing out if it can't connect to a host. So, I like this solution, which sets the timeout to 3 seconds:
PGCONNECT_TIMEOUT=3 psql development -h db -U postgres -c ""
This is for connecting to a development database on the official postgres Alpine Docker image.
Separately, if you're using Rails and want to setup a database if it doesn't already exist (as when launching a Docker container), this works well, as migrations are idempotent:
bundle exec rake db:migrate 2>/dev/null || bundle exec rake db:setup
kibibu's accepted answer is flawed in that grep -w will match any name containing the specified pattern as a word component.
i.e. If you look for "foo" then "foo-backup" is a match.
Otheus's answer provides some good improvements, and the short version will work correctly for most cases, but the longer of the two variants offered exhibits a similar problem with matching substrings.
To resolve this issue, we can use the POSIX -x argument to match only entire lines of the text.
Building on Otheus's answer, the new version looks like this:
psql -U "$USER" -lqtA | cut -d\| -f1 | grep -qFx "$DBNAME"
That all said, I'm inclined to say that Nicolas Grilly's answer -- where you actually ask postgres about the specific database -- is the best approach of all.
psql -l|awk '{print $1}'|grep -w <database>
shorter version
I'm still pretty inexperienced with shell programming, so if this is really wrong for some reason, vote me down, but don't be too alarmed.
Building from kibibu's answer:
# If resulting string is not zero-length (not empty) then...
if [[ ! -z `psql -lqt | cut -d \| -f 1 | grep -w $DB_NAME` ]]; then
echo "Database $DB_NAME exists."
else
echo "No existing databases are named $DB_NAME."
fi
This command will return the number of databases that are called DATABASE_NAME: psql -At -U postgres -c "select count(*) from pg_databases where datname = 'DATABASE_NAME';
So
if [ "$(psql -At -U postgres -c "select count(*) from pg_databases where datname = 'DATABASE_NAME`;")" -eq 0 ] ; then
# This runs if the DB doesn't exist.
fi
In one line:
PGPASSWORD=mypassword psql -U postgres#hostname -h postgres.hostname.com -tAc 'select 1' -d dbnae || echo 0
This will return 1 if db exists 0 if not
or more readable:
if [ "$(PGPASSWORD=mypassword psql -U postgres#hostname -h postgres.hostname.com -tAc 'select 1' -d dbnae || echo 0 )" = '1' ]
then
echo "Database already exists"
else
echo "Database does not exist"
fi
Trigger divide by zero if it doesn't exist then check return code like this:
sql="SELECT 1/count(*) FROM pg_database WHERE datname='db_name'";
error=$(psql -h host -U user -c "$sql" postgres);
if $error
then
echo "doesn't exist";
else
echo "exists";
fi