A container is a database server. How to ask it's Dockerfile to complete its construction after that container has started? - postgresql

I am using a postgis/postgis Docker image to set a database server for my application.
The database server must have a tablespace created, then a database.
Then each time another application will start from another container, it will run a Liquibase script that will update the database schema (create tables, index...) when needed.
On a terminal, to prepare the database container, I'm running these commands :
# Run a naked Postgis container
sudo docker run --name ecoemploi-postgis
-e POSTGRES_PASSWORD=postgres
-d -v /data/comptes-france:/data/comptes-france postgis/postgis
# Send 'bash level' commands to create the directory for the tablespace
sudo docker exec -it ecoemploi-postgis
bin/sh -c 'mkdir /tablespace && chown postgres:postgres /tablespace'
Then to complete my step 1, I have to run SQL statements to create the tablespace in a PostGIS point of view, and create the database by a CREATE DATABASE.
I connect myself, manually, under the psql of my container :
sudo docker exec -it ecoemploi-postgis bin/sh
-c 'exec psql -h "$POSTGRES_PORT_5432_TCP_ADDR"
-p "$POSTGRES_PORT_5432_TCP_PORT" -U postgres'
And I run manally these commands :
CREATE TABLESPACE data LOCATION '/tablespace';
CREATE DATABASE comptesfrance TABLESPACE data;
exit
But I would like to have a container created from a single Dockerfile having done all the needed work. The difficulty is that it has to be done in two parts :
One before the container is started. (creating directories, granting them user:group).
One after it is started for the first time : declaring the tablespace and creating the base. If I understand well the base image I took, it should be done after an entrypoint docker-entrypoint.sh has been run ?
What is the good way to write a Dockerfile creating a container having done all these steps ?

The PostGIS image "is based on the official postgres image", so it should be able to use the /docker-entrypoint-initdb.d mechanism. Any files you put in that directory will be run the first time the database container is started. The postgis Dockerfile already uses this directory to install the PostGIS extensions into the default database.
That means you can put your build-time setup directly into the Dockerfile, and copy the startup-time script into that directory.
FROM postgis/postgis:12-3.0
RUN mkdir /tablespace && chown postgres:postgres /tablespace
COPY createdb.sql /docker-entrypoint-initdb.d/20-createdb.sql
# Use default ENTRYPOINT/CMD from base image
For the particular setup you describe, this may not be necessary. Each database runs in an isolated filesystem space and starts with an empty data directory, so there's not a specific need to create an alternate data directory; Docker style is to just run multiple databases if you need isolated storage. Similarly, the base postgres image will create a database for you at first start (named by the POSTGRES_DB environment variable).

In order to run a container, your Dockerfile must be functional and completed.
you must enter the queries in a bash file and in the last line you have to enter an ENTRYPOINT with this bash script

Related

Backup PostgreSQL database running in a container on Digital Ocean

I'm a newbie to Docker, I'm working on a project that written by another developer, the project is running on Digital Ocean Ubuntu 18.04. It consists of 2 containers (1 container for Django App, 1 container for PostgreSQL database).
It is required from me now to get a backup of database, I found a bash file written by previous programmer:
### Create a database backup.
###
### Usage:
### $ docker-compose -f <environment>.yml (exec |run --rm) postgres backup
set -o errexit
set -o pipefail
set -o nounset
working_dir="$(dirname ${0})"
source "${working_dir}/_sourced/constants.sh"
source "${working_dir}/_sourced/messages.sh"
message_welcome "Backing up the '${POSTGRES_DB}' database..."
if [[ "${POSTGRES_USER}" == "postgres" ]]; then
message_error "Backing up as 'postgres' user is not supported. Assign 'POSTGRES_USER' env with another one and try again."
exit 1
fi
export PGHOST="${POSTGRES_HOST}"
export PGPORT="${POSTGRES_PORT}"
export PGUSER="${POSTGRES_USER}"
export PGPASSWORD="${POSTGRES_PASSWORD}"
export PGDATABASE="${POSTGRES_DB}"
backup_filename="${BACKUP_FILE_PREFIX}_$(date +'%Y_%m_%dT%H_%M_%S').sql.gz"
pg_dump | gzip > "${BACKUP_DIR_PATH}/${backup_filename}"
message_success "'${POSTGRES_DB}' database backup '${backup_filename}' has been created and placed in '${BACKUP_DIR_PATH}'."
my first question is:
is that command right? i mean if i ran:
docker-compose -f production.yml (exec |run --rm) postgres backup
Would that create a backup for my database at the written location?
Second question: can I run this command while the database container is running? or should I run docker-compose down then run the command for backup then run docker-compose up again.
Sure you can run that script to backup, one way to do it is executing a shell in container with docker-compose exec db /bin/bash and then run that script.
Other way is running a new postgres container attached to the postgres container network:
docker run -it --name pgback -v /path/backup/host:/var/lib/postgresql/data --network composeNetwork postgres /bin/bash
this will create a new postgres container attached to the network created with compose, with a binding volume attached, then you can create this script in the container and back up the database to the volume to save it to other place out of container.
Then when you want to backup simple start docker container and backup:
docker start -a -i pgback
You dont need to create other compose file, just copy the script to he container and run it, also you could create a new postgres image with the script and run it from CMD, just to run the container, there are plenty ways to do it.

how to restore postgres database in docker when docker container not start?

I want to create a database in PostgreSQL and restore a backup in a docker container. I am able to create the database and run the docker container, and then run the pg_restore to restore the backup.
My Dockerfile is :
FROM postgres:latest
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD 123qwe
ENV POSTGRES_DB docker_pg
COPY createTable.sql /docker-entrypoint-initdb.d/
VOLUME /var/lib/postgresql/data
Then I run the command for restore the backup :
docker exec -i 0d96d6b59d74 pg_restore -U postgres -d docker_pg< backup_latest.sql
It is working fine.
But my requirement is when I run the command for create the docker container database creation and restore the backup both work done in same time, mean at the time of container creation.
How can I do this?
But my requirement is when I run the command for create the docker
container database creation and restore the backup both work done in
same time, mean at the time of container creation.
Both tasks can be performed by the Docker container all you need to place the restore script in the docker-entrypoint-initdb.d folder.
COPY createTable.sql /docker-entrypoint-initdb.d/a_createTable.sql
COPY backup_latest.sql /docker-entrypoint-initdb.d/
As I changed createTable.sql to a_createTable.sql, so it will first create Table and then it will restore the backup.
These initialization files will be executed in sorted name order as
defined by the current locale
Initialization scripts
Or the other option is to create single SQL file and the order will be
ALL DDL
# then
ALL DML
so something like
COPY db.sql /docker-entrypoint-initdb.d/

Wait service postgres start dockerfile

I'm using docker file to build ubuntu image have install postgresql. But I can't wait for service postgres status OK
FROM ubuntu:18.04
....
RUN apt-get update && apt-get install -y postgresql-11
RUN service postgresql start
RUN su postgres
RUN psql
RUN CREATE USER kong; CREATE DATABASE kong OWNER kong;
RUN \q
RUN exit
Everything seem okay, but RUN su postgres will throw error because service postgresql not yet started after RUN service postgresql start. How can I do that?
First thing, each RUN command in Dockerfile run in a separate shell and RUN command should be used for installation or configuration not for starting the process. The process should be started at CMD or entrypoint.
RUN vs CMD
Better to use offical postgress docker image.
docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres
The default postgres user and database are created in the entrypoint with initdb.
or you can build your image based on postgress.
FROM postgres:11
ENV POSTGRES_USER=kong
ENV POSTGRES_PASSWORD=example
COPY seed.sql /docker-entrypoint-initdb.d/seed.sql
This will create an image with user, password and also the entrypoint will insert seed data as well when container started.
POSTGRES_USER
This optional environment variable is used in conjunction with
POSTGRES_PASSWORD to set a user and its password. This variable will
create the specified user with superuser power and a database with the
same name. If it is not specified, then the default user of postgres
will be used.
Some advantage with offical docker image
Create DB from ENV
Create DB user from ENV
Start container with seed data
Will wait for postgres to be up and running
All you need
# Use postgres/example user/password credentials
version: '3.1'
services:
db:
image: postgres
restart: always
environment:
POSTGRES_PASSWORD: example
Initialization scripts
If you would like to do additional initialization in an image derived
from this one, add one or more *.sql, *.sql.gz, or *.sh scripts under
/docker-entrypoint-initdb.d (creating the directory if necessary).
After the entrypoint calls initdb to create the default postgres user
and database, it will run any *.sql files, run any executable *.sh
scripts, and source any non-executable *.sh scripts found in that
directory to do further initialization before starting the service.

How to copy and use existing postgres data folder into docker postgres container

I want to build postgres docker container for testing some issue.
I have:
Archived folder of postgres files(/var/lib/postgres/data/)
Dockerfile that place folder into doccker postgres:latest.
I want:
Docker image that reset self-state after recreate image.
Container that have database state based on passed into the container postgres files
I don't want to wait for a long time operation of backup and restore existing database in /docker-entrypoint-initdb.d initialization script.
I DON'T WANT TO USE VOLUMES because I don't need to store new data between restart (That's why this post is different from How to use a PostgreSQL container with existing data?. In that post volumes are used)
My suggestion is to copy postgres files(/var/lib/postgres/data/) from host machine into docker's /var/lib/postgres/data/ in build phase.
But postgres docker replace this files when initdb phase is executing.
How to ask Postgres docker not overriding database files?
e.g.
Dockerfile
FROM postgres:latest
COPY ./postgres-data.tar.gz /opt/pg-data/
WORKDIR /opt/pg-data
RUN tar -xzf postgres-data.tar.gz
RUN mv ./data/ /var/lib/postgresql/data/pg-data/
Run command
docker run -p 5432:5432 -e PGDATA=/var/lib/postgresql/data/pg-data --name database-immage1 database-docker
If you don't really need to create a custom image with the database snapshot you could use volumes. Un-tar the database files somewhere on the host say ~/pgdata then run the image. Example:
docker run -v ~/pgdata:/var/lib/postgresql/data/ -p 5432:5432 postgres:9.5
The files must be compatible with the postgres version of the image so use the same image version as the archived database.
If, instead, you must recreate the image you don't need to uncompress the database archive. The ADD instruction will do
that for you. Make sure the tar does not contain any leading directory.
The Dockerfile:
FROM postgres:latest
ADD ./postgres-data.tar.gz /var/lib/postgresql/data/
Build it:
docker build . -t database-docker
Run without overriding the environment variable PGDATA. Note that you copy the files in /var/lib/postgresql/data but the PGDATA points to /var/lib/postgresql/data/pg-data.
Run the container:
docker run -p 5432:5432 --name database-image1 database-docker

Build postgres docker container with initial schema

I'm looking to build dockerfiles that represent company databases that already exist. Similarly, I'd like create a docker file that starts by restoring a psql dump.
I have my psql_dump.sql in the . directory.
FROM postgres
ADD . /init_data
run "createdb" "--template=template0" "my_database"
run "psql" "-d" "my_database" --command="create role my_admin superuser"
run "psql" "my_database" "<" "init_data/psql_dump.sql"
I thought this would be good enough to do it. I'd like to avoid solutions that use a .sh script. Like this solution.
I use template0 since the psql documentation says you need the same users created that were in the original database, and you need to create the database with template0 before you restore.
However, it gives me an error:
createdb: could not connect to database template1: could not connect to server: No such file or directory
Is the server running locally and accepting
I'm also using docker compose for the overall application, if solving this problem in docker-compose is better, I'd be happy to use the base psql image and use docker compose to do this.
According to the usage guide for the official PostreSQL Docker image, all you need is:
Dockerfile
FROM postgres
ENV POSTGRES_DB my_database
COPY psql_dump.sql /docker-entrypoint-initdb.d/
The POSTGRES_DB environment variable will instruct the container to create a my_database schema on first run.
And any .sql file found in the /docker-entrypoint-initdb.d/ of the container will be executed.
If you want to execute .sh scripts, you can also provide them in the /docker-entrypoint-initdb.d/ directory.
As said in the comments, #Thomasleveil answer is great and simple if your schema recreation is fast.
But in my case it's slow, and I wanted to use docker volumes, so here is what I did
First use docker image as in #Thomasleveil answer to create a container with postgres with all the schema initialization
Dockerfile:
FROM postgres
WORKDIR /docker-entrypoint-initdb.d
ADD psql_dump.sql /docker-entrypoint-initdb.d
EXPOSE 5432
then run it and create new local dir which contains the postgres data after its populated from the “psql_dump.sql” file: docker cp mypg:/var/lib/postgresql/data ./postgres-data
Copy the data to a temp data folder, and start a new postgres docker-compose container whose volume is at the new temp data folder:
startPostgres.sh:
rm -r ./temp-postgres-data/data
mkdir -p ./temp-postgres-data/data
cp -r ./postgres-data/data ./temp-postgres-data/
docker-compose -p mini-postgres-project up
and the docker-compose.yml file is:
version: '3'
services:
postgres:
container_name: mini-postgres
image: postgres:9.5
ports:
- "5432:5432"
volumes:
- ./temp-postgres-data/data:/var/lib/postgresql/data
Now you can run steps #1 and #2 on a new machine or if your psql_dump.sql changes. And each time you want a new clean (but already initialized) db, you can only run startPostgres.sh from step #3.
And it still uses docker volumes.
#Thomasleveil's answer will re-create the database schema at runtime, which is fine for most cases.
If you want to recreate the database schema at buildtime (i.e. if your schema initialization is really slow) you can invoke the stock docker_entrypoint.sh from within your Dockerfile.
However, since the docker_entrypoint.sh is designed to start a long-running database server, you have to add an extra script to exit the process after database initialization but before booting the long-running server.
Dockerfile (with build time database initialization)
# STAGE 1 - Equivalent to #Thomasleveil
FROM postgres AS runtime_init
ENV POSTGRES_DB my_database
COPY 1-psql_dump.sql /docker-entrypoint-initdb.d/
# STAGE 2 - Initialize the database during the build
FROM runtime_init AS buildtime_init_builder
RUN echo "exit 0" > /docker-entrypoint-initdb.d/100-exit_before_boot.sh
ENV PGDATA=/pgdata
RUN docker-entrypoint.sh postgres
# STAGE 3 - Copy the initialized db to a new image to reduce size.
FROM postgres AS buildtime_init
ENV PGDATA=/pgdata
COPY --chown=postgres:postgres --from=buildtime_init_builder /pgdata /pgdata
Important Notes
The stock postgres image will run initialization scripts in alphabetical order, so ensure that your database restoration scripts appear earlier than the exit_before_boot.sh script created in the Dockerfile.
This is demonstrated by the 1 and 100 prefixes shown above. Modify them to your liking.
Database updates to a running instance of this image will not be persisted across reboots since the PGDATA path where the database files are stored no longer maps to a volume mounted from the host machine.
Further Reading
Instructions from the authors of the official postgres image about writing your own custom_entrypoint.sh. This is arguably the more "official" way to solve this problem, but I personally find my approach easier to understand and implement.
A demo of this concept for PostgreSQL 9, which uses the --help flag to exit the docker-entrypoint.sh before the long-running server boots. Unfortunately, this no longer works as of December 3, 2019
Two discussions (1) (2) of this same question from the official docker postgres repository.