docker compose: postgresql create db, user pass and grant permission - postgresql

I have the following docker-compose file:
version: '3'
services:
web:
build:
context: ./django_httpd_mod_wsgi
ports:
- "8000:80"
db:
build:
context: ./postgresql
volumes:
- db-data:/var/lib/postgres/data
volumes:
db-data:
I am building psotgresql image using archlinux:
The following is my postgresql Dockerfile:
FROM archlinux/base
RUN yes | pacman -S postgresql
RUN mkdir /run/postgresql/
RUN chown -R postgres:postgres /run/postgresql/
USER postgres
RUN initdb -D /var/lib/postgres/data
RUN psql -c 'CREATE DATABASE btgapp;'
RUN psql -c "CREATE USER simha WITH PASSWORD 'krishna';"
RUN psql -c 'GRANT ALL PRIVILEGES ON DATABASE btgapp TO simha;'
CMD ["/usr/bin/postgres","-D","/var/lib/postgres/data"]
When i try to do:
docker-compose up
I get the error:
psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/run/postgresql/.s.PGSQL.5432"?
ERROR: Service 'db' failed to build: The command '/bin/sh -c psql -c 'CREATE DATABASE dbname;'' returned a non-zero code: 2
I understood that i have to run the psql -c CREATE DATABSE "dbname" after starting the postgresql server by /usr/bin/postgres -D /var/lib/postgres/data
But i cannot start multiple commands in a Dockerfile. So how to do this.
The option is start a script. But then it will be difficult to see postgres running as a single process.

Based on the comments, I will try to answer here.
I believe that you should go with the postgres 11-alpine image. And I will try to explain why here.
Official docker images come with a number of benefits that you should always consider before starting your own.
Upgrade path is easy - when a new revision of the application wrapped in the image is released, the official docker image will in most cases be updated along with it. And ususally the changes respect the configuration conventions that the image has established. Such as environment variables, startup specifics. So that users can simple change the tag in their stacks, and upgrade. There may of course be breaking changes - always check this.
Large user base - when images like postgres have been downloaded more than 10 milliion times (2019), this does not only mean that it is popular, but inherently works like a guarantee that the image has been tested thoroughly. Any elementary bugs have been weeded out already, and you will have an easy time with the image.
Optimized for size and performance - you can be sure that attention has been paid to a lot of details, minimizing the size of the image and maximizing performance. Many projects publish their applications on a few different linux distros. Like postgres - they publish debian and a alpine based images. The alpine image is the smaller one, while the debian is slightly larger, but gives you access to the vast debian package repositories if you need extra packages installed.
Easy configuration - maintainers of the official images usually understand that usecases of their userbase very well. And they try to make our lives as developers and admins easier (god bless them). Official images usually have some pretty good documentation sitting right on their docker hub landing page, or a link to a github repo where the README.md will cover common usecases. I find that these instructions are worth a good read from top to bottom.
I understand that you want to keep the image small, but what do you know - the postgres project has got your usecase covered.
The latest alpine postgres image tagged 11-alpine has a compressed footprint of 28 MB and decompressed of 70MB. While the archlinux/base image that you want to start off with has compressed base footprint of 153MB and a decompressed size of 445MB. And that's before you introduce postgres itself.
Add to that, that the database and user that you want created on startup - can be handled in the environment variables alone for the official postgres image. Like this:
docker run -d --name some-postgres \
-e POSTGRES_PASSWORD=mysecretpassword \
-e POSTGRES_USER=simha \
-e POSTGRES_DB=btgapp \
postgres:11-alpine
If that does not cover the initialization that you need for your database, then you can copy .sql scripts (and .sh scripts) into a special location in the image - and they will be executed on startup. For this you can extend their image like this:
init-user-db.sh
#!/bin/bash
set -e
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
CREATE USER simha;
CREATE DATABASE btgapp;
GRANT ALL PRIVILEGES ON DATABASE btgapp TO simha;
EOSQL
And then with a Dockerfile like this:
Dockerfile
FROM postgres:11-alpine
COPY ./init-user-db.sh /docker-entrypoint-initdb.d/init-user-db.sh
(This is taken from the postgres description on docker hub)
In closing - I would recommend that you do not prioritize the distro that an image is based on over the usability and maintainability. Docker enables us to run applications in containers without really caring too much about what distro is inside the container. It's all linux anyway. At the end of the day, I expect that you want a stable postgres database container like me. This is what I get with the official postgres image.
I hope I helped you evaluate your options on this.

Related

A container is a database server. How to ask it's Dockerfile to complete its construction after that container has started?

I am using a postgis/postgis Docker image to set a database server for my application.
The database server must have a tablespace created, then a database.
Then each time another application will start from another container, it will run a Liquibase script that will update the database schema (create tables, index...) when needed.
On a terminal, to prepare the database container, I'm running these commands :
# Run a naked Postgis container
sudo docker run --name ecoemploi-postgis
-e POSTGRES_PASSWORD=postgres
-d -v /data/comptes-france:/data/comptes-france postgis/postgis
# Send 'bash level' commands to create the directory for the tablespace
sudo docker exec -it ecoemploi-postgis
bin/sh -c 'mkdir /tablespace && chown postgres:postgres /tablespace'
Then to complete my step 1, I have to run SQL statements to create the tablespace in a PostGIS point of view, and create the database by a CREATE DATABASE.
I connect myself, manually, under the psql of my container :
sudo docker exec -it ecoemploi-postgis bin/sh
-c 'exec psql -h "$POSTGRES_PORT_5432_TCP_ADDR"
-p "$POSTGRES_PORT_5432_TCP_PORT" -U postgres'
And I run manally these commands :
CREATE TABLESPACE data LOCATION '/tablespace';
CREATE DATABASE comptesfrance TABLESPACE data;
exit
But I would like to have a container created from a single Dockerfile having done all the needed work. The difficulty is that it has to be done in two parts :
One before the container is started. (creating directories, granting them user:group).
One after it is started for the first time : declaring the tablespace and creating the base. If I understand well the base image I took, it should be done after an entrypoint docker-entrypoint.sh has been run ?
What is the good way to write a Dockerfile creating a container having done all these steps ?
The PostGIS image "is based on the official postgres image", so it should be able to use the /docker-entrypoint-initdb.d mechanism. Any files you put in that directory will be run the first time the database container is started. The postgis Dockerfile already uses this directory to install the PostGIS extensions into the default database.
That means you can put your build-time setup directly into the Dockerfile, and copy the startup-time script into that directory.
FROM postgis/postgis:12-3.0
RUN mkdir /tablespace && chown postgres:postgres /tablespace
COPY createdb.sql /docker-entrypoint-initdb.d/20-createdb.sql
# Use default ENTRYPOINT/CMD from base image
For the particular setup you describe, this may not be necessary. Each database runs in an isolated filesystem space and starts with an empty data directory, so there's not a specific need to create an alternate data directory; Docker style is to just run multiple databases if you need isolated storage. Similarly, the base postgres image will create a database for you at first start (named by the POSTGRES_DB environment variable).
In order to run a container, your Dockerfile must be functional and completed.
you must enter the queries in a bash file and in the last line you have to enter an ENTRYPOINT with this bash script

How can i persist my data in docker/postgres container?

I know there are probably many ways to do this. What I am looking for is a way to do it using (preferably) only my DockerFile and one container.
Here is my current dockerfile:
FROM postgres:latest
ENV POSTGRES_USER=myuser
ENV POSTGRES_PASSWORD=mypassword
Here is the command I used to build this container:
docker built -t my_db .
And here is the command that I use to run the container:
docker run -p 5432:5432 my_db
What I would like to do is have the data stored in the container if possible, but I don't seem to understand how or where postgres stores it's data. I saw on another stack overflow post that postgres will store it by default in /var/lib/postgresql/data however when I look in that folder I see nothing. I can however verify that postgres is running because I am using a client called teamSQL and from that client I can create tables and insert/read data.
I can also verify that when i stop the container and restart the data is definitely not persisted.
Note: this is running in OSx but I don't think that is relevant.
You should use Docker volumes, so when you stop your container, data will persist on host machine, and when you start container again data will be mounted to it
docker volume create pgdata
docker run -p 5432:5432 -v pgdata:/var/lib/postgresql/data my_db

Creating a running Postgres service inside a docker container

I'm a bit new to Docker.
I have two containers running using docker-compose.
One is the API and the other is the actual application.
I want to add a new DB container using the Postgres official image.
It's a bit hard to find a simple tutorial on how to create the container and populate it with a predefined sql file (of schemas and data).
When I start with "CMD /etc/init.d/postgresql start" in the Dockerfile I get an error saying: "No PostgreSQL clusters exist; see "man pg_createcluster" ... (warning)."
Since it takes me too much time to get things going I was wondering if it might be better to get an Ubuntu image and install Postgres on my own since there is only one source on how to use the image - docker hub, and I don't seem to understand it that well.
Any ideas or simple steps on how to compose and 'configure' this image?
If you want populate your database with some file, A simply way to do this is:
How to extend this image
If you would like to do additional initialization in an image derived
from this one, add one or more *.sql, *.sql.gz, or *.sh scripts under
/docker-entrypoint-initdb.d (creating the directory if necessary).
After the entrypoint calls initdb to create the default postgres user
and database, it will run any *.sql files and source any *.sh scripts
found in that directory to do further initialization before starting
the service.
Dockerfile
FROM postgres:alpine
COPY init.sql /docker-entrypoint-initdb.d/init.sql
docker-compose.yml
version: '3'
services:
app:
//your app definition
postgres:
build: .
Pull the postgres image
docker pull postges:14.2
Create the service with the below command
docker service create --name postgres --network my_overlay --env "POSTGRES_PASSWORD=password" --publish 5432:5432 postgres:14.2
Try to connect using userName as postgres and password as password to the default postgres db.
jdbc:postgresql://127.0.0.1:5432/postgres // JDBC connection

Build postgres docker container with initial schema

I'm looking to build dockerfiles that represent company databases that already exist. Similarly, I'd like create a docker file that starts by restoring a psql dump.
I have my psql_dump.sql in the . directory.
FROM postgres
ADD . /init_data
run "createdb" "--template=template0" "my_database"
run "psql" "-d" "my_database" --command="create role my_admin superuser"
run "psql" "my_database" "<" "init_data/psql_dump.sql"
I thought this would be good enough to do it. I'd like to avoid solutions that use a .sh script. Like this solution.
I use template0 since the psql documentation says you need the same users created that were in the original database, and you need to create the database with template0 before you restore.
However, it gives me an error:
createdb: could not connect to database template1: could not connect to server: No such file or directory
Is the server running locally and accepting
I'm also using docker compose for the overall application, if solving this problem in docker-compose is better, I'd be happy to use the base psql image and use docker compose to do this.
According to the usage guide for the official PostreSQL Docker image, all you need is:
Dockerfile
FROM postgres
ENV POSTGRES_DB my_database
COPY psql_dump.sql /docker-entrypoint-initdb.d/
The POSTGRES_DB environment variable will instruct the container to create a my_database schema on first run.
And any .sql file found in the /docker-entrypoint-initdb.d/ of the container will be executed.
If you want to execute .sh scripts, you can also provide them in the /docker-entrypoint-initdb.d/ directory.
As said in the comments, #Thomasleveil answer is great and simple if your schema recreation is fast.
But in my case it's slow, and I wanted to use docker volumes, so here is what I did
First use docker image as in #Thomasleveil answer to create a container with postgres with all the schema initialization
Dockerfile:
FROM postgres
WORKDIR /docker-entrypoint-initdb.d
ADD psql_dump.sql /docker-entrypoint-initdb.d
EXPOSE 5432
then run it and create new local dir which contains the postgres data after its populated from the “psql_dump.sql” file: docker cp mypg:/var/lib/postgresql/data ./postgres-data
Copy the data to a temp data folder, and start a new postgres docker-compose container whose volume is at the new temp data folder:
startPostgres.sh:
rm -r ./temp-postgres-data/data
mkdir -p ./temp-postgres-data/data
cp -r ./postgres-data/data ./temp-postgres-data/
docker-compose -p mini-postgres-project up
and the docker-compose.yml file is:
version: '3'
services:
postgres:
container_name: mini-postgres
image: postgres:9.5
ports:
- "5432:5432"
volumes:
- ./temp-postgres-data/data:/var/lib/postgresql/data
Now you can run steps #1 and #2 on a new machine or if your psql_dump.sql changes. And each time you want a new clean (but already initialized) db, you can only run startPostgres.sh from step #3.
And it still uses docker volumes.
#Thomasleveil's answer will re-create the database schema at runtime, which is fine for most cases.
If you want to recreate the database schema at buildtime (i.e. if your schema initialization is really slow) you can invoke the stock docker_entrypoint.sh from within your Dockerfile.
However, since the docker_entrypoint.sh is designed to start a long-running database server, you have to add an extra script to exit the process after database initialization but before booting the long-running server.
Dockerfile (with build time database initialization)
# STAGE 1 - Equivalent to #Thomasleveil
FROM postgres AS runtime_init
ENV POSTGRES_DB my_database
COPY 1-psql_dump.sql /docker-entrypoint-initdb.d/
# STAGE 2 - Initialize the database during the build
FROM runtime_init AS buildtime_init_builder
RUN echo "exit 0" > /docker-entrypoint-initdb.d/100-exit_before_boot.sh
ENV PGDATA=/pgdata
RUN docker-entrypoint.sh postgres
# STAGE 3 - Copy the initialized db to a new image to reduce size.
FROM postgres AS buildtime_init
ENV PGDATA=/pgdata
COPY --chown=postgres:postgres --from=buildtime_init_builder /pgdata /pgdata
Important Notes
The stock postgres image will run initialization scripts in alphabetical order, so ensure that your database restoration scripts appear earlier than the exit_before_boot.sh script created in the Dockerfile.
This is demonstrated by the 1 and 100 prefixes shown above. Modify them to your liking.
Database updates to a running instance of this image will not be persisted across reboots since the PGDATA path where the database files are stored no longer maps to a volume mounted from the host machine.
Further Reading
Instructions from the authors of the official postgres image about writing your own custom_entrypoint.sh. This is arguably the more "official" way to solve this problem, but I personally find my approach easier to understand and implement.
A demo of this concept for PostgreSQL 9, which uses the --help flag to exit the docker-entrypoint.sh before the long-running server boots. Unfortunately, this no longer works as of December 3, 2019
Two discussions (1) (2) of this same question from the official docker postgres repository.

Why doesn't postgres official docker repo start db service at build time?

Under the background of https://github.com/docker-library/postgres (github repo) and https://registry.hub.docker.com/_/postgres/ (docker hub)
It can be seen database is started by Entrypoint and CMD with bash script
/docker-entrypoint.sh
with
ENTRYPOINT ["/docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]
another script hook provided to change database is
/docker-entrypoint-initdb.d
which means the database starts (can be pqsl) only at runtime, when docker run command is typed in.
This causes a problem, we could not customize the database before it runs in build time, for example add extensions and populate db with data.
Of course, it could be done in run time. But it has the advantage to repeat the operation every time when the image is run.
So, what is the logic behind this design from docker or postgres perspective? How could I add extension and populate data in build time ?
If you were to customize (create, populate data) a database at build time, that would imply that the database data is written into the docker image filesystem itself (as one cannot mount a volume at build time).
The issue with that is that the docker image filesystem is a special one (AUFS or btrfs, etc) which isn't delivering good I/O performances for data intensive applications such as a database server.
As a consequence, you want to have your data written on a volume instead of on the docker container filesystem. As you don't know at build time what would be the volume used at run time, and as there is no mean anyway to mount volumes at build time, no one should create database at build time.
Furthermore, if you take a close look at the Dockerfile of the official PostgreSQL image, you will see that there is a VOLUME instruction that makes the path at which the data is written a volume. That means that the image is designed so that the data will never hit the docker container filesystem.
If you take a look at other Dockerfiles for other databases or data intensive applications, you will notice that they all operate in this manner. An other reason for that is that it is accepted as a good practice to make your docker containers immutable.
If you want to install additional modules to your image, it is fine as long as those do not depend on data that would be written on a volume, and as long as you make sure to declare a volume for any path they would write data on.
tl;dr
Application code/binary → docker image filesystem
Application data → docker volume
This is right from the docker page for the postgres image (library/postgres):
If you would like to do additional initialization in an image derived from this one, add a *.sql or *.sh script under /docker-entrypoint-initdb.d (creating the directory if necessary). After the entrypoint calls initdb to create the default postgres user and database, it will run any *.sql files and source any *.sh script found in that directory to do further initialization before starting the service.
You can also extend the image with a simple Dockerfile to set the locale. The following example will set the default locale to de_DE.utf8:
FROM postgres:9.4
RUN localedef -i de_DE -c -f UTF-8 -A /usr/share/locale/locale.alias de_DE.UTF-8
ENV LANG de_DE.utf8
Since database initialization only happens on container startup, this allows us to set the language before it is created.
You have the ability to extend an image just as the example shows from the docs that I pasted above. You can also use the exec command and execute virtually anything within the container right from your host machine. It took me a little while to get used to it, I continue to discover things as I play with it more and more.
UPDATE:
sudo docker run --name some-postgres -v ~/PATH/TO/some-postgres/data:/var/lib/postgres/data -p 127.0.0.1:5432:5432 -e POSTGRES_PASSWORD=test -d postgres