Dockerfile: Postgres initdb does not create default files in data dir - postgresql

Still getting my docker sea legs, I am trying to create a container for postgres 14 with alpine linux.
this is my Dockerfile so far:
FROM alpine:3.15.5
EXPOSE 5432
# update repo, install postgres 14
RUN apk update
RUN apk add gcc make
RUN apk add postgresql14 postgis
# data dir
RUN mkdir /var/lib/postgresql/data
RUN chmod 0700 /var/lib/postgresql/data
RUN chown postgres:postgres /var/lib/postgresql/data
VOLUME /var/lib/postgresql/data
# create db cluster as postgres user
USER postgres:postgres
RUN initdb -D /var/lib/postgresql/data
# temp
ENTRYPOINT [ "top" ]
The issue I am running into is when I build and run (docker-compose up --build), the initdb command runs perfectly, no errors, and the output makes sense, however there is no data in the /var/lib/postgresql/data dir which should have all the default postgres configs and db files.
The weird thing is, if I attach the container shell while it is running, and run initdb -D /var/lib/postgresql/data it works... 🤯
Please can you tell me what I am doing wrong/missing
Here is the docker-compose.yml as well for total coverage:
version: '3.9'
services:
postgres:
build: .
ports:
- "5432:5432"
volumes:
- ./postgres:/var/lib/postgresql/data

as #β.εηοιτ.βε has pointed out:
This happens because you are overriding the content of it with your
mount: - ./postgres:/var/lib/postgresql/data. I have explained the
docker-compose up process and its implication with mounts in this answer.

Related

Create Postgres Docker Image with Database

How to Create Postgres Docker Image with Data?
I have this folder/file structure:
- initdb
- 01-createSchema.sql
- 02-createData.sql
- Dockerfile
The Dockerfile:
FROM postgres:13.5-bullseye
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD PASSWORD
ENV POSTGRES_DB mydatabase
COPY ./initdb/*.sql /docker-entrypoint-initdb.d/
ENTRYPOINT ["docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]
I can build my-database image:
docker build . -t me/my-database
Then start a container build on the image:
docker run --name my-db -p 5432:5432 -d me/my-database
When I connect to the database, I can find my tables with my data.
So far so good.
But this is not exactly what I want, because my database is build when I start the first time my container (with the docker run command).
What I want, is an image that already has build the database, so when I start the container, no further database creation (which takes a few minutes in my case) is needed.
Anything like this 'Dockerfile':
FROM postgres:13.5-bullseye
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD PASSWORD
ENV POSTGRES_DB mydatabase
COPY ./initdb/*.sql /docker-entrypoint-initdb.d/
## The tricky part, I could not figure out how to do:
BUILD DATABASE
REMOVE /docker-entrypoint-initdb.d/*.sql
##
ENTRYPOINT ["docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]
How can I build my pre-build-database-image?
The proper way to persist data in docker is to create an empty volume that will store the database files created by the container.
docker volume create pgdata
When running your image, you need to mount the volume at the path in the container that contains your database data. By default, the official Postgres uses /var/lib/postgresql/data.
docker run -d \
--name db \
-v pgdata:/var/lib/postgresql/data \
me/my-database
When the pgdata volume is empty, the container's entrypoint script will initialize the database. The next time you run the image, the entrypoint script will determine that the directory already contains data provided from the volume and it won't attempt to re-initialize the database or run any of the scripts located in /docker-entrypoint-initdb.d -so you do not need to remove this directory.

Receiving an error from a docker-compose that the user must own the data directory

Every time I try to build my image, I get the following error:
The server must be started by the user that owns the data directory.
The following is my docker file:
version: "3.7"
services:
db:
image: postgres
container_name: xxxxxxxxxxxx
volumes:
- ./postgres-data:/var/lib/postgresql/data
environment:
POSTGRES_DB: $POSTGRES_DB
POSTGRES_USER: $POSTGRES_USER
POSTGRES_PASSWORD: $POSTGRES_PASSWORD
nginx:
image: nginx:latest
restart: always
container_name: xxxxxxxxxxxx-nginx
volumes:
- ./deployment/nginx:/etc/nginx
logging:
driver: none
depends_on: ["radio"]
ports:
- 8080:80
- 8081:443
radio:
build:
context: .
dockerfile: "./deployment/Dockerfile"
image: test-radio
command: './manage.py runserver 0:3000'
container_name: xxxxxxxxxxxxxxx
restart: always
depends_on: ["db"]
volumes:
- type: bind
source: ./api
target: /app/api
- type: bind
source: ./xxxxxx
target: /app/xxxxx
environment:
POSTGRES_DB: $POSTGRES_DB
POSTGRES_USER: $POSTGRES_USER
POSTGRES_PASSWORD: $POSTGRES_PASSWORD
POSTGRES_HOST: $POSTGRES_HOST
AWS_KEY_ID: $AWS_KEY_ID
AWS_ACCESS_KEY: $AWS_ACCESS_KEY
AWS_S3_BUCKET_NAME: $AWS_S3_BUCKET_NAME
networks:
default:
The image is built with the following run.sh file:
#!/usr/bin/env sh
if [ ! -f .pass ]; then
openssl rand -base64 32 > .pass
fi
#export POSTGRES_DB="xxxxxxxxxxxxxxxxx"
#export POSTGRES_USER="xxxxxxxxxxxxxx"
#export POSTGRES_PASSWORD="xxxxxxxxxxxxxxxxxxxx"
#export POSTGRES_HOST="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export POSTGRES_DB="xxxxxxxxxxxxxxxxxx"
export POSTGRES_USER="xxxxxxxxxxxxxxxxxxxx"
export POSTGRES_PASSWORD="`cat .pass`"
export POSTGRES_HOST="db"
export AWS_KEY_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export AWS_ACCESS_KEY="xxxxxxxxxxxxxxxxxxxxxxxxx"
export AWS_S3_BUCKET_NAME=""
echo "Your psql password is in .pass do not commit this file."
echo "The app will be available on localhost:8080 shortly"
if [ -z "$1" ]; then
docker-compose up
else
docker-compose up $1
fi
I'm wondering if my error is being caused by attempting to use a bash script to deploy the service on a Windows machine?
Details on the issue
The behavior observed by the OP definetely comes from a UID/GID mismatch, given that the specification
volumes:
- ./postgres-data:/var/lib/postgresql/data
(which can be viewed as a docker-compose equivalent of docker run -v "$PWD/postgres-data:/var/lib/postgresql/data" …) bind-mounts the $PWD/postgres-data folder inside the container, giving access to its files as is (including owner/group metadata).
Also, note that the handling of owner/group metadata between host and containers only relies on the numeric UID and GID, not on the owner and group names.
For more information about UIDs and GIDs in a Docker context, see also that article on Medium.
Workarounds if the bind-mount is necessary
For completeness, several possible solutions to workaround the bind-mount UID-mismatch issue (including the most straightforward one that consists in changing the files' UID :) are described in this answer on StackOverflow:
How to have host and container read/write the same files with Docker?
Other solutions
Following #ParanoidPenguin's comment, you may want to use a named volume, which mainly consists in using:
the docker volume command
and/or the docker run option -v …:….
Remarks:
docker run -v PATH1:PATH2 … triggers a bind-mount of PATH1 (host) to PATH2 (container) if and only if PATH1 is absolute (i.e., starts with a /) (e.g., -v "$PWD:$PWD" is a common idiom)
docker run -v NAME:PATH2 … mounts volume NAME to PATH2 (container) if and only if NAME does not contain any / (i.e., matches regexp [a-zA-Z0-9][a-zA-Z0-9_.-]).
even if we don't run docker volume create foo beforehand by hand, docker run -v foo:/data --rm -it debian will create the named volume foo if need be.
in order to populate the files of a named volume (or respectively, backup them) you can use an ephemeral container of image debian, ubuntu or so, combining at the same time a bind-mount and a volume mount:
Add a file /home/user/bar.txt in a new volume foo
file1=/home/user/bar.txt # initial file
uid=2000 # target User-ID in the volume
gid=2000 # target Group-ID in the volume
docker pull debian
docker run -v "$file1:$file1:ro" -v foo:/data \
-e file1="$file1" -e uid="$uid" -e gid="$gid" \
--rm -it debian bash -exc \
'cp -v -- "$file1" /data/bar.txt && chown -v $uid:$gid /data/bar.txt'
docker volume ls
Backup the foo volume in a tarball
date=$(date +'%Y%m%d_%H%M%S')
back="backup_$date.tar.gz"
destdir=/home/user/backup
mkdir -p "$destdir"
docker run -v foo:/data -v "$destdir:/backup" -e back="$back" \
--rm -it debian bash -exc 'tar cvzf "/backup/$back" /data'

Add custom config location to Docker Postgres image preserving its access parameters

I have written a Dockerfile like this:
FROM postgres:11.2-alpine
ADD ./db/postgresql.conf /etc/postgresql/postgresql.conf
CMD ["-c", "config_file=/etc/postgresql/postgresql.conf"]
It just adds custom config location to a generic Postgres image.
Now I have the following docker-compose service description
db:
build:
context: .
dockerfile: ./db/Dockerfile
environment:
POSTGRES_PASSWORD passwordhere
POSTGRES_USER: user
POSTGRES_DB: db_name
ports:
- 5432:5432
volumes:
- ./run/db-data:/var/lib/db/data
The problem is I can no longer remotely connect to DB using these credentials if I add this Config option. Without that CMD line it works just fine.
If I prepend "postgres" in CMD it has the same effect due to the underlying script prepending it itself.
Provided all the files are where they need to be, I believe the only problem with your setup is that you've omitted an actual executable from the CMD -- specifying just the option. You need to actually run postgres:
CMD ["postgres", "-c", "config_file=/etc/postgresql/postgresql.conf"]
That should work!
EDIT in response to OP's first comment below
First, I did confirm that behavior doesn't change whether "postgres" is in the CMD or not. It's exactly as you said. Onward!
Then I thought there must be a problem with the particular postgresql.conf in use. If we could just figure out what the default file is.. turns out we can!
How to get the existing postgres.conf out of the postgres image
1. Create docker-compose.yml with the following contents:
version: "3"
services:
db:
image: postgres:11.2-alpine
environment:
- POSTGRES_PASSWORD=passwordhere
- POSTGRES_USER=user
- POSTGRES_DB=db_name
ports:
- 5432:5432
volumes:
- ./run/db-data:/var/lib/db/data
2. Spin up the service using
$ docker-compose run --rm --name=postgres db
3. In another terminal get the location of the file used in this release:
$ docker exec -it postgres psql --dbname=db_name --username=user --command="SHOW config_file"
config_file
------------------------------------------
/var/lib/postgresql/data/postgresql.conf
(1 row)
4. View the contents of default postgresql.conf
$ docker exec -it postgres cat /var/lib/postgresql/data/postgresql.conf
5. Replace local config file
Now all we have to do is replace the local config file ./db/postgresql.conf with the contents of the known-working-state config and modify it as necessary.
Database objects are only created once!
Database objects are only created once by the postgres container (source). So when developing the database parameters we have to remove them to make sure we're in a clean state.
Here's a nuclear (be careful!) option to
(1) remove all exited Docker containers, and then
(2) remove all Docker volumes not attached to containers:
$ docker rm $(docker ps -a -q) -f && docker volume prune -f
So now we can be sure to start from a clean state!
Final setup
Let's bring our Dockerfile back into the picture (just like you have in the question).
docker-compose.yml
version: "3"
services:
db:
build:
context: .
dockerfile: ./db/Dockerfile
environment:
- POSTGRES_PASSWORD=passwordhere
- POSTGRES_USER=user
- POSTGRES_DB=db_name
ports:
- 5432:5432
volumes:
- ./run/db-data:/var/lib/db/data
Connect to the db
Now all we have to do is build from a clean state.
# ensure all volumes are deleted (see above)
$ docker-compose build
$ docker-compose run --rm --name=postgres db
We can now (still) connect to the database:
$ docker exec -it postgres psql --dbname=db_name --username=user --command="SELECT COUNT(1) FROM pg_database WHERE datname='db_name'"
Finally, we can edit the postgres.conf from a known working state.
As per this other discussion, your CMD command only has arguments and is missing a command. Try:
CMD ["postgres", "-c", "config_file=/etc/postgresql/postgresql.conf"]

Permission issue with PostgreSQL in docker container

I'm trying to run a docker image with PostgreSQL that has a volume configured for persisting data.
docker-compose.yml
version: '3.1'
services:
db:
image: postgres
restart: always
volumes:
- ./data:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD: example
When I start the container I see the output
fixing permissions on existing directory /var/lib/postgresql/data ... ok
and the data folder is no longer readable for me.
If I elevate myself and access the data directory I can see that the files are there. Furthermore, the command ls -ld data gives me
drwx------ 19 systemd-coredump root 4096 May 17 16:22 data
I can manually set the directory permission with sudo chmod 755 data, but that only works until I restart the container.
Why does this happen, and how can I fix it?
The other answer indeed points to the root cause of the problem, however the help page it points to does not contain a solution. Here is what I came up with to make this work for me:
start the container using your normal docker-compose file, this creates the directory with the hardcoded uid:gid (999:999)
version: '3.7'
services:
db:
image: postgres
container_name: postgres
volumes:
- ./data:/var/lib/postgresql/data
environment:
POSTGRES_USER: fake_database_user
POSTGRES_PASSWORD: fake_database_PASSWORD
stop the container and manually change the ownership to uid:gid you want (I'll use 1000:1000 for this example
$ docker stop postgres
$ sudo chown -R 1000:1000 ./data
Edit your docker file to add your desired uid:gid and start it up again using docker-compose (notice the user:)
version: '3.7'
services:
db:
image: postgres
container_name: postgres
volumes:
- ./data:/var/lib/postgresql/data
user: 1000:1000
environment:
POSTGRES_USER: fake_database_user
POSTGRES_PASSWORD: fake_database_password
The reason you can't just use user: from the start is that if the image runs as a different user it fails to create the data files.
On the image documentation page, it does mention a solution to add a volume to expose the /etc/passwd file as read-only in the image when providing --user option, however, that did not work for me with the latest image, as I was getting the following error. In fact none of the three proposed solutions worked for me.
initdb: error: could not change permissions of directory "/var/lib/postgresql/data": Operation not permitted
This is because of what is written in the dockerfile of the postgres image.
From line 15 to 18, you'll see that the group 999 and the user 999 are used. I'm guessing that in your host, they map respectively to systemd-coredump and root.
You need to know that whenever you use a user/group in an image, if the uid/gid exist in your host, then it will be mapped to it.
You can read the documentation on the docker hub from the postgres image here. There is a section Arbitrary --user Notes that explain how it works in the context of this image.
An easier and permanent solution would be as follows:
Add these lines to ~/.bashrc:
export UID=$(id -u)
export GID=$(id -g)
Reload your shell:
$ source ~/.bashrc
Modify your docker-compose.yml as follows:
version: "3.7"
services:
db:
image: postgres
volumes:
- ./tmp/db:/var/lib/postgresql/data
user: "${UID}:${GID}"
...
Source
here's what i did:
services:
postgres:
image: postgres:15.1
restart: always
environment:
- POSTGRES_USER=my_user
- POSTGRES_PASSWORD=my_user
- POSTGRES_DB=my_user
user: root
ports:
- "5432:5432"
volumes:
- /home/my_user/volumes/postgres/data:/var/lib/postgresql/data
- /home/my_user/volumes/postgres/config:/etc/postgresql
postgres_setup:
image: postgres:15.1
user: root
volumes:
- /home/my_user/volumes/postgres/data:/var/lib/postgresql/data
- /home/my_user/volumes/postgres/config:/etc/postgresql
entrypoint: [ "bash", "-c", "chmod 750 -R /var/lib/postgresql/data && chmod 750 -R /etc/postgresql"]
depends_on:
- postgres
pgadmin4:
image: dpage/pgadmin4
restart: always
environment:
- PGADMIN_DEFAULT_EMAIL=my_user#admin.com
- PGADMIN_DEFAULT_PASSWORD=my_user
- PGADMIN_LISTEN_ADDRESS=0.0.0.0
user: root
ports:
- "5050:80"
volumes:
- /home/my_user/volumes/pgadmin/data:/var/lib/pgadmin
depends_on:
- postgres_setup
the postgres_setup container just changes permissions and then shuts down
I have been struggling with a similar issue and the answer hit me when trying to work around postgres (static uid per container, configured or 70 by default on alpine, 999 on standard image), and docker limitations (no uid translation of volumes).
The answer is to utilize Linux ACL without any changes to docker-compose.yml user - just keep the default internal container user id.
mkdir -p ./data
sudo setfacl -m u:$(id -u):rwx -R ./data/
docker-compose up -d
or
docker-compose up -d
sudo setfacl -m u:$(id -u):rwx -R ./data/
The order of creating data volume's directory does not matter and as long as ACL is set after it was created, you as a user will be able to access it recursively. You can of course add additional permissions.
To check who has access to data folder simply run:
getfacl ./data

Howto pass POSTGRES_USER env variable when using docker-compose .yml for docker swarm

I'm starting a docker swarm with a PostgreSQL image.
I want to create a user named 'numbers' on that database.
This is my docker-compose file. The .env file contains POSTGRES_USER and POSTGRES_PASSORD. If I ssh into the container hosting the postgres image, I can see the variables when executing env.
But psql --user numbers tells me that role "numbers" does not exists.
How should I pass the POSTGRES_* vars so that the correct user is created?
version: '3'
services:
postgres:
image: 'postgres:9.5'
env_file:
- ./.env
ports:
- '5432:5432'
volumes:
- 'postgres:/var/lib/postgresql/data'
deploy:
replicas: 1
networks:
- default
restart: always
This creates the postgresql user as expected.
$ docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -e POSTGRES_USER=numbers -d postgres
When Postgres find its data directory already initialized, he does not run the initialization script. This is the check:
if [ ! -s "$PGDATA/PG_VERSION" ]; then
....
So I recommend you to manually create that user or start from scratch (removing your volume if you can afford it, loosing the data). From command line:
docker volume ls
docker volume rm <id>