How to Create Postgres Docker Image with Data?
I have this folder/file structure:
- initdb
- 01-createSchema.sql
- 02-createData.sql
- Dockerfile
The Dockerfile:
FROM postgres:13.5-bullseye
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD PASSWORD
ENV POSTGRES_DB mydatabase
COPY ./initdb/*.sql /docker-entrypoint-initdb.d/
ENTRYPOINT ["docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]
I can build my-database image:
docker build . -t me/my-database
Then start a container build on the image:
docker run --name my-db -p 5432:5432 -d me/my-database
When I connect to the database, I can find my tables with my data.
So far so good.
But this is not exactly what I want, because my database is build when I start the first time my container (with the docker run command).
What I want, is an image that already has build the database, so when I start the container, no further database creation (which takes a few minutes in my case) is needed.
Anything like this 'Dockerfile':
FROM postgres:13.5-bullseye
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD PASSWORD
ENV POSTGRES_DB mydatabase
COPY ./initdb/*.sql /docker-entrypoint-initdb.d/
## The tricky part, I could not figure out how to do:
BUILD DATABASE
REMOVE /docker-entrypoint-initdb.d/*.sql
##
ENTRYPOINT ["docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]
How can I build my pre-build-database-image?
The proper way to persist data in docker is to create an empty volume that will store the database files created by the container.
docker volume create pgdata
When running your image, you need to mount the volume at the path in the container that contains your database data. By default, the official Postgres uses /var/lib/postgresql/data.
docker run -d \
--name db \
-v pgdata:/var/lib/postgresql/data \
me/my-database
When the pgdata volume is empty, the container's entrypoint script will initialize the database. The next time you run the image, the entrypoint script will determine that the directory already contains data provided from the volume and it won't attempt to re-initialize the database or run any of the scripts located in /docker-entrypoint-initdb.d -so you do not need to remove this directory.
Related
Still getting my docker sea legs, I am trying to create a container for postgres 14 with alpine linux.
this is my Dockerfile so far:
FROM alpine:3.15.5
EXPOSE 5432
# update repo, install postgres 14
RUN apk update
RUN apk add gcc make
RUN apk add postgresql14 postgis
# data dir
RUN mkdir /var/lib/postgresql/data
RUN chmod 0700 /var/lib/postgresql/data
RUN chown postgres:postgres /var/lib/postgresql/data
VOLUME /var/lib/postgresql/data
# create db cluster as postgres user
USER postgres:postgres
RUN initdb -D /var/lib/postgresql/data
# temp
ENTRYPOINT [ "top" ]
The issue I am running into is when I build and run (docker-compose up --build), the initdb command runs perfectly, no errors, and the output makes sense, however there is no data in the /var/lib/postgresql/data dir which should have all the default postgres configs and db files.
The weird thing is, if I attach the container shell while it is running, and run initdb -D /var/lib/postgresql/data it works... 🤯
Please can you tell me what I am doing wrong/missing
Here is the docker-compose.yml as well for total coverage:
version: '3.9'
services:
postgres:
build: .
ports:
- "5432:5432"
volumes:
- ./postgres:/var/lib/postgresql/data
as #β.εηοιτ.βε has pointed out:
This happens because you are overriding the content of it with your
mount: - ./postgres:/var/lib/postgresql/data. I have explained the
docker-compose up process and its implication with mounts in this answer.
I have written a Dockerfile like this:
FROM postgres:11.2-alpine
ADD ./db/postgresql.conf /etc/postgresql/postgresql.conf
CMD ["-c", "config_file=/etc/postgresql/postgresql.conf"]
It just adds custom config location to a generic Postgres image.
Now I have the following docker-compose service description
db:
build:
context: .
dockerfile: ./db/Dockerfile
environment:
POSTGRES_PASSWORD passwordhere
POSTGRES_USER: user
POSTGRES_DB: db_name
ports:
- 5432:5432
volumes:
- ./run/db-data:/var/lib/db/data
The problem is I can no longer remotely connect to DB using these credentials if I add this Config option. Without that CMD line it works just fine.
If I prepend "postgres" in CMD it has the same effect due to the underlying script prepending it itself.
Provided all the files are where they need to be, I believe the only problem with your setup is that you've omitted an actual executable from the CMD -- specifying just the option. You need to actually run postgres:
CMD ["postgres", "-c", "config_file=/etc/postgresql/postgresql.conf"]
That should work!
EDIT in response to OP's first comment below
First, I did confirm that behavior doesn't change whether "postgres" is in the CMD or not. It's exactly as you said. Onward!
Then I thought there must be a problem with the particular postgresql.conf in use. If we could just figure out what the default file is.. turns out we can!
How to get the existing postgres.conf out of the postgres image
1. Create docker-compose.yml with the following contents:
version: "3"
services:
db:
image: postgres:11.2-alpine
environment:
- POSTGRES_PASSWORD=passwordhere
- POSTGRES_USER=user
- POSTGRES_DB=db_name
ports:
- 5432:5432
volumes:
- ./run/db-data:/var/lib/db/data
2. Spin up the service using
$ docker-compose run --rm --name=postgres db
3. In another terminal get the location of the file used in this release:
$ docker exec -it postgres psql --dbname=db_name --username=user --command="SHOW config_file"
config_file
------------------------------------------
/var/lib/postgresql/data/postgresql.conf
(1 row)
4. View the contents of default postgresql.conf
$ docker exec -it postgres cat /var/lib/postgresql/data/postgresql.conf
5. Replace local config file
Now all we have to do is replace the local config file ./db/postgresql.conf with the contents of the known-working-state config and modify it as necessary.
Database objects are only created once!
Database objects are only created once by the postgres container (source). So when developing the database parameters we have to remove them to make sure we're in a clean state.
Here's a nuclear (be careful!) option to
(1) remove all exited Docker containers, and then
(2) remove all Docker volumes not attached to containers:
$ docker rm $(docker ps -a -q) -f && docker volume prune -f
So now we can be sure to start from a clean state!
Final setup
Let's bring our Dockerfile back into the picture (just like you have in the question).
docker-compose.yml
version: "3"
services:
db:
build:
context: .
dockerfile: ./db/Dockerfile
environment:
- POSTGRES_PASSWORD=passwordhere
- POSTGRES_USER=user
- POSTGRES_DB=db_name
ports:
- 5432:5432
volumes:
- ./run/db-data:/var/lib/db/data
Connect to the db
Now all we have to do is build from a clean state.
# ensure all volumes are deleted (see above)
$ docker-compose build
$ docker-compose run --rm --name=postgres db
We can now (still) connect to the database:
$ docker exec -it postgres psql --dbname=db_name --username=user --command="SELECT COUNT(1) FROM pg_database WHERE datname='db_name'"
Finally, we can edit the postgres.conf from a known working state.
As per this other discussion, your CMD command only has arguments and is missing a command. Try:
CMD ["postgres", "-c", "config_file=/etc/postgresql/postgresql.conf"]
If have a docker-compose file for postgres that works as expected and I'm able to access it from R. See relevant content below. However, I also need an equivalent "docker run" command but for some reason cannot get this to work. As far as I can tell the commands / setup are equivalent. Any suggestions?
postgres:
image: postgres
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
PGDATA: /var/lib/postgresql/data
ports:
- 5432:5432
restart: always
volumes:
- ~/postgresql/data:/var/lib/postgresql/data
The docker run command I'm using is:
docker run -p 5432:5432 \
--name postgres \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=postgres \
-e PGDATA=/var/lib/postgresql/data \
-v ~/postgresql/data:/var/lib/postgresql/data \
-d postgres
EDIT 1: In both settings I'm trying to connect from another docker container/service. In the docker-compose setting the different services are described in one and the same yml file
EDIT 2: David's answer provided all the information I needed. Create a docker network and reference that network in each docker run call. For those interested in a shell script that uses this setup to connect postgres, pgadmin4, and a data science container with R and Python see the link below:
https://github.com/radiant-rstats/docker/blob/master/launch-rsm-msba-pg.sh
Docker Compose will automatically create a Docker network for you (per Compose file). For inter-container DNS to work, you can't use the default Docker network but any named network will work. So you need to add that bit of setup:
docker network create some-name # default options are fine
docker run --net some-name --name postgres ...
# will be accessible as "postgres" from other containers on
# the "some-name" network
I'm starting a docker swarm with a PostgreSQL image.
I want to create a user named 'numbers' on that database.
This is my docker-compose file. The .env file contains POSTGRES_USER and POSTGRES_PASSORD. If I ssh into the container hosting the postgres image, I can see the variables when executing env.
But psql --user numbers tells me that role "numbers" does not exists.
How should I pass the POSTGRES_* vars so that the correct user is created?
version: '3'
services:
postgres:
image: 'postgres:9.5'
env_file:
- ./.env
ports:
- '5432:5432'
volumes:
- 'postgres:/var/lib/postgresql/data'
deploy:
replicas: 1
networks:
- default
restart: always
This creates the postgresql user as expected.
$ docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -e POSTGRES_USER=numbers -d postgres
When Postgres find its data directory already initialized, he does not run the initialization script. This is the check:
if [ ! -s "$PGDATA/PG_VERSION" ]; then
....
So I recommend you to manually create that user or start from scratch (removing your volume if you can afford it, loosing the data). From command line:
docker volume ls
docker volume rm <id>
Here is the image I am using.
I named it posgres_test
If I run the image individually
docker run -i -t -v="test_volume:/var/lib/postgresql" -p 5432:5432 posgres_test
I can access it with
psql -h 192.168.99.100 -p 5432 -U pguser -W pgdb
Or I can access it with my golang app
// host is set to postgres
db, err := sql.Open("postgres", "postgres://pguser:pguser#postgres:5432/pgdb")
// table test_db is manually created.
rows, err := db.Query("SELECT name FROM test_db WHERE)
However if I use docker compose
docker-compose.yml
version: "2"
services:
postgres:
image: my_image/postgresql:9.3
volumes:
- test_volume:/var/lib/postgresql
ports:
- "5432:5432"
web:
image: my-golang-app4
ports:
- "8080:8080"
volumes:
test_volume: {}
I get the following
pguser#pgdb ERROR: relation "test_db" does not exist at character 15
I know for sure test_db exist in test_volume since
docker run -i -t -v="test_volume:/var/lib/postgresql" -p 5432:5432 posgres_test
psql -h 192.168.99.100 -p 5432 -U pguser -W pgdb
\dt
will show the table I created
But it seems like my app in docker compose cannot find it
Can someone help me out?
About your docker-compose file
First, I thought it's because you don't use the 'links' option to link your postgres container to the web container - it's good practice if you don't expand ports - but you expand postgres port.
If you want to use inheritance from the image you posted
Instead of using this line:
my_image/postgresql:9.3
use:
docker/postgres
and create path docker/postgres and there place Dockerfile with inheritance from the container you want.
I always use shared volumes in docker-compose.yml like this:
.:/var/www/html
where . is my project path where I place my code files.
Image I created to test this case
I don't have all your docker files structure to reproduce this error and fix it, so I created a docker-compose, which should match your needs or help to fix your issue:
version: '2'
services:
web:
build: docker/web
ports:
- "8080:8080"
links:
- dbpostgres
volumes:
- .:/var/www/html # I will share my code so I map this path
dbpostgres:
image: postgres
volumes:
- /private/var/lib/postgresql:/var/lib/postgresql
ports:
- "5432:5432"
environment:
POSTGRES_USER: pguser
POSTGRES_PASSWORD: pguser
POSTGRES_DB: pgdb
Notes:
I will recommend use official postgres image
I left comments next to the lines.
How I made connection:
host=dbpostgres port=5432 dbname=pgdb user=pguser password=pguser
Because my web container knows host dbpostgres (image name and domain name) now - I link them using links.
If you need database from existing container
If you need the database from your existing container just use the docker option cp to copy database locally:
docker cp posgres_test:/var/lib/postgresql /private/var/lib/postgresql
where /private/var/lib/postgresql is a path on your localhost.
You also need to change credentials to db in docker-compose to your old credentials.
You have to do it before running docker-compose because if db doesn't exist, it will be created.
Any questions, let me know.
If the volume is external and already existing before the use of docker-compose you should declare it external, or else docker compose will create a new volume with the project name as prefix.
volumes:
test_volume:
external: true
Docs for external using compose v3 (mostly similar to v2): https://docs.docker.com/compose/compose-file/compose-file-v3/#external
I think it sould be something like this for you.
docker run -itd -p 5432:5432 --name postgres_test -v /path/in/your/host :/path/in/your/container postgres_test psql -h 192.168.99.100 -p 5432 -U pguser -W pgdb
Read Docker docs(https://docs.docker.com/engine/tutorials/dockervolumes/), watch tutorials (there is a Docker Youtube channel whith great tutorials).