docker postgres, fail to map volume in windows - postgresql

I wish to store my persists data in my local D:\dockerData\postgres9.6. Below is my docker command
docker pull postgres
docker run -d -v /d/dockerData/postgres9.6:/var/lib/postgresql/data -p 5432:5432 postgres
It successful create a container and I can use pgAdmin to access and create database.
But I found out that there is no file in my D:\dockerData\postgres9.6. I exec bash into the container, there is at least 20+ files inside /var/lib/postgresql/data.
Anyone can point out which part goes wrong?

It depends what kind of Docker you are using on Windows:
Docker Toolbox with VirtualBox: only C:\Users\mylogin is shared by default. D:\ is not mounted.
Docker for Windows with HyperV: only C:\ is mounted by default. Make sure D:\ is a shared drive: see image

Related

Docker Postgres data host volume mapping

I'm trying to docker-containerize PostgreSQL server and this container will have many other applications as well. The need is that, PostgreSQL server data should be mapped to the host volume so that when container is stopped, we won't lose the data. Also that, the next time when we start the container, the same directory can be mapped again and postgres can use the old data. Below is the DOCKERFILE. Note that I'm using ubuntu 22.04 on the host.
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt install -y postgresql
ENTRYPOINT ["tail", "-f", "/dev/null"]
Docker image is built using the command
docker build -t pg_test .
and the container is run using the command
docker run --name test -v /home/me/data:/var/lib/postgresql/14/main pg_test
'/home/me/data' is the host directory which is empty where I want to map the postgres server data. '/var/lib/postgresql/14/main' is the directory inside the docker container where the postgres is supposed to store the data.
Once the docker container starts, I enter the docker container using the command
docker exec -it test bash
and once I'm inside, I'm trying to start the PostgreSQL service. But PostgreSQL fails to start as there is no data in '/var/lib/postgresql/14/main' directory. I understand that since I have mapped an empty host directory to '/var/lib/postgresql/14/main' directory, postgres doesn't have the files required to start.
I understand that I'm doing it the wrong way, but I couldn't find a way around it. Can anyone please help me to do this the right way, if there is one?
Any help would be appreciable.
You should use the postgres docker image, it will set up the db for you when you start the container, you can find instructions on https://hub.docker.com/_/postgres
If you must use a custom image, you will need to initialize the db yourself, usually by running initdb or whatever your system provides.
But really you should use the appropriate docker image, and if you need more services you start them in their own container and connect them to the postgres one

How to reconnect to same postgres database on Docker

I'm very new to using docker and I've created a postgres container using
docker run --name mytrainingdb -e POSTGRES_PASSWORD=mysecretpassword -d postgres. Then I connected to it with docker exec -it <container-id> bash and then psql.
Then I stop the container.
My query is, what do I do reconnect to the same database? I tried to run same docker run command, but it says the name 'mytrainingdb' is used, which means it is trying to create it afresh, which is not what I want. Hope my expectation is right, as in when I restart my laptop or resume work I can just restart the same container and my data/config would be preserved?
The documentation also mentions that we can link a host directory to volume of pg container to have the stored data accessible to us, but I'm ok with docker managing my storage for that database.
You will have error when you try to re-run the same command, because docker is trying to create a new container with same name as the previous one "mytrainingdb". If you close docker and reopen it you will still find your container , but its not running , you can start it again with docker start mytrainingdb or you can remove it with docker rm mytrainingdb .
However , dont restart docker because you want to create a new container with the same name! If you want to start a new container with the same name and your container is still running you can first stop it with docker stop mytrainingdb and docker rm mytrainingdb or you can just do docker rm -f mytrainingdb (this will remove you running container with force ) and then create a new container..
As for the volumes ,you just created one by default which is named is kind of hash , and its found at volumes/var/lib/docker/volumes/ .Because generally containers such PostgreSQL, or databases in general persists volumes. The volume gets created when running the container and is handy to save persistent data, whether you start the container with -v or not.
The volume you talked about in your question , is called mounted volume , is when you basically just bind a certain directory or file from the host (outside) to inside the container
docker run -v /hostdir:/containerdir in your case docker run -v /hostdir:/var/lib/postgresql/data
If you restart docker or your computer running containers won't be automatically restarted. You can start your container again with docker start mytrainingdb (related question), then connect with your docker exec command.
(one tip: instead of running bash, then psql, you can directly run psql, e.g. docker exec -it mytrainingdb psql --user postgres)
Your understanding of data persistence is correct, docker will manage the data and it will still be around.
From the postgres image documentation
There are several ways to store data used by applications that run in Docker containers. We encourage users of the postgres images to familiarize themselves with the options available, including:
Let Docker manage the storage of your database data by writing the database files to disk on the host system using its own internal volume management. This is the default and is easy and fairly transparent to the user. The downside is that the files may be hard to locate for tools and applications that run directly on the host system, i.e. outside containers.
You can add --rm argument so that whenever you stop the container manually, or container stops for any reasons (his task is done or it fails), it will remove that container.
In your case, you can use this:
docker run --name mytrainingdb --rm -e POSTGRES_PASSWORD=mysecretpassword -d postgres

Postgres volume mounting on WSL2 and Docker desktop: Permission Denied on PGDATA folder

There are some similar posts but this is specifically related to running Postgres with WSL2 backend on Docker desktop. WSL2 brings full Linux experience on Windows. Volumes can be mounted to both Windows and Linux file systems. But the best practice is to use Linux file system for performance reasons see docker documentation.
Performance is much higher when files are bind-mounted from the Linux filesystem, rather than remoted from the Windows host. Therefore avoid docker run -v /mnt/c/users:/users (where /mnt/c is mounted from Windows).
Instead, from a Linux shell use a command like docker run -v ~/my-project:/sources where ~ is expanded by the Linux shell to $HOME.
My WSL distro is Ubuntu 20.04 LTS. I'm bind mounting Postgres data directory to a directory on Linux filesystem and I'm also configuring the Postgres PGDATA to use a sub-directory because this is instructed on the official Docker image docs:
PGDATA
This optional variable can be used to define another location - like a subdirectory - for the database files. The default is /var/lib/postgresql/data. If the data volume you're using is a filesystem mountpoint (like with GCE persistent disks) or remote folder that cannot be chowned to the postgres user (like some NFS mounts), Postgres initdb recommends a subdirectory be created to contain the data.
So this is how I start Postgres with the volume mounting to WSL2 Ubuntu file system:
docker run -d \
--name some-postgres -e POSTGRES_PASSWORD=root \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-v ~/custom/mount:/var/lib/postgresql/data \
postgres
I can exec into the running container and verify that the data folder exists and it's configured correctly:
Now from the host machine (WSL2 Linux) if I try to access that folder I get the permission denied:
I would appreciate if anyone can provide a solution. None of the existing posts worked to resolve the issue.
This has got nothing to do with PostgreSQL. Docker containers run as root and so any directory created by Docker will also belong to root.
When you attach to the container and list the directory under /var/lib/postgresql/data it shows postgres as the owner.
Check "Arbitrary --user Notes" section in the official documentation here
The second option "bind-mount /etc/passwd read-only from the host" worked for me.
Two things that were blocking us working with WSL2 on Windows were:
Folder c:\Program files\WindowsApps didn't have admin account listed as owner
McAfee was blocking the WSL. In order to disable blocking we had to remove following rule: Open McAfee -> Threat Prevention -> Show Advanced (button in Right upper corner) -> scroll down to Rules -> name of the rule is "Executing Subsystem for Linux"

How to copy and use existing postgres data folder into docker postgres container

I want to build postgres docker container for testing some issue.
I have:
Archived folder of postgres files(/var/lib/postgres/data/)
Dockerfile that place folder into doccker postgres:latest.
I want:
Docker image that reset self-state after recreate image.
Container that have database state based on passed into the container postgres files
I don't want to wait for a long time operation of backup and restore existing database in /docker-entrypoint-initdb.d initialization script.
I DON'T WANT TO USE VOLUMES because I don't need to store new data between restart (That's why this post is different from How to use a PostgreSQL container with existing data?. In that post volumes are used)
My suggestion is to copy postgres files(/var/lib/postgres/data/) from host machine into docker's /var/lib/postgres/data/ in build phase.
But postgres docker replace this files when initdb phase is executing.
How to ask Postgres docker not overriding database files?
e.g.
Dockerfile
FROM postgres:latest
COPY ./postgres-data.tar.gz /opt/pg-data/
WORKDIR /opt/pg-data
RUN tar -xzf postgres-data.tar.gz
RUN mv ./data/ /var/lib/postgresql/data/pg-data/
Run command
docker run -p 5432:5432 -e PGDATA=/var/lib/postgresql/data/pg-data --name database-immage1 database-docker
If you don't really need to create a custom image with the database snapshot you could use volumes. Un-tar the database files somewhere on the host say ~/pgdata then run the image. Example:
docker run -v ~/pgdata:/var/lib/postgresql/data/ -p 5432:5432 postgres:9.5
The files must be compatible with the postgres version of the image so use the same image version as the archived database.
If, instead, you must recreate the image you don't need to uncompress the database archive. The ADD instruction will do
that for you. Make sure the tar does not contain any leading directory.
The Dockerfile:
FROM postgres:latest
ADD ./postgres-data.tar.gz /var/lib/postgresql/data/
Build it:
docker build . -t database-docker
Run without overriding the environment variable PGDATA. Note that you copy the files in /var/lib/postgresql/data but the PGDATA points to /var/lib/postgresql/data/pg-data.
Run the container:
docker run -p 5432:5432 --name database-image1 database-docker

Migrate data from a data only postgresql docker volume

I have a data only postgresql container
docker create -v /var/lib/postgresql/data --name bevdata mdillon/postgis /bin/true
I have a running Postgis container
docker run --name bevaddress -e POSTGRES_USER=bevsu -e POSTGRES_DB=bevaddress -P -d --volumes-from bevdata mdillon/postgis
I have made a backup of that database into the bavaddress container into directory /var/lib/postgresql/backup
I think this means that the backup data is in container bevaddress (the running process) and NOT the data only container bevdata which I think is good.
Now if I docker pull mdillon/postgis to a new version, how can I attach the folder /var/lib/postgresql/backup of container bevaddress so that a new instance and version of mdillon/postgis can access that folder to restore the database?
To the best of my knowledge, you cannot. The file system in your running container only exists for the duration of the run. Without mounting a volume, you have no way to allow a second container access to the backup.
For future backups, you could create a second volume only container that mounts /var/lib/postgresql/backup.