postgresql persist data: which is better named volume or bind mount

postgresql persist data: which is better named volume or bind mount - postgresql

Option 1: (named container. the volume is identified by its name. It store its data in the /var/lib/docker/volumes/nameofthevolume)
# create the volume in advance
$ docker volume create test_vol
Option: 2 (here name of the volume bind-test does not matter, what matter is which local path /home/user/test it mounts to, which is persistant. Rather than /var/lib/docker/volume/somevolumename /home/user/somedatafolder makes more readability. Cons: we have to ensure that the /home/user/somedatafolder exists.)
# inside a docker-compose file
...
volumes:
bind-test:
driver: local
driver_opts:
type: none
o: bind
device: /home/user/test
or:
version: '3'
services:
myservice:
volumes:
- ./path:/volume/path
The downside of bind mounts is that it places files that are managed by containers, with the uid/gid from the container, inside a path likely used by other users on the host, often with a different uid/gid on the host. The result is permission issues either on the host or inside the container. You need to align uid/gid's between the two to avoid this.

At the end of the day, there isn't a big difference between bind mount and Docker named volumes.
I tend to prefer keeping persistent data from Docker services in Docker volumes. You can then use tools like docker system df -v to inspect what your application uses.
As for exporting the data, you can use docker cp
docker cp someContainer:/somedir/ .

Related

PostgresSQL Docker image without a volume mount

For automated testing we can't use a DB Docker container with a defined volume. Just wondering if there would be available an "offical" Postgres image with no mounted volume or volume definitions.
Or if someone has a Dockerfile that would create a container without any volume definitions, that would be very helpful to see or try to use one.
Or is there any way to override a defined volume mount and just use datafile inside of to be created Docker container with running DB.

I think you are mixing up volumes and bind mounts.
https://docs.docker.com/storage/
VOLUME Dockerfile command: A volume with the VOLUME command in a Dockerfile is created into the docker area on the host that is /var/lib/docker/volumes/.
I don't think it is possible to run docker without it having access to this directory or it would be not advisable to restrict permission of docker to these directories, these are dockers own directories after all.
So postgres dockerfile has this command in dockerfile, for example: https://github.com/docker-library/postgres/blob/master/15/bullseye/Dockerfile
line 186: VOLUME /var/lib/postgresql/data
This means that the /var/lib/postgresql/data directory that is inside the postgres container will be a VOLUME that will be stored on the host somewhere in /var/lib/docker/volumes/somerandomhashorguid..... in a directory with a random name.
You can also create a volume like this with docker run:
docker run --name mypostgres -e POSTGRES_PASSWORD=password -v /etc postgres:15.1
This way the /etc directory that is inside the container will be stored on the host in the /var/lib/docker/volumes/somerandomhashorguid.....
This volume solution is needed for containers that need extra IO, because the files of the containers (that are not in volumes) are stored in the writeable layer as per the docs: "Writing into a container’s writable layer requires a storage driver to manage the filesystem. The storage driver provides a union filesystem, using the Linux kernel. This extra abstraction reduces performance as compared to using data volumes, which write directly to the host filesystem."
So you could technically remove the VOLUME command from the postgres dockerfile and rebuild the image for yourself and use that image to create your postgres container but it would have lesser performance.
Bind mounts are the type of data storage solution that can be mounted to anywhere on the host filesystem. For example if you would run:
docker run --name mypostgres -e POSTGRES_PASSWORD=password -v /tmp/mypostgresdata:/var/lib/postgresql/data postgres:15.1
(Take not of the -v flag here, there is a colon between the host and the container directory while previously in the volume version of this flag there was no host directory and no colon either.)
then you would have a directory created on your docker host machine /tmp/mypostgresdata and the directory of the container of /var/lib/postgresql/data would be mapped here instead of the docker volumes internal directory /var/lib/docker/volumes/somerandomhashorguid.....
My general rule of thumb would be to use volumes - as in /var/lib/docker/volumes/ - whenever you can and deviate only if really necessary. Bind mounts are not flexible enough to make an image/container portable and the writable container layer has less performance than docker volumes.
You can list docker volumes with docker volume ls but you will not see bind mounted directories here. For that you will need to do docker inspect containername

"You could just copy one of the dockerfiles used by the postgres project, and remove the VOLUME statement. github.com/docker-library/postgres/blob/… –
Nick ODell
Nov 26, 2022 at 18:05"
answered Nick abow.
And that edited Dockerfile would build "almost" Docker Official Image.

Docker: postgres volume location and permissions

I have the following docker-compose file
version: '3.7'
volumes:
postgres-data:
services:
postgres:
environment:
- POSTGRES_PASSWORD=mypwd
- POSTGRES_USER=randomuser
image: 'postgres:14'
restart: always
volumes:
- './postgres-data:/var/lib/postgresql/data'
I seem to have multiple issues regarding the volume:
A folder named postgres-data is created in the docker-compose file location when I run up, though it seems that for other images, they get placed in the /var/lib/docker/volumes folder instead (without creating such a folder). Is this expected ? Is it a good practice to have the volume folder created in the same location as the docker-compose file, instead of the /var/lib/docker/volumes folder ?
This folder has weird ownership, I can't get into it as my current user (though I am in the docker group).
I tried reading the image documentation, especially the "Arbitrary --user Notes", but didn't understand what to do with it. I also tried not setting the POSTGRES_USER (which then defaults to postgres), but the result is the same.
What's the correct way to create a volume using this image ?

Your volume mount is explicitly to a subdirectory of the current directory
volumes:
- './postgres-data:/var/lib/postgresql/data'
# ^^ (a slash before the colon always means a bind mount)
If you want to use a named volume you need to declare that at the top level of the Compose file, and refer to the volume name (without a slash) when you use it
volumes:
postgres-data:
services:
...
volumes:
- 'postgres-data:/var/lib/postgresql/data'
# ^^ (no slash)
One isn't really "better" than the other for this case. A bind-mounted host directory is much easier to back up; a named volume will be noticeably faster on MacOS or Windows; you can directly see and edit the files with a bind mount; you can use the Docker ecosystem to clean up named volumes. For a database in particular, seeing the data files isn't very useful and I might prefer a named volume, but that's not at all a strong preference.
File ownership for bind mounts is a frequent question. On native Linux, the numeric user ID is the only thing that matters for permission checking. This is resolved by the /etc/passwd file into a username, but the host and container have different copies of this file (and that's okay). The unusual owner you're seeing with ls -l from the host matches the numeric uid of the default user in the postgres image.
That image is well-designed, though, and the upshot of the section in the Docker Hub documentation is that you can specify any Compose user: you want, probably matching the host uid owning the directory.
sudo rm -rf ./postgres-data # with the wrong owner
id -u # what's my current numeric uid?
version: '3.8'
services:
postgres:
volumes: # using a host directory
- './postgres-data:/var/lib/postgresql/data'
user: 1000 # matches the `id -u` output

Postgres Dockerfile exploration - VOLUME statement usage

I am looking at sample dockerfile to see how VOLUME is used , I come across the following lines from - https://github.com/docker-library/postgres/blob/master/Dockerfile-alpine.template
ENV PGDATA /var/lib/postgresql/data
# this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA"
VOLUME /var/lib/postgresql/data
What is the purpose of using a volume here , here is my understanding - please confirm
Create directory pointed by $PGDATA in image file system.
Map it with the VOLUME so that any content created later as part of populating the content thorough docker-entrypont.sh by exposing a predefined directory that could be used by the container.
What if the VOLUME instr is not defined ? It might more laborious for someone to figure out where to keep custom changes unless VOLUME is not defined

Volume is define here, so when you start a container ( out of this image ) a new anonymous volume is created.
The volume will hold your sensible data in this regard, so this is all you need to "persist" during normal/soft docker image lifecycled.
Usually when the maintainers of docker images are already aware where the data, which will be sensible to keep, is located ( like here ) there will decorate the folder using VOLUME in the Dockerfile. This will, as mentioned, create a anon-volume during runtime but also makes you aware ( using docker inspect or reading the Dockerfile ) where volumes for persistence are located.
In production you usually will used a named volume / path mount in your docker-compose file mounted to this very folder
docker-compose.yml as named volume
volumes:
mydbdata:/var/lib/postgresql/data
docker-compose.yml as path
volumes:
./local/path/data:/var/lib/postgresql/data
There are actually cons in defining such VOLUME definitions in the Dockerfile, which i will not elaborate here, but the main reason is "lifetime".
Having no VOLUME in the Dockerfile and running
docker-compose up -d
# do something, manipulate the data
docker-compose down
# all your data would be lost when starting again
docker-compose up -d
Would remove not only the running container, but all your DB data, which might not what you intended ( you just wanted to recreated the container ).
With VOLUME in the Dockerfile, the anon-volume would be persisted even over docker-compose down

How to mount dotnet user-secrets in a devcontainer?

I use a devcontainer for building and debugging my .NET Core apps. I'd like to share user-secrets between my host machine and the container.
How can I do this if the the location of the usersecrets depends on the host machine?
Windows: %APPDATA%/Microsoft/UserSecrets
Mac/Linux: $HOME/.microsoft/usersecrets
I tried mounting both locations, but that throws an error.
.devcontainer/devcontainer.json
{
"dockerComposeFile":"docker-compose.yml",
"service":"devcontainer",
"runServices":[],
"workspaceFolder":"/workspace",
"forwardPorts":[
5000,
5001
],
"remoteEnv":{
"ASPNETCORE_ENVIRONMENT":"Development",
"ASPNETCORE_URLS":"https://+:5001;http://+:5000"
}
}
.devcontainer/docker-compose.yml
version: "3.7"
services:
devcontainer:
image: mydevcontainerimage:12345
volumes:
- ..:/workspace:cached
- ${APPDATA}/Microsoft/UserSecrets/:/root/.microsoft/usersecrets
- ${HOME}/.microsoft/usersecrets:/root/.microsoft/usersecrets
# Forwards the local Docker socket to the container.
- /var/run/docker.sock:/var/run/docker.sock
command: sleep infinity
Docker-compose crashes with an error.
ERROR: Duplicate mount points: [/.microsoft/usersecrets:/root/.microsoft/usersecrets:rw, C:\Users\steven\AppData\Roaming\Microsoft\UserSecrets:/root/.microsoft/usersecrets:rw]

The solution might be to use a named volume between the host and the container.
Hence, the docker-compose will only reference that named volume.
The named volume creation will be specific to the host though.
For named volume creation based on host path, see here
But as stated here
The built-in local driver on Windows does not support any options.
And for example device=c:\a\path\to\my\folder will not work under Windows.
But, given that the windows path %APPDATA% expands to something like c:\a\path\to\my\folder you can rephrase it as /host_mnt/c/a/path/to/my/folder and use that for device:
docker volume create --name my_test_volume --opt type=none --opt device=device=/host_mnt/c/a/path/to/my/folder --opt o=bind
For others, this supposes that c: is made accessible in docker settings (Resources / File sharing).

Docker-compose named mounted volume

In order to keep track of the volumes used by docker-compose, I'd like to use named volumes. This works great for 'normal' volumes like
version: 2
services:
example-app:
volume:
-named_vol:/dir/in/container/volume
volumes:
named_vol:
But I can't figure out how to make it work when mounting the local host.
I'm looking for something like:
version: 2
services:
example-app:
volume:
-named_homedir:/dir/in/container/volume
volumes:
named_homedir: /c/Users/
or
version: 2
services:
example-app:
volume:
-/c/Users/:/home/dir/in/container/ --name named_homedir
is this in any way possible or am I stuck with anonymous volumes for mounted ones?

As you can read in this GitHub issue, mounting named volumes now is a thing … since 1.11 or 1.12.). Driver specific options are documented. Some notes from the GitHub thread:
docker volume create --opt type=none --opt device=<host path> --opt o=bind
If the host path does not exist, it will not be created.
Options are passed in literally to the mount syscall. We may add special cases for certain "types" because they are awkward to use... like the nfs example [referenced above].
– #cpuguy83
To address your specific question about how to use that in compose, you write under your volumes section:
my-named-volume:
driver_opts:
type: none
device: /home/full/path #NOTE needs full path (~ doesn't work)
o: bind
This is because as cpuguy83 wrote in the github thread linked, the options are (under the hood) passed directly to the mount command.
EDIT: As commented by…
…#villasv, you can use ${PWD} for relative paths.
…#mikeyjk, you might need to delete preexisting volumes:
docker volume rm $(docker volume ls -q)
OR
docker volume prune
…#Camron Hudson, in case you have no such file or directory errors showing up, you might want to read this SO question/ answer as Docker does not follow symlinks and there might be permission issues with your local file system.

OP appears to be using full paths already, but if like most people you're interested in mounting a project folder inside the container this might help.
This is how to do it with driver_opts like #kaiser said and #linuxbandit exemplified. But you can try to use the usually available environment variable $PWD to avoid specifying full paths for directories in the docker-compose context:
logs-directory:
driver_opts:
type: none
device: ${PWD}/logs
o: bind

I've been trying the (almost) same thing and it seems to work with something like:
version: '2'
services:
example-app:
volume:
-named_vol:/dir/in/container/volume
-/c/Users/:/dir/in/container/volume
volumes:
named_vol:
Seems to work for me (I didn't dig into it, just tested it).

I was looking for an answer to the same question recently and stumbled on this plugin: https://github.com/CWSpear/local-persist
Looks like it allows just what topic started wants to do.
Haven't tried it myself yet, but thought it might be useful for somebody.

Host volumes are different from named volumes or anonymous volumes. Their "name" is the path on the host.
There is no way to use the volumes section for host volumes.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse