The postgres image, for example, has a volume baked in at /var/lib/postgresl/data, but it isn't bound to a particular host path. I'm wondering if the database work done in this container is wholly encapsulated by committing the container to an image, or if I need to separately pass along the contents of the unbound volume.
Example in commands
Create container vtest based on postgres image:
$ docker run -d --name vtest postgres
The container has a volume at /var/lib/postgresql/data that is not bound to a host path:
$ docker inspect -f '{{ .Volumes }}' vtest
map[/var/lib/postgresql/data:/var/lib/docker/vfs/dir/bc39da05ff1cd044d7a17bba61381e854a948fb70cf39f897247f5ada66ad906]
$ sudo docker inspect -f '{{ .HostConfig.Binds }}' vtest
<no value>
Create a database and add some records in the vtest container. Then, commit the changes to an image to be able to share with others:
$ docker commit -p vtest postgres:vtest
Will the changes made in the vtest container's /var/lib/postgresql/data persist in this new postgres:vtest image?
The volumes mounted in the container are not committed to the image, not matter it is mounted to a particular folder of your host or it's mounted to a folder in /var/lib/docker. In fact, as you show in your message, the volume is mounted to /var/lib/docker/vfs/dir/bc39da05ff1cd044d7a17bba61381e854a948fb70cf39f897247f5ada66ad906 in host machine. You can browse that folder as root user.
If you want to save the data in a volume you would need to use other approach different to committing. One of the more used ones is using data containers (which will create also a folder in /var/lib/container/... with your data), and then saving that volume using a new container an a packing tool like tar. Check Docker documentation related to this topic for further details.
Related
For automated testing we can't use a DB Docker container with a defined volume. Just wondering if there would be available an "offical" Postgres image with no mounted volume or volume definitions.
Or if someone has a Dockerfile that would create a container without any volume definitions, that would be very helpful to see or try to use one.
Or is there any way to override a defined volume mount and just use datafile inside of to be created Docker container with running DB.
I think you are mixing up volumes and bind mounts.
https://docs.docker.com/storage/
VOLUME Dockerfile command: A volume with the VOLUME command in a Dockerfile is created into the docker area on the host that is /var/lib/docker/volumes/.
I don't think it is possible to run docker without it having access to this directory or it would be not advisable to restrict permission of docker to these directories, these are dockers own directories after all.
So postgres dockerfile has this command in dockerfile, for example: https://github.com/docker-library/postgres/blob/master/15/bullseye/Dockerfile
line 186: VOLUME /var/lib/postgresql/data
This means that the /var/lib/postgresql/data directory that is inside the postgres container will be a VOLUME that will be stored on the host somewhere in /var/lib/docker/volumes/somerandomhashorguid..... in a directory with a random name.
You can also create a volume like this with docker run:
docker run --name mypostgres -e POSTGRES_PASSWORD=password -v /etc postgres:15.1
This way the /etc directory that is inside the container will be stored on the host in the /var/lib/docker/volumes/somerandomhashorguid.....
This volume solution is needed for containers that need extra IO, because the files of the containers (that are not in volumes) are stored in the writeable layer as per the docs: "Writing into a container’s writable layer requires a storage driver to manage the filesystem. The storage driver provides a union filesystem, using the Linux kernel. This extra abstraction reduces performance as compared to using data volumes, which write directly to the host filesystem."
So you could technically remove the VOLUME command from the postgres dockerfile and rebuild the image for yourself and use that image to create your postgres container but it would have lesser performance.
Bind mounts are the type of data storage solution that can be mounted to anywhere on the host filesystem. For example if you would run:
docker run --name mypostgres -e POSTGRES_PASSWORD=password -v /tmp/mypostgresdata:/var/lib/postgresql/data postgres:15.1
(Take not of the -v flag here, there is a colon between the host and the container directory while previously in the volume version of this flag there was no host directory and no colon either.)
then you would have a directory created on your docker host machine /tmp/mypostgresdata and the directory of the container of /var/lib/postgresql/data would be mapped here instead of the docker volumes internal directory /var/lib/docker/volumes/somerandomhashorguid.....
My general rule of thumb would be to use volumes - as in /var/lib/docker/volumes/ - whenever you can and deviate only if really necessary. Bind mounts are not flexible enough to make an image/container portable and the writable container layer has less performance than docker volumes.
You can list docker volumes with docker volume ls but you will not see bind mounted directories here. For that you will need to do docker inspect containername
"You could just copy one of the dockerfiles used by the postgres project, and remove the VOLUME statement. github.com/docker-library/postgres/blob/… –
Nick ODell
Nov 26, 2022 at 18:05"
answered Nick abow.
And that edited Dockerfile would build "almost" Docker Official Image.
I am creating postgreSQL container using following command
sudo docker run -d --name=pg -p 5432:5432 -e POSTGRES_PASSWORD=secret -e PGDATA=/pgdata -v pg:/pgdata postgres
After running this container when I check volumes by running following command
sudo docker volume ls
DRIVER VOLUME NAME
local 6d283475c6fe923155018c847f2c607c464244cb6767dd37a579824cf8c7e612
local pg
I get two volumes. pg volume is created in the command but what the second volume is??
If you look at the Docker Hub decomposition of the postgres image you will notice it has a declaration
VOLUME ["/var/lib/postgresql/data"]
If you don't explicitly mount something else on that directory, Docker will create an anonymous volume and mount it there for you. This behaves identically to a named volume except that it doesn't have a specific name.
docker inspect mostly dumps out low-level diagnostic information, but it should include the mount information, and you should see two volume mounts, one with the anonymous volume on the default PostgreSQL data directory and a second matching the explicit mount on /pgdata.
I have a running postgreSQL docker container and need to add a volume mount.
I followed the steps from How can I add a volume to an existing Docker container?, (ran docker commit on the container to save it as an image, and spun up another container based on that image with a named volume mounted in). All the data files from the first container are present in /var/lib/postgres/data of the second container.
However, when I try to query this second postgres database, I cannot see any tables that are in the first container. Been trying to fix this for a few days with no luck, am I missing something here (does mounting a volume obscure the existing data in /var/lib/postres/data)?
Commit will not work as there is the volume defined in the Dockerfile.
Volumes are useful in many cases, for example, for running
database-storage. However, since volumes are not 'part' of a
container, it makes containers no longer portable - which seems in
direct conflict with the slogan "Build once... Run anywhere.."
docker commit data container with VOLUME
One option that you can try is copying data folder to host from an existing container and then launch the container with mount path.
docker cp my_db_cotainer:/var/lib/postgresql/data db_data
then start a new container with this path so it will contain the same data as the previous one
docker run -d --name some-postgres -v $PWD/db_data/:/var/lib/postgresql/data postgres
same for mysql
docker cp some-mysql-old:/var/lib/mysql db_backup
docker run --rm --name some-mysql-new -v $PWD/db_backup:/var/lib/mysql -it mysql
I am looking at sample dockerfile to see how VOLUME is used , I come across the following lines from - https://github.com/docker-library/postgres/blob/master/Dockerfile-alpine.template
ENV PGDATA /var/lib/postgresql/data
# this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA"
VOLUME /var/lib/postgresql/data
What is the purpose of using a volume here , here is my understanding - please confirm
Create directory pointed by $PGDATA in image file system.
Map it with the VOLUME so that any content created later as part of populating the content thorough docker-entrypont.sh by exposing a predefined directory that could be used by the container.
What if the VOLUME instr is not defined ? It might more laborious for someone to figure out where to keep custom changes unless VOLUME is not defined
Volume is define here, so when you start a container ( out of this image ) a new anonymous volume is created.
The volume will hold your sensible data in this regard, so this is all you need to "persist" during normal/soft docker image lifecycled.
Usually when the maintainers of docker images are already aware where the data, which will be sensible to keep, is located ( like here ) there will decorate the folder using VOLUME in the Dockerfile. This will, as mentioned, create a anon-volume during runtime but also makes you aware ( using docker inspect or reading the Dockerfile ) where volumes for persistence are located.
In production you usually will used a named volume / path mount in your docker-compose file mounted to this very folder
docker-compose.yml as named volume
volumes:
mydbdata:/var/lib/postgresql/data
docker-compose.yml as path
volumes:
./local/path/data:/var/lib/postgresql/data
There are actually cons in defining such VOLUME definitions in the Dockerfile, which i will not elaborate here, but the main reason is "lifetime".
Having no VOLUME in the Dockerfile and running
docker-compose up -d
# do something, manipulate the data
docker-compose down
# all your data would be lost when starting again
docker-compose up -d
Would remove not only the running container, but all your DB data, which might not what you intended ( you just wanted to recreated the container ).
With VOLUME in the Dockerfile, the anon-volume would be persisted even over docker-compose down
I'm using docker-compose in one of my projects. During development i mount my source directory to a volume in one of my docker services for easy development. At the same time, I have a db service (psql) that mounts a named volume for persistent data storage.
I start by solution and everything is working fine
$ docker-compose up -d
When I check my volumes I see the named and "unnamed" (source volume).
$ docker volume ls
DRIVER VOLUME NAME
local 226ba7af9689c511cb5e6c06ceb36e6c26a75dd9d619360882a1012cdcd25b72
local myproject_data
The problem I experience is that, when I do
$ docker-compose down
...
$ docker volume ls
DRIVER VOLUME NAME
local 226ba7af9689c511cb5e6c06ceb36e6c26a75dd9d619360882a1012cdcd25b72
local myproject_data
both volumes remain. Every time I run
$ docker-compose down
$ docker-compose up -d
a new volume is created for my source mount
$ docker volume ls
DRIVER VOLUME NAME
local 19181286b19c0c3f5b67d7d1f0e3f237c83317816acbdf4223328fdf46046518
local 226ba7af9689c511cb5e6c06ceb36e6c26a75dd9d619360882a1012cdcd25b72
local myproject_data
I know that this will not happen on my deployment server, since it will not mount the source, but is there a way to not make the mounted source persistent?
You can use the --rm option in docker run. To use it with docker-compose you can use
docker-compose rm -v after stopping your containers with docker-compose stop
If you go through the docs about Data volumes , its mentioned that
Data volumes persist even if the container itself is deleted.
So that means, stopping a container will not remove the volumes it created, whether named or anonymous.
Now if you read further down to Removing volumes
A Docker data volume persists after a container is deleted. You can
create named or anonymous volumes. Named volumes have a specific
source form outside the container, for example awesome:/bar. Anonymous
volumes have no specific source. When the container is deleted, you
should instruct the Docker Engine daemon to clean up anonymous
volumes. To do this, use the --rm option, for example:
$ docker run --rm -v /foo -v awesome:/bar busybox top
This command creates an anonymous /foo volume. When the container is
removed, the Docker Engine removes the /foo volume but not the awesome
volume.
Just remove volumes with the down command:
docker-compose down -v