I have a running postgreSQL docker container and need to add a volume mount.
I followed the steps from How can I add a volume to an existing Docker container?, (ran docker commit on the container to save it as an image, and spun up another container based on that image with a named volume mounted in). All the data files from the first container are present in /var/lib/postgres/data of the second container.
However, when I try to query this second postgres database, I cannot see any tables that are in the first container. Been trying to fix this for a few days with no luck, am I missing something here (does mounting a volume obscure the existing data in /var/lib/postres/data)?
Commit will not work as there is the volume defined in the Dockerfile.
Volumes are useful in many cases, for example, for running
database-storage. However, since volumes are not 'part' of a
container, it makes containers no longer portable - which seems in
direct conflict with the slogan "Build once... Run anywhere.."
docker commit data container with VOLUME
One option that you can try is copying data folder to host from an existing container and then launch the container with mount path.
docker cp my_db_cotainer:/var/lib/postgresql/data db_data
then start a new container with this path so it will contain the same data as the previous one
docker run -d --name some-postgres -v $PWD/db_data/:/var/lib/postgresql/data postgres
same for mysql
docker cp some-mysql-old:/var/lib/mysql db_backup
docker run --rm --name some-mysql-new -v $PWD/db_backup:/var/lib/mysql -it mysql
Related
For automated testing we can't use a DB Docker container with a defined volume. Just wondering if there would be available an "offical" Postgres image with no mounted volume or volume definitions.
Or if someone has a Dockerfile that would create a container without any volume definitions, that would be very helpful to see or try to use one.
Or is there any way to override a defined volume mount and just use datafile inside of to be created Docker container with running DB.
I think you are mixing up volumes and bind mounts.
https://docs.docker.com/storage/
VOLUME Dockerfile command: A volume with the VOLUME command in a Dockerfile is created into the docker area on the host that is /var/lib/docker/volumes/.
I don't think it is possible to run docker without it having access to this directory or it would be not advisable to restrict permission of docker to these directories, these are dockers own directories after all.
So postgres dockerfile has this command in dockerfile, for example: https://github.com/docker-library/postgres/blob/master/15/bullseye/Dockerfile
line 186: VOLUME /var/lib/postgresql/data
This means that the /var/lib/postgresql/data directory that is inside the postgres container will be a VOLUME that will be stored on the host somewhere in /var/lib/docker/volumes/somerandomhashorguid..... in a directory with a random name.
You can also create a volume like this with docker run:
docker run --name mypostgres -e POSTGRES_PASSWORD=password -v /etc postgres:15.1
This way the /etc directory that is inside the container will be stored on the host in the /var/lib/docker/volumes/somerandomhashorguid.....
This volume solution is needed for containers that need extra IO, because the files of the containers (that are not in volumes) are stored in the writeable layer as per the docs: "Writing into a container’s writable layer requires a storage driver to manage the filesystem. The storage driver provides a union filesystem, using the Linux kernel. This extra abstraction reduces performance as compared to using data volumes, which write directly to the host filesystem."
So you could technically remove the VOLUME command from the postgres dockerfile and rebuild the image for yourself and use that image to create your postgres container but it would have lesser performance.
Bind mounts are the type of data storage solution that can be mounted to anywhere on the host filesystem. For example if you would run:
docker run --name mypostgres -e POSTGRES_PASSWORD=password -v /tmp/mypostgresdata:/var/lib/postgresql/data postgres:15.1
(Take not of the -v flag here, there is a colon between the host and the container directory while previously in the volume version of this flag there was no host directory and no colon either.)
then you would have a directory created on your docker host machine /tmp/mypostgresdata and the directory of the container of /var/lib/postgresql/data would be mapped here instead of the docker volumes internal directory /var/lib/docker/volumes/somerandomhashorguid.....
My general rule of thumb would be to use volumes - as in /var/lib/docker/volumes/ - whenever you can and deviate only if really necessary. Bind mounts are not flexible enough to make an image/container portable and the writable container layer has less performance than docker volumes.
You can list docker volumes with docker volume ls but you will not see bind mounted directories here. For that you will need to do docker inspect containername
"You could just copy one of the dockerfiles used by the postgres project, and remove the VOLUME statement. github.com/docker-library/postgres/blob/… –
Nick ODell
Nov 26, 2022 at 18:05"
answered Nick abow.
And that edited Dockerfile would build "almost" Docker Official Image.
I'm very new to using docker and I've created a postgres container using
docker run --name mytrainingdb -e POSTGRES_PASSWORD=mysecretpassword -d postgres. Then I connected to it with docker exec -it <container-id> bash and then psql.
Then I stop the container.
My query is, what do I do reconnect to the same database? I tried to run same docker run command, but it says the name 'mytrainingdb' is used, which means it is trying to create it afresh, which is not what I want. Hope my expectation is right, as in when I restart my laptop or resume work I can just restart the same container and my data/config would be preserved?
The documentation also mentions that we can link a host directory to volume of pg container to have the stored data accessible to us, but I'm ok with docker managing my storage for that database.
You will have error when you try to re-run the same command, because docker is trying to create a new container with same name as the previous one "mytrainingdb". If you close docker and reopen it you will still find your container , but its not running , you can start it again with docker start mytrainingdb or you can remove it with docker rm mytrainingdb .
However , dont restart docker because you want to create a new container with the same name! If you want to start a new container with the same name and your container is still running you can first stop it with docker stop mytrainingdb and docker rm mytrainingdb or you can just do docker rm -f mytrainingdb (this will remove you running container with force ) and then create a new container..
As for the volumes ,you just created one by default which is named is kind of hash , and its found at volumes/var/lib/docker/volumes/ .Because generally containers such PostgreSQL, or databases in general persists volumes. The volume gets created when running the container and is handy to save persistent data, whether you start the container with -v or not.
The volume you talked about in your question , is called mounted volume , is when you basically just bind a certain directory or file from the host (outside) to inside the container
docker run -v /hostdir:/containerdir in your case docker run -v /hostdir:/var/lib/postgresql/data
If you restart docker or your computer running containers won't be automatically restarted. You can start your container again with docker start mytrainingdb (related question), then connect with your docker exec command.
(one tip: instead of running bash, then psql, you can directly run psql, e.g. docker exec -it mytrainingdb psql --user postgres)
Your understanding of data persistence is correct, docker will manage the data and it will still be around.
From the postgres image documentation
There are several ways to store data used by applications that run in Docker containers. We encourage users of the postgres images to familiarize themselves with the options available, including:
Let Docker manage the storage of your database data by writing the database files to disk on the host system using its own internal volume management. This is the default and is easy and fairly transparent to the user. The downside is that the files may be hard to locate for tools and applications that run directly on the host system, i.e. outside containers.
You can add --rm argument so that whenever you stop the container manually, or container stops for any reasons (his task is done or it fails), it will remove that container.
In your case, you can use this:
docker run --name mytrainingdb --rm -e POSTGRES_PASSWORD=mysecretpassword -d postgres
I'm using docker-compose in one of my projects. During development i mount my source directory to a volume in one of my docker services for easy development. At the same time, I have a db service (psql) that mounts a named volume for persistent data storage.
I start by solution and everything is working fine
$ docker-compose up -d
When I check my volumes I see the named and "unnamed" (source volume).
$ docker volume ls
DRIVER VOLUME NAME
local 226ba7af9689c511cb5e6c06ceb36e6c26a75dd9d619360882a1012cdcd25b72
local myproject_data
The problem I experience is that, when I do
$ docker-compose down
...
$ docker volume ls
DRIVER VOLUME NAME
local 226ba7af9689c511cb5e6c06ceb36e6c26a75dd9d619360882a1012cdcd25b72
local myproject_data
both volumes remain. Every time I run
$ docker-compose down
$ docker-compose up -d
a new volume is created for my source mount
$ docker volume ls
DRIVER VOLUME NAME
local 19181286b19c0c3f5b67d7d1f0e3f237c83317816acbdf4223328fdf46046518
local 226ba7af9689c511cb5e6c06ceb36e6c26a75dd9d619360882a1012cdcd25b72
local myproject_data
I know that this will not happen on my deployment server, since it will not mount the source, but is there a way to not make the mounted source persistent?
You can use the --rm option in docker run. To use it with docker-compose you can use
docker-compose rm -v after stopping your containers with docker-compose stop
If you go through the docs about Data volumes , its mentioned that
Data volumes persist even if the container itself is deleted.
So that means, stopping a container will not remove the volumes it created, whether named or anonymous.
Now if you read further down to Removing volumes
A Docker data volume persists after a container is deleted. You can
create named or anonymous volumes. Named volumes have a specific
source form outside the container, for example awesome:/bar. Anonymous
volumes have no specific source. When the container is deleted, you
should instruct the Docker Engine daemon to clean up anonymous
volumes. To do this, use the --rm option, for example:
$ docker run --rm -v /foo -v awesome:/bar busybox top
This command creates an anonymous /foo volume. When the container is
removed, the Docker Engine removes the /foo volume but not the awesome
volume.
Just remove volumes with the down command:
docker-compose down -v
I have a data only postgresql container
docker create -v /var/lib/postgresql/data --name bevdata mdillon/postgis /bin/true
I have a running Postgis container
docker run --name bevaddress -e POSTGRES_USER=bevsu -e POSTGRES_DB=bevaddress -P -d --volumes-from bevdata mdillon/postgis
I have made a backup of that database into the bavaddress container into directory /var/lib/postgresql/backup
I think this means that the backup data is in container bevaddress (the running process) and NOT the data only container bevdata which I think is good.
Now if I docker pull mdillon/postgis to a new version, how can I attach the folder /var/lib/postgresql/backup of container bevaddress so that a new instance and version of mdillon/postgis can access that folder to restore the database?
To the best of my knowledge, you cannot. The file system in your running container only exists for the duration of the run. Without mounting a volume, you have no way to allow a second container access to the backup.
For future backups, you could create a second volume only container that mounts /var/lib/postgresql/backup.
The postgres image, for example, has a volume baked in at /var/lib/postgresl/data, but it isn't bound to a particular host path. I'm wondering if the database work done in this container is wholly encapsulated by committing the container to an image, or if I need to separately pass along the contents of the unbound volume.
Example in commands
Create container vtest based on postgres image:
$ docker run -d --name vtest postgres
The container has a volume at /var/lib/postgresql/data that is not bound to a host path:
$ docker inspect -f '{{ .Volumes }}' vtest
map[/var/lib/postgresql/data:/var/lib/docker/vfs/dir/bc39da05ff1cd044d7a17bba61381e854a948fb70cf39f897247f5ada66ad906]
$ sudo docker inspect -f '{{ .HostConfig.Binds }}' vtest
<no value>
Create a database and add some records in the vtest container. Then, commit the changes to an image to be able to share with others:
$ docker commit -p vtest postgres:vtest
Will the changes made in the vtest container's /var/lib/postgresql/data persist in this new postgres:vtest image?
The volumes mounted in the container are not committed to the image, not matter it is mounted to a particular folder of your host or it's mounted to a folder in /var/lib/docker. In fact, as you show in your message, the volume is mounted to /var/lib/docker/vfs/dir/bc39da05ff1cd044d7a17bba61381e854a948fb70cf39f897247f5ada66ad906 in host machine. You can browse that folder as root user.
If you want to save the data in a volume you would need to use other approach different to committing. One of the more used ones is using data containers (which will create also a folder in /var/lib/container/... with your data), and then saving that volume using a new container an a packing tool like tar. Check Docker documentation related to this topic for further details.