How to copy docker volume from one machine to another? - postgresql

I have created a docker volume for postgres on my local machine.
docker create volume postgres-data
Then I used this volume and run a docker.
docker run -it -v postgres-data:/var/lib/postgresql/9.6/main postgres
After that I did some database operations which got stored automatically in postgres-data. Now I want to copy that volume from my local machine to another remote machine. How to do the same.
Note - Database size is very large

If the second machine has SSH enabled you can use an Alpine container on the first machine to map the volume, bundle it up and send it to the second machine.
That would look like this:
docker run --rm -v <SOURCE_DATA_VOLUME_NAME>:/from alpine ash -c \
"cd /from ; tar -cf - . " | \
ssh <TARGET_HOST> \
'docker run --rm -i -v <TARGET_DATA_VOLUME_NAME>:/to alpine ash -c "cd /to ; tar -xpvf - "'
You will need to change:
SOURCE_DATA_VOLUME_NAME
TARGET_HOST
TARGET_DATA_VOLUME_NAME
Or, you could try using this helper script https://github.com/gdiepen/docker-convenience-scripts
Hope this helps.

I had an exact same problem but in my case, both volumes were in separate VPCs and couldn't expose SSH to outside world. I ended up creating dvsync which uses ngrok to create a tunnel between them and then use rsync over SSH to copy the data. In your case you could start the dvsync-server on your machine:
$ docker run --rm -e NGROK_AUTHTOKEN="$NGROK_AUTHTOKEN" \
--mount source=postgres-data,target=/data,readonly \
quay.io/suda/dvsync-server
and then start the dvsync-client on the target machine:
docker run -e DVSYNC_TOKEN="$DVSYNC_TOKEN" \
--mount source=MY_TARGET_VOLUME,target=/data \
quay.io/suda/dvsync-client
The NGROK_AUTHTOKEN can be found in ngrok dashboard and the DVSYNC_TOKEN is being shown by the dvsync-server in its stdout.
Once the synchronization is done, the dvsync-client container will stop.

Related

Restore backed up volume to another docker container (postgresql)

I have backed postgres volume using this command:
sudo docker run --rm --volumes-from base -v $(pwd):/docker-volumes ubuntu tar cvf /docker-volumes/base_data.tar /var/lib/postgresql/data
It worked and then I transferred the .tar file to another server using scp.
Now I need the postgres container in the new server to restore this data.
The new container is running and called base.
~/docker-volumes/base_data.tar is where the backup file in the server.
/var/lib/posgresql/data is where this .tar must be unpacked inside the container.
I tried this:
sudo docker run --rm --volumes-from base -v $(pwd):/docker-volumes ubuntu bash -c "cd /var/lib/postgresql/data && tar xvf /docker-volumes/base_data.tar --strip 1"
I tried to tweek it in multiple ways but didnt work.

Docker & PostgreSQL- modified PostgreSQL image doesn't start with Docker run command

I want to to build a PostgreSQL image that only contains some extra .sql files to be executed at starting time
Dockerfile:
FROM postgres:11.9-alpine
USER postgres
WORKDIR /
COPY ddl/*.sql /docker-entrypoint-initdb.d/
Then I build the image:
docker build -t my-postgres:1.0.0 -f Dockerfile .
And run the container
docker run -d --name my-database \
-e POSTGRES_PASSWORD=abc123 \
-p 5432:5432 \
my-postgres:1.0.0
The output of it is the container id
33ed596792a80fc08f37c7c0ab16f8827191726b8e07d68ce03b2b5736a6fa4e
Checking the running containers returns nothing:
Docker container ls
But if I explicitly start it, it works
docker start my-postgres
In the original PostgreSQL image the Docker run command already starts the database. Why after building my own image it doesn't?
It turned out that one of the copied .sql files was failing to execute and, based on this documentation, it makes the entrypoint script to exit. Fixing the SQL solved the issue and the container started normally with Docker run

Don't find /var/lib/postgresql/data/ directory on ubuntu when created docker image

I found the following mentioned at many places -
docker run -d \
--name some-postgres \
-e POSTGRES_PASSWORD=mysecretpassword \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-v /custom/mount:/var/lib/postgresql/data \
postgres
My only question is that I am unable to find /var/lib/postgresql/data/pgdata directory itself. I don't see any postgresql directory under /var/lib. Why is it? And just wonder how does it work if there is no directory?
The -v in your command mounts /custom/mount on your host (the machine where you run docker command) to container's /var/lib/postgresql/data. So the pgdata you are looking for is on host's /custom/mount/pgdata.
Of course, /custom/data is only an example name, you have to replace it with your real directory.

Docker volume does not persist data

Here is my docker file:
FROM ubuntu:14.04
RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys B97B0AFCAA1A47F044F244A07FCC7D46ACCC4CF8
RUN echo "deb http://apt.postgresql.org/pub/repos/apt/ precise-pgdg main" > /etc/apt/sources.list.d/pgdg.list
RUN apt-get update && apt-get -y -q install python-software-properties software-properties-common \
&& apt-get -y -q install postgresql-9.3 postgresql-client-9.3 postgresql-contrib-9.3
USER postgres
RUN /etc/init.d/postgresql start \
&& psql --command "CREATE USER pguser WITH SUPERUSER PASSWORD 'pguser';" \
&& createdb -O pguser pgdb
USER root
RUN echo "host all all 0.0.0.0/0 md5" >> /etc/postgresql/9.3/main/pg_hba.conf
RUN echo "listen_addresses='*'" >> /etc/postgresql/9.3/main/postgresql.conf
EXPOSE 5432
RUN mkdir -p /var/run/postgresql && chown -R postgres /var/run/postgresql
VOLUME ["/etc/postgresql", "/var/log/postgresql", "/var/lib/postgresql"]
USER postgres
CMD ["/usr/lib/postgresql/9.3/bin/postgres", "-D", "/var/lib/postgresql/9.3/main", "-c", "config_file=/etc/postgresql/9.3/main/postgresql.conf"]
Here is what I did...
I build the docker image:
docker build --rm=true -t my_image/postgresql:9.3
Then, I create a new directory called data in my current directory and ran the following command:
docker run -i -t -v="data:/data" -p 5432:5432 my_image/postgresql:9.3
I open another terminal and enter the postgres shell by running:
psql -h my_docker_ip -p 5432 -U pguser -W pgdb
and I create a table:
pgdb=# create table test (test_id bigserial primary key);
I verify the table exist using \dt and exit the postgres shell
I terminate the docker process and rerun the following:
docker run -i -t -v="data:/data" -p 5432:5432 my_image/postgresql:9.3
I enter the posgrest shell again and run \dt
I notice
there are no tables.
in the data directory there are no files.
I must be doing something wrong since I am assuming that the table I created will persist. Can someone point out my mistake?
There is something that confused me and for me was not very clear in the official documentation.
To my knowledge, persistent volumes can be created in three ways.
At container invocation time including full path ( -v ~/database:/data ): makes an external folder from the host available inside the docker container. Both can modify it.
At container invocation time using a volume name ( -v datamysql:/data ): makes a volume that is persistent available inside the container. It is created it if it did not exist. You can list them by name with docker volume ls. Internally, it will be stored in a place such as /var/lib/docker/volumes/ae4445f7c9317a22fe84726fb894c47754f38a7fd150c00fd877024889968750/_data.
At container build time ( VOLUME ["/database/data"] in Dockerfile). Every invocation of docker run will create a new volume that will persist even if you delete the container. This can be confusing becausee subsequent invocations will result in different volumes being created that will not be reused.
You can list both named (second case) and unnamed (third case) volumes with
$ docker volume ls
DRIVER VOLUME NAME
local 064593b3e65977097d4d0c8402a6c633f1af69be2937bf118678ab8f97ee9a7e
local 4753ad0437d13e54c76d9c34a30a1843396a1866a0cf9237d500fdcca0d78c5f
local 8d7a35354f666b2e8a26866a35bbae36bb9601701d4c6b505ab8ce6629f69415
local db48eefe8f189b36107ca9c4eebb792690590ab0ba055e7e4e2c9adfd1765b7e
local datamysql
You can see the exact location of a container's volume by using docker inspect mycontainer
{
"Type": "volume",
"Name": "8d7a35354f666b2e8a26866a35bbae36bb9601701d4c6b505ab8ce6629f69415",
"Source": "/media/USBdrive/docker/volumes/8d7a35354f666b2e8a26866a35bbae36bb9601701d4c6b505ab8ce6629f69415/_data",
"Destination": "/var/lib/mysql",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
},
It might be handy to remove unused volumes (for the third case, specially).
$ docker volume prune
WARNING! This will remove all volumes not used by at least one container.
Are you sure you want to continue? [y/N] y
Deleted Volumes:
4753ad0437d13e54c76d9c34a30a1843396a1866a0cf9237d500fdcca0d78c5f
Total reclaimed space: 205MB
Because you used the VOLUME directive in your Dockerfile, you are in the third case. Inspect your container to look for the file, and specify the volume from the command line if you want repeated sessions to persist data.
Based on your comment:
the data persisted, but I still can't find the persist data in my host ./data directory
and running this command:
docker run -i -t -v="data:/data" -p 5432:5432 my_image/postgresql:9.3
You appear to be confusing a named volume and a host volume. The named volume is used when you give the volume a name without a path, like data. The named volume stores the data using the docker driver (typically local) under a given name that you can reuse. It has the advantage of being listed in docker volume ls, and being initialized to the content of the image at the mounted location.
If you include a full path, like /home/username/data that would mount the directory from the docker host instead of using the named volume. The biggest disadvantage is that you don't get the directory initialized with the contents from the image, and you will likely encounter permission issues where the uid of the container process won't match the uid you use on your host.
For more details, see https://docs.docker.com/engine/tutorials/dockervolumes/

Why can't you start postgres in docker using "service postgres start"?

All the tutorials point out to running postgres in the format of
docker run -d -p 5432 \
-t <your username>/postgresql \
/bin/su postgres -c '/usr/lib/postgresql/9.2/bin/postgres \
-D /var/lib/postgresql/9.2/main \
-c config_file=/etc/postgresql/9.2/main/postgresql.conf'
Why can't we in our Docker file have:
ENTRYPOINT ["/etc/init.d/postgresql-9.2", "start"]
And simply start the container by
docker run -d psql
Is that not the purpose of Entrypoint or am I missing something?
the difference is that the init script provided in /etc/init.d is not an entry point. Its purpose is quite different; to get the entry point started, in the background, and then report on the success or failure to the caller. that script causes a postgres process, usually indirectly via pg_ctl, to be started, detached from the controlling terminal.
for docker to work best, it needs to run the application directly, attached to the docker process. that way it can usefully and generically terminate it when the user asks for it, or quickly discover and respond to the process crashing.
Exemplify that IfLoop said.
Using CMD into Dockerfiles:
USE postgres
CMD ["/usr/lib/postgresql/9.2/bin/postgres", "-D", "/var/lib/postgresql/9.2/main", "-c", "config_file=/etc/postgresql/9.2/main/postgresql.conf"]
To run:
$docker run -d -p 5432:5432 psql
Watching PostgeSQL logs:
$docker logs -f POSTGRES_CONTAINER_ID