Docker Volume Data is not Persistent - docker-compose

I want to create two Docker volumes and have their data be persistent. I run sudo docker compose up -d, post some data to my website (text that is stores in a sqlite database and an image stored in the filesystem), then run sudo docker compose down. When I run sudo docker compose up -d again, all the data I posted is gone. With the following configs, I expect the data to still be present.
Dockerfile:
FROM python:3.9.16-buster
RUN pip install --upgrade pip
# The Debian/Buster default is to disable the password.
RUN adduser nonroot
RUN mkdir /home/site/ && chown -R nonroot:nonroot /home/site
RUN chown -R nonroot:nonroot /var/log/site
# two volumes created
VOLUME /home/site/db /home/site/static
WORKDIR /home/site
USER nonroot
# folders ./site/static and ./site/db exist in my host directory
COPY --chown=nonroot:nonroot . .
CMD ["python", "./site/main.py"]
compose.yaml:
services:
site:
build: flask
restart: always
ports:
- '8081:8081'
volumes:
- site_db:/home/site/db # same path as the volumes created in the Dockerfile
- site_static:/home/site/static
command: gunicorn -w 1 -t 3 -b 0.0.0.0:8081 --chdir ./site main:app
volumes:
site_db: # I find it odd these volumes keys don't have values, but that's what I have see other people do
site_static:
docker compose up and docker compose down delete my volumes.
docker compose start and docker compose stop do NOT delete my volumes.

Through the Flask app, check where you are uploading the files to, as well as where the sqlite3 db file is. If these paths do not align with the volumes paths, data will not persist.

Related

Postgres running in docker-compose unable to write data to mounted volume

Description
I am running a Postgres container in docker-compose. I am mounting the /data directory needed by Postgres into the container using a volume in the docker-compose.yml below.
Qualifications
The Postgres user must be called graph-node and create a database called graph-node
I delete the data/postgres/ folder before each docker-compose up using the boot.sh script below for application-specific reasons. Just know that /data/postgres is re-created on each run of docker-compose up.
Expected Behavior
Postgres boots and writes all files it needs to the mounted /data/postgres volume.
Actual Behavior
Postgres boots fine, but writes nothing to the volume.
Possible Reasons
This feels like a read/write permissions problem? I've added :rw in the third column of the volume as suggested, still no cigar. I run chmod -R a+rwx ./data on the data dir to get access to all files recursively .
The oddest thing to is that if I manually run chmod -R a+rwx ./data after booting, Postgres suddenly IS able to write to the directory all needed files. But if I run this before it's created as seen below (recursively for all things in /data) it does not work.
Files
boot.sh
# Check for data/ dir. If found, make it recursively rwx for all users. Otherwise, create it and make it recursively rwx for all users.
if [ -d "./data" ]
then
chmod -R a+rwx ./data
else
mkdir data
chmod -R a+rwx ./data
fi
if [ -d "./data/postgres" ]
then
rm -rf data/postgres
else
echo "No data/postgres dir found. Proceeding"
fi
docker-compose -f docker-compose.yml up
docker-compose.yml
version: "3"
services:
postgres:
image: postgres
ports:
- '5432:5432'
command:
[
"postgres",
"-cshared_preload_libraries=pg_stat_statements"
]
environment:
POSTGRES_USER: graph-node
POSTGRES_PASSWORD: let-me-in
POSTGRES_DB: graph-node
volumes:
- ./data/postgres:/var/lib/postgresql/data:rw
Machine + Software Specs
Operating System: Windows 10, WSL2, Ubuntu
Docker Version: 20.10.7 (running directly on the machine since it's Ubuntu, NOT in Docker Desktop like on a Mac)
Well, not exactly an answer, but because I only needed one-run ephemeral storage for Postgres (I was deleting the data/ dir between runs anyways) I solved the problem by just removing the external volume and letting Postgres write data into the container itself, where it surely had privileges.

How to persist default data and running app data of mongo db with docker images and docker-compose

I have to create a mongo image with some default collection and data. I am able to create mongo image with this data by referring the following link :-
How to create a Mongo Docker Image with default collections and data?
so when I run the container I get the default data.
Now when I use the app and some more data is generated(by calling API's) which gets saved again in mongodb with default data.
Now for some reason if docker container is re-started, unfortunately, all the run-time created data is gone and only default data is left. Though I am saving data using volumes.
So how to persist the run time data and default data each time docker is restarted?
I am using following docker file and docker-compose file
Dockerfile :
FROM mongo
####### working isnerting data $##########
# Modify child mongo to use /data/db2 as dbpath (because /data/db wont persist the build)
RUN mkdir -p /data/db2 \
&& echo "dbpath = /data/db2" > /etc/mongodb.conf \
&& chown -R mongodb:mongodb /data/db2
COPY . /data/db2
RUN mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db2 --smallfiles \
&& mongo 127.0.0.1:27017/usaa /data/db2/config-mongo.js \
&& mongod --dbpath /data/db2 --shutdown \
&& chown -R mongodb /data/db2
# Make the new dir a VOLUME to persists it
VOLUME /data/db2
CMD ["mongod", "--config", "/etc/mongodb.conf", "--smallfiles"]
and a part of docker-compose.yml
services:
mongo:
build: ./mongodb
image: "mongo:1.2"
container_name: "mongo"
ports:
- "27017:27017"
volumes:
- ${LOCAL_DIRECTORY}:/data/db2
networks:
- some-network
Reason may be, by rebuilding docker image its creating /data/db2 directory with only default data defined in .js file. But not sure.
Please correct me what I am doing wrong or suggest a new work-flow for this problem.
Thanks much!
Because docker is stateless by default. Each time you call docker run it rebuilds the container. If you want some data to persist, you have 2 general approaches:
Not to remove the container after it exits. Just give the lovely name to your container when first starting it, like docker run --name jessica mongo and then, on subsequent calls, use docker start jessica
Use volumes to store data and share it between containers. In this case you will start your container with volume arguments, like docker run -v /home/data:/data mongo. Also, you will have to reconfigure your mongodb to save data in path /data inside container. This approach is easier and can be used to share data between different containers, as well as providing default data for the first run
UPD
When using docker-compose to start the containers, if you need your data to persist between sessions, you can simply use external volumes, which you create in advance.
First create volume, lets say lovely:
docker volume create lovely
Then use it in docker-compose.yml:
version: '3'
services:
db1:
image: whatever
volumes:
- lovely:/data
db2:
image: whatever
volumes:
- lovely:/data
volumes:
lovely:
external: true

How can I keep changes I made to Postgresql Docker container?

I'm using the official postgresql docker image to start a container.
Afterwards, I install some software and use psql to create some tables etc. I am doing this by first starting the postgres container as follows:
docker run -it --name="pgtestcontainer" -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres:9.6.6
Then I attach to this container with
docker exec -it pgtestcontainer bash
and I install software, create db tables etc.
Afterwards, I first quit from the second terminal session (that I used to install software) and do a ctrl + c in the first one to stop the postgres container.
At this point my expectation is that if I commit this postgres image with
docker commit xyz...zxy pg-commit-test
and then run a new container based on the committed image with:
docker run -it --name="modifiedcontainer" -e POSTGRES_PASSWORD=postgres -p 5432:5432 pg-commit-test
then I should have all the software and tables in place.
The outcome of the process above is that the software I've installed is in the modifiedcontainer but the sql tables etc are gone. So my guess is my approach is more or less correct but there is something specific to postgres docker image I'm missing.
I know that it creates the db from scratch if no external directory or docker volume is bound to
/var/lib/postgresql/data
but I'm not doing that and after the commit I'd expect the contents of the db to stay as they are.
How do I follow the procedure above (or the right one) and keep the changes to database(s)?
The postgres Dockerfile creates a mount point at /var/lib/postgresql/data which you must mount an external volume onto if you want persistent data.
ENV PGDATA /var/lib/postgresql/data
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA" # this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
VOLUME /var/lib/postgresql/data
https://docs.docker.com/engine/reference/builder/#notes-about-specifying-volumes
You can create a volume using
docker volume create mydb
Then you can use it in your container
docker run -it --name="pgtestcontainer" -v mydb:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres:9.6.6
https://docs.docker.com/engine/admin/volumes/volumes/#create-and-manage-volumes
In my opinion, the best way is to create your own image with a /docker-entrypoint-initdb.d folder and your script inside.
Look How to extend this image
But without volume you can't (I think) save your datas.
I solved this by passing PGDATA parameter with a value that is different than the path that is bound to docker volume as suggested in one of the responses to this question.

Initialize data on dockerized mongo

I'm running a dockerized mongo container.
I'd like to create a mongo image with some initialized data.
Any ideas?
A more self-contained approach:
create javascript files that initialize your database
create a derived MongoDB docker image that contains these files
There are many answers that use disposable containers or create volumes and link them, but this seems overly complicated. If you take a look at the mongo docker image's docker-entrypoint.sh, you see that line 206 executes /docker-entrypoint-initdb.d/*.js files on initialization using a syntax: mongo <db> <js-file>. If you create a derived MongoDB docker image that contains your seed data, you can:
have a single docker run command that stands up a mongo with seed data
have data is persisted through container stops and starts
reset that data with docker stop, rm, and run commands
easily deploy with runtime schedulers like k8s, mesos, swarm, rancher
This approach is especially well suited to:
POCs that just need some realistic data for display
CI/CD pipelines that need consistent data for black box testing
example deployments for product demos (sales engineers, product owners)
How to:
Create and test your initialization scripts (grooming data as appropriate)
Create a Dockerfile for your derived image that copies your init scripts
FROM mongo:3.4
COPY seed-data.js /docker-entrypoint-initdb.d/
Build your docker image
docker build -t mongo-sample-data:3.4 .
Optionally, push your image to a docker registry for others to use
Run your docker image
docker run \
--name mongo-sample-data \
-p 27017:27017 \
--restart=always \
-e MONGO_INITDB_DATABASE=application \
-d mongo-sample-data:3.4
By default, docker-entrypoint.sh will apply your scripts to the test db; the above run command env var MONGO_INITDB_DATABASE=application will apply these scripts to the application db instead. Alternatively, you could create and switch to different dbs in the js file.
I have a github repo that does just this - here are the relevant files.
with the latest release of mongo docker , something like this works for me.
FROM mongo
COPY dump /home/dump
COPY mongo_restore.sh /docker-entrypoint-initdb.d/
the mongo restore script looks like this.
#!/bin/bash
# Restore from dump
mongorestore --drop --gzip --db "<RESTORE_DB_NAME>" /home/dump
and you could build the image normally.
docker build -t <TAG> .
First create a docker volume
docker volume create --name mongostore
then create your mongo container
docker run -d --name mongo -v mongostore:/data/db mongo:latest
The -v switch here is responsible for mounting the volume mongostore at the /data/db location, which is where mongo saves its data. The volume is persistent (on the host). Even with no containers running you will see your mongostore volume listed by
docker volume ls
You can kill the container and create a new one (same line as above) and the new mongo container will pick up the state of the previous container.
Initializing the volume
Mongo initializes a new database if none is present. This is responsible for creating the initial data in the mongostore. Let's say that you want to create a brand new environment using a pre-seeded database. The problem becomes how to transfer data from your local environment (for instance) to the volume before creating the mongo container. I'll list two cases.
Local environment
You're using either Docker for Mac/Windows or Docker Toolbox. In this case you can easily mount a local drive to a temporary container to initialize the volume. Eg:
docker run --rm -v /Users/myname/work/mongodb:/incoming \
-v mongostore:/data alpine:3.4 cp -rp /incoming/* /data
This doesn't work for cloud storage. In that case you need to copy the files.
Remote environment (AWS, GCP, Azure, ...)
It's a good idea to tar/compress things up to speed the upload.
tar czf mongodata.tar.gz /Users/myname/work/mongodb
Then create a temporary container to untar and copy the files to the mongostore. the tail -f /dev/null just makes sure that the container doesn't exit.
docker run -d --name temp -v mongostore:/data alpine:3.4 tail -f /dev/null
Copy files to it
docker cp mongodata.tar.gz temp:.
Untar and move to the volume
docker exec temp tar xzf mongodata.tar.gz && cp -rp mongodb/* /data
Cleanup
docker rm temp
You could also copy the files to the remote host and mounting from there but I tend to avoid interacting with the remote host at all.
Disclaimer. I'm writing this from memory (no testing).
Here is how its done with docker-compose. I use an older image of mongo but the docker-entrypoint.sh accepts *.js and *.sh files for all versions of the image.
docker-compose.yaml
version: '3'
services:
mongo:
container_name: mongo
image: mongo:3.2.12
ports:
- "27017:27017"
volumes:
- mongo-data:/data/db:cached
- ./deploy/local/mongo_fixtures /fixtures
- ./deploy/local/mongo_import.sh:/docker-entrypoint-initdb.d/mongo_import.sh
volumes:
mongo-data:
driver: local
mongo_import.sh:
#!/bin/bash
# Import from fixtures
mongoimport --db wcm-local --collection clients --file /fixtures/properties.json && \
mongoimport --db wcm-local --collection configs --file /fixtures/configs.json
And my monogo_fixtures json files are the product of monogoexport which have the following format:
{"_id":"some_id","field":"value"}
{"_id":"another_id","field":"value"}
This should help those using this without a custom Dockefile, just using the image straight away with the right entrypoint setup right in your docker-compose file. Cheers!
I've found a way that is somehow easier for me.
Say you have a database in a docker container on your server, and you want to back it up, here’s what you could do.
What might differ from your setup to mine is the name of your mongo docker container [mongodb] (default when using elastic_spence). So make sure you start your container first with --name mongodb to match the following steps:
$ docker run \
--rm \
--link mongodb:mongo \
-v /root:/backup \
mongo \
bash -c ‘mongodump --out /backup --host $MONGO_PORT_27017_TCP_ADDR’
And to restore the database from a dump.
$ docker run \
--rm \
--link mongodb:mongo \
-v /root:/backup \
mongo \
bash -c ‘mongorestore /backup --host $MONGO_PORT_27017_TCP_ADDR’
If you need to download the dump from to your server you can use scp:
$ scp -r root#IP:/root/backup ./backup
Or upload it:
$ scp -r ./backup root#IP:/root/backup
P.S: Original source by Tim Brandin available at https://blog.studiointeract.com/mongodump-and-mongorestore-for-mongodb-in-a-docker-container-8ad0eb747c62
Thank you!

Postgresql raises 'data directory has wrong ownership' when trying to use volume

I'm trying to run postgresql in docker container, but of course I need to have my database data to be persistent, so I'm trying to use data only container which expose volume to store database at this place.
So, my data container has such Dockerfile:
FROM ubuntu
# Create data directory
RUN mkdir -p /data/postgresql
# Create /data volume
VOLUME /data/postgresql
Which I run:
docker run --name postgresql_data lyapun/postgresql_data true
In my postgresql.conf I set:
data_directory = '/data/postgresql'
Then I run my postgresql container in such way:
docker run -d --name postgre --volumes-from postgresql_data lyapun/postgresql
And I got:
2014-07-04 07:45:57 GMT FATAL: data directory "/data/postgresql" has wrong ownership
2014-07-04 07:45:57 GMT HINT: The server must be started by the user that owns the data directory.
How to deal with this issue? I googled a lot to find some information about using postgresql with docker volumes, but I didn't found anything.
Thanks!
Ok, seems like I found workaround for this issue.
Instead of running postgres in such way:
CMD ["/usr/lib/postgresql/9.1/bin/postgres", "-D", "/var/lib/postgresql/9.1/main", "-c", "config_file=/etc/postgresql/9.1/main/postgresql.conf"]
I wrote bash script:
chown -Rf postgres:postgres /data/postgresql
chmod -R 700 /data/postgresql
sudo -u postgres /usr/lib/postgresql/9.1/bin/postgres -D /var/lib/postgresql/9.1/main -c config_file=/etc/postgresql/9.1/main/postgresql.conf
And replaced CMD in postgresql image to:
CMD ["bash", "/run.sh"]
It works!
You have to set ownership of directory /data/postgresql to the same user, under which you are running your postgresql binary. For example, in Ubuntu it is usually postgres user.
Then you have to use this command:
chown postgres.postgres /data/postgresql
A better way to solve that issue, assuming your postgres images is named "postgres" and that your backup is ./backup.tar:
First, add this to your postgres Dockerfile:
VOLUME ["/etc/postgresql", "/var/log/postgresql", "/var/lib/postgresql"]
Then run:
docker run -it --name postgres -v $(pwd):/db postgres sh -c "tar xvf /db/backup.tar --no-overwrite-dir" && \
docker run -it --name data --volumes-from postgres busybox true && \
docker rm postgres && \
docker run -it --name postgres --volumes-from=data postgres
You don't have permission issues since the archive is extracted by the postgres user of your postgres image, so it is the owner of the extracted files.
You can then backup your data using the data container. The advantage of this solution is that you don't chmod/chown every time you run the image.
This type of errors is quite common when you link a NTFS directory into your docker container. NTFS directories don't support ext3 file & directory access control.
The only way to make it work is to link directory from a ext3 drive into your container.
I got a bit desperate when I played around Apache / PHP containers with linking the www folder. After I linked files reside on a ext3 filesystem the problem disappear.
I published a short Docker tutorial on youtube, may it helps to understand this problem: https://www.youtube.com/watch?v=eS9O05TTFjM