When are docker-compose on-the-fly volumes reused vs. recreated? - docker-compose

I have a docker-compose.yml like this:
version: '2'
services:
app:
build: .
volumes:
- /usr/src/app
If I do docker-compose up, then any changes I make to the /usr/src/app are persisted across runs. I can control+C and then docker-compose up, and the contents are still there.
But if I do docker-compose run app ls -la /usr/src/app, then the path is always empty.
My goal is that I'd like to have that volume 1) automatically created on the fly for me, 2) specific to this docker-compose project (since I'll have many others), and 3) persist across docker-compose up/run/etc.
I think one way around this is to use named volumes, which will automatically pull the name of my docker-compose project.
But with on-the-fly containers, is this the expected behavior? They persist automatically for docker-compose up, and are recreated from scratch for each docker-compose run?
Also, is there any documentation that makes clear the lifetime of on-the-fly containers?
Thanks!

Related

My docker postgres is persisting data even between stopping the container and deleting all volumes

This is a bit of an odd question: There are a ton of articles on how to persist docker data between sessions but I DO NOT want the docker data to be persisted between sessions, yet postgres tables and rows seem to be preserved between taking down the docker image and starting it.
Here's my dockerfile:
FROM postgres
ENV POSTGRES_PASSWORD notasecret
ENV POSTGRES_USER my_app
ENV POSTGRES_DB my_app_test
WORKDIR /usr/src/app
RUN apt-get update
RUN apt-get install -y gzip
COPY . .
ADD init.sql.gz /docker-entrypoint-initdb.d/
The init.sql.gz added at the bottom of the file are my integration test fixtures that I want the database to be loaded with at the start of the test run.
Here's my docker-compose file:
version: "3.8"
services:
my_app_test_postgres:
build: .
healthcheck:
test: [ "CMD-SHELL", "pg_isready", "-U", "postgres"]
interval: 10s
timeout: 5s
retries: 5
ports:
- "7432:5432"
Ok, heres what I see. This docker image is intended to be used as a postgres instance for integration testing. At the start of the test, the database should contain exactly and only the initialization data contained in init.sql.gz.
When I run docker-compose up, the database starts as expected. But when I stop the instance and restart it, I see that changes from the test are persisted. In fact, the changes continue to persist even if I delete the docker image and remove the volume!
Here's how I'm resetting the database between test runs:
docker kill $(docker ps -q)
docker rm -f $(docker ps -a -q)
docker volume rm $(docker volume ls -q)
docker-compose up
If I understand properly, these commands should delete both all docker containers and all attached volumes. Yet when I restart the container, the changes are still there.
Another oddity - I've tried replacing init.sql.gz with a different init.sql.gz to see if schema changes are picked up when I rebuild the image. As it turns out, replacing the init.sql.gz has no effect whatsoever! Even if I completely exclude it from the image, I keep getting the database tables from the previous test run.
I've discovered that if I completely nuke the docker cache, and delete ALL cached images, then rebuild from scratch, that seems to solve the issue - but I don't want to do that every time.
What's the right way to tell docker, "Start this container from scratch, and delete all persistent data"? I am mystified as to how the data is being persisted, as I am deleting the volumes between test runs!

Docker container does not receive new files (updates) when rebuilding container that has a named volume

I have a dockercontainer that i build using a dockerfile via a docker-compose. I have a named volume, on the first build, it copies a file into /state/config
all is well, while the container is running, the /state/config receives more data because of a process I have running
the volume is setup like so
volumes:
- config_data:/state/config
on the dockerfile i use the copy like so
COPY --from=builder /src/runner /state/config/runner
So, as I say the first run - when no docker container or volume exists, then the /state/config recevies the "runner" file and also adds data into this same directory while the container is running.
Now I don't wish to destroy the volume, but if i rebuild the container using docker build or docker-compose build --no-cache then the volume stays - which is what i want but the runner is NOT updated.
I even tried to exec into the container and remove runner and then rebuild the container again and now the copying of the file does not even happen.
I wondered why this happening ?
Of course, I think i may have a work around, to place the file inside the docker container using the temporary volumes and not a named volume meaning the next time it is re-created then the file is recopied.
But I am confused why - its happening
Anybody help ?

How to safely stop/start my postgres server when using docker-compose

I sometimes stop/start docker very often when I am release new features in my application.
docker-compose up -d
docker-compose stop
I am using pretty much the bare bones postgres docker setup (see below).
I am mapping the /data folder to my host.
Is there anything I should be worried about if I stop/start docker many times in a day in terms of data getting corrupted?
Is calling docker-compose stop the best way to be stopping my postgres instance?
My postgres service in my docker-compose looks like this:
db:
image: postgres:9.4
volumes:
- "/home/deploy/data/pgdata:/var/lib/postgresql/data"
restart: always
This setup currently is running smoothly in development, but once it goes to production I want to make sure I am following best practices etc.
Use,
docker-compose down -v
What it does is basically removes all the volumes you added. If you don't those volumes will hang on and eat up your space. It only removes the volume inside the docker container. The volume in your host stays and survives container removal in case if you want that data to survive container removal.
Whenever you create a docker container by docker run, Docker creates a volume/ directory to keep the details about the containers. After you execute docker run, if you look into /var/lib/docker/containers, you will see one directory for each container you started. If you have not removed the volumes for previous container, you will see many directories under the "container" directory. The name of these directories will be very long random letters and number. So, if you don't tell the docker to remove these directories when you stop the container, it will be there forever. The v option I mentioned above, will delete these directories when you take down the container.
Keep in mind, you can view the contents of the directory /var/lib/docker only as a root user. To change to root user, use sudo -i before you attempt to view the contents of the directory.
Databases in particular are usually designed so that it's very hard to lose data, even if the machine loses power in the middle of writing something to disk. (This comes at some performance cost.) So long as you don't have more than one PostgreSQL instance at a time using the same backing data store, I'd expect it to not lose data or otherwise corrupt itself; the worst you should expect to see is a message at startup that it's recovering from a write-ahead log or something along those lines.
docker stop will send a signal to a container that prompts it to shut down cleanly, and PostgreSQL will take this as a cue to shut down. It looks like docker-compose stop, docker-compose down, and sending ^C to docker-compose up all use the same mechanism. So the way you're doing it now should result in a clean shutdown (provided PostgreSQL finishes its cleanup within 10 seconds).
I believe you can docker-compose restart specific services, or docker-compose up --force-recreate them. This would help if you rebuilt your application container and needed to restart that, but not its database.

Postgres AWS EBS volume doesn't persist when updating service

I deploy a service on a standard Docker for AWS stack (using this template).
I deploy using docker stack deploy -c docker-compose.yml pos with this compose file:
version: "3.2"
services:
postgres_vanilla:
image: postgres
volumes:
- db-data:/var/lib/postgresql
volumes:
db-data:
driver: "cloudstor:aws"
driver_opts:
size: "6"
ebstype: "gp2"
backing: "relocatable"
I then change some data in the db and force an update of the service with docker service update --force pos_postgres_vanilla
Problem is that the data I change doesn't persist after the update.
I've noticed that postgres initdb script runs every time I update, so I assume it's related.
Is there something i'm doing wrong?
Issue was that cloudstor:aws creates the volume with a lost+found under it, so when postgres starts it finds that the data directory isn't empty and complains about it. To fix that I changed the volume to be mounted one directory above the data directory, at /var/lib/postgresql, but that caused postgres to not find the PGVERSION file, which in turn caused it to run initdb every time the container starts (https://github.com/docker-library/postgres/blob/master/11/docker-entrypoint.sh#L57).
So to work around it, instead of changing the volume to be mounted one directory above the data directory, I changed the data directory to be one level below the volume mount by overriding environment variable PGDATA (to something like /var/lib/postgresql/data/db/).

How to run a command once in Docker compose

So I'm working on a docker compose file to deploy my Go web server. My server uses mongo, so I added a data volume container and the mongo service in docker compose.
Then I wrote a Dockerfile in order to build my Go project, and finally run it.
However, there is another step that must be done. Once my project has been compiled, I have to run the following command:
./my-project -setup
This will add some necessary information to the database, and the information only needs to be added once.
I can't however add this step on the Dockerfile (in the build process) because mongo must already be started.
So, how can I achieve this? Even if I restart the server and then run again docker-compose up I don't want this command to be executed again.
I think I'm missing some Docker understanding, because I don't actually understand everything about data volume containers (are they just stopped containers that mount a volume?).
Also, if I restart the server, and then run docker-compose up, which commands will be run? Will it just start the same container that was now stopped with the given CMD?
In any case, here is my docker-compose.yml:
version: '2'
services:
mongodata:
image: mongo:latest
volumes:
- /data/db
command: --break-mongo
mongo:
image: mongo:latest
volumes_from:
- mongodata
ports:
- "28001:27017"
command: --smallfiles --rest --auth
my_project:
build: .
ports:
- "6060:8080"
depends_on:
- mongo
- mongodata
links:
- mongo
And here is my Dockerfile to build my project image:
FROM golang
ADD . /go/src/my_project
RUN cd /go/src/my_project && go get
RUN go install my_project
RUN my_project -setup
ENTRYPOINT /go/bin/my_project
EXPOSE 8080
I suggest to add an entrypoint-script to your container; in this entrypoint-script, you can check if the database has been initialized, and if it isn't, perform the required steps.
As you noticed in your question, the order in which services / containers are started should not be taken for granted, so it's possible your application container is started before the database container, so the script should take that into account.
As an example, have a look at the official WordPress image, which performs a one-time initialization of the database in it's entrypoint-script. The script attempts to connect to the database (and retries if the database cannot be contacted (yet)), and checks if initialization is needed; https://github.com/docker-library/wordpress/blob/df190dc9c5752fd09317d836bd2bdcd09ee379a5/apache/docker-entrypoint.sh#L146-L171
NOTE
I notice you created a "data-only container" to attach your volume to. Since docker 1.9, docker has volume management, including naming volumes. Because of this, you no longer need to use "data-only" containers.
You can remove the data-only container from your compose file, and change your mongo service to look something like this;
mongo:
image: mongo:latest
volumes:
- mongodata:/data/db
ports:
- "28001:27017"
command: --smallfiles --rest --auth
This should create a new volume, named mongodata if it doesn't exist, or re-use the existing volume with that name. You can list all volumes using docker volume ls and remove a volume with docker volume rm <some-volume> if you no longer need it
You could try to use ONBUILD instruction:
The ONBUILD instruction adds to the image a trigger instruction to be executed at a later time, when the image is used as the base for another build. The trigger will be executed in the context of the downstream build, as if it had been inserted immediately after the FROM instruction in the downstream Dockerfile.
Any build instruction can be registered as a trigger.
This is useful if you are building an image which will be used as a base to build other images, for example an application build environment or a daemon which may be customized with user-specific configuration.
For example, if your image is a reusable Python application builder, it will require application source code to be added in a particular directory, and it might require a build script to be called after that. You can’t just call ADD and RUN now, because you don’t yet have access to the application source code, and it will be different for each application build. You could simply provide application developers with a boilerplate Dockerfile to copy-paste into their application, but that is inefficient, error-prone and difficult to update because it mixes with application-specific code.
The solution is to use ONBUILD to register advance instructions to run later, during the next build stage.
Here’s how it works:
When it encounters an ONBUILD instruction, the builder adds a trigger to the metadata of the image being built. The instruction does not otherwise affect the current build.
At the end of the build, a list of all triggers is stored in the image manifest, under the key OnBuild. They can be inspected with the docker inspect command.
Later the image may be used as a base for a new build, using the FROM instruction. As part of processing the FROM instruction, the downstream builder looks for ONBUILD triggers, and executes them in the same order they were registered. If any of the triggers fail, the FROM instruction is aborted which in turn causes the build to fail. If all triggers succeed, the FROM instruction completes and the build continues as usual.
Triggers are cleared from the final image after being executed. In other words they are not inherited by “grand-children” builds.
In docker-compose you can define:
restart: no
To run the container only once, which is useful for example for db-migration containers.
Your application need some initial state for working. It means that you should:
Check if required state already exists
Depends on first step result init state or not
You can write program for checking current database state (here I will use bash script but it can be every other language program):
RUN if $(./check.sh); then my_project -setup; fi
In my case if script will return 0 (success exit status) then setup command will be called.