How to run a command once in Docker compose - mongodb

So I'm working on a docker compose file to deploy my Go web server. My server uses mongo, so I added a data volume container and the mongo service in docker compose.
Then I wrote a Dockerfile in order to build my Go project, and finally run it.
However, there is another step that must be done. Once my project has been compiled, I have to run the following command:
./my-project -setup
This will add some necessary information to the database, and the information only needs to be added once.
I can't however add this step on the Dockerfile (in the build process) because mongo must already be started.
So, how can I achieve this? Even if I restart the server and then run again docker-compose up I don't want this command to be executed again.
I think I'm missing some Docker understanding, because I don't actually understand everything about data volume containers (are they just stopped containers that mount a volume?).
Also, if I restart the server, and then run docker-compose up, which commands will be run? Will it just start the same container that was now stopped with the given CMD?
In any case, here is my docker-compose.yml:
version: '2'
services:
mongodata:
image: mongo:latest
volumes:
- /data/db
command: --break-mongo
mongo:
image: mongo:latest
volumes_from:
- mongodata
ports:
- "28001:27017"
command: --smallfiles --rest --auth
my_project:
build: .
ports:
- "6060:8080"
depends_on:
- mongo
- mongodata
links:
- mongo
And here is my Dockerfile to build my project image:
FROM golang
ADD . /go/src/my_project
RUN cd /go/src/my_project && go get
RUN go install my_project
RUN my_project -setup
ENTRYPOINT /go/bin/my_project
EXPOSE 8080

I suggest to add an entrypoint-script to your container; in this entrypoint-script, you can check if the database has been initialized, and if it isn't, perform the required steps.
As you noticed in your question, the order in which services / containers are started should not be taken for granted, so it's possible your application container is started before the database container, so the script should take that into account.
As an example, have a look at the official WordPress image, which performs a one-time initialization of the database in it's entrypoint-script. The script attempts to connect to the database (and retries if the database cannot be contacted (yet)), and checks if initialization is needed; https://github.com/docker-library/wordpress/blob/df190dc9c5752fd09317d836bd2bdcd09ee379a5/apache/docker-entrypoint.sh#L146-L171
NOTE
I notice you created a "data-only container" to attach your volume to. Since docker 1.9, docker has volume management, including naming volumes. Because of this, you no longer need to use "data-only" containers.
You can remove the data-only container from your compose file, and change your mongo service to look something like this;
mongo:
image: mongo:latest
volumes:
- mongodata:/data/db
ports:
- "28001:27017"
command: --smallfiles --rest --auth
This should create a new volume, named mongodata if it doesn't exist, or re-use the existing volume with that name. You can list all volumes using docker volume ls and remove a volume with docker volume rm <some-volume> if you no longer need it

You could try to use ONBUILD instruction:
The ONBUILD instruction adds to the image a trigger instruction to be executed at a later time, when the image is used as the base for another build. The trigger will be executed in the context of the downstream build, as if it had been inserted immediately after the FROM instruction in the downstream Dockerfile.
Any build instruction can be registered as a trigger.
This is useful if you are building an image which will be used as a base to build other images, for example an application build environment or a daemon which may be customized with user-specific configuration.
For example, if your image is a reusable Python application builder, it will require application source code to be added in a particular directory, and it might require a build script to be called after that. You can’t just call ADD and RUN now, because you don’t yet have access to the application source code, and it will be different for each application build. You could simply provide application developers with a boilerplate Dockerfile to copy-paste into their application, but that is inefficient, error-prone and difficult to update because it mixes with application-specific code.
The solution is to use ONBUILD to register advance instructions to run later, during the next build stage.
Here’s how it works:
When it encounters an ONBUILD instruction, the builder adds a trigger to the metadata of the image being built. The instruction does not otherwise affect the current build.
At the end of the build, a list of all triggers is stored in the image manifest, under the key OnBuild. They can be inspected with the docker inspect command.
Later the image may be used as a base for a new build, using the FROM instruction. As part of processing the FROM instruction, the downstream builder looks for ONBUILD triggers, and executes them in the same order they were registered. If any of the triggers fail, the FROM instruction is aborted which in turn causes the build to fail. If all triggers succeed, the FROM instruction completes and the build continues as usual.
Triggers are cleared from the final image after being executed. In other words they are not inherited by “grand-children” builds.

In docker-compose you can define:
restart: no
To run the container only once, which is useful for example for db-migration containers.

Your application need some initial state for working. It means that you should:
Check if required state already exists
Depends on first step result init state or not
You can write program for checking current database state (here I will use bash script but it can be every other language program):
RUN if $(./check.sh); then my_project -setup; fi
In my case if script will return 0 (success exit status) then setup command will be called.

Related

Docker - Waiting for Mongo to load before running create indexes script

I have a script that creates text indexes in MongoDB - however if i use it with my current Dockerfile (which in turn is being called in my docker-compose file), it results in an error as Mongo hasn't fully loaded.
I've seen dependsOn however that appears to be for a entire service. Also come across this https://github.com/ufoscout/docker-compose-wait/ however I'd like to avoid using it if there's something much simplier.
Has anyone solved this - wait for Mongo to load before RUN?
My dockerfile
FROM mongo:4
COPY ./scripts /scripts
RUN mongo<scripts/create-text-indexes
And docker-compose snippet:
dashboard-mongo:
build:
context: ./infrastructure
dockerfile: Dockerfile
image: a-custom-mongo:SNAPSHOT
ports:
- "27017:27017"
networks:
default:
cms:
aliases:
- dashboard-mongo
And the script I'm running looks like:
use customboard
print("---> Deleting existing indexes...");
db.test.dropIndexes();
print("---> Creating indexes...");
db.test.createIndex({
"_id": "text",
"code": "text"
});
Thanks.
RUN mongo<scripts/create-text-indexes this script will not work, remember RUN command is for installation and configuration not to interact with Database or start the process. you should start the process in CMD or entrypoint.
one way to use offical image or clone the offical repo and take the benefit from offical docker image entrypoint. why reinvent the wheel?
Wit offical image all you need
FROM mongo:4
COPY ./scripts /docker-entrypoint-initdb.d
Once you build and start the container the offical image will start the DB process and wait once it is able to handle connection then it will run the script.
Initializing a fresh instance
When a container is started for the first time it will execute files
with extensions .sh and .js that are found in
/docker-entrypoint-initdb.d. Files will be executed in alphabetical
order. .js files will be executed by mongo using the database
specified by the MONGO_INITDB_DATABASE variable, if it is present, or
test otherwise. You may also switch databases within the .js script.
docker mongo image

docker-compose - issue restarting single service

I am using docker-compose for a development project. I have 6 services defined in my docker compose file. I have been using the below script to rebuild the images whenever I make a change.
#!/bin/bash
# file: rebuild.sh
docker-compose down
docker-compose build
docker-compose up
I am looking for a way to reduce the build time as building and restarting all the services seems unnecessary as I am usually only changing one module. I see in the docker-compose docs you can run commands for individual services by specifying the service name after e.g. docker-compose build myservice.
In another terminal window I tried docker-compose build myservice && docker-compose restart myservice while leaving the other ./rebuild.sh command open in the original terminal. In the ./rebuild.sh terminal window I see all the initialization messages being reprinted to the stdout so I know it is restarting that service but the code changes aren't there. What am I doing wrong? I just want to rebuild and restart a single service.
Try:
docker-compose up -d --force-recreate --build myservice
Note that:
-d is for Detached mode,
-force-recreate will recreate containers even is your code did not change,
-build is for build your images before starting containers.
At least the name of your service.
Take a look here.

How to safely stop/start my postgres server when using docker-compose

I sometimes stop/start docker very often when I am release new features in my application.
docker-compose up -d
docker-compose stop
I am using pretty much the bare bones postgres docker setup (see below).
I am mapping the /data folder to my host.
Is there anything I should be worried about if I stop/start docker many times in a day in terms of data getting corrupted?
Is calling docker-compose stop the best way to be stopping my postgres instance?
My postgres service in my docker-compose looks like this:
db:
image: postgres:9.4
volumes:
- "/home/deploy/data/pgdata:/var/lib/postgresql/data"
restart: always
This setup currently is running smoothly in development, but once it goes to production I want to make sure I am following best practices etc.
Use,
docker-compose down -v
What it does is basically removes all the volumes you added. If you don't those volumes will hang on and eat up your space. It only removes the volume inside the docker container. The volume in your host stays and survives container removal in case if you want that data to survive container removal.
Whenever you create a docker container by docker run, Docker creates a volume/ directory to keep the details about the containers. After you execute docker run, if you look into /var/lib/docker/containers, you will see one directory for each container you started. If you have not removed the volumes for previous container, you will see many directories under the "container" directory. The name of these directories will be very long random letters and number. So, if you don't tell the docker to remove these directories when you stop the container, it will be there forever. The v option I mentioned above, will delete these directories when you take down the container.
Keep in mind, you can view the contents of the directory /var/lib/docker only as a root user. To change to root user, use sudo -i before you attempt to view the contents of the directory.
Databases in particular are usually designed so that it's very hard to lose data, even if the machine loses power in the middle of writing something to disk. (This comes at some performance cost.) So long as you don't have more than one PostgreSQL instance at a time using the same backing data store, I'd expect it to not lose data or otherwise corrupt itself; the worst you should expect to see is a message at startup that it's recovering from a write-ahead log or something along those lines.
docker stop will send a signal to a container that prompts it to shut down cleanly, and PostgreSQL will take this as a cue to shut down. It looks like docker-compose stop, docker-compose down, and sending ^C to docker-compose up all use the same mechanism. So the way you're doing it now should result in a clean shutdown (provided PostgreSQL finishes its cleanup within 10 seconds).
I believe you can docker-compose restart specific services, or docker-compose up --force-recreate them. This would help if you rebuilt your application container and needed to restart that, but not its database.

When are docker-compose on-the-fly volumes reused vs. recreated?

I have a docker-compose.yml like this:
version: '2'
services:
app:
build: .
volumes:
- /usr/src/app
If I do docker-compose up, then any changes I make to the /usr/src/app are persisted across runs. I can control+C and then docker-compose up, and the contents are still there.
But if I do docker-compose run app ls -la /usr/src/app, then the path is always empty.
My goal is that I'd like to have that volume 1) automatically created on the fly for me, 2) specific to this docker-compose project (since I'll have many others), and 3) persist across docker-compose up/run/etc.
I think one way around this is to use named volumes, which will automatically pull the name of my docker-compose project.
But with on-the-fly containers, is this the expected behavior? They persist automatically for docker-compose up, and are recreated from scratch for each docker-compose run?
Also, is there any documentation that makes clear the lifetime of on-the-fly containers?
Thanks!

Why doesn't postgres official docker repo start db service at build time?

Under the background of https://github.com/docker-library/postgres (github repo) and https://registry.hub.docker.com/_/postgres/ (docker hub)
It can be seen database is started by Entrypoint and CMD with bash script
/docker-entrypoint.sh
with
ENTRYPOINT ["/docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]
another script hook provided to change database is
/docker-entrypoint-initdb.d
which means the database starts (can be pqsl) only at runtime, when docker run command is typed in.
This causes a problem, we could not customize the database before it runs in build time, for example add extensions and populate db with data.
Of course, it could be done in run time. But it has the advantage to repeat the operation every time when the image is run.
So, what is the logic behind this design from docker or postgres perspective? How could I add extension and populate data in build time ?
If you were to customize (create, populate data) a database at build time, that would imply that the database data is written into the docker image filesystem itself (as one cannot mount a volume at build time).
The issue with that is that the docker image filesystem is a special one (AUFS or btrfs, etc) which isn't delivering good I/O performances for data intensive applications such as a database server.
As a consequence, you want to have your data written on a volume instead of on the docker container filesystem. As you don't know at build time what would be the volume used at run time, and as there is no mean anyway to mount volumes at build time, no one should create database at build time.
Furthermore, if you take a close look at the Dockerfile of the official PostgreSQL image, you will see that there is a VOLUME instruction that makes the path at which the data is written a volume. That means that the image is designed so that the data will never hit the docker container filesystem.
If you take a look at other Dockerfiles for other databases or data intensive applications, you will notice that they all operate in this manner. An other reason for that is that it is accepted as a good practice to make your docker containers immutable.
If you want to install additional modules to your image, it is fine as long as those do not depend on data that would be written on a volume, and as long as you make sure to declare a volume for any path they would write data on.
tl;dr
Application code/binary → docker image filesystem
Application data → docker volume
This is right from the docker page for the postgres image (library/postgres):
If you would like to do additional initialization in an image derived from this one, add a *.sql or *.sh script under /docker-entrypoint-initdb.d (creating the directory if necessary). After the entrypoint calls initdb to create the default postgres user and database, it will run any *.sql files and source any *.sh script found in that directory to do further initialization before starting the service.
You can also extend the image with a simple Dockerfile to set the locale. The following example will set the default locale to de_DE.utf8:
FROM postgres:9.4
RUN localedef -i de_DE -c -f UTF-8 -A /usr/share/locale/locale.alias de_DE.UTF-8
ENV LANG de_DE.utf8
Since database initialization only happens on container startup, this allows us to set the language before it is created.
You have the ability to extend an image just as the example shows from the docs that I pasted above. You can also use the exec command and execute virtually anything within the container right from your host machine. It took me a little while to get used to it, I continue to discover things as I play with it more and more.
UPDATE:
sudo docker run --name some-postgres -v ~/PATH/TO/some-postgres/data:/var/lib/postgres/data -p 127.0.0.1:5432:5432 -e POSTGRES_PASSWORD=test -d postgres