I deployed a stack to a docker cloud (cloud.docker.com not in swarm mode).
Everything is running fine but I have a postgres database. I have a separate container that contains scripts to initialize the structure of the database (I need certain tables). I only need to run this once, so I thought about executing this container in the stack.
However it doesn't seem to be possible to run a single container (the docker-cloud container commands don't have a run sub-cmnd).
Is there a way to execute one-off scripts in the stack?
Related
I have a script creates my database (script with all required DDL and inserts). My goal is to test that script is correct and database will be created successfully and without exceptions.
I decide to use for this docker image "postgres:latest".
My question is: can I run the docker image so that my script will be applied (I know, I can run my cript by copying to /docker-entrypoint-initdb.d/), and immedietly after that database will be shutdown and docker container exit with code 0. I want to shutdown database for automation this process and check exit code in test script.
I'll be glad to other suggestions of automation the prosess.
This question already has an answer here:
Docker : Can a container A call an executable located on an other container B?
(1 answer)
Closed 10 months ago.
I have Elixir/Phoenix server, based on user requirements server generates markdown and latex files, then it runs System.cmd("pandoc", [relevant commands]) to generate PDF files everything works fine in the local system, here the dependencies are (pandoc and latex) installed locally
now I'm trying to dockerize the project.
I tried installing pandoc and latex in phoenix_server container and it worked fine but the final docker image size increased to 8.5GB because texlive itself has 7.5GB so its not an option
I found this pandoc/latex:2.18 image
so my idea is to create 3 docker containers and run docker-compose up
container_1: Phoenix_server
container_2: postgres:14
container_3: pandoc/latex:2.18
but it didn't worked.
challenges:
1 sharing server generated file's with pandoc/latex container, for this I'm thinking to using docker volume option
2 I could not figure out how to run cli commands from phoenix container onto in pandoc/latex container
any help is greatly appreciated
Thank you
To be fair, if you're not using distributed erlang in this case, I would just install Elixir to the pandoc image and left postgres as an external container / managed db as you did.
When you want to distribute calculations like you are trying with containers, I would probably put some queue (postgres + Oban as a option looks good here) and started elixir process in the pandoc image which can do the rendering and save outputs to the shared volume.
I'm using Google Cloud Run to host some solutions. When the containers start, programs can write to disk, and the data persists until the container stops. However, from a system point of view, all partitions of the container always report zero free space. I confirmed this in a few ways:
Running df from start.sh shows zero free space when the container starts
Deleting a large file and then running df from start.sh still shows zero free space
It is possible to write to disk via start.sh, PHP scripts, etc, so the system DOES have free space to write to memory, yet df still reports zero free space
(All of the above are once the container is deployed to Cloud Run. Manually running the same container via docker from the Cloud Shell and executing df reports free space).
The problem is that certain applications perform disk space checks when they start, and they fail to load in Google Cloud Run. For example, MariaDB uses df in its init script, so commenting out these lines makes it possible to add a static yet functional MariaDB instance to a Cloud Run container.
MariaDB made it easy. Now, I'm trying to do the same thing with PostgreSQL and RabbitMQ, but I'm having trouble figuring out how to override their disk space checks. Here are the two options I am considering:
Keep digging through the source of PostgreSQL and RabbitMQ until I find the disk space check and override it. I don't speak Erlang, so this is a pain, and I would have to do it for every application with this issue
Programs are probably using coreutils to determine disk size. I could edit the source and rebuild it as part of my Dockerfile routine so the system always returns with free space available (could have unintentional side effects)
Is anyone either familiar with the source of Postgres or RabbitMQ or have a system-wide solution that I could implement that would "spoof" the free space available?
EDIT: Here are the error messages given by RabbitMQ and PostgreSQL
RabbitMQ:
{error,{cannot_log_to_file,"/var/log/rabbitmq/rabbit#localhost.log",{error,einval}}}
Postgres:
Error: /usr/lib/postgresql/10/bin/pg_ctl /usr/lib/postgresql/10/bin/pg_ctl start -D /var/lib/postgresql/10/main -l /var/log/postgresql/postgresql-10-main.log -s -o -c config_file="/etc/postgresql/10/main/postgresql.conf" exited with status 1:
I have an application which, for local development, has multiple Docker containers (organized under Docker Compose). One of those containers is a Postgres 10 instance, based on the official postgres:10 image. That instance has its data directory mounted as a Docker volume, which persists data across container runs. All fine so far.
As part of testing the creation and initialization of the postgres cluster, it is frequently the case that I need to remove the Docker volume that holds the data. (The official postgres image runs cluster init if-and-only-if the data directory is found to be empty at container start.) This is also fine.
However! I now have a situation where in order to test and use a third party Postgres extension, I need to load around 6GB of (entirely static) geocoding lookup data into a database on the cluster, from Postgres backup dump files. It's certainly possible to load the data from a local mount point at container start, and the resulting (very large) tables would persist across container restarts in the volume that holds the entire cluster.
Unfortunately, they won't survive the removal of the docker volume which, again, needs to happen with some frequency. I am looking for a way to speed up or avoid the rebuilding of the single database which holds the geocoding data.
Approaches I have been or currently am considering:
Using a separate Docker volume on the same container to create persistent storage for a separate Postgres tablespace that holds only the geocoder database. This appears to be unworkable because while I can definitely set it up, the official PG docs say that tablespaces and clusters are inextricably linked such that the loss of the rest of the cluster would render the additional tablespace unusable. I would love to be wrong about this, since it seems like the simplest solution.
Creating an entirely separate container running Postgres, which mounts a volume to hold a separate cluster containing only the geocoding data. Presumably I would then need to do something kludgy with foreign data wrappers (or some more arcane postgres admin trickery that I don't know of at this point) to make the data seamlessly accessible from the application code.
So, my question: Does anyone know of a way to persist a single database from a dockerized Postgres cluster, without resorting to a dump and reload strategy?
If you want to speed up then you could convert your database dump to a data directory (import your dump to a clean postgres container, stop it and create a tarball of the data directory, then upload it somewhere). Now when you need to create a new postgres container use use a init script to stop the database, download and unpack your tarball to the data directory and start the database again, this way you skip the whole db restore process.
Note: The data tarball has to match the postgres major version so the container has no problem to start from it.
If you want to speed up things even more then create a custom postgres image with the tarball and init script bundled so everytime it starts then it will wipe the empty cluster and copy your own.
You could even change the entrypoint to use your custom script and load the database data, then call docker-entrypoint.sh so there is no need to delete a possible empty cluster.
This will only work if you are OK with replacing the whole cluster everytime you want to run your tests, else you are stuck with importing the database dump.
I've created docker image with PostgreSQL running inside and exposing 5432 port.
This image doesn't contain any database inside. Container is an empty PostgreSQL database server.
I'd like in (or during) "docker run" command:
attach db file
create db via sql query execution
restore db from dump
I don't want to keep the data after container will be closed. It's just a temporary development server.
I suspect it's possible to keep my "docker run" command string quite short/simple.
Probably there it is possible to mount some external folder with db/sql/dump in run command and then create db during container initialization.
What are the best/recommended way and the best practices to accomplish this task? Probably somebody can point me to corresponding docker examples.
This is a good question and probably something other folks asked themselves more than once.
According to the docker guide you would not do this in a RUN command. Instead you would create yourself an ENTRYPOINT or CMD in your Dockerfile that calls a custom shell script instead of calling the postgres process direclty. In this scenario the DB would be created in a "real" filesystem, but then cleaned-up during shutdown of the container.
How would this work? The container would start, call the ENTRYPOINT or CMD as usual and consume the init script to get the DB filled. Then at the moment the container is stopped, the same script will be notified with a signal and manually drop the database content.
CMD ["cleanAndRun.sh"]
A sketched script "cleanAndRun.sh" taken from the Docker documentation and modified for your needs. Please remember it is a sketch only and needs modifications:
#!/bin/sh
# The script that is called in the trap must also stop the DB, so below call to
# dropdb is not enough, it just demonstrates how to call anything in the stop-container scenario!
trap "dropdb <params>" HUP INT QUIT TERM
# init your DB -every- time container starts
<init script to import to clean and import dump>
# start your postgres DB
postgres
echo "exited $0"