I'm relatively new to dockers, but I'm kind of wondering whether is it possible for me to create two master-slave postgres containers. I can do it on virtual machines, but I'm a bit confused on the one in docker.
If it's possible can someone please point me to right directions?
I have tried to docker exec -it, but all the files are all missing and I cannot edit the files inside.
Since you are new to Docker, and you wish to get up and running quickly, you can try using Bitnami's images, which allow you to specify a POSTGRESQL_REPLICATION_MODE environment variable, which will allow you to designate a container as a standby/slave.
Just save their docker-compose-replication.yml as docker-compose.yml in the director of your choice, run docker-compose up -d, and it will pull the necessary image and set everything up for you quickly.
However, I would highly encourage you to tinker on your own to learn how Docker works. Specifically, you could just use the community Postgres image, and then write your own entrypoint.sh file (along with any additional helper files as necessary), and customize the setup to your requirements.
Disclosure: I work for EnterpriseDB (EDB)
Related
This question already has an answer here:
Docker : Can a container A call an executable located on an other container B?
(1 answer)
Closed 10 months ago.
I have Elixir/Phoenix server, based on user requirements server generates markdown and latex files, then it runs System.cmd("pandoc", [relevant commands]) to generate PDF files everything works fine in the local system, here the dependencies are (pandoc and latex) installed locally
now I'm trying to dockerize the project.
I tried installing pandoc and latex in phoenix_server container and it worked fine but the final docker image size increased to 8.5GB because texlive itself has 7.5GB so its not an option
I found this pandoc/latex:2.18 image
so my idea is to create 3 docker containers and run docker-compose up
container_1: Phoenix_server
container_2: postgres:14
container_3: pandoc/latex:2.18
but it didn't worked.
challenges:
1 sharing server generated file's with pandoc/latex container, for this I'm thinking to using docker volume option
2 I could not figure out how to run cli commands from phoenix container onto in pandoc/latex container
any help is greatly appreciated
Thank you
To be fair, if you're not using distributed erlang in this case, I would just install Elixir to the pandoc image and left postgres as an external container / managed db as you did.
When you want to distribute calculations like you are trying with containers, I would probably put some queue (postgres + Oban as a option looks good here) and started elixir process in the pandoc image which can do the rendering and save outputs to the shared volume.
I'm trying to connect Tarantool Docker Image to local PostgreSQL, to replicate some test data, and ran into the following problems:
It seems there is no CL (except Tarantool console) to check which
files are in place (exec bin/bash fails)
pg = require('pg') leads to
an error: "init.lua:4: module 'pg.driver' not found", despite the
presence of the pg module in the Docker description
I have doubts about how to replicate efficiently 4 tables, and
relations between them, to the container from outside Postgres
Does anyone know sources to dig in and find solutions to those problems? Any direction would be greatly appreciated.
docker exec -ti tnt_container sh
the issue. You should find an older base image or build it yourself.
This is PostgreSQL-related doubts. You may pass batches of data to pg functions or use intermediate application to transfer data via COPY. It looks like tarantool's pg driver does not support COPY.
I have a dockerfile for frontend, one for backend, and one for the database.
In the backend portion of the project, I have a dockerfile and a docker-compose.yml file.
the dockerfile is great for the backend because it configures the backend, copies and sets up the information etc. I like it alot.
The issue i have come to though is that if i can easily create a dockerfile for the dbms, but it requires me to put it in a different directory, where i was hoping to just define it in the same directory as the backend, and because of the fact the backend and the dbms is so tightly coupled, i figured this is where docker-compose would go.
My issue I ran into is that in a compose file, I cant do a COPY into the dbms container. I would just have to create another dockerfile to set that up. I was thinking that would work.
When looking on github, there was a big enhancement thread about it, but the closest people would get is just creating volume relationship, which fails to do what I want.
Ideally, All i want to be able to do is to stand up a postgres dbms in a fashion such that i could conduct load balancing on it later down the line with 1 write, 5 read or something, and have its initial db defined in my one sql file.
Am I missing something? I thought i was going about it correctly, but maybe I need to create a whole new directory with a dockerfile for the dbms.
Thoughts on how I should accomplish this?
Right now i was doing something like:
version: '2.0'
services:
backend:
build: .
ports:
- "8080:8080"
database:
image: "postgres:10"
environment:
POSTGRES_USER: "test"
POSTGRES_PASSWORD: "password"
POSTGRES_DB: "foo"
# I shouldnt have volumes as it would copy the entire folder and its contents to db.
volumes:
- ./:/var/lib/postgresql/data
To copy things with docker there an infinite set of possibilities.
At image build time:
use COPY or ADD instructions
use shell commands including cp,ssh,wget and many others.
From the docker command line:
use docker cp to copy from/to hosts and containers
use docker exec to run arbitrary shell commands including cp, ssh and many others...
In docker-compose / kubernetes (or through command line):
use volume to share data between containers
volume can be local or distant file systems (network disk for example)
potentially combine that with shell commands for example to perform backups
Still how you should do it dependy heavily of the use case.
If the data you copy is linked to the code and versionned (in the git repo...) then treat as it was code and build the image with it thanks to the Dockerfile. This is for me a best practice.
If the data is a configuration dependrnt of the environement (like test vs prod, farm 1 vs farm 2), then go for docker config/secret + ENV variables.
If the data is dynamic and generated at production time (like a DB that is filled with user data as the app is used), use persistant volumes and be sure you understand well the impact of container failure for your data.
For a database in a test system it can make sense to relauch the DB from a backup dump, a read only persistant volume or much simpler backup the whole container at a known state (with docker commit).
I have an application which, for local development, has multiple Docker containers (organized under Docker Compose). One of those containers is a Postgres 10 instance, based on the official postgres:10 image. That instance has its data directory mounted as a Docker volume, which persists data across container runs. All fine so far.
As part of testing the creation and initialization of the postgres cluster, it is frequently the case that I need to remove the Docker volume that holds the data. (The official postgres image runs cluster init if-and-only-if the data directory is found to be empty at container start.) This is also fine.
However! I now have a situation where in order to test and use a third party Postgres extension, I need to load around 6GB of (entirely static) geocoding lookup data into a database on the cluster, from Postgres backup dump files. It's certainly possible to load the data from a local mount point at container start, and the resulting (very large) tables would persist across container restarts in the volume that holds the entire cluster.
Unfortunately, they won't survive the removal of the docker volume which, again, needs to happen with some frequency. I am looking for a way to speed up or avoid the rebuilding of the single database which holds the geocoding data.
Approaches I have been or currently am considering:
Using a separate Docker volume on the same container to create persistent storage for a separate Postgres tablespace that holds only the geocoder database. This appears to be unworkable because while I can definitely set it up, the official PG docs say that tablespaces and clusters are inextricably linked such that the loss of the rest of the cluster would render the additional tablespace unusable. I would love to be wrong about this, since it seems like the simplest solution.
Creating an entirely separate container running Postgres, which mounts a volume to hold a separate cluster containing only the geocoding data. Presumably I would then need to do something kludgy with foreign data wrappers (or some more arcane postgres admin trickery that I don't know of at this point) to make the data seamlessly accessible from the application code.
So, my question: Does anyone know of a way to persist a single database from a dockerized Postgres cluster, without resorting to a dump and reload strategy?
If you want to speed up then you could convert your database dump to a data directory (import your dump to a clean postgres container, stop it and create a tarball of the data directory, then upload it somewhere). Now when you need to create a new postgres container use use a init script to stop the database, download and unpack your tarball to the data directory and start the database again, this way you skip the whole db restore process.
Note: The data tarball has to match the postgres major version so the container has no problem to start from it.
If you want to speed up things even more then create a custom postgres image with the tarball and init script bundled so everytime it starts then it will wipe the empty cluster and copy your own.
You could even change the entrypoint to use your custom script and load the database data, then call docker-entrypoint.sh so there is no need to delete a possible empty cluster.
This will only work if you are OK with replacing the whole cluster everytime you want to run your tests, else you are stuck with importing the database dump.
Just want ask a clarifying question before I pursue docker further here. I'm trying to understand the life cycle of indices with Sphinx in a container.
Provided I set up a container with Sphinx with some build, so it has some shared indices, how can I reindex from the host? Will I have to determine the container IP (assuming through $CID) and then send the reindex command through SSH to the container or something else fancy manner?
I'm using Rails with thinking sphinx and have some nice capistrano hooks to reindex from my dev box, I'm guessing I'm going to loose those by putting sphinx in a docker container since sphinx would no longer be on the host itself.
A container is just like a virtual machine with an added advantage that it is much lighter. So you can do the re-indexing any manner you like or do. Either ssh or directly through the bash shell you receive when you run the container from the provided image.