Loading osm data to PostgreSQL during docker build - postgresql

Almost the same as Import osm data in Docker postgresql BUT I want to load the osm data into the postgres via osm2pgsql during the docker build phase.
The reason for this are:
I only want to load a fixed osm file inside my postgres, meaning this data will not change.
I want to reuse this docker image as many times as possible.
It is not possible to mount any volume with my current environment.
I know that this will make the docker image big but that is something I already took into consideration.

Related

Connect Tarantool Docker image to Postgres

I'm trying to connect Tarantool Docker Image to local PostgreSQL, to replicate some test data, and ran into the following problems:
It seems there is no CL (except Tarantool console) to check which
files are in place (exec bin/bash fails)
pg = require('pg') leads to
an error: "init.lua:4: module 'pg.driver' not found", despite the
presence of the pg module in the Docker description
I have doubts about how to replicate efficiently 4 tables, and
relations between them, to the container from outside Postgres
Does anyone know sources to dig in and find solutions to those problems? Any direction would be greatly appreciated.
docker exec -ti tnt_container sh
the issue. You should find an older base image or build it yourself.
This is PostgreSQL-related doubts. You may pass batches of data to pg functions or use intermediate application to transfer data via COPY. It looks like tarantool's pg driver does not support COPY.

postgresql docker replications

I'm relatively new to dockers, but I'm kind of wondering whether is it possible for me to create two master-slave postgres containers. I can do it on virtual machines, but I'm a bit confused on the one in docker.
If it's possible can someone please point me to right directions?
I have tried to docker exec -it, but all the files are all missing and I cannot edit the files inside.
Since you are new to Docker, and you wish to get up and running quickly, you can try using Bitnami's images, which allow you to specify a POSTGRESQL_REPLICATION_MODE environment variable, which will allow you to designate a container as a standby/slave.
Just save their docker-compose-replication.yml as docker-compose.yml in the director of your choice, run docker-compose up -d, and it will pull the necessary image and set everything up for you quickly.
However, I would highly encourage you to tinker on your own to learn how Docker works. Specifically, you could just use the community Postgres image, and then write your own entrypoint.sh file (along with any additional helper files as necessary), and customize the setup to your requirements.
Disclosure: I work for EnterpriseDB (EDB)

Persisting a single, static, large Postgres database beyond removal of the db cluster?

I have an application which, for local development, has multiple Docker containers (organized under Docker Compose). One of those containers is a Postgres 10 instance, based on the official postgres:10 image. That instance has its data directory mounted as a Docker volume, which persists data across container runs. All fine so far.
As part of testing the creation and initialization of the postgres cluster, it is frequently the case that I need to remove the Docker volume that holds the data. (The official postgres image runs cluster init if-and-only-if the data directory is found to be empty at container start.) This is also fine.
However! I now have a situation where in order to test and use a third party Postgres extension, I need to load around 6GB of (entirely static) geocoding lookup data into a database on the cluster, from Postgres backup dump files. It's certainly possible to load the data from a local mount point at container start, and the resulting (very large) tables would persist across container restarts in the volume that holds the entire cluster.
Unfortunately, they won't survive the removal of the docker volume which, again, needs to happen with some frequency. I am looking for a way to speed up or avoid the rebuilding of the single database which holds the geocoding data.
Approaches I have been or currently am considering:
Using a separate Docker volume on the same container to create persistent storage for a separate Postgres tablespace that holds only the geocoder database. This appears to be unworkable because while I can definitely set it up, the official PG docs say that tablespaces and clusters are inextricably linked such that the loss of the rest of the cluster would render the additional tablespace unusable. I would love to be wrong about this, since it seems like the simplest solution.
Creating an entirely separate container running Postgres, which mounts a volume to hold a separate cluster containing only the geocoding data. Presumably I would then need to do something kludgy with foreign data wrappers (or some more arcane postgres admin trickery that I don't know of at this point) to make the data seamlessly accessible from the application code.
So, my question: Does anyone know of a way to persist a single database from a dockerized Postgres cluster, without resorting to a dump and reload strategy?
If you want to speed up then you could convert your database dump to a data directory (import your dump to a clean postgres container, stop it and create a tarball of the data directory, then upload it somewhere). Now when you need to create a new postgres container use use a init script to stop the database, download and unpack your tarball to the data directory and start the database again, this way you skip the whole db restore process.
Note: The data tarball has to match the postgres major version so the container has no problem to start from it.
If you want to speed up things even more then create a custom postgres image with the tarball and init script bundled so everytime it starts then it will wipe the empty cluster and copy your own.
You could even change the entrypoint to use your custom script and load the database data, then call docker-entrypoint.sh so there is no need to delete a possible empty cluster.
This will only work if you are OK with replacing the whole cluster everytime you want to run your tests, else you are stuck with importing the database dump.

Docker postgres how to share database

I'm running a PostgreSQL db via docker postgres.
I have populated the db with lots of data and would like to share it with others.
Is there a way to 'save' this database with all the data as a new image and publish it to a Docker registry so it can be easily pulled and used?
You can use docker container commit https://docs.docker.com/engine/reference/commandline/commit/ to create an image from a container.
Then you can publish that image to a docker registry for use by others.

Understanding how to import a map.osm file into a postgres database.

I was just reading a tutorial HERE.
I am assigned with the following task:
The task is to create a Docker image that has PostgreSQL installed.
The PostgreSQL database should be PostGIS-enabled. The Docker image
should also allow quick and easy import of an .osm map data file into
the database. The database should contain a routable osm-based road
network which can be used to run simple default pgRouting queries
(shortest path, A star).
I know how to do the initial and final part of it but i am a bit confused about the following part:
The Docker image should also allow quick and easy import of an .osm
map data file into the database.
How do i make this possible ? Dockerfile ? but still how ?