Interacting with PostgreSQL server in Docker container - postgresql

I'm creating a Docker image based on the postgres image and I'm trying to interact with it like this:
FROM postgres:9.6
USER postgres
RUN createuser foo
However, this results in the following error while building:
createuser: could not connect to database postgres: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
How do I properly connect to the PostgreSQL server from within this container?

The postgres server isn't running during the docker build process, so trying to connect to it with a RUN statement in your Dockerfile isn't going to work.
If you want to create users or databases or extensions, etc, you need to do that at runtime. There are a few options available, and which one you choose depends on exactly what you're trying to do.
If you just need to create a user and/or database that differs from the default, you can do that via environment variables as described in the documentation.
To create a user other than postgres:
docker run -e POSTGRES_USER=foo -e POSTGRES_PASSWORD=secret [...] postgres
To create a database other than the default (which will match the name of POSTGRES_USER):
docker run -e POSTGRES_DB=mydbname [...] postgres
If you need to do anything more complicated, take a look at the "How to extend this image" section of the documentation. You can place shell scripts or sql scripts into /docker-entrypoint-initdb.d and they will be executed during container startup. There is an example there that demonstrates how to create an additional database using this mechanism.

Related

Docker Postgres data host volume mapping

I'm trying to docker-containerize PostgreSQL server and this container will have many other applications as well. The need is that, PostgreSQL server data should be mapped to the host volume so that when container is stopped, we won't lose the data. Also that, the next time when we start the container, the same directory can be mapped again and postgres can use the old data. Below is the DOCKERFILE. Note that I'm using ubuntu 22.04 on the host.
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt install -y postgresql
ENTRYPOINT ["tail", "-f", "/dev/null"]
Docker image is built using the command
docker build -t pg_test .
and the container is run using the command
docker run --name test -v /home/me/data:/var/lib/postgresql/14/main pg_test
'/home/me/data' is the host directory which is empty where I want to map the postgres server data. '/var/lib/postgresql/14/main' is the directory inside the docker container where the postgres is supposed to store the data.
Once the docker container starts, I enter the docker container using the command
docker exec -it test bash
and once I'm inside, I'm trying to start the PostgreSQL service. But PostgreSQL fails to start as there is no data in '/var/lib/postgresql/14/main' directory. I understand that since I have mapped an empty host directory to '/var/lib/postgresql/14/main' directory, postgres doesn't have the files required to start.
I understand that I'm doing it the wrong way, but I couldn't find a way around it. Can anyone please help me to do this the right way, if there is one?
Any help would be appreciable.
You should use the postgres docker image, it will set up the db for you when you start the container, you can find instructions on https://hub.docker.com/_/postgres
If you must use a custom image, you will need to initialize the db yourself, usually by running initdb or whatever your system provides.
But really you should use the appropriate docker image, and if you need more services you start them in their own container and connect them to the postgres one

Setup a PostgreSQL connection to an already existing project in Docker

I had never used PostgreSQL nor Docker before. I set up an already developed project that uses these two technologies in order to modify it.
To get the project running on my Linux (Pop!_OS 20.04) machine I was given these instructions (sorry if this is irrelevant but I don't know what is important and what is not to state my problem):
Installed Docker CE and Docker Compose.
Cloned the project with git and ran the commands git submodule init and git submodule update.
Initialized the container with: docker-compose up -d
Generated the application configuration file: ./init.sh
After all of that the app was available at http://localhost:8080/app/ and I got inside the project's directory the following subdirectories:
And inside dbdata:
Now I need to modify the DB and there's where the difficulty arose since I don't know how to set up the connection with PostgreSQL inside Docker.
In a project without Docker which uses MySQL I would
Create the local project's database "dbname".
Import the project's DB: mysql -u username -ppassword dbname < /path/to/dbdata.sql
Connect a DB client (DBeaver in my case) to the local DB and perform the necessary modifications.
In an endeavour to do something like that with PostgeSQL, I have read that I need to
Install and configure Ubuntu 20.04 serve.
Install PostgreSQL.
Configure Postgres “roles” to handle authentication and authorization.
Create a new Database.
And then what?
How can I set up the connection in order to be able to modify the DB from DBeaver and see the changes reflected on http://localhost:8080/app/ when Docker is involved?
Do I really need an Ubuntu server?
Do I need other program than psql to connect to Postgres from the command line?
I have found many articles related to the local setup of PostgreSQL with Docker but all of them address the topic from scratch, none of them talk about how to connect to the DB of an "old" project inside Docker. I hope someone here can give directions for a newbie on what to do or recommend an article explaining from scratch how to configure PostgreSQL and then connecting to a DB in Docker. Thanks in advance.
Edit:
Here's the output of docker ps
You have 2 options to get into known waters pretty fast:
Publish the postgres port on the docker host machine, install any postgres client you like on the host and connect to the database hosted in the container as you would have done this traditionally. You will use localhost:5433 to reach the DB. << Update: 5433 is the port where the postgres container is published on you host, according to the screenshot.
Another option is to add another service in your docker-compose file to host the client itself in a container.
Here's a minimal example in which I am launching two containers: the postgres and an adminer that is exposed on the host machine on port 9999.
version: '3'
services:
db:
image: postgres
restart: always
environment:
POSTGRES_PASSWORD: example
adminer:
image: adminer
restart: always
ports:
- 9999:8080
then I can access the adminer at localhost:9999 (password is example):
Once I'm connected to my postgres through adminer, I can import and execute any SQL query I need:
A kind advice is to have a thorough lecture to understand how the data is persisted in a Docker context. Performance and security are also topics that you might want to add under your belt as a novice in the field better sooner than later.
If you're running your PostgreSQL container inside your own machine you don't need anything else to connect using a database client. That's because to the host machine, all the containers are accessible using their own subnet.
That means that if you do this:
docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' 341164c5050f`
it will output a list of IPs that you can configure in your DBeaver to access the container instance directly.
If you're not fond of doing that (or you prefer to use cli) you can always use the psql inside the installation of the PostgreSQL container to achieve something like you described in mysql point nº2:
docker exec -i 341164c5050f bash -c 'psql -U $POSTGRES_USER' < /path/to/your/schema.sql
It's important to inform the -i, otherwise it'll not read the schema from the stdin. If you're looking for psql in the interactive mode, use -it instead.
Last but not least, you can always edit the docker-compose.yml file to export the port and connect to the instance using the public IP/loopback device.

How do I SSH from a Docker container to a remote server

I am building a docker image off postgres image, and I would like to seed it with some data.
I am following the initialization-scripts section of the documentation.
But the problem I am facing now, is that my initialisation scripts needs to ssh to a remote database and dumb data from there. Basically something like this:
ssh remote.host "pg_dump -U user -d somedb" > some.sql
but this fails with the error that ssh: command not found
Question now is, in general, how do I ssh from a docker container to a remote server. In this case, specifically how do I ssh from a docker container to a remote database server as part of the initialisation step of seeding a postgres database?
As a general rule you don't do things this way. Typical Docker images contain only the server they're running and some core tools, but network clients like ssh or curl generally aren't part of this. In the particular case of ssh, securely managing the credentials required is also tricky (not impossible, but not obvious).
In your particular case, I might rearrange things so that your scripts didn't have the hard assumption the database was running locally. Provision an empty database container, then run your script from the host targeting that empty database. It may even work to set the PGHOST and PGPORT environment variables to point to your host machine's host name and the port you publish the database interface on, and then run that script unmodified.
Looking closer at that specific command, you also may find it better to set up a cron job to run that specific database dump and put the contents somewhere. Then a developer can get a snapshot of the data without having to make a connection to the live database server, and you can limit the number of people who will have access. Once you have this dump file, you can use the /docker-entrypoint-initdb.d mechanism to cause it to be loaded at first startup time.

Creating a running Postgres service inside a docker container

I'm a bit new to Docker.
I have two containers running using docker-compose.
One is the API and the other is the actual application.
I want to add a new DB container using the Postgres official image.
It's a bit hard to find a simple tutorial on how to create the container and populate it with a predefined sql file (of schemas and data).
When I start with "CMD /etc/init.d/postgresql start" in the Dockerfile I get an error saying: "No PostgreSQL clusters exist; see "man pg_createcluster" ... (warning)."
Since it takes me too much time to get things going I was wondering if it might be better to get an Ubuntu image and install Postgres on my own since there is only one source on how to use the image - docker hub, and I don't seem to understand it that well.
Any ideas or simple steps on how to compose and 'configure' this image?
If you want populate your database with some file, A simply way to do this is:
How to extend this image
If you would like to do additional initialization in an image derived
from this one, add one or more *.sql, *.sql.gz, or *.sh scripts under
/docker-entrypoint-initdb.d (creating the directory if necessary).
After the entrypoint calls initdb to create the default postgres user
and database, it will run any *.sql files and source any *.sh scripts
found in that directory to do further initialization before starting
the service.
Dockerfile
FROM postgres:alpine
COPY init.sql /docker-entrypoint-initdb.d/init.sql
docker-compose.yml
version: '3'
services:
app:
//your app definition
postgres:
build: .
Pull the postgres image
docker pull postges:14.2
Create the service with the below command
docker service create --name postgres --network my_overlay --env "POSTGRES_PASSWORD=password" --publish 5432:5432 postgres:14.2
Try to connect using userName as postgres and password as password to the default postgres db.
jdbc:postgresql://127.0.0.1:5432/postgres // JDBC connection

Build postgres docker container with initial schema

I'm looking to build dockerfiles that represent company databases that already exist. Similarly, I'd like create a docker file that starts by restoring a psql dump.
I have my psql_dump.sql in the . directory.
FROM postgres
ADD . /init_data
run "createdb" "--template=template0" "my_database"
run "psql" "-d" "my_database" --command="create role my_admin superuser"
run "psql" "my_database" "<" "init_data/psql_dump.sql"
I thought this would be good enough to do it. I'd like to avoid solutions that use a .sh script. Like this solution.
I use template0 since the psql documentation says you need the same users created that were in the original database, and you need to create the database with template0 before you restore.
However, it gives me an error:
createdb: could not connect to database template1: could not connect to server: No such file or directory
Is the server running locally and accepting
I'm also using docker compose for the overall application, if solving this problem in docker-compose is better, I'd be happy to use the base psql image and use docker compose to do this.
According to the usage guide for the official PostreSQL Docker image, all you need is:
Dockerfile
FROM postgres
ENV POSTGRES_DB my_database
COPY psql_dump.sql /docker-entrypoint-initdb.d/
The POSTGRES_DB environment variable will instruct the container to create a my_database schema on first run.
And any .sql file found in the /docker-entrypoint-initdb.d/ of the container will be executed.
If you want to execute .sh scripts, you can also provide them in the /docker-entrypoint-initdb.d/ directory.
As said in the comments, #Thomasleveil answer is great and simple if your schema recreation is fast.
But in my case it's slow, and I wanted to use docker volumes, so here is what I did
First use docker image as in #Thomasleveil answer to create a container with postgres with all the schema initialization
Dockerfile:
FROM postgres
WORKDIR /docker-entrypoint-initdb.d
ADD psql_dump.sql /docker-entrypoint-initdb.d
EXPOSE 5432
then run it and create new local dir which contains the postgres data after its populated from the “psql_dump.sql” file: docker cp mypg:/var/lib/postgresql/data ./postgres-data
Copy the data to a temp data folder, and start a new postgres docker-compose container whose volume is at the new temp data folder:
startPostgres.sh:
rm -r ./temp-postgres-data/data
mkdir -p ./temp-postgres-data/data
cp -r ./postgres-data/data ./temp-postgres-data/
docker-compose -p mini-postgres-project up
and the docker-compose.yml file is:
version: '3'
services:
postgres:
container_name: mini-postgres
image: postgres:9.5
ports:
- "5432:5432"
volumes:
- ./temp-postgres-data/data:/var/lib/postgresql/data
Now you can run steps #1 and #2 on a new machine or if your psql_dump.sql changes. And each time you want a new clean (but already initialized) db, you can only run startPostgres.sh from step #3.
And it still uses docker volumes.
#Thomasleveil's answer will re-create the database schema at runtime, which is fine for most cases.
If you want to recreate the database schema at buildtime (i.e. if your schema initialization is really slow) you can invoke the stock docker_entrypoint.sh from within your Dockerfile.
However, since the docker_entrypoint.sh is designed to start a long-running database server, you have to add an extra script to exit the process after database initialization but before booting the long-running server.
Dockerfile (with build time database initialization)
# STAGE 1 - Equivalent to #Thomasleveil
FROM postgres AS runtime_init
ENV POSTGRES_DB my_database
COPY 1-psql_dump.sql /docker-entrypoint-initdb.d/
# STAGE 2 - Initialize the database during the build
FROM runtime_init AS buildtime_init_builder
RUN echo "exit 0" > /docker-entrypoint-initdb.d/100-exit_before_boot.sh
ENV PGDATA=/pgdata
RUN docker-entrypoint.sh postgres
# STAGE 3 - Copy the initialized db to a new image to reduce size.
FROM postgres AS buildtime_init
ENV PGDATA=/pgdata
COPY --chown=postgres:postgres --from=buildtime_init_builder /pgdata /pgdata
Important Notes
The stock postgres image will run initialization scripts in alphabetical order, so ensure that your database restoration scripts appear earlier than the exit_before_boot.sh script created in the Dockerfile.
This is demonstrated by the 1 and 100 prefixes shown above. Modify them to your liking.
Database updates to a running instance of this image will not be persisted across reboots since the PGDATA path where the database files are stored no longer maps to a volume mounted from the host machine.
Further Reading
Instructions from the authors of the official postgres image about writing your own custom_entrypoint.sh. This is arguably the more "official" way to solve this problem, but I personally find my approach easier to understand and implement.
A demo of this concept for PostgreSQL 9, which uses the --help flag to exit the docker-entrypoint.sh before the long-running server boots. Unfortunately, this no longer works as of December 3, 2019
Two discussions (1) (2) of this same question from the official docker postgres repository.