How to move Postresql to RAM disk in Docker? - postgresql

I want to run the Docker image postgres:9, stop Postgres, move it to /dev/shm, and restart it, so I can run my application tests really fast.
But when I try to stop Postgres in the container using postgres or pg_ctl I get told cannot be run as root.
Since all Docker containers log you in as the root user what can I do to run the Postgres commands I need?
And which folders do I need to move to /dev/shm before restarting it?
Command to start the container if you want to try this:
docker run -it postgres:9 bash
cd /usr/lib/postgresql/9.6/bin
./pg_ctl stop

Mount a tmpfs in the container and point the PostgreSQL data at it
docker run --tmpfs=/pgtmpfs -e PGDATA=/pgtmpfs postgres:15
Use size=Nk to set a size limit (rather than all free memory).
--tmpfs /pgtmpfs:size=131072k
The same can be done for MySQL
docker run --tmpfs=/var/lib/mysql -e MYSQL_ALLOW_EMPTY_PASSWORD=yes mysql:8
Kubernetes
An emptyDir volume can set the medium property to Memory
apiVersion: v1
kind: Pod
metadata:
name: tmpfs-pd
spec:
containers:
- image: docker.io/postgres:15
name: tmpdb
env:
- name: PGDATA
value: /pgtmpfs
volumeMounts:
- mountPath: /pgtmpfs
name: tmpdata-volume
volumes:
- name: tmpdata-volume
emptyDir:
medium: Memory
sizeLimit: 131072k
Docker Compose
And in a docker compose 3.6+ definition (not supported by stack)
version: "3.6"
services:
db:
image: docker.io/postgres:15
environment:
- PGDATA=/pgtmpfs
tmpfs:
- /pgtmpfs
Compose can define shared volumes of tmpfs as well.

By adding --user=postgres
to your docker parameter list, you'll get to be user=postgres directly:
docker --user=postgres run -it postgres:9 bash
cd /usr/lib/postgresql/9.6/bin
./pg_ctl stop

Related

How to increase the maximum connection limit of a Postgres docker container? [duplicate]

Problem
I have too many connection open using the default docker postgresql configuration
https://hub.docker.com/_/postgres/
Goal
I want to extend max_connection without using a volume for mounting the configuration (I need this to be available by default for my CI environment).
I have tried to use sed to edit the configuration but this has no effect.
What is the recommended way of overriding default configuration of postgresql docker official image?
run this docker-compose.yml
version: '2'
services:
postgres:
image: postgres:10.3-alpine
command: postgres -c 'max_connections=200'
environment:
POSTGRES_DB: pgdb
POSTGRES_PASSWORD: postgres
POSTGRES_USER: postgres
stdin_open: true
tty: true
ports:
- 5432:5432/tcp
It is as simple as (you just override the default CMD with postgres -N 500):
docker run -d --name db postgres:10 postgres -N 500
You can check it using:
docker run -it --rm --link db:db postgres psql -h db -U postgres
show max_connections;
max_connections
-----------------
500
(1 row)
The official image provides a way to run arbitrary SQL and shell scripts after the DB is initialized by putting them into the /docker-entrypoint-initdb.d/ directory. This script:
ALTER SYSTEM SET max_connections = 500;
will let us change the maximum connection limit. Note that the postgres server will be restarted after the initializing scripts are run, so even settings like max_connections that require a restart will go into effect when your container starts for the first time.
How you attach this script to the docker container depends on how you are starting it:
Docker
Save the SQL script to a file max_conns.sql, then use it as a volume:
docker run -it -v $PWD/max_conns.sql:/docker-entrypoint-initdb.d/max_conns.sql postgres
Docker Compose
With docker compose, save the SQL script to a file max_conns.sql next to your docker-compose.yaml, and than reference it:
version: '3'
services:
db:
image: postgres:latest
volumes:
- ./max_conns.sql:/docker-entrypoint-initdb.d/max_conns.sql
Kubernetes
With kubernetes, you will need to create a configmap for the script:
kind: ConfigMap
apiVersion: v1
metadata:
name: max-conns
data:
max_conns.sql: "ALTER SYSTEM SET max_connections = 500;"
And then use it with a container:
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-example
spec:
template:
spec:
containers:
- name: postgres
image: postgres:latest
volumeMounts:
- name: max-conns
mountPath: /docker-entrypoint-initdb.d
volumes:
- name: max-conns
configMap:
name: max-conns
If you are using TestContainers do this before starting the container
postgreSQLContainer.setCommand("postgres", "-c", "max_connections=20000");
postgreSQLContainer.start();
From Google, Max value allowed 262143 min 1 and default 100
I spent a lot of time in this issue and the simplest way to resolve this is, you can add max_connections in your values.yaml file straight away. You can specify extended configuration parameters, this option will override the conf file.
For instance;
extendedConfiguration: "max_connections = 500"
You need to develop a init script. Accept max connections value from environment variable and update it during start up from init script. Finally launch postgreSQL service

How to persist data on a volume when using docker swarm mode?

New to Docker and I'm trying to set Postgres and pgadmin4 to run as a single service on docker for Mac inside a virtual machine. Everything works but as soon as I stop the service my data is gone. I'm using a named volume to persist data but probably doing something wrong. What is it?
Here's my setup:
# create my VM
docker-machine create dbvm
# set the right environment
eval $(docker-machine env dbvm)
Here's my docker-compose.yaml file:
version: '3'
services:
db:
image: postgres
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=my_db
volumes:
- pgdata:/pgdata
ports:
- 5432:5432
pgadmin:
image: fenglc/pgadmin4
ports:
- 5050:5050
volumes:
- pgadmindata:/pgadmindata
volumes:
pgdata:
pgadmindata:
With docker-compose.yaml, I run:
docker stack deploy -c docker-compose.yaml dbstack
I can do everything on this setup, but if I run docker stack rm dbstack the data is gone after this, but the volumes still exist.
$ docker volume ls
DRIVER VOLUME NAME
local 0c15b0b22c6b850e8768c14045da166253424dda4df8d2e13df75fd54d833412
local 22bab81d9d1de0e07de97363596b096f944752eba617ff574a0ab525239227f5
local 6da6e29fb98ad0f66d7da6a75dc76066ce014b26ea43567c55ed318fda707105
local dbstack_pgadmindata
local dbstack_pgdata
What am I missing?
Unless you have it in some config not shown, I believe you need to map to the default data location inside the container e.g., pgdata:/var/lib/postgresql/data
#Idg is partially correct. postgres data lives at /var/lib/postgresql/data per the Docker Hub readme.
But for it to work in your named volume, you can't use a path on the left side, so correct value would be:
volumes:
- pgdata:/var/lib/postgresql/data
Then the postgres data will stay in that named volume, on the node it was created on.

OpenShift's YAML execution precedence regarding volume mounting and commands

As a beginner in container administration, I can't find a clear description of OpenShift's deployment stages and related YAML statements, specifically when persistent volume mounting and shell commands execution are involved. For example, in the RedHat documentation there is a lot of examples. A simple one is 16.4. Pod Object Definition:
apiVersion: v1
kind: Pod
metadata:
name: busybox-nfs-pod
labels:
name: busybox-nfs-pod
spec:
containers:
- name: busybox-nfs-pod
image: busybox
command: ["sleep", "60000"]
volumeMounts:
- name: nfsvol-2
mountPath: /usr/share/busybox
readOnly: false
securityContext:
supplementalGroups: [100003]
privileged: false
volumes:
- name: nfsvol-2
persistentVolumeClaim:
claimName: nfs-pvc
Now the question is: does the command sleep (or any other) execute before or after the mount of nfsvol-2 is finished? In other words, is it possible to use the volume's resources in such commands? And if it's not possible in this config, then which event handlers to use instead? I don't see any mention about an event like volume mounted.
does the command sleep (or any other) execute before or after the
mount of nfsvol-2 is finished?
To understand this, lets dig into the underlying concepts for Openshift.
OpenShift is a container application platform that brings docker and Kubernetes to the enterprise. So Openshift is nothing but an abstraction layer on top of docker and kubernetes along with additional features.
Regarding the volumes and commands lets consider the following example:
Let's run the docker container by mounting a volume, which is the home directory of host machine to the root path of the container(-v is option to attach volume).
$ docker run -it -v /home:/root ubuntu /bin/bash
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
50aff78429b1: Pull complete
f6d82e297bce: Pull complete
275abb2c8a6f: Pull complete
9f15a39356d6: Pull complete
fc0342a94c89: Pull complete
Digest: sha256:f871d0805ee3ce1c52b0608108dbdf1b447a34d22d5c7278a3a9dd78fc12c663
Status: Downloaded newer image for ubuntu:latest
root#1f07f083ba79:/# cd /root/
root#1f07f083ba79:~# ls
lost+found raghavendralokineni raghu user1
root#1f07f083ba79:~/raghavendralokineni# pwd
/root/raghavendralokineni
Now execute the sleep command in the container and exit.
root#1f07f083ba79:~/raghavendralokineni# sleep 10
root#1f07f083ba79:~/raghavendralokineni#
root#1f07f083ba79:~/raghavendralokineni# exit
Check the files available in the /home path which we have mounted to the container. This content is same as that of /root path in the container.
raghavendralokineni#iconic-glider-186709:/home$ ls
lost+found raghavendralokineni raghu user1
So when a volume is mounted to the container, any changes in the volume will be effected in the host machine as well.
Hence the volume will be mounted along with the container and commands will be executed after container is started.
Coming back to the your YAML file,
volumeMounts:
- name: nfsvol-2
mountPath: /usr/share/busybox
It says ,mount the volume nfsvol-2 to the container and the information regarding the volume is mentioned in volumes:
volumes:
- name: nfsvol-2
persistentVolumeClaim:
claimName: nfs-pvc
So mount the volume to the container and execute the command which is specifed:
containers:
- name: busybox-nfs-pod
image: busybox
command: ["sleep", "60000"]
Hope this helps.

How to backup a Postgres database in Kubernetes on Google Cloud?

What is the best practice for backing up a Postgres database running on Google Cloud Container Engine?
My thought is working towards storing the backups in Google Cloud Storage, but I am unsure of how to connect the Disk/Pod to a Storage Bucket.
I am running Postgres in a Kubernetes cluster using the following configuration:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: postgres
spec:
containers:
- image: postgres:9.6.2-alpine
imagePullPolicy: IfNotPresent
env:
- name: PGDATA
value: /var/lib/postgresql/data
- name: POSTGRES_DB
value: my-database-name
- name: POSTGRES_PASSWORD
value: my-password
- name: POSTGRES_USER
value: my-database-user
name: postgres-container
ports:
- containerPort: 5432
volumeMounts:
- mountPath: /var/lib/postgresql
name: my-postgres-volume
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: my-postgres-disk
name: my-postgres-volume
I have attempted to create a Job to run a backup:
apiVersion: batch/v1
kind: Job
metadata:
name: postgres-dump-job
spec:
template:
metadata:
labels:
app: postgres-dump
spec:
containers:
- command:
- pg_dump
- my-database-name
# `env` value matches `env` from previous configuration.
image: postgres:9.6.2-alpine
imagePullPolicy: IfNotPresent
name: my-postgres-dump-container
volumeMounts:
- mountPath: /var/lib/postgresql
name: my-postgres-volume
readOnly: true
restartPolicy: Never
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: my-postgres-disk
name: my-postgres-volume
(As far as I understand) this should run the pg_dump command and output the backup data to stdout (which should appear in the kubectl logs).
As an aside, when I inspect the Pods (with kubectl get pods), it shows the Pod never gets out of the "Pending" state, which I gather is due to there not being enough resources to start the Job.
Is it correct to run this process as a Job?
How do I connect the Job to Google Cloud Storage?
Or should I be doing something completely different?
I'm guessing it would be unwise to run pg_dump in the database Container (with kubectl exec) due to a performance hit, but maybe this is ok in a dev/staging server?
As #Marco Lamina said you can run pg_dump on postgres pod like
DUMP
// pod-name name of the postgres pod
// postgres-user database user that is able to access the database
// database-name name of the database
kubectl exec [pod-name] -- bash -c "pg_dump -U [postgres-user] [database-name]" > database.sql
RESTORE
// pod-name name of the postgres pod
// postgres-user database user that is able to access the database
// database-name name of the database
cat database.sql | kubectl exec -i [pod-name] -- psql -U [postgres-user] -d [database-name]
You can have a job pod that does run this command and exports this to a file storage system such as AWS s3.
I think running pg_dump as a job is a good idea, but connecting directly to your DB's persistent disk is not. Try having pg_dump connect to your DB over the network! You could then have a second disk onto which your pg_dump command dumps the backups. To be on the safe side, you can create regular snapshots of this second disk.
The reason for the Jobs POD to stay in Pending state is that it forever tries to attach/mount the GCE persistent disk and fails to do so because it is already attached/mounted to another POD.
Attaching a persistent disk to multiple PODs is only supported if all of them attach/mount the volume in ReadOnly mode. This is of course no viable solution for you.
I never worked with GCE, but it should be possible to easily create a snapshot from the PD from within GCE. This would not give a very clean backup, more like something in the state of "crashed in the middle, but recoverable", but this is probably acceptable for you.
Running pg_dump inside the database POD is a viable solution, with a few drawbacks as you already noticed, especially performance. You'd also have to move out the resulting backup from the POD afterwards, e.g. by using kubectl cp and another exec to cleanup the backup in the POD.
You can use Minio Client
First of all use simple dockerfile to make docker image contains postgres along with minio client (let name this image postgres_backup):
FROM postgres
RUN apt-get update && apt-get install -y wget
RUN wget https://dl.min.io/client/mc/release/linux-amd64/mc
RUN chmod +x mc
RUN ./mc alias set gcs https://storage.googleapis.com BKIKJAA5BMMU2RHO6IBB V8f1CwQqAcwo80UEIJEjc5gVQUSSx5ohQ9GSrr12
Now you can use postgres_backup image in your CronJob (I assumed you made backups bucket in your Google storage):
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: backup-job
spec:
# Backup the database every day at 2AM
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: postgres-backup
image: postgres_backup
env:
- name: POSTGRES_HOST_AUTH_METHOD
value: trust
command: ["/bin/sh"]
args: ["-c", 'pg_dump -Fc -U [Your Postgres Username] -W [Your Postgres Password] -h [Your Postgres Host] [Your Postgres Database] | ./mc pipe gcs/backups/$(date -Iseconds).dump']
restartPolicy: Never
A lot of tutorials use kubectl cp or transfer the file inside the pod, but you can also pipe the pg_dump container output directly to another process.
kubectl run --env=PGPASSWORD=$PASSWORD --image=bitnami/postgresql postgresql -it --rm -- \
bash -c "pg_dump -U $USER -h $HOST -d $DATABASE" |\
gzip > backup.sql.gz
The easiest way to dump without storing any additional copies on your pod:
kubectl -n [namespace] exec -it [pod name] -- bash -c "export PGPASSWORD='[db password]'; pg_dump -U [db user] [db name]" > [database].sql

Docker: change folder where to store docker volumes

On my Ubuntu EC2 I host an application using docker containers. db data and upload data is being stored in volumes CaseBook-data-db and CaseBook-data-uploads which are being created with this commands:
docker volume create --name=CaseBook-data-db
docker volume create --name=CaseBook-data-uploads
Volumes being attached through docker-compose file:
version: '2'
services:
mongo:
container_name: "CaseBook-db"
restart: always
image: mongo:3.2.7
ports:
- "27017"
volumes:
- data_db:/data/db
labels:
- "ENVIRONMENT_TYPE=meteor"
app:
container_name: "CaseBook-app"
restart: always
image: "meteor/casebook"
build: .
depends_on:
- mongo
environment:
- MONGO_URL=mongodb://mongo:27017/CaseBook
ports:
- "80:3000"
volumes:
- data_uploads:/Meteor-CaseBook-Container/.uploads
labels:
- "ENVIRONMENT_TYPE=meteor"
volumes:
data_db:
external:
name: CaseBook-data-db
data_uploads:
external:
name: CaseBook-data-uploads
I need to store those docker volumes in different folder(for example /home/ubuntu/data/) of the host machine. How to change docker storage folder for volumes? Or there is a better way in doing this? Thank you in advance.
Named volumes will be stored inside docker's folder (/var/lib/docker). If you want to create a volume in a specific host folder, use a host volume with the following syntax:
docker run -v /home/ubuntu/data/app-data:/app-data my-image
Or from your compose file:
version: '2'
services:
mongo:
container_name: "CaseBook-db"
restart: always
image: mongo:3.2.7
ports:
- "27017"
volumes:
- /home/ubuntu/data/db:/data/db
labels:
- "ENVIRONMENT_TYPE=meteor"
app:
container_name: "CaseBook-app"
restart: always
image: "meteor/casebook"
build: .
depends_on:
- mongo
environment:
- MONGO_URL=mongodb://mongo:27017/CaseBook
ports:
- "80:3000"
volumes:
- /home/ubuntu/data/uploads:/Meteor-CaseBook-Container/.uploads
labels:
- "ENVIRONMENT_TYPE=meteor"
With host volumes, any contents of the volume inside the image will be overlaid with the exact contents of the host folder, including UID's of the host folder. An empty host folder is not initialized from the image the way an empty named volume is. UID mappings tend to be the most difficult part of using a host volume.
Edit: from the comments below, if you need a named volume that acts as a host volume, there is a local persist volume plugin that's listed on docker's plugin list. After installing the plugin, you can create volumes that point to host folders, with the feature that even after removing the named volume, the host directory is left behind. Sample usage from the plugin includes:
docker volume create -d local-persist -o mountpoint=/data/images --name=images
docker run -d -v images:/path/to/images/on/one/ one
docker run -d -v images:/path/to/images/on/two/ two
They also include a v2 compose file with the following volume example:
volumes:
data:
driver: local-persist
driver_opts:
mountpoint: /data/local-persist/data
One additional option that I've been made aware of in the past month is to use the local volume driver's mount options to manually create a bind mount. This is similar to a host volume in docker with the following differences:
If the directory doesn't exist, trying to start a container with a named volume pointing to a bind mount will fail. With host volumes, docker will initialize it to an empty directory owned by root.
If the directory is empty, a named volume will initialize the bind mount with the contents of the image at the mount location, including file and directory ownership/permissions. With a host volume, there is no initialization of the host directory contents.
To create a named volume as a bind mount, you can create it in advance with:
docker volume create --driver local \
--opt type=none \
--opt device=/home/user/test \
--opt o=bind \
test_vol
From a docker run command, this can be done with --mount:
docker run -it --rm \
--mount type=volume,dst=/container/path,volume-driver=local,volume-opt=type=none,volume-opt=o=bind,volume-opt=device=/home/user/test \
foo
Or in a compose file, you can create the named volume with:
volumes:
data:
driver: local
driver_opts:
type: none
o: bind
device: /home/user/test
My preference would be to use the named volume with the local driver instead of the local-persist 3rd party driver if you need the named volume features.
Another way with build-in driver local:
docker volume create --opt type=none --opt device=/home/ubuntu/data/ --opt o=bind data_db
(This use DimonVersace example with: data_db declared as external named volume in docker-compose and /home/ubuntu/data/ as the folder on the host machine)