I have a Postgres DB container which is running in a Kubernetes cluster. I need to write a Kubernetes job to connect to the Postgres DB container and run the scripts from SQL file. I need to understand two things here
commands to run SQL script
how to load SQL file in Job.yaml file
Here is my sample yaml file for Kubernetes job
apiVersion: batch/v1
kind: Job
metadata:
name: init-db
spec:
template:
metadata:
name: init-db
labels:
app: init-postgresdb
spec:
containers:
- image: "docker.io/bitnami/postgresql:11.5.0-debian-9-r60"
name: init-db
command:
- psql -U postgres
env:
- name: DB_HOST
value: "knotted-iguana-postgresql"
- name: DB_DATABASE
value: "postgres"
restartPolicy: OnFailure
You have to mount the SQL file as a volumen from a configmap and use the psql cli to execute the commands from mounted file.
To execute commands from file you can change the command parameter on the yaml by this:
psql -a -f sqlCommand.sql
The configmap needs to be created using the file you pretend to mount more info here
kubectl create configmap sqlCommands.sql --from-file=sqlCommands.sql
Then you have to add the configmap and the mount statement on your job yaml and modify the command to use the mounted file.
apiVersion: batch/v1
kind: Job
metadata:
name: init-db
spec:
template:
metadata:
name: init-db
labels:
app: init-postgresdb
spec:
containers:
- image: "docker.io/bitnami/postgresql:11.5.0-debian-9-r60"
name: init-db
command: [ "bin/sh", "-c", "psql -a -f /sqlCommand.sql" ]
volumeMounts:
- name: sqlCommand
mountPath: /sqlCommand.sql
env:
- name: DB_HOST
value: "knotted-iguana-postgresql"
- name: DB_DATABASE
value: "postgres"
volumes:
- name: sqlCommand
configMap:
# Provide the name of the ConfigMap containing the files you want
# to add to the container
name: sqlCommand.sql
restartPolicy: OnFailure
You should make a docker file for the same first, execute it and map the same working docker image to the kubernetes job yaml file.
You can add an entrypoint.sh in docker file, where you can place your scripts to be executed
Related
I'm running a StatefulSet where each replica requires its own unique configuration. To achieve that I'm currently using a configuration with two containers per Pod:
An initContainer prepares the configuration and stores it to a shared volume
A main container consumes the configuration by outputting the contents of the shared volume and passing it to the program as CLI flags.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: my-app
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: my-app
serviceName: my-app
template:
metadata:
labels:
app.kubernetes.io/name: my-app
spec:
initContainers:
- name: generate-config
image: myjqimage:latest
command: [ "/bin/sh" ]
args:
- -c
- |
set -eu -o pipefail
POD_INDEX="${HOSTNAME##*-}"
# A configuration is stored as a JSON array in a Secret
# E.g., [{"param1":"string1","param2":"string2"}]
echo "$MY_APP_CONFIG" | jq -rc --arg i "$POD_INDEX" '.[$i|tonumber-1].param1' > /config/param1
echo "$MY_APP_CONFIG" | jq -rc --arg i "$POD_INDEX" '.[$i|tonumber-1].param2' > /config/param2
env:
- name: MY_APP_CONFIG
valueFrom:
secretKeyRef:
name: my-app
key: config
volumeMounts:
- name: configs
mountPath: "/config"
containers:
- name: my-app
image: myapp:latest
command:
- /bin/sh
args:
- -c
- |
/myapp --param1 $(cat /config/param1) --param2 $(cat /config/param2)
volumeMounts:
- name: configs
mountPath: "/config"
volumes:
- name: configs
emptyDir:
medium: "Memory"
---
apiVersion: v1
kind: Secret
metadata:
name: my-app
namespace: default
labels:
app.kubernetes.io/name: my-app
type: Opaque
data:
config: W3sicGFyYW0xIjoic3RyaW5nMSIsInBhcmFtMiI6InN0cmluZzIifV0=
Now I want to switch to distroless for my main container. Distroless images only contain the required dependencies to run the program (glibc in my case). And it is missing a shell. So if previously I could execute cat and output the contents of a file. Now I'm a bit stuck.
Now instead of reading the contents from file, I should pass the CLI flags defined as environment variables. Something like this:
containers:
- name: my-app
image: myapp:latest
command: ["/myapp", "--param1", "$(PARAM1)", "--param2", "$(PARAM2)"]
env:
- name: PARAM1
value: somevalue1
- name: PARAM2
value: somevalue2
Again, each Pod in a StatefulSet should have a unique configuration. I.e., PARAM1 and PARAM2 should be unique across the Pods in a StatefulSet. How do I achieve that?
Options I considered:
Using Debug Containers -- a new feature of K8s. Somehow use it to edit the configuration of a running container in runtime and inject the required variables. But the feature just became beta in 1.23. And I don't want to mutate my StatefulSet in runtime as I'm using a GitOps approach to store the configuration in Git. It'll probably cause a continuous configuration drift
Using a Job to mutate the configuration in runtime. Again, looks very ugly and violates the GitOps principle
Using shareProcessNamespace. Unsure if it can help but maybe I can somehow inject the environment variables from within the initContainer
Limitations:
Application only supports configuration provisioned through CLI flags. No environment variables, no loading the config from a file
I have my dockerfile in which i have used postgres:12 image and i modified it using some ddl scripts and then i build this image and i can run the container through docker run command but how i can use Kubernetes jobs to run build image , as I dont have good exp on k8s.
This is my dockerfile here you can see it.
docker build . -t dockerdb
FROM postgres:12
ENV POSTGRES_PASSWORD xyz#123123!233
ENV POSTGRES_DB test
ENV POSTGRES_USER test
COPY ./Scripts /docker-entrypoint-initdb.d/
How i can customize the below code using the below requirement
apiVersion: batch/v1
kind: Job
metadata:
name: job-1
spec:
template:
metadata:
name: job-1
spec:
containers:
- name: postgres
image: gcr.io/project/pg_12:dev
command:
- /bin/sh
- -c
- "not sure what command should i give in last line"
Not sure how you are running the docker image
if you are running your docker image without passing any command you can directly run the image in Job.
docker run <imagename>
once your Dockerimage is ready and build you can run it directly
You job will get executed without passing any command
apiVersion: batch/v1
kind: Job
metadata:
name: job-1
spec:
template:
metadata:
name: job-1
spec:
containers:
- name: postgres
image: gcr.io/project/pg_12:dev
if you want to pass any argument or command that you can pass further
apiVersion: batch/v1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: <CHANGE IMAGE URL>
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Just to update above template is for Cronjob, Cronjob run on specific time.
Currently I'm trying to create a PostgreSQL Deployment for replication, using the postgres:latest image.
The config files are located by default within the data directory /var/lib/postgresql/data. For replication to work I need the data directory to be empty, but that means I have to keep the config files elsewhere.
Referring to the PostgreSQL Documentation:
If you wish to keep the configuration files elsewhere than the data directory, the postgres -D command-line option or PGDATA environment variable must point to the directory containing the configuration files, and the data_directory parameter must be set in postgresql.conf (or on the command line) to show where the data directory is actually located. Notice that data_directory overrides -D and PGDATA for the location of the data directory, but not for the location of the configuration files.
In a physical machine setup, we can manually move the files and set the location of data-directory in the postgresql.conf file. However in Kubernetes it is not so straight-forward.
I tried to use volumeMount with subPath to mount the config files in another location, then use command to change the new location of postgresql.conf.
Sample .yaml file:
apiVersion: v1
kind: ConfigMap
metadata:
name: pg-replica
labels:
app: postgres
name: pg-replica
data:
POSTGRES_DB: postgres
POSTGRES_USER: postgres
POSTGRES_PASSWORD: mypassword
pg_hba.conf: |
# Contents
postgresql.conf: |
data_directory = /var/lib/postgresql/data/data-directory
recovery.conf: |
# Contents
---
apiVersion: v1
kind: Service
metadata:
name: pg-replica
labels:
app: postgres
name: pg-replica
spec:
type: NodePort
ports:
- nodePort: 31000
port: 5432
selector:
app: postgres
name: pg-replica
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: pg-replica
spec:
selector:
matchLabels:
app: postgres
name: pg-replica
replicas: 1
template:
metadata:
labels:
app: postgres
name: pg-replica
spec:
containers:
- name: pg-replica
image: postgres:latest
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
envFrom:
- configMapRef:
name: pg-replica
volumeMounts:
- name: pg-replica
mountPath: /var/lib/postgresql/data
- name: replica-config
mountPath: /var/lib/postgresql/postgresql.conf
subPath: postgresql.conf
- name: replica-config
mountPath: /var/lib/postgresql/pg_hba.conf
subPath: pg_hba.conf
- name: replica-config
mountPath: /var/lib/postgresql/recovery.conf
subPath: recovery.conf
command:
- "/bin/bash"
- "postgres -c config_file=/var/lib/postgresql/postgresql.conf"
volumes:
- name: pg-replica
persistentVolumeClaim:
claimName: pv-replica-claim
- name: replica-config
configMap:
name: pg-replica
The returned message was as following:
/bin/bash: postgres -c config_file=/var/lib/postgresql/postgresql.conf: No such file or directory
What is wrong with this configuration? And what steps am I missing to make it work?
Edit:
When using the volumeMount field, the directory is overwritten (all other files were removed) despite I specified the exact file to mount on with subPath. What could be the cause for this?
I realized there were a few mistakes here after posting this question...
I used PostgreSQL 11 for replication prior so I assumed they worked the same way (which of course is wrong, there are some changes). The recovery.conf is omitted from PostgreSQL 12 and it gave this error message FATAL: XX000: using recovery command file "recovery.conf" is not supported when I had it. so I had to remove it from my ConfigMap.
I confused about Docker's Entrypoint & Command to Kubernetes' Command & Args. After being corrected by my senior that Kubernetes Command will override the Docker Entrypoint, I'm going to need and use only Args afterwards.
The following are the changes I made to my ConfigMap and Deployment.
apiVersion: v1
kind: ConfigMap
metadata:
name: pg-replica
labels:
app: postgres
name: pg-replica
data:
POSTGRES_DB: postgres
POSTGRES_USER: postgres
POSTGRES_PASSWORD: mypassword
pg_hba.conf: |
# Contents
postgresql.conf: |
data_directory = '/var/lib/postgresql/data'
# the contents from recovery.conf are intergrated into postgresql.conf
primary_conninfo = # host address and authentication credentials
promote_trigger_file = # trigger file path
extra.sh: |
#!/bin/sh
postgres -D /var/lib/postgresql
---
apiVersion: v1
kind: Service
metadata:
name: pg-replica
labels:
app: postgres
name: pg-replica
spec:
type: NodePort
ports:
- nodePort: 31000
port: 5432
selector:
app: postgres
name: pg-replica
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: pg-replica
spec:
selector:
matchLabels:
app: postgres
name: pg-replica
replicas: 1
template:
metadata:
labels:
app: postgres
name: pg-replica
spec:
containers:
- name: pg-replica
image: postgres:latest
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
envFrom:
- configMapRef:
name: pg-replica
volumeMounts:
- name: pg-replica
mountPath: /var/lib/postgresql/data
- name: replica-config
mountPath: /var/lib/postgresql/postgresql.conf
subPath: postgresql.conf
- name: replica-config
mountPath: /var/lib/postgresql/pg_hba.conf
subPath: pg_hba.conf
- name: replica-config
mountPath: /docker-entrypoint-initdb.d/extra.sh
subPath: extra.sh
args:
- "-c"
- "config_file=/var/lib/postgresql/postgresql.conf"
- "-c"
- "hba_file=/var/lib/postgresql/pg_hba.conf"
volumes:
- name: pg-replica
persistentVolumeClaim:
claimName: pv-replica-claim
- name: replica-config
configMap:
name: pg-replica
The arguments in Args will set the location of the .conf files to where I specified.
For further steps in Replication:
After the pod is up, I manually ran the pod's shell with kubectl exec.
I removed all the files from the data-directory for step 3 (to copy files from master pod).
rm -rf /var/lib/postgresql/data/*
Use pg_basebackup to backup data from master node.
pg_basebackup -h <host IP> --port=<port number used> -D /var/lib/postgresql/data -P -U replica -R -X stream
And that's it. Now I managed to have my pg-replica pod replicating my master pod.
As mentioned in comments, I really encorage you to use the Postgres Helm chart to setup yout environment.
The way you solved the issue could work, but if the pod died for some reason, all work you have done will be lost and you'll need to reconfigure everything again.
Here you can found all information about how to create a postgres deployment with high availability and replication.
To install HELM you can follow this guide.
As the documentation shows, you should be setting the env vars when doing a docker run like the following:
docker run --name some-postgres -e POSTGRES_PASSWORD='foo' POSTGRES_USER='bar'
This sets the superuser and password to access the database instead of the defaults of POSTGRES_PASSWORD='' and POSTGRES_USER='postgres'.
However, I'm using Skaffold to spin up a k8s cluster and I'm trying to figure out how to do something similar. How does one go about doing this for Kubernetes and Skaffold?
#P Ekambaram is correct but I would like to go further into this topic and explain the "whys and hows".
When passing passwords on Kubernetes, it's highly recommended to use encryption and you can do this by using secrets.
Creating your own Secrets (Doc)
To be able to use the secrets as described by #P Ekambaram, you need to have a secret in your kubernetes cluster.
To easily create a secret, you can also create a Secret from generators and then apply it to create the object on the Apiserver. The generators should be specified in a kustomization.yaml inside a directory.
For example, to generate a Secret from literals username=admin and password=secret, you can specify the secret generator in kustomization.yaml as
# Create a kustomization.yaml file with SecretGenerator
$ cat <<EOF >./kustomization.yaml
secretGenerator:
- name: db-user-pass
literals:
- username=admin
- password=secret
EOF
Apply the kustomization directory to create the Secret object.
$ kubectl apply -k .
secret/db-user-pass-dddghtt9b5 created
Using Secrets as Environment Variables (Doc)
This is an example of a pod that uses secrets from environment variables:
apiVersion: v1
kind: Pod
metadata:
name: secret-env-pod
spec:
containers:
- name: mycontainer
image: redis
env:
- name: SECRET_USERNAME
valueFrom:
secretKeyRef:
name: mysecret
key: username
- name: SECRET_PASSWORD
valueFrom:
secretKeyRef:
name: mysecret
key: password
restartPolicy: Never
Source: here and here.
Use the below YAML
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
replicas: 1
template:
metadata:
labels:
name: postgres
spec:
containers:
- name: postgres
image: postgres:11.2
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: "sampledb"
- name: POSTGRES_USER
value: "postgres"
- name: POSTGRES_PASSWORD
value: "secret"
volumeMounts:
- name: data
mountPath: /var/lib/postgresql
volumes:
- name: data
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: postgres
spec:
type: ClusterIP
ports:
- port: 5432
selector:
name: postgres
A problem:
Docker arguments will pass from command line:
docker run -it -p 8080:8080 joethecoder2/spring-boot-web -Dcassandra_ip=127.0.0.1 -Dcassandra_port=9042
However, Kubernetes POD arguments will not pass from singlePod.yaml file:
apiVersion: v1
kind: Pod
metadata:
name: spring-boot-web-demo
labels:
purpose: demonstrate-spring-boot-web
spec:
containers:
- name: spring-boot-web
image: docker.io/joethecoder2/spring-boot-web
env: ["name": "-Dcassandra_ip", "value": "127.0.0.1"]
command: ["java","-jar", "spring-boot-web-0.0.1-SNAPSHOT.jar", "-D","cassandra_ip=127.0.0.1", "-D","cassandra_port=9042"]
args: ["-Dcassandra_ip=127.0.0.1", "-Dcassandra_port=9042"]
restartPolicy: OnFailure
when I do:
kubectl create -f ./singlePod.yaml
Why don't you pass the arguments as env variables? It looks like you're using spring boot, so this shouldn't even require changes in the code since spring boot injects env variables.
The following should work:
apiVersion: v1
kind: Pod
metadata:
name: spring-boot-web-demo
labels:
purpose: demonstrate-spring-boot-web
spec:
containers:
- name: spring-boot-web
image: docker.io/joethecoder2/spring-boot-web
command: ["java","-jar", "spring-boot-web-0.0.1-SNAPSHOT.jar"]
env:
- name: cassandra_ip
value: "127.0.0.1"
- name: cassandra_port
value: "9042"