kubernetes lifecycle commands not running - kubernetes

I have been trying to get prestop to run a script before the pod terminates (to prolong the termination until the current job has finished), but command doesn't seem to be executing the commands. I've temporarily added an echo command, which i would expect to see in kubectl logs for the pod, i can't see this either.
This is part of the (otherwise working) deployment spec:
containers:
- name: file-blast-app
image: my_image:stuff
imagePullPolicy: Always
lifecycle:
preStop:
exec:
command: ["echo","PRE STOP!"]
Does anyone know why this would not be working and if i'm correct to expect the logs from hook in kubectl logs for the pod?

You forgot to mention the shell through which you want this command to be executed.
Try using the following in your YAML.
containers:
- name: file-blast-app
image: my_image:stuff
imagePullPolicy: Always
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","echo PRE STOP!"]
Also, one thing to note is that a preStop hook only gets executed when a pod is terminated, and not when it is completed. You can read more on this here.
You can also refer to the K8S official documentation for lifecycle hooks here.

Related

Ensure pod deletion when one of container terminates [duplicate]

I have a Kubernetes JOB that does database migrations on a CloudSQL database.
One way to access the CloudSQL database from GKE is to use the CloudSQL-proxy container and then connect via localhost. Great - that's working so far. But because I'm doing this inside a K8s JOB the job is not marked as successfully finished because the proxy keeps on running.
$ kubectrl get po
NAME READY STATUS RESTARTS AGE
db-migrations-c1a547 1/2 Completed 0 1m
Even though the output says 'completed' one of the initially two containers is still running - the proxy.
How can I make the proxy exit on completing the migrations inside container 1?
The best way I have found is to share the process namespace between containers and use the SYS_PTRACE securityContext capability to allow you to kill the sidecar.
apiVersion: batch/v1
kind: Job
metadata:
name: my-db-job
spec:
template:
spec:
restartPolicy: OnFailure
shareProcessNamespace: true
containers:
- name: my-db-job-migrations
command: ["/bin/sh", "-c"]
args:
- |
<your migration commands>;
sql_proxy_pid=$(pgrep cloud_sql_proxy) && kill -INT $sql_proxy_pid;
securityContext:
capabilities:
add:
- SYS_PTRACE
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.17
command:
- "/cloud_sql_proxy"
args:
- "-instances=$(DB_CONNECTION_NAME)=tcp:5432"
One possible solution would be a separate cloudsql-proxy deployment with a matching service. You would then only need your migration container inside the job that connects to your proxy service.
This comes with some downsides:
higher network latency, no pod local mysql communication
possible security issue if you provide the sql port to your whole kubernetes cluster
If you want to open cloudsql-proxy to the whole cluster you have to replace tcp:3306 with tcp:0.0.0.0:3306 in the -instance parameter on the cloudsql-proxy.
There are 3 ways of doing this.
1- Use private IP to connect your K8s job to Cloud SQL, as described by #newoxo in one of the answers. To do that, your cluster needs to be a VPC-native cluster. Mine wasn't and I was not whiling to move all my stuff to a new cluster. So I wasn't able to do this.
2- Put the Cloud SQL Proxy container in a separate deployment with a service, as described by #Christian Kohler. This looks like a good approach, but it is not recommended by Google Cloud Support.
I was about to head in this direction (solution #2) but I decided to try something else.
And here is the solution that worked for me:
3- You can communicate between different containers in the same Pod/Job using the file system. The idea is to tell the Cloud SQL Proxy container when the main job is done, and then kill the cloud sql proxy. Here is how to do it:
In the yaml file (my-job.yaml)
apiVersion: v1
kind: Pod
metadata:
name: my-job-pod
labels:
app: my-job-app
spec:
restartPolicy: OnFailure
containers:
- name: my-job-app-container
image: my-job-image:0.1
command: ["/bin/bash", "-c"]
args:
- |
trap "touch /lifecycle/main-terminated" EXIT
{ your job commands here }
volumeMounts:
- name: lifecycle
mountPath: /lifecycle
- name: cloudsql-proxy-container
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/bin/sh", "-c"]
args:
- |
/cloud_sql_proxy -instances={ your instance name }=tcp:3306 -credential_file=/secrets/cloudsql/credentials.json &
PID=$!
while true
do
if [[ -f "/lifecycle/main-terminated" ]]
then
kill $PID
exit 0
fi
sleep 1
done
securityContext:
runAsUser: 2 # non-root user
allowPrivilegeEscalation: false
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
- name: lifecycle
mountPath: /lifecycle
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
- name: lifecycle
emptyDir:
Basically, when your main job is done, it will create a file in /lifecycle that will be identified by the watcher added to the cloud-sql-proxy container, which will kill the proxy and terminate the container.
I hope it helps! Let me know if you have any questions.
Based on: https://stackoverflow.com/a/52156131/7747292
Doesn't look like Kubernetes can do this alone, you would need to manually kill the proxy once the migration exits. Similar question asked here: Sidecar containers in Kubernetes Jobs?
Google cloud sql has recently launched private ip address connectivity for cloudsql. If the cloud sql instance and kubernetes cluster is in same region you can connect to cloudsql without using cloud sql proxy.
https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine#private-ip
A possible solution would be to set the concurrencyPolicy: Replace in the job spec ... this will agnostically replace the current pod with the new instance whenever it needs to run again. But, you have to make sure that the subsequent cron runs are separated enough.
Unfortunately the other answers weren't working for me because of CloudSQLProxy running in a distroless environment where there is no shell.
I managed to get around this by bundling a CloudSQLProxy binary with my deployment and running a bash script to start up CloudSQLProxy followed by my app.
Dockerfile:
FROM golang:1.19.4
RUN apt update
COPY . /etc/mycode/
WORKDIR /etc/mycode
RUN chmod u+x ./scripts/run_migrations.sh
RUN chmod u+x ./bin/cloud_sql_proxy.linux-amd64
RUN go install
ENTRYPOINT ["./scripts/run_migrations.sh"]
Shell Script (run_migrations.sh):
#!/bin/sh
# This script is run from the parent directory
dbConnectionString=$1
cloudSQLProxyPort=$2
echo "Starting Cloud SQL Proxy"
./bin/cloud_sql_proxy.linux-amd64 -instances=${dbConnectionString}=tcp:5432 -enable_iam_login -structured_logs &
CHILD_PID=$!
echo "CloudSQLProxy PID: $CHILD_PID"
echo "Migrating DB..."
go run ./db/migrations/main.go
MAIN_EXIT_CODE=$?
kill $CHILD_PID;
echo "Migrations complete.";
exit $MAIN_EXIT_CODE
K8s (via Pulumi):
import * as k8s from '#pulumi/kubernetes'
const jobDBMigrations = new k8s.batch.v1.Job("job-db-migrations", {
metadata: {
namespace: namespaceName,
labels: appLabels,
},
spec: {
backoffLimit: 4,
template: {
spec: {
containers: [
{
image: pulumi.interpolate`gcr.io/${gcpProject}/${migrationsId}:${migrationsVersion}`,
name: "server-db-migration",
args: [
dbConnectionString,
],
},
],
restartPolicy: "Never",
serviceAccount: k8sSAMigration.metadata.name,
},
},
},
},
{
provider: clusterProvider,
});

k8s check wait for pod existence

In K8s job file declared in yaml (with helm), I need that the job will run only whether the pod database exists and ready.
For the readiness, it works fine, since I added the following:
initContainers
- name: wait-mysql-ready
image: "bitnami//kubectl:latest"
imagePullPolicy: IfNotPresent
command:
- kubectl
args:
- wait
- pod/mysql-pod
- --for=condition=ready
- --timeout=120s
It works fine, but I need the job to run once (without duplications, and with long time period till it ends).
The job doesn't as usual run once, and it's name, when running kubectl get pods is <jobname-hash>
If it doesn't run once, at a the last attempt it will succeed (because the pod isn't created yet). Other attempt may failed.
So I added the following lines in main spec:
spec:
parallelism: 1
completions: 1
backoffLimit: 0
and in init container section (befor the previous one), I have added:
initContainers
- name: wait-mysql-exist-pod
image: "bitnami/kubectl:latest"
imagePullPolicy: IfNotPresent
command:
- /bin/sh
args:
- -c
- "while !(kubectl get pod mysql-pod); do echo 'Waiting for mysql pod to be existed...'; sleep 5; done"
(I could not find other good syntax for ! - for multiline string. I would be glade knowing how).
Also need another job to wait for the current job.
How can I create a job that run once, and check in the job that the pod exists before checking the ready state?
You can use this script as init-container to wait for other resources.

Execute a command after pod startup without overriding image entrypoint

I want to know if there is any solution in order to submit a flink job to a kubernetes cluster.
In the jobmanager deployment file after startup I tried to add a command option to my jobmanager pod but I realized that the command I passed override the image entrypoint.
So I want to know if there is a solution to do so?
Yes, if you provide a command and/or its args, it overrides the original image's Entrypoint and/or Cmd. If you want to know how exactly it happens, please refer to this fragment of the official kubernetes docs.
If you want to run some additional command immediatelly after your Pod startup, you can do it with a postStart handler, which usage is presented in this example:
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
containers:
- name: lifecycle-demo-container
image: nginx
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
preStop:
exec:
command: ["/bin/sh","-c","nginx -s quit; while killall -0 nginx; do sleep 1; done"]
Not specifically. Really it's up to your application. The only thing Kubernetes can control is what the initial command run inside the container is and there can only be one of those.

Kubernetes can analytical jobs be chained together in a workflow?

Reading the Kubernetes "Run to Completion" documentation, it says that jobs can be run in parallel, but is it possible to chain together a series of jobs that should be run in sequential order (parallel and/or non-parallel).
https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
Or is it up to the user to keep track of which jobs have finished and triggering the next job using a PubSub messaging service?
I have used initContainers under the PodSpec in the past to solve problems like this: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app: myapp
spec:
containers:
- name: myapp-container
image: busybox
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
initContainers:
- name: init-myservice
image: busybox
command: ['sh', '-c', 'until nslookup myservice; do echo waiting for myservice; sleep 2; done;']
- name: init-mydb
image: busybox
command: ['sh', '-c', 'until nslookup mydb; do echo waiting for mydb; sleep 2; done;']
Take a look here for the chaining of containers using the "depends" keyword is also an option:
https://github.com/kubernetes/kubernetes/issues/1996
Overall, no. Check out things like Airflow for this. Job objects give you a pretty simple way to run a container until it completes, that's about it. The parallelism is in that you can run multiple copies, it's not a full workflow management system :)
It is not possible to manage job workflows with Kubernetes core API objects.
Argo Workflow looks like an interesting tool to manage workflow inside Kubernetes: https://argoproj.github.io/projects/argo.
It looks like it can handle Kubernetes jobs workflow: https://argoproj.github.io/argo/examples/#kubernetes-resources
It is in the CNCF incubator: https://www.cncf.io/blog/2020/04/07/toc-welcomes-argo-into-the-cncf-incubator/
Other alternatives include:
https://www.nextflow.io/
https://www.pachyderm.com/
Airflow: https://airflow.apache.org/docs/apache-airflow/stable/kubernetes.html
This document might also help: https://www.preprints.org/manuscript/202001.0378/v1/download
Nearly 3 years later, I'll throw another answer into the mix.
Kubeflow Pipelines
https://www.kubeflow.org/docs/components/pipelines/overview/pipelines-overview/
Which actually use Argo under the hood.

PreStop hook in kubernetes never gets executed

I am trying to create a little Pod example with two containers that share data via an emptyDir volume. In the first container I am waiting a couple of seconds before it gets destroyed.
In the postStart I am writing a file to the shared volume with the name "started", in the preStop I am writing a file to the shared volume with the name "finished".
In the second container I am looping for a couple of seconds outputting the content of the shared volume but the "finished" file never gets created. Describing the pod doesn't show an error with the hooks either.
Maybe someone has an idea what I am doing wrong
apiVersion: v1
kind: Pod
metadata:
name: shared-data-example
labels:
app: shared-data-example
spec:
volumes:
- name: shared-data
emptyDir: {}
containers:
- name: first-container
image: ubuntu
command: ["/bin/bash"]
args: ["-c", "for i in {1..4}; do echo Welcome $i;sleep 1;done"]
imagePullPolicy: Never
env:
- name: TERM
value: xterm
volumeMounts:
- name: shared-data
mountPath: /myshareddata
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "echo First container finished > /myshareddata/finished"]
postStart:
exec:
command: ["/bin/sh", "-c", "echo First container started > /myshareddata/started"]
- name: second-container
image: ubuntu
command: ["/bin/bash"]
args: ["-c", "for i in {1..20}; do ls /myshareddata;sleep 1;done"]
imagePullPolicy: Never
env:
- name: TERM
value: xterm
volumeMounts:
- name: shared-data
mountPath: /myshareddata
restartPolicy: Never
It is happening because the final status of your pod is Completed and applications inside containers stopped without any external calls.
Kubernetes runs preStop hook only if pod resolves an external signal to stop. Hooks were made to implement a graceful custom shutdown for applications inside a pod when you need to stop it. In your case, your application is already gracefully stopped by itself, so Kubernetes has no reason to call the hook.
If you want to check how a hook works, you can try to create Deployment and update it's image by kubectl rolling-update, for example. In that case, Kubernetes will stop the old version of the application, and preStop hook will be called.