Why Kubernetes Pod gets into Terminated state giving Completed reason and exit code 0? - kubernetes

I am struggling to find any answer to this in the Kubernetes documentation. The scenario is the following:
Kubernetes version 1.4 over AWS
8 pods running a NodeJS API (Express) deployed as a Kubernetes Deployment
One of the pods gets restarted for no apparent reason late at night (no traffic, no CPU spikes, no memory pressure, no alerts...). Number of restarts is increased as a result of this.
Logs don't show anything abnormal (ran kubectl -p to see previous logs, no errors at all in there)
Resource consumption is normal, cannot see any events about Kubernetes rescheduling the pod into another node or similar
Describing the pod gives back TERMINATED state, giving back COMPLETED reason and exit code 0. I don't have the exact output from kubectl as this pod has been replaced multiple times now.
The pods are NodeJS server instances, they cannot complete, they are always running waiting for requests.
Would this be internal Kubernetes rearranging of pods? Is there any way to know when this happens? Shouldn't be an event somewhere saying why it happened?
Update
This just happened in our prod environment. The result of describing the offending pod is:
api:
Container ID: docker://7a117ed92fe36a3d2f904a882eb72c79d7ce66efa1162774ab9f0bcd39558f31
Image: 1.0.5-RC1
Image ID: docker://sha256:XXXX
Ports: 9080/TCP, 9443/TCP
State: Running
Started: Mon, 27 Mar 2017 12:30:05 +0100
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 24 Mar 2017 13:32:14 +0000
Finished: Mon, 27 Mar 2017 12:29:58 +0100
Ready: True
Restart Count: 1
Update 2
Here it is the deployment.yaml file used:
apiVersion: "extensions/v1beta1"
kind: "Deployment"
metadata:
namespace: "${ENV}"
name: "${APP}${CANARY}"
labels:
component: "${APP}${CANARY}"
spec:
replicas: ${PODS}
minReadySeconds: 30
revisionHistoryLimit: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
component: "${APP}${CANARY}"
spec:
serviceAccount: "${APP}"
${IMAGE_PULL_SECRETS}
containers:
- name: "${APP}${CANARY}"
securityContext:
capabilities:
add:
- IPC_LOCK
image: "134078050561.dkr.ecr.eu-west-1.amazonaws.com/${APP}:${TAG}"
env:
- name: "KUBERNETES_CA_CERTIFICATE_FILE"
value: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
- name: "NAMESPACE"
valueFrom:
fieldRef:
fieldPath: "metadata.namespace"
- name: "ENV"
value: "${ENV}"
- name: "PORT"
value: "${INTERNAL_PORT}"
- name: "CACHE_POLICY"
value: "all"
- name: "SERVICE_ORIGIN"
value: "${SERVICE_ORIGIN}"
- name: "DEBUG"
value: "http,controllers:recommend"
- name: "APPDYNAMICS"
value: "true"
- name: "VERSION"
value: "${TAG}"
ports:
- name: "http"
containerPort: ${HTTP_INTERNAL_PORT}
protocol: "TCP"
- name: "https"
containerPort: ${HTTPS_INTERNAL_PORT}
protocol: "TCP"
The Dockerfile of the image referenced in the above Deployment manifest:
FROM ubuntu:14.04
ENV NVM_VERSION v0.31.1
ENV NODE_VERSION v6.2.0
ENV NVM_DIR /home/app/nvm
ENV NODE_PATH $NVM_DIR/v$NODE_VERSION/lib/node_modules
ENV PATH $NVM_DIR/v$NODE_VERSION/bin:$PATH
ENV APP_HOME /home/app
RUN useradd -c "App User" -d $APP_HOME -m app
RUN apt-get update; apt-get install -y curl
USER app
# Install nvm with node and npm
RUN touch $HOME/.bashrc; curl https://raw.githubusercontent.com/creationix/nvm/${NVM_VERSION}/install.sh | bash \
&& /bin/bash -c 'source $NVM_DIR/nvm.sh; nvm install $NODE_VERSION'
ENV NODE_PATH $NVM_DIR/versions/node/$NODE_VERSION/lib/node_modules
ENV PATH $NVM_DIR/versions/node/$NODE_VERSION/bin:$PATH
# Create app directory
WORKDIR /home/app
COPY . /home/app
# Install app dependencies
RUN npm install
EXPOSE 9080 9443
CMD [ "npm", "start" ]
npm start is an alias for a regular node app.js command that starts a NodeJS server on port 9080.

Check the version of docker you run, and whether the docker daemon was restarted during that time.
If the docker daemon was restarted, all the container would be terminated (unless you use the new "live restore" feature in 1.12). In some docker versions, docker may incorrectly reports "exit code 0" for all containers terminated in this situation. See https://github.com/docker/docker/issues/31262 for more details.

If this is still relevant, we just had a similar problem in our cluster.
We have managed to find more information by inspecting the logs from docker itself. ssh onto your k8s node and run the following:
sudo journalctl -fu docker.service

I hade similar problem when we upgraded to version 2.x Pos get restarted even after the Dags ran successfully.
I later resolved it after a long time of debugging by overriding the pod template and specifying it in the airflow.cfg file.
[kubernetes]
....
pod_template_file = {{ .Values.airflow.home }}/pod_template.yaml
---
# pod_template.yaml
apiVersion: v1
kind: Pod
metadata:
name: dummy-name
spec:
serviceAccountName: default
restartPolicy: Never
containers:
- name: base
image: dummy_image
imagePullPolicy: IfNotPresent
ports: []
command: []

Related

Ensure pod deletion when one of container terminates [duplicate]

I have a Kubernetes JOB that does database migrations on a CloudSQL database.
One way to access the CloudSQL database from GKE is to use the CloudSQL-proxy container and then connect via localhost. Great - that's working so far. But because I'm doing this inside a K8s JOB the job is not marked as successfully finished because the proxy keeps on running.
$ kubectrl get po
NAME READY STATUS RESTARTS AGE
db-migrations-c1a547 1/2 Completed 0 1m
Even though the output says 'completed' one of the initially two containers is still running - the proxy.
How can I make the proxy exit on completing the migrations inside container 1?
The best way I have found is to share the process namespace between containers and use the SYS_PTRACE securityContext capability to allow you to kill the sidecar.
apiVersion: batch/v1
kind: Job
metadata:
name: my-db-job
spec:
template:
spec:
restartPolicy: OnFailure
shareProcessNamespace: true
containers:
- name: my-db-job-migrations
command: ["/bin/sh", "-c"]
args:
- |
<your migration commands>;
sql_proxy_pid=$(pgrep cloud_sql_proxy) && kill -INT $sql_proxy_pid;
securityContext:
capabilities:
add:
- SYS_PTRACE
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.17
command:
- "/cloud_sql_proxy"
args:
- "-instances=$(DB_CONNECTION_NAME)=tcp:5432"
One possible solution would be a separate cloudsql-proxy deployment with a matching service. You would then only need your migration container inside the job that connects to your proxy service.
This comes with some downsides:
higher network latency, no pod local mysql communication
possible security issue if you provide the sql port to your whole kubernetes cluster
If you want to open cloudsql-proxy to the whole cluster you have to replace tcp:3306 with tcp:0.0.0.0:3306 in the -instance parameter on the cloudsql-proxy.
There are 3 ways of doing this.
1- Use private IP to connect your K8s job to Cloud SQL, as described by #newoxo in one of the answers. To do that, your cluster needs to be a VPC-native cluster. Mine wasn't and I was not whiling to move all my stuff to a new cluster. So I wasn't able to do this.
2- Put the Cloud SQL Proxy container in a separate deployment with a service, as described by #Christian Kohler. This looks like a good approach, but it is not recommended by Google Cloud Support.
I was about to head in this direction (solution #2) but I decided to try something else.
And here is the solution that worked for me:
3- You can communicate between different containers in the same Pod/Job using the file system. The idea is to tell the Cloud SQL Proxy container when the main job is done, and then kill the cloud sql proxy. Here is how to do it:
In the yaml file (my-job.yaml)
apiVersion: v1
kind: Pod
metadata:
name: my-job-pod
labels:
app: my-job-app
spec:
restartPolicy: OnFailure
containers:
- name: my-job-app-container
image: my-job-image:0.1
command: ["/bin/bash", "-c"]
args:
- |
trap "touch /lifecycle/main-terminated" EXIT
{ your job commands here }
volumeMounts:
- name: lifecycle
mountPath: /lifecycle
- name: cloudsql-proxy-container
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/bin/sh", "-c"]
args:
- |
/cloud_sql_proxy -instances={ your instance name }=tcp:3306 -credential_file=/secrets/cloudsql/credentials.json &
PID=$!
while true
do
if [[ -f "/lifecycle/main-terminated" ]]
then
kill $PID
exit 0
fi
sleep 1
done
securityContext:
runAsUser: 2 # non-root user
allowPrivilegeEscalation: false
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
- name: lifecycle
mountPath: /lifecycle
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
- name: lifecycle
emptyDir:
Basically, when your main job is done, it will create a file in /lifecycle that will be identified by the watcher added to the cloud-sql-proxy container, which will kill the proxy and terminate the container.
I hope it helps! Let me know if you have any questions.
Based on: https://stackoverflow.com/a/52156131/7747292
Doesn't look like Kubernetes can do this alone, you would need to manually kill the proxy once the migration exits. Similar question asked here: Sidecar containers in Kubernetes Jobs?
Google cloud sql has recently launched private ip address connectivity for cloudsql. If the cloud sql instance and kubernetes cluster is in same region you can connect to cloudsql without using cloud sql proxy.
https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine#private-ip
A possible solution would be to set the concurrencyPolicy: Replace in the job spec ... this will agnostically replace the current pod with the new instance whenever it needs to run again. But, you have to make sure that the subsequent cron runs are separated enough.
Unfortunately the other answers weren't working for me because of CloudSQLProxy running in a distroless environment where there is no shell.
I managed to get around this by bundling a CloudSQLProxy binary with my deployment and running a bash script to start up CloudSQLProxy followed by my app.
Dockerfile:
FROM golang:1.19.4
RUN apt update
COPY . /etc/mycode/
WORKDIR /etc/mycode
RUN chmod u+x ./scripts/run_migrations.sh
RUN chmod u+x ./bin/cloud_sql_proxy.linux-amd64
RUN go install
ENTRYPOINT ["./scripts/run_migrations.sh"]
Shell Script (run_migrations.sh):
#!/bin/sh
# This script is run from the parent directory
dbConnectionString=$1
cloudSQLProxyPort=$2
echo "Starting Cloud SQL Proxy"
./bin/cloud_sql_proxy.linux-amd64 -instances=${dbConnectionString}=tcp:5432 -enable_iam_login -structured_logs &
CHILD_PID=$!
echo "CloudSQLProxy PID: $CHILD_PID"
echo "Migrating DB..."
go run ./db/migrations/main.go
MAIN_EXIT_CODE=$?
kill $CHILD_PID;
echo "Migrations complete.";
exit $MAIN_EXIT_CODE
K8s (via Pulumi):
import * as k8s from '#pulumi/kubernetes'
const jobDBMigrations = new k8s.batch.v1.Job("job-db-migrations", {
metadata: {
namespace: namespaceName,
labels: appLabels,
},
spec: {
backoffLimit: 4,
template: {
spec: {
containers: [
{
image: pulumi.interpolate`gcr.io/${gcpProject}/${migrationsId}:${migrationsVersion}`,
name: "server-db-migration",
args: [
dbConnectionString,
],
},
],
restartPolicy: "Never",
serviceAccount: k8sSAMigration.metadata.name,
},
},
},
},
{
provider: clusterProvider,
});

Kubernetes Pod permission denied on local volume

I have created a pod on Kubernetes and mounted a local volume but when I try to execute the ls command on locally mounted volume, I get a permission denied error. If I disable SELINUX then everything works fine. I am unable to make out how do I make it work with SELinux enabled.
Following is the output of permission denied:
kubectl apply -f testpod.yaml
root#olcne-operator-ol8 opc]# kubectl get all
NAME READY STATUS RESTARTS AGE
pod/testpod 1/1 Running 0 5s
# kubectl exec -i -t testpod /bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[root#testpod /]# cd /u01
[root#testpod u01]# ls
ls: cannot open directory '.': Permission denied
[root#testpod u01]#
Following is the testpod.yaml
cat testpod.yaml
kind: Pod
apiVersion: v1
metadata:
name: testpod
labels:
name: testpod
spec:
hostname: testpod
restartPolicy: Never
volumes:
- name: swvol
hostPath:
path: /u01
containers:
- name: testpod
image: oraclelinux:8
imagePullPolicy: Always
securityContext:
privileged: false
command: [/usr/sbin/init]
volumeMounts:
- mountPath: "/u01"
name: swvol
Selinux Configuration on worker node:
# sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: enforcing
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Memory protection checking: actual (secure)
Max kernel policy version: 31
---
# semanage fcontext -l | grep kub | grep container_file
/var/lib/kubelet/pods(/.*)? all files system_u:object_r:container_file_t:s0
/var/lib/kubernetes/pods(/.*)? all files system_u:object_r:container_file_t:s0
Machine OS Details
rpm -qa | grep kube
kubectl-1.20.6-2.el8.x86_64
kubernetes-cni-0.8.1-1.el8.x86_64
kubeadm-1.20.6-2.el8.x86_64
kubelet-1.20.6-2.el8.x86_64
kubernetes-cni-plugins-0.9.1-1.el8.x86_64
----
cat /etc/oracle-release
Oracle Linux Server release 8.4
---
uname -r
5.4.17-2102.203.6.el8uek.x86_64
This is a community wiki answer posted for better visibility. Feel free to expand it.
SELinux labels can be assigned with seLinuxOptions:
apiVersion: v1
metadata:
name: testpod
labels:
name: testpod
spec:
hostname: testpod
restartPolicy: Never
volumes:
- name: swvol
hostPath:
path: /u01
containers:
- name: testpod
image: oraclelinux:8
imagePullPolicy: Always
command: [/usr/sbin/init]
volumeMounts:
- mountPath: "/u01"
name: swvol
securityContext:
seLinuxOptions:
level: "s0:c123,c456"
From the official documentation:
seLinuxOptions: Volumes that support SELinux labeling are relabeled
to be accessible by the label specified under seLinuxOptions.
Usually you only need to set the level section. This sets the
Multi-Category Security (MCS) label given to all Containers in the Pod
as well as the Volumes.
Based on the information from the original post on stackoverflow:
You can only specify the level portion of an SELinux label when relabeling a path destination pointed to by a hostPath volume. This
is automatically done so by the seLinuxOptions.level attribute
specified in your securityContext.
However attributes such as seLinuxOptions.type currently have no
effect on volume relabeling. As of this writing, this is still an
open issue within
Kubernetes

Cannot find module '/usr/src/app/server.js'

I have tested the app using minikube locally and it works. When I use the same Doeckerfile with deploymnt.yml, the pod returns to Error state with the below reason
Error: Cannot find module '/usr/src/app/server.js'
Dockerfile:
FROM node:13-alpine
WORKDIR /api
COPY package.json .
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]
Deployment.yml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nodejs-app-dep
labels:
app: nodejs-app
spec:
replicas: 1
selector:
matchLabels:
app: nodejs-app
template:
metadata:
labels:
app: nodejs-app
spec:
serviceAccountName: opp-sa
imagePullSecrets:
- name: xxx
containers:
- name: nodejs-app
image: registry.xxxx.net/k8s_app
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
Assuming it could be a problem with "node_modules", I had "ls" on the WORDIR inside the Dockerfile and it does show me "node_modules". Does anyone what else to check to resolve this issue ?
Since I can't give you this level of suggestions on a comment I'm writing you a fully working example so you can compare to yours and check if there is something different.
Sources:
Your Dockerfile:
FROM node:13-alpine
WORKDIR /api
COPY package*.json .
RUN npm install
COPY . .
EXPOSE 8080
CMD [ "node", "server.js" ]
Sample package.json:
{
"name": "docker_web_app",
"version": "1.0.0",
"description": "Node.js on Docker",
"author": "First Last <first.last#example.com>",
"main": "server.js",
"scripts": {
"start": "node server.js"
},
"dependencies": {
"express": "^4.16.1"
}
}
sample server.js:
'use strict';
const express = require('express');
// Constants
const PORT = 8080;
const HOST = '0.0.0.0';
// App
const app = express();
app.get('/', (req, res) => {
res.send('Hello World');
});
app.listen(PORT, HOST);
console.log(`Running on http://${HOST}:${PORT}`);
Build image:
$ ls
Dockerfile package.json server.js
$ docker build -t k8s_app .
...
Successfully built 2dfbfe9f6a2f
Successfully tagged k8s_app:latest
$ docker images k8s_app
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s_app latest 2dfbfe9f6a2f 4 minutes ago 118MB
Your deployment sample + service for easy access (called nodejs-app.yaml):
apiVersion: apps/v1
kind: Deployment
metadata:
name: nodejs-app-dep
labels:
app: nodejs-app
spec:
replicas: 1
selector:
matchLabels:
app: nodejs-app
template:
metadata:
labels:
app: nodejs-app
spec:
containers:
- name: web-app
image: k8s_app
imagePullPolicy: Never
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: web-app-svc
spec:
type: NodePort
selector:
app: nodejs-app
ports:
- port: 8080
targetPort: 8080
Note: I'm using the minikube docker registry for this example, that's why imagePullPolicy: Never is set.
Now I'll deploy it:
$ kubectl apply -f nodejs-app.yaml
deployment.apps/nodejs-app-dep created
service/web-app-svc created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nodejs-app-dep-5d75f54c7d-mfw8x 1/1 Running 0 3s
Whenever you need to troubleshoot inside a pod you can use kubectl exec -it <pod_name> -- /bin/sh (or /bin/bash depending on the base image.)
$ kubectl exec -it nodejs-app-dep-5d75f54c7d-mfw8x -- /bin/sh
/api # ls
Dockerfile node_modules package-lock.json package.json server.js
The pod is running and the files are in the WORKDIR folder as stated in the Dockerfile.
Finally let's test accessing from outside the cluster:
$ minikube service list
|-------------|-------------|--------------|-------------------------|
| NAMESPACE | NAME | TARGET PORT | URL |
|-------------|-------------|--------------|-------------------------|
| default | web-app-svc | 8080 | http://172.17.0.2:31446 |
|-------------|-------------|--------------|-------------------------|
$ curl -i http://172.17.0.2:31446
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: text/html; charset=utf-8
Content-Length: 11
ETag: W/"b-Ck1VqNd45QIvq3AZd8XYQLvEhtA"
Date: Thu, 14 May 2020 18:49:40 GMT
Connection: keep-alive
Hello World$
The Hello World is being served as desired.
To Summarize:
I Build the Docker Image in minikube ssh so it is cached.
Created the manifest containing the deployment pointing to the image, added the service part to allow access externally using Nodeport.
NodePort routes all traffic to the Minikube IP in the port assigned to the service (i.e:31446) and deliver to the pods matching the selector listening on port 8080.
A few pointers for troubleshooting:
kubectl describe pod <pod_name>: provides precious information when the pod status is in any kind of error.
kubectl exec is great to troubleshoot inside the container as it's running, it's pretty similar to docker run command.
Review your code files to ensure there is no baked path in it.
Try using WORKDIR /usr/src/app instead of /api and see if you get a different result.
Try using a .dockerignore file with node_modules on it's content.
Try out and let me know in the comments if you need further help
#willrof, thanks for the detailed write-up. A reply to your response is limited to 30 characters and hence I'm posting as new comment.
My problem was resolved yesterday. It was with COPY . .
It works perfectly fine in my local but, when I tried to deploy onto the cluster with the same Dockerfile, I was running into the issue of "cannot find module..."
So it finally worked when the directory path was mentioned instead of . . while copying files
COPY /api /usr/app #copy project basically
WORKDIR /usr/app #set workdir just before npm install
RUN npm install
EXPOSE 3000
Moving WORKDIR statement before installing "node_modules" worked in my case. I'm surprised to figure this as the problem though it worked locally with COPY . .

Kubernetes job pod completed successfully but one of the containers were not ready

I've got some strange looking behavior.
When a job is run, it completes successfully but one of the containers says it's not (or was not..) ready:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default **********-migration-22-20-16-29-11-2018-xnffp 1/2 Completed 0 11h 10.4.5.8 gke-******
job yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: migration-${timestamp_hhmmssddmmyy}
labels:
jobType: database-migration
spec:
backoffLimit: 0
template:
spec:
restartPolicy: Never
containers:
- name: app
image: "${appApiImage}"
imagePullPolicy: IfNotPresent
command:
- php
- artisan
- migrate
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/cloud_sql_proxy",
"-instances=${SQL_INSTANCE_NAME}=tcp:3306",
"-credential_file=/secrets/cloudsql/credentials.json"]
securityContext:
runAsUser: 2 # non-root user
allowPrivilegeEscalation: false
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
What may be the cause of this behavior? There is no readiness or liveness probes defined on the containers.
If I do a describe on the pod, the relevant info is:
...
Command:
php
artisan
migrate
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 29 Nov 2018 22:20:18 +0000
Finished: Thu, 29 Nov 2018 22:20:19 +0000
Ready: False
Restart Count: 0
Requests:
cpu: 100m
...
A Pod with a Ready status means it "is able to serve requests and should be added to the load balancing pools of all matching Services", see https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions
In your case, you don't want to serve requests, but simply to execute php artisan migrate once, and done. So you don't have to worry about this status, the important part is the State: Terminated with a Reason: Completed and a zero exit code: your command did whatever and then exited successfully.
If the result of the command is not what you expected, you'd have to investigate the logs from the container that ran this command with kubectl logs your-pod -c app (where app is the name of the container you defined), and/or you would expect the php artisan migrate command to NOT issue a zero exit code.
In my case, I was using istio, and experienced the same issue, removing istio-sidecar from the job pod solves this problem.
my solution if using istio:
spec:
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"

What is the equivalent for depends_on in kubernetes

I have a docker compose file with the following entries
version: '2.1'
services:
mysql:
container_name: mysql
image: mysql:latest
volumes:
- ./mysqldata:/var/lib/mysql
environment:
MYSQL_ROOT_PASSWORD: 'password'
ports:
- '3306:3306'
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3306"]
interval: 30s
timeout: 10s
retries: 5
test1:
container_name: test1
image: test1:latest
ports:
- '4884:4884'
- '8443'
depends_on:
mysql:
condition: service_healthy
links:
- mysql
The Test-1 container is dependent on mysql and it needs to be up and running.
In docker this can be controlled using health check and depends_on attributes.
The health check equivalent in kubernetes is readinessprobe which i have already created but how do we control the container startup in the pod's?????
Any directions on this is greatly appreciated.
My Kubernetes file:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: deployment
spec:
replicas: 1
template:
metadata:
labels:
app: deployment
spec:
containers:
- name: mysqldb
image: "dockerregistry:mysqldatabase"
imagePullPolicy: Always
ports:
- containerPort: 3306
readinessProbe:
tcpSocket:
port: 3306
initialDelaySeconds: 15
periodSeconds: 10
- name: test1
image: "dockerregistry::test1"
imagePullPolicy: Always
ports:
- containerPort: 3000
That's the beauty of Docker Compose and Docker Swarm... Their simplicity.
We came across this same Kubernetes shortcoming when deploying the ELK stack.
We solved it by using a side-car (initContainer), which is just another container in the same pod thats run first, and when it's complete, kubernetes automatically starts the [main] container. We made it a simple shell script that is in loop until Elasticsearch is up and running, then it exits and Kibana's container starts.
Below is an example of a side-car that waits until Grafana is ready.
Add this 'initContainer' block just above your other containers in the Pod:
spec:
initContainers:
- name: wait-for-grafana
image: darthcabs/tiny-tools:1
args:
- /bin/bash
- -c
- >
set -x;
while [[ "$(curl -s -o /dev/null -w ''%{http_code}'' http://grafana:3000/login)" != "200" ]]; do
echo '.'
sleep 15;
done
containers:
.
.
(your other containers)
.
.
This was purposefully left out. The reason being is that applications should be responsible for their connect/re-connect logic for connecting to service(s) such as a database. This is outside the scope of Kubernetes.
While I don't know the direct answer to your question except this link (k8s-AppController), I don't think it's wise to use same deployment for DB and app. Because you are tightly coupling your db with app and loosing awesome k8s option to scale any one of them as needed. Further more if your db pod dies you loose your data as well.
Personally what I would do is to have a separate StatefulSet with Persistent Volume for database and Deployment for app and use Service to make sure their communication.
Yes I have to run few different commands and may need at least two separate deployment files but this way I am decoupling them and can scale them as needed. And my data is being persistent as well!
As mentioned, you should run the database and the application containers in separate pods and connect them with a service.
Unfortunately, both Kubernetes and Helm don't provide a functionality similar to what you've described. We had many issues with that and tried a few approaches until we have decided to develop a smallish utility that solved this problem for us.
Here's the link to the tool we've developed: https://github.com/Opsfleet/depends-on
You can make pods wait until other pods become ready according to their readinessProbe configuration. It's very close to Docker's depends_on functionality.
In Kubernetes terminology one your docker-compose set is a Pod.
So, there is no depends_on equivalent there. Kubernetes will check all containers in a pod and they all have to be alive for a mark that pod as Healthy and will always run them together.
In your case, you need to prepare configuration of Deployment like that:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
template:
metadata:
labels:
app: app-and-db
spec:
containers:
- name: app
image: nginx
ports:
- containerPort: 80
- name: db
image: mysql
ports:
- containerPort: 3306
After pod will be started, your database will be available on localhost interface for your application, because of network conception:
Containers within a pod share an IP address and port space, and can find each other via localhost. They can also communicate with each other using standard inter-process communications like SystemV semaphores or POSIX shared memory.
But, as #leninhasda mentioned, it is not a good idea to run database and application in your pod and without Persistent Volume. Here is a good tutorial on how to run a stateful application in the Kubernetes.
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
what about liveness and readiness ??? supports commands, http requests and more
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
containers:
- name: liveness
image: k8s.gcr.io/busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5