Kubernetes delete deployment after script execution - kubernetes

I'm working on creating a distributed locust service for benchmarking and REST API testing in the platform. Architecture is as follows:
First pod running a docker image with master flag for controlling the whole process
A collection of pods running a docker image with worker flag which will make the work (which can vary depending on the requeriments)
Deployment and Service files are:
01-locust-master.yaml
apiVersion: v1
kind: Service
metadata:
name: locust-master
labels:
name: locust
spec:
type: LoadBalancer
selector:
name: locust
role: master
ports:
- port: 8089
protocol: TCP
name: master-web
- port: 5557
protocol: TCP
name: master-port1
- port: 5558
protocol: TCP
name: master-port2
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: locust-master
spec:
replicas: 1
template:
selector:
matchLabels:
name: locust
role: master
template:
metadata:
labels:
name: locust
role: master
spec:
containers:
- name: locust
image: locust-image:latest
imagePullPolicy: Always
env:
- name: LOCUST_MODE
value: master
- name: LOCUST_LOCUSTFILE_PATH
value: "/locust-tasks/locustfiles/the_file.py"
- name: LOCUST_TARGET_HOST
value: "the_endpoint"
- name: LOCUST_USERS
value: !!integerEnv 300
- name: LOCUST_SPAWN_RATE
value: !!integerEnv 100
- name: LOCUST_TEST_TIME
value: "5m"
- name: LOCUST_OUTPUT_DIR
value: "/locust-tasks/locust-output"
- name: LOCUST_TEST_API_TOKEN
value: "some_api_topken"
- name: LOCUST_S3_OUTPUT_BUCKET
value: "s3-bucket"
ports:
- containerPort: 8089
- containerPort: 5557
- containerPort: 5558
resources:
limits:
cpu: 2000m
memory: 2048Mi
02-locust-worker.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: locust-worker
spec:
replicas: 3
selector:
matchLabels:
name: locust
template:
metadata:
labels:
name: locust
role: worker
spec:
containers:
- name: locust
image: locust:latest
imagePullPolicy: Always
env:
- name: LOCUST_MODE
value: worker
- name: LOCUST_MASTER_NODE_HOST
value: locust-master
- name: LOCUST_LOCUSTFILE_PATH
value: "/locust-tasks/locustfiles/the_file.py"
- name: LOCUST_TARGET_HOST
value: "the_endpoint"
- name: LOCUST_TEST_API_TOKEN
value: "the_api_token"
- name: LOCUST_S3_OUTPUT_BUCKET
value: "s3_bucket"
resources:
limits:
cpu: 1500m
memory: 850Mi
requests:
cpu: 1200m
memory: 768Mi
DockerFile
FROM python:3.7.3
# Install packages
COPY requirements.txt /tmp/
RUN pip install --upgrade pip
RUN pip install --requirement /tmp/requirements.txt
RUN pip install awscli
# Add locustfiles
COPY common/ /locust-tasks/common/
COPY templates/ /locust-tasks/templates/
COPY locustfiles/ /locust-tasks/locustfiles/
# Set the entrypoint
COPY docker-entrypoint.sh /
RUN chmod +x /docker-entrypoint.sh
ENTRYPOINT ["/docker-entrypoint.sh"]
EXPOSE 5557 5558 8089
docker-entrypoint.sh
#!/bin/bash -x
LOCUST_MODE=${LOCUST_MODE:="standalone"}
LOCUST_MASTER=${LOCUST_MASTER:=""}
LOCUST_LOCUSTFILE_PATH=${LOCUST_LOCUSTFILE_PATH:="/locust-tasks/locustfiles/the_file.py"}
LOCUST_TARGET_HOST=${LOCUST_TARGET_HOST:="the_endpoint"}
LOCUST_OUTPUT_DIR=${LOCUST_OUTPUT_DIR:="/locust-tasks/locust-output"}
LOCUST_TEST_API_TOKEN=${LOCUST_TEST_API_TOKEN:="the_token"}
LOCUST_S3_OUTPUT_BUCKET=${LOCUST_S3_OUTPUT_BUCKET:="s3_bucket"}
cd /locust-tasks
if [[ ! -e $LOCUST_OUTPUT_DIR ]]; then
mkdir $LOCUST_OUTPUT_DIR
elif [[ ! -d $LOCUST_OUTPUT_DIR ]]; then
echo "$LOCUST_OUTPUT_DIR already exists but is not a directory" 1>&2
fi
LOCUST_PATH="/usr/local/bin/locust"
LOCUST_FLAGS="-f $LOCUST_LOCUSTFILE_PATH --host=$LOCUST_TARGET_HOST --csv=$LOCUST_OUTPUT_DIR/locust-${LOCUST_MODE}"
if [[ "$LOCUST_MODE" = "master" ]]; then
LOCUST_FLAGS="$LOCUST_FLAGS --master --headless -u $LOCUST_USERS -r $LOCUST_SPAWN_RATE -t $LOCUST_TEST_TIME"
elif [[ "$LOCUST_MODE" = "worker" ]]; then
LOCUST_FLAGS="$LOCUST_FLAGS --worker --master-host=$LOCUST_MASTER_NODE_HOST"
fi
auth_token=$LOCUST_TEST_API_TOKEN $LOCUST_PATH $LOCUST_FLAGS
# Copy test output files to S3
today=$(date +"%Y/%m/%d")
S3_OUTPUT_DIR="s3://${LOCUST_S3_OUTPUT_BUCKET}/${today}/${HOSTNAME}"
echo "Copying locust output files from [$LOCUST_OUTPUT_DIR] to S3 [$S3_OUTPUT_DIR]"
aws s3 cp --recursive $LOCUST_OUTPUT_DIR $S3_OUTPUT_DIR
retVal=$?
if [ $retVal -ne 0 ]; then
echo "Something went wrong, exit code is ${retVal}"
fi
exit $retVal
So my requeriment / idea is to run the script above and after that delete the whole thing. But instead of that, I'm getting and endless pods restarting:
NAME READY STATUS RESTARTS AGE
locust-master-69b4547ddf-7fl4d 1/1 Running 4 23m
locust-worker-59b9689857-l5jhw 1/1 Running 4 23m
locust-worker-59b9689857-l5nd2 1/1 Running 4 23m
locust-worker-59b9689857-lwqbb 1/1 Running 4 23m
How can I delete both deployments after the shellscript ends?

I think you are looking for Jobs.
As pods successfully complete, the Job tracks the successful
completions. When a specified number of successful completions is
reached, the task (ie, Job) is complete. Deleting a Job will clean up
the Pods it created.
You can use ttl mechanism for cleaning up the finished job by specifying the .spec.ttlSecondsAfterFinished field of the Job
https://kubernetes.io/docs/concepts/workloads/controllers/job/#ttl-mechanism-for-finished-jobs

Related

Rabbitmq containers throwing error "Bad characters in cookie"

I am trying to setup a Rabbitmq cluster and when the containers start they are failing with error [error] CRASH REPORT Process <0.200.0> with 0 neighbours crashed with reason: "Bad characters in cookie" in auth:init_no_setcookie/0 line 313. This suggests that the erlang cookie value passed in is not valid :
kubectl -n demos get pods
NAME READY STATUS RESTARTS AGE
mongodb-deployment-6499999-vpcjh 1/1 Running 0 12h
rabbitmq-0 0/1 CrashLoopBackOff 9 25m
rabbitmq-1 0/1 CrashLoopBackOff 9 24m
rabbitmq-2 0/1 CrashLoopBackOff 9 23m
And when I query the logs for one of the pods :
kubectl -n demos logs -p rabbitmq-0 --previous
I get :
WARNING: '/var/lib/rabbitmq/.erlang.cookie' was populated from
'$RABBITMQ_ERLANG_COOKIE', which will no longer happen in 3.9 and later! (https://github.com/docker-library/rabbitmq/pull/424)
Configuring logger redirection
02:04:47.506 [error] Bad characters in cookie
02:04:47.512 [error]
02:04:47.506 [error] Supervisor net_sup had child auth started with auth:start_link() at undefined exit with reason "Bad characters in cookie" in auth:init_no_setcookie/0 line 313 in context start_error
02:04:47.506 [error] CRASH REPORT Process <0.200.0> with 0 neighbours crashed with reason: "Bad characters in cookie" in auth:init_no_setcookie/0 line 313
02:04:47.522 [error] BOOT FAILED
BOOT FAILED
02:04:47.523 [error] ===========
===========
02:04:47.523 [error] Exception during startup:
Exception during startup:
02:04:47.524 [error]
02:04:47.524 [error] supervisor:children_map/4 line 1250
....
....
....
This is how I am generating the cookie in bash :
dd if=/dev/urandom bs=30 count=1 | base64
And in the secrets manifest I have :
metadata:
name: rabbit-secret
namespace: demos
type: Opaque
data:
# echo -n "cookie-value" | base64
RABBITMQ_ERLANG_COOKIE: <encoded_cookie_value_here>
And in statefulset I have :
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
namespace: demos
spec:
serviceName: rabbitmq
replicas: 3
selector:
matchLabels:
app: rabbitmq
template:
metadata:
labels:
app: rabbitmq
spec:
serviceAccountName: rabbitmq
initContainers:
- name: config
image: busybox
imagePullPolicy: "IfNotPresent"
command: ['/bin/sh', '-c', 'cp /tmp/config/rabbitmq.conf /config/rabbitmq.conf && ls -l /config/ && cp /tmp/config/enabled_plugins /etc/rabbitmq/enabled_plugins']
volumeMounts:
- name: config
mountPath: /tmp/config/
readOnly: false
- name: config-file
mountPath: /config/
- name: plugins-file
mountPath: /etc/rabbitmq/
containers:
- name: rabbitmq
image: rabbitmq:3.8-management
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 4369
name: discovery
- containerPort: 5672
name: amqp
env:
- name: RABBIT_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: RABBIT_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: RABBITMQ_NODENAME
value: rabbit#$(RABBIT_POD_NAME).rabbitmq.$(RABBIT_POD_NAMESPACE).svc.cluster.local
- name: RABBITMQ_USE_LONGNAME
value: "true"
- name: RABBITMQ_CONFIG_FILE
value: "/config/rabbitmq"
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: rabbit-secret
key: RABBITMQ_ERLANG_COOKIE
- name: K8S_HOSTNAME_SUFFIX
value: .rabbitmq.$(RABBIT_POD_NAMESPACE).svc.cluster.local
volumeMounts:
- name: data
mountPath: /var/lib/rabbitmq
readOnly: false
- name: config-file
mountPath: /config/
- name: plugins-file
mountPath: /etc/rabbitmq/
volumes:
- name: config-file
emptyDir: {}
- name: plugins-file
emptyDir: {}
- name: config
configMap:
name: rabbitmq-config
defaultMode: 0755
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "cinder-csi"
resources:
requests:
storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
name: rabbitmq
namespace: labs
spec:
clusterIP: None
ports:
- port: 4369
targetPort: 4369
name: discovery
- port: 5672
targetPort: 5672
name: amqp
selector:
app: rabbitmq
What am I missing ?
Is there a recommended way of generating the cookie or something else to do with the K8s cluster itself.
I have followed the example given here with the only difference being that I am generating my cookie on my local machine and not the k8s host.
duplicate question?
This requires you to create the Secret. Just to bring this up, you can run a one-off imperative command to create a random Secret:
kubectl create secret generic rabbitmq \
--from-literal=erlangCookie=$(dd if=/dev/urandom bs=30 count=1 | base64)
The error is from the rabbitmq's source code file docker-entrypoint.sh
if [ "${RABBITMQ_ERLANG_COOKIE:-}" ]; then
cookieFile='/var/lib/rabbitmq/.erlang.cookie'
if [ -e "$cookieFile" ]; then
if [ "$(cat "$cookieFile" 2>/dev/null)" != "$RABBITMQ_ERLANG_COOKIE" ]; then
echo >&2
echo >&2 "warning: $cookieFile contents do not match RABBITMQ_ERLANG_COOKIE"
echo >&2
fi
else
echo "$RABBITMQ_ERLANG_COOKIE" > "$cookieFile"
fi
chmod 600 "$cookieFile"
fi
Therefore you can delete '/var/lib/rabbitmq/.erlang.cookie' file, the code will copy the content from environment variable $RABBITMQ_ERLANG_COOKIE to create this file.
If you are working for the production system, you should be very careful and test it in your local system firstly and gain experience.
This source code will be execuated only the rabbitmq restart, and won't run again after it is execuated.
And you can find the actual rabbitmq's cookie data from the erlang's console by erlang:get_cookie() -> Cookie | nocookie.

How do I ensure all nodes are running at the same time in K8S for jobs with parallelism

I need to run a job with parallelism, but I need all nodes/pods running at the same time. If I have only 4 nodes available, but I need 5 then they need to all remain pending or the submission needs to fail as a whole. How do I enforce this in the manifest file? Currently what I see is it takes up as many nodes as it can and leaves the rest in pending state.
Here's my manifest:
apiVersion: batch/v1
kind: Job
metadata:
name: myapp-pod
labels:
app: myapp
spec:
parallelism: 93 #expecting the job to stay pending or fail
activeDeadlineSeconds: 30
template:
metadata:
labels:
app: myapp
spec:
volumes:
- name: indatapt
hostPath:
path: /data # folder path in node, external to container
containers:
- name: myapp-container
image: busybox
imagePullPolicy: IfNotPresent
env:
- name: DEMO_GREETING
value: "Hello from the environment"
command: ['sh', '-c', 'echo "b" >> /indata/b.txt && /indata/test.sh && sleep 10s']
volumeMounts:
- name: indatapt
mountPath: /indata # path in the container
restartPolicy: Never

I want to apt-get install sysstat command in kubernetes yaml file

Cluster information:
Kubernetes version: 1.8
Cloud being used: (put bare-metal if not on a public cloud) AWS EKS
Host OS: debian linux
When I deploy pods,I want to my pod to install and start sysstat automatically
this is my two yaml flies below but it doesn’t work CrashLoopBackoff when I put the command: ["/bin/sh", “-c”]、args: [“apt-get install sysstat”]」 below 「image:」
cat deploy/db/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name:
sbdemo-postgres-sfs
spec:
serviceName: sbdemo-postgres-service
replicas: 1
selector:
matchLabels:
app: sbdemo-postgres-sfs
template:
metadata:
labels:
app: sbdemo-postgres-sfs
spec:
containers:
- name: postgres
image: dayan888/springdemo:postgres9.6
ports:
- containerPort: 5432
**command: ["/bin/bash", "-c"]**
**args: ["apt-get install sysstat"]**
volumeMounts:
- name: pvc-db-volume
mountPath: /var/lib/postgresql
volumeClaimTemplates:
- metadata:
name: pvc-db-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1G
cat deploy/web/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: sbdemo-nginx
spec:
replicas: 3
selector:
matchLabels:
app: sbdemo-nginx
template:
metadata:
labels:
app: sbdemo-nginx
spec:
containers:
- name: nginx
image: gobawoo21/springdemo:nginx
**command: ["/bin/bash", "-c"]**
**args: ["apt-get install sysstat"]**
ports:
- containerPort: 80
volumeMounts:
- name: nginx-conf
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
- name: server-conf
mountPath: /etc/nginx/conf.d/server.conf
subPath: server.conf
volumes:
- name: nginx-conf
configMap:
name: nginx-conf
items:
- key: nginx.conf
path: nginx.conf
- name: server-conf
configMap:
name: server-conf
items:
- key: server.conf
path: server.conf
Does anyone know about how to set the repository automatically when deploy pods?
Regards
The best practice is to install packages at image build stage. You can simply add this step to your Dockerfile.
FROM postgres:9.6
RUN apt-get update &&\
apt-get install sysstat -y &&\
rm -rf /var/lib/apt/lists/*
COPY deploy/db/init_ddl.sh /docker-entrypoint-initdb.d/
RUN chmod +x /docker-entrypoint-initdb.d/init_ddl.sh
Kube Manifest
spec:
containers:
- name: postgres
image: harik8/sof:62298191
imagePullPolicy: Always
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
value: password
volumeMounts:
- name: pvc-db-volume
mountPath: /var/lib/postgresql
It should run (Please ignore POSTGRES_PASSWORD env variable)
$ kubectl get po
NAME READY STATUS RESTARTS AGE
sbdemo-postgres-sfs-0 1/1 Running 0 8m46s
Validation
$ kubectl exec -it sbdemo-postgres-sfs-0 bash
root#sbdemo-postgres-sfs-0:/# iostat
Linux 4.19.107 (sbdemo-postgres-sfs-0) 06/10/2020 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
10.38 0.01 6.28 0.24 0.00 83.09
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
vda 115.53 1144.72 1320.48 1837135 2119208
scd0 0.02 0.65 0.00 1048 0
If this is possible, something is wrong. Your container should not be running as root and so even if you fixed this approach, it shouldn’t work. What you need to do is put this in your container build instead (I.e. in the Dockerfile).

Serialize creation of Pods in a deployment manifest using Helm charts

So I have a helm chart that deploys a pod, so the next task is to create another pod once the first pod is running.
So I created a simple pod.yaml in chart/templates which creates a simple pod-b, so next step to only create pod-b after pod-a is running.
So was only at helm hooks but don't think they care about pod status.
Another idea is to use Init container like below but not sure how to write command to lookup a pod is running?
spec:
containers:
- name: myapp-container
image: busybox
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
initContainers:
- name: init-myservice
image: busybox
command: ['sh', '-c', 'until nslookup myservice; do echo waiting for myservice; sleep 2; done;']
Another idea is a simple script to check pod status something like:
y=`kubectl get po -l app=am -o 'jsonpath={.items[0].status.phase}'`
while [ $i -le 5 ]
do
if [[ "$y" == "Running" ]]; then
break
fi
sleep 5
done
Any advice would be great.
If you want your post-install/post-upgrade chart hooks to work, you should add readiness probes to your first pod and use --wait flag.
helm upgrade --install -n test --wait mychart .
pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: readiness-exec
labels:
test: readiness
spec:
containers:
- name: readiness
image: k8s.gcr.io/busybox
args:
- /bin/sh
- -c
- sleep 30; touch /tmp/healthy; sleep 600
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 10
hook.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: "post-deploy"
annotations:
"helm.sh/hook": post-upgrade,post-install
"helm.sh/hook-delete-policy": before-hook-creation
spec:
backoffLimit: 1
template:
metadata:
name: "post-deploy"
spec:
restartPolicy: Never
containers:
- name: post-deploy
image: k8s.gcr.io/busybox
args:
- /bin/sh
- -c
- echo "executed only after previous pod is ready"

Kubernetes Pod is changing status from running to completed very soon ,how do i prevent that

Created a pod using yaml and once pod is created I am running kubectl exec to run my gatling perf test code
kubectl exec gradlecommandfromcommandline -- ./gradlew gatlingRun-
simulations.RuntimeParameters -DUSERS=500 -DRAMP_DURATION=5 -DDURATION=30
but this is ending at kubectl console with below message :-
command terminated with exit code 137
On Investigation its found that pod is changing status from running to completed stage.
How do i increase life span of a pod so that it waits for my command to get executed.Here is pod yaml
apiVersion: v1
kind: Pod
metadata:
name: gradlecommandfromcommandline
labels:
purpose: gradlecommandfromcommandline
spec:
containers:
- name: gradlecommandfromcommandline
image: tarunkumard/tarungatlingscript:v1.0
workingDir: /opt/gatling-fundamentals/
command: ["./gradlew"]
args: ["gatlingRun-simulations.RuntimeParameters", "-DUSERS=500", "-
DRAMP_DURATION=5", "-DDURATION=30"]
restartPolicy: OnFailure
Here is yaml file to make pod running always
apiVersion: v1
kind: Pod
metadata:
name: gradlecommandfromcommandline
labels:
purpose: gradlecommandfromcommandline
spec:
volumes:
- name: docker-sock
hostPath:
path: /home/vagrant/k8s/pods/gatling/user-files/simulations # A file or
directory location on the node that you want to mount into the Pod
# command: [ "git clone https://github.com/TarunKDas2k18/PerfGatl.git" ]
containers:
- name: gradlecommandfromcommandline
image: tarunkumard/tarungatlingscript:v1.0
workingDir: /opt/gatling-fundamentals/
command: ["./gradlew"]
args: ["gatlingRun-simulations.RuntimeParameters", "-DUSERS=500", "-
DRAMP_DURATION=5", "-DDURATION=30"]
- name: gatlingperftool
image: tarunkumard/gatling:FirstScript # Run the ubuntu 16.04
command: [ "/bin/bash", "-c", "--" ] # You need to run some task inside a
container to keep it running
args: [ "while true; do sleep 10; done;" ] # Our simple program just sleeps inside
an infinite loop
volumeMounts:
- mountPath: /opt/gatling/user-files/simulations # The mount path within the
container
name: docker-sock # Name must match the hostPath volume name
ports:
- containerPort: 80