On my test cluster, starting a pod seems to consistently take ~12 seconds, give or take one. I would like to know if that is reasonable, or if I am doing something wrong, either in configuring the pod, or in measuring the time, or configuring the cluster.
According to https://github.com/kubernetes/kubernetes/issues/3952 and https://medium.com/google-cloud/profiling-gke-startup-time-9052d81e0052, I believe what I am getting is excessively slow.
The way I measure startup is by running the following script and counting how many times it prints "Pending", and that is my startup time in seconds. Since due to the sleep command, I get almost exactly one "Pending" per second.
id=mypod1
tee job.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: clusterrunner-build-${id}
spec:
containers:
- name: clusterrunner-slave
image: jdanekrh/clusterrunner-slave
command: ["bash", "-c", "echo bof; sleep 5; echo lek"]
restartPolicy: Never
EOF
kubectl create -f job.yaml
while kubectl get pod/clusterrunner-build-${id} -o jsonpath='{.status.phase}' | grep Pending; do
sleep 1
done
kubectl logs -f po/clusterrunner-build-${id}
kubectl delete -f job.yaml
Related
Currently it takes quite a long time before the pod can be terminated after a kubectl delete command. I have the feeling that it could be because of the sleep command.
How can I make the container stop faster?
What best practices should I use here?
apiVersion: apps/v1
kind: Deployment
...
spec:
template:
spec:
containers:
- image: alpine
..
command:
- /bin/sh
- -c
- |
trap : TERM INT
while true; do
# some code to check something
sleep 10
done
Is my approach with "trap: TERM INT" correct? At the moment I don't see any positive effect...
When I terminate the pod it takes several seconds for the command to come back.
kubectl delete pod my-pod
Add terminationGracePeriodSeconds to your spec will do:
...
spec:
template:
spec:
terminationGracePeriodSeconds: 10 # <-- default is 30, can go as low as 0 to send SIGTERM immediately.
containers:
- image: alpine
I have a cronjob that keeps restarting, despite its RestartPolicy set to Never:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: cron-zombie-pod-killer
spec:
schedule: "*/9 * * * *"
successfulJobsHistoryLimit: 1
jobTemplate:
spec:
template:
metadata:
name: cron-zombie-pod-killer
spec:
containers:
- name: cron-zombie-pod-killer
image: bitnami/kubectl
command:
- "/bin/sh"
args:
- "-c"
- "kubectl get pods --all-namespaces --field-selector=status.phase=Failed | awk '{print $2 \" --namespace=\" $1}' | xargs kubectl delete pod > /dev/null"
serviceAccountName: pod-read-and-delete
restartPolicy: Never
I would expect it to run every 9th minute, but that's not the case.
What happens is that when there are pods to clean up (so, when there's smth to do for the pod) it would run normally. Once everything is cleared up, it keeps restarting -> failing -> starting, etc. in a loop every second.
Is there something I need to do to tell k8s that the job has been successful, even if there's nothing to do (no pods to clean up)? What makes the job loop in restarts and failures?
That is by design. restartPolicy is not applied to a CronJob, but a Pod it creates.
If restartPolicy is set to Never, it will ust create new pods, if the previous failed. Setting it to OnFailure causes the Pod to be restarted, and prevents the stream of new Pods.
This was discussed in this GitHub issue: Job being constanly recreated despite RestartPolicy: Never #20255
Your kubectl command results in exit code 123 (any invocation exited with a non-zero status) if there are no Pods in Failed state. This causes the Job to fail, and constant restarts.
You can fix that by forcing kubectl command to exit with exit code 0. Add || exit 0 to the end of it:
kubectl get pods --all-namespaces --field-selector=status.phase=Failed | awk '{print $2 \" --namespace=\" $1}' | xargs kubectl delete pod > /dev/null || exit 0
...Once everything is cleared up, it keeps restarting -> failing -> starting, etc. in a loop every second.
When your first command returns no pod, the trailing commands (eg. awk, xargs) fails and returns non-zero exit code. Such exit code is perceived by the controller that the job has failed and therefore start a new pod to re-run the job. You should just exit with zero when there is no pod returned.
I'm running a pod with 3 containers (telegraf, fluentd and an in-house agent) that makes use of shareProcessNamespace: true.
I've written a python script to fetch the initial config for telegraf and fluentd from a central controller API endpoint. Since this is a one time operation, I plan to use helm post-install hook.
apiVersion: batch/v1
kind: Job
metadata:
name: agent-postinstall
annotations:
"helm.sh/hook-weight": "3"
"helm.sh/hook": "post-install"
spec:
template:
spec:
containers:
- name: agent-postinstall
image: "{{ .Values.image.agent.repository }}:{{ .Values.image.agent.tag | default .Chart.AppVersion }}"
imagePullPolicy: IfNotPresent
command: ['python3', 'getBaseCfg.py']
volumeMounts:
- name: config-agent-volume
mountPath: /etc/config
volumes:
- name: config-agent-volume
configMap:
name: agent-cm
restartPolicy: Never
backoffLimit: 1
It is required for the python script to check if telegraf/fluentd/agent processes are up, before getting the config. I intend to wait (with a timeout) until pgrep <telegraf/fluentd/agent> returns true and then fire APIs. Is there a way to enable shareProcessNamespace for the post-install hook as well? Thanks.
PS: Currently, the agent calls the python script along with its own startup script. It works, but it is kludgy. I'd like to move it out of agent container.
shareProcessNamespace
Most important part of this flag is it works only within one pod, all containers within one pod will share processes between each other.
In described approach job is supposed to be used. Job creates a separate pod so it won't work this way. Container should be a part of the "main" pod with all other containers to have access to running processes of that pod.
More details about process sharing.
Possible way to solution it
It's possible to get processes from the containers directly using kubectl command.
Below is an example how to check state of the processes using pgrep command. The pgrepContainer container needs to have the pgrep command already installed.
job.yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: "{{ .Release.Name }}-postinstall-hook"
annotations: "helm.sh/hook": post-install
spec:
template:
spec:
serviceAccountName: config-user # service account with appropriate permissions is required using this approach
volumes:
- name: check-script
configMap:
name: check-script
restartPolicy: Never
containers:
- name: post-install-job
image: "bitnami/kubectl" # using this image with kubectl so we can connect to the cluster
command: ["bash", "/mnt/script/checkScript.sh"]
volumeMounts:
- name: check-script
mountPath: /mnt/script
And configmap.yaml which contains script and logic which check three processes in loop for 60 iterations per 10 seconds each:
apiVersion: v1
kind: ConfigMap
metadata:
name: check-script
data:
checkScript.sh: |
#!/bin/bash
podName=test
pgrepContainer=app-1
process1=sleep
process2=pause
process3=postgres
attempts=0
until [ $attempts -eq 60 ]; do
kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process1} 1>/dev/null 2>&1 \
&& kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process2} 1>/dev/null 2>&1 \
&& kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process3} 1>/dev/null 2>&1
if [ $? -eq 0 ]; then
break
fi
attempts=$((attempts + 1))
sleep 10
echo "Waiting for all containers to be ready...$[ ${attempts}*10 ] s"
done
if [ $attempts -eq 60 ]; then
echo "ERROR: Timeout"
exit 1
fi
echo "All containers are ready !"
echo "Configuring telegraf and fluentd services"
Final result will look like:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
test 2/2 Running 0 20m
test-postinstall-hook-dgrc9 0/1 Completed 0 20m
$ kubectl logs test-postinstall-hook-dgrc9
Waiting for all containers to be ready...10 s
All containers are ready !
Configuring telegraf and fluentd services
Above is an another approach, you can use its logic as base to achieve your end goal.
postStart
Also postStart hook can be considered to be used where some logic will be located. It will run after container is created. Since main application takes time to start and there's already logic which waits for it, it's not an issue that:
there is no guarantee that the hook will execute before the container ENTRYPOINT
I'm running a K8S job, with the following flags:
apiVersion: batch/v1
kind: Job
metadata:
name: my-EP
spec:
template:
metadata:
labels:
app: EP
spec:
restartPolicy: "Never"
containers:
- name: EP
image: myImage
The Job starts, runs my script that runs some application that sends me an email and then terminates. The application returns the exit code to the bash script.
when I run the command:
kubectl get pods, I get the following:
NAME READY STATUS RESTARTS AGE
my-EP-94rh8 0/1 Completed 0 2m2s
Sometimes there are issues, and the network not connected or no license available.
I would like that to be visible to the pod user.
My question is, can I propagate the script exit code to be seen when I run the above get pods command?
I.E instead of the "Completed" status, I would like to see my application exit code - 0, 1, 2, 3....
or maybe there is a way to see it in the Pods Statuses, in the describe command?
currently I see:
Pods Statuses: 0 Running / 1 Succeeded / 0 Failed
Is this possible?
The a non-zero exit code on k8s jobs will fall into the Failed pod status. There really isn't a way for you to have the exit code shown with kubectl get pods but you could output the pod status with -ojson and then pipe it into jq looking for the exit code. Something like the following from this post might work
kubectl get pod pod_name -c container_name-n namespace -ojson | jq .status.containerStatuses[].state.terminated.exitCode
or this, with the items[] in the json
kubectl get pods -ojson | jq .items[].status.containerStatuses[].state.terminated.exitCode
Alternatively, as u/blaimi mentioned, you can do it without jq, like this:
kubectl get pod pod_name -o jsonpath --template='{.status.containerStatuses[*].state.terminated.exitCode}
When I try to create Deployment as Type Job, it's not pulling any image.
Below is .yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: copyartifacts
spec:
backoffLimit: 1
template:
metadata:
name: copyartifacts
spec:
restartPolicy: "Never"
volumes:
- name: sharedvolume
persistentVolumeClaim:
claimName: shared-pvc
- name: dockersocket
hostPath:
path: /var/run/docker.sock
containers:
- name: copyartifacts
image: alpine:3.7
imagePullPolicy: Always
command: ["sh", "-c", "ls -l /shared; rm -rf /shared/*; ls -l /shared; while [ ! -d /shared/artifacts ]; do echo Waiting for artifacts to be copied; sleep 2; done; sleep 10; ls -l /shared/artifacts; "]
volumeMounts:
- mountPath: /shared
name: sharedvolume
Can you please guide here?
Regards,
Vikas
There could be two possible reasons for not seeing pod.
The pod hasn't been created yet.
The pod has completed it's task and terminated before you have noticed.
1. Pod hasn't been created:
If pod hasn't been created yet, you have to find out why the job failed to create pod. You can view job's events to see if there are any failure event. Use following command to describe a job.
kubectl describe job <job-name> -n <namespace>
Then, check the Events: field. There might be some events showing pod creation failure with respective reason.
2. Pod has completed and terminated:
Job's are used to perform one-time task rather than serving an application that require to maintain a desired state. When the task is complete, pod goes to completed state then terminate (but not deleted). If your Job is intended for a task that does not take much time, the pod may terminate after completing the task before you have noticed.
As the pod is terminated, kubectl get pods will not show that pod. However, you will able to see the pod using kubectl get pods -a command as it hasn't been deleted.
You can also describe the job and check for completion or success event.
if you use kind created the K8s cluster, all the cluster node run as docker. If you had reboot you computer or VM, the cluster (pod) ip address may change, leeding to the cluster node internet communication failed. In this case, see the cluster manager logs, it has error message. Job created, but pod not.
try to re-create the cluster, or change the node config about ip address.