Does kubernetes pod restart on failure on ImgPullBackOff - kubernetes

I have a kubernetes cluster working perfectly fine. I have 10 worker nodes and 1 master device. I have a below deployment.yaml file type as DaemonSet for pods and containers.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: deployment
namespace: mynamespace
spec:
replicas: 2
selector:
matchLabels:
name: deployment
template:
metadata:
labels:
name: deployment
spec:
#List of all the containers
containers:
- name: container1
image: CRname/container1
imagePullPolicy: Always
volumeMounts:
- mountPath: /share
name: share-files
securityContext:
privileged: true
- name: container2
image: CRname/container2
imagePullPolicy: Always
volumeMounts:
- mountPath: /share
name: share-files
securityContext:
privileged: true
volumes:
- name: share-files
hostPath:
path: /home/user/shared-folder
imagePullSecrets:
- name: Mysecret
nodeSelector:
NodeType: ALL
On starting above, these two containers start running on all the worker nodes and runs perfectly fine. But I have observed that sometimes, few nodes show error as ImagePullBackOff which means that due to some network or any other issues, it wasn't able to download the image. I did use describe command to check which image failed. But the problem is it didn't try to automatically redownload the image. I had to delete the pod and thus it is automatically created and then it works fine.
I just want to know why the pod shows this error and do not try to redownload the image. Is there anything which I can add in the yaml file so that it delete and recreate the pod automatically on any type of error.
EDIT I would also like to ask when the deployment is created, node starts to pull the image from the container registry initially for the first time. Once they are downloaded locally on the nodes, why does it has to pull the image again when the image is locally present.?
Please suggest some good options. Thanks.

imagePullPolicy: Always cause you download the image everytime pod restart.
And the pod is always restarted by daemonset, so just wait a bit more time.

Related

container level securityContext fsGroup

I'm trying to play with single pod multi container scenario.
The problem is one of my container (directus) is a node app that run as user 'node' with uid 1000
First try, I use hostpath as storage back end. With this, I need to change the host's directory mode with chmod manualy.
Now, I'm trying using longhorn.
And basicaly I don't want to change a host directory mod/ownership each time i deploy this deployment.
Here is my manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: lh-directus
namespace: lh-directus
spec:
replicas: 1
selector:
matchLabels:
app: lh-directus
template:
metadata:
labels:
app: lh-directus
spec:
nodeSelector:
kubernetes.io/os: linux
isGeneralDeployment: "true"
volumes:
- name: lh-directus-uploads-volume
persistentVolumeClaim:
claimName: lh-directus-uploads-pvc
- name: lh-directus-dbdata-volume
persistentVolumeClaim:
claimName: lh-directus-dbdata-pvc
containers:
# Redis Cache
- name: redis
image: redis:6
# Database
- name: database
image: postgres:12
volumeMounts:
- name: lh-directus-dbdata-volume
mountPath: /var/lib/postgresql/data
# Directus
- name: directus
image: directus/directus:latest
securityContext:
fsGroup: 1000
volumeMounts:
- name: lh-directus-uploads-volume
mountPath: /directus/uploads
When I Appy the manifest, I got error
error: error validating "lh-directus.yaml": error validating data: ValidationError(Deployment.spec.template.spec.containers[2].securityContext): unknown field "fsGroup" in io.k8s.api.core.v1.SecurityContext; if you choose to ignore these errors, turn validation off with --validate=false
I reads about initContainer ....
But Kindly please tell me how to fix this problem without initContainer and without manualy set/change host's path ownership/mod.
Sincerely
-bino-

Executing a Script using a Cronjob Kubernetes Cluster

I have a 3 node K8 v1.21 cluster in AWS and looking for SOLID config to run a script using a cronjob. I have seen many documents on here and Google using cronjob and hostPath to Persistent Volumes/Claims to using ConfigMaps, the list goes one.
I keep getting "Back-off restarting failed container/CrashLoopBackOff" errors.
Any help is much appreciated.
cronjob.yaml
The script I am trying to run is basic for testing only
#! /bin/<br/>
kubectl create deployment nginx --image=nginx
Still getting the same error.
kubectl describe pod/xxxx
This hostPath in AWS cluster created using eksctl works.
apiVersion: v1
kind: Pod
metadata:
name: redis-hostpath
spec:
containers:
- image: redis
name: redis-container
volumeMounts:
- mountPath: /test-mnt
name: test-vol
volumes:
- name: test-vol
hostPath:
path: /test-vol
UPDATE
Tried running your config in GCP on a fresh cluster. Only thing I changed was the /home/script.sh to /home/admin/script.sh
Did you test this on your cluster?
Warning FailedPostStartHook 5m27s kubelet Exec lifecycle hook ([/home/mchung/script.sh]) for Container "busybox" in Pod "dumb-job-1635012900-qphqr_default(305c4ed4-08d1-4585-83e0-37a2bc008487)" failed - error: rpc error: code = Unknown desc = failed to exec in container: failed to create exec "0f9f72ccc6279542f18ebe77f497e8c2a8fd52f8dfad118c723a1ba025b05771": cannot exec in a deleted state: unknown, message: ""
Normal Killing 5m27s kubelet FailedPostStartHook
Assuming you're running it in a remote multi-node cluster (since you mentioned AWS in your question), hostPath is NOT an option there for volume mount. Your best choice would be to use a ConfigMap and use it as volume mount.
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-script
data:
script.sh: |
# write down your script here
And then:
apiVersion: batch/v1
kind: CronJob
metadata:
name: redis-job
spec:
schedule: '*/5 * * * *'
jobTemplate:
spec:
template:
spec:
containers:
- name: redis-container
image: redis
args:
- /bin/sh
- -c
- /home/user/script.sh
volumeMounts:
- name: redis-data
mountPath: /home/user/script.sh
subPath: script.sh
volumes:
- name: redis-data
configMap:
name: redis-script
Hope this helps. Let me know if you face any difficulties.
Update:
I think you're doing something wrong. kubectl isn't something you should run from another container / pod. Because it requires the necessary binary to be existed into that container and an appropriate context set. I'm putting a working manifest below for you to understand the whole concept of running a script as a part of cron job:
apiVersion: v1
kind: ConfigMap
metadata:
name: script-config
data:
script.sh: |-
name=StackOverflow
echo "I love $name <3"
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: dumb-job
spec:
schedule: '*/1 * * * *' # every minute
jobTemplate:
spec:
template:
spec:
containers:
- name: busybox
image: busybox:stable
lifecycle:
postStart:
exec:
command:
- /home/script.sh
volumeMounts:
- name: some-volume
mountPath: /home/script.sh
volumes:
- name: some-volume
configMap:
name: script-config
restartPolicy: OnFailure
What it'll do is it'll print some texts in the STDOUT in every minute. Please note that I have put only the commands that container is capable to execute, and kubectl is certainly not one of them which exists in that container out-of-the-box. I hope that is enough to answer your question.

K8S cronjob scheduling on existing pod?

I have my application running in K8S pods. my application writes logs to a particular path for which I already have volume mounted on the pod. my requirement is to schedule cronjob which will trigger weekly once and read the logs from that pod volume and generate a report base on my script (which is basically filtering the logs based on some keywords). and send the report via mail.
unfortunately I am not sure how I will proceed on this as I couldn't get any doc or blog which talks about integrating conrjob to existing pod or volume.
apiVersion: v1
kind: Pod
metadata:
name: webserver
spec:
volumes:
- name: shared-logs
emptyDir: {}
containers:
- name: nginx
image: nginx
volumeMounts:
- name: shared-logs
mountPath: /var/log/nginx
- name: sidecar-container
image: busybox
command: ["sh","-c","while true; do cat /var/log/nginx/access.log /var/log/nginx/error.log; sleep 30; done"]
volumeMounts:
- name: shared-logs
mountPath: /var/log/nginx
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: "discovery-cronjob"
labels:
app.kubernetes.io/name: discovery
spec:
schedule: "*/5 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: log-report
image: busybox
command: ['/bin/sh']
args: ['-c', 'cat /var/log/nginx/access.log > nginx.log']
volumeMounts:
- mountPath: /log
name: shared-logs
restartPolicy: Never
I see two things here that you need to know:
Unfortunately, it is not possible to schedule a cronjob on an existing pod. Pods are ephemeral and job needs to finish. It would be impossible to tell if the job completed or not. This is by design.
Also in order to be able to see the files from one pod to another you must use a PVC. The logs created by your app have to be persisted if your job wants to access it. Here you can find some examples of how to Create ReadWriteMany PersistentVolumeClaims on your Kubernetes Cluster:
Kubernetes allows us to provision our PersistentVolumes dynamically
using PersistentVolumeClaims. Pods treat these claims as volumes. The
access mode of the PVC determines how many nodes can establish a
connection to it. We can refer to the resource provider’s docs for
their supported access modes.

hostPath as volume in kubernetes

I am trying to configure hostPath as the volume in kubernetes. I have logged into VM server, from where I usually use kubernetes commands such as kubectl.
Below is the pod yaml:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: helloworldanilhostpath
spec:
replicas: 1
template:
metadata:
labels:
run: helloworldanilhostpath
spec:
volumes:
- name: task-pv-storage
hostPath:
path: /home/openapianil/samplePV
type: Directory
containers:
- name: helloworldv1
image: ***/helloworldv1:v1
ports:
- containerPort: 9123
volumeMounts:
- name: task-pv-storage
mountPath: /mnt/sample
In VM server, I have created "/home/openapianil/samplePV" folder and I have a file present in it. It has a sample.txt file.
once I try to create this deployment. it does not happen with error -
Warning FailedMount 28s (x7 over 59s) kubelet, aks-nodepool1-39499429-1 MountVolume.SetUp failed for volume "task-pv-storage" : hostPath type check failed: /home/openapianil/samplePV is not a directory.
Can anyone please help me in understanding the problem here.
hostPath type volumes refer to directories on the Node (VM/machine) where your Pod is scheduled for running (aks-nodepool1-39499429-1 in this case). So you'd need to create this directory at least on that Node.
To make sure your Pod is consistently scheduled on that specific Node you need to set spec.nodeSelector in the PodTemplate:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: helloworldanilhostpath
spec:
replicas: 1
template:
metadata:
labels:
run: helloworldanilhostpath
spec:
nodeSelector:
kubernetes.io/hostname: aks-nodepool1-39499429-1
volumes:
- name: task-pv-storage
hostPath:
path: /home/openapianil/samplePV
type: Directory
containers:
- name: helloworldv1
image: ***/helloworldv1:v1
ports:
- containerPort: 9123
volumeMounts:
- name: task-pv-storage
mountPath: /mnt/sample
In most cases it's a bad idea to use this type of volume; there are some special use cases, but chance are yours is not one them!
If you need local storage for some reason then a slightly better solution is to use local PersistentVolumes.

How do I retain/access the custom log files from a completed cronjob?

I have a cronjob that is completing and outputting several log files.
I want to persist these files and be able access them after the pod has succeeded.
I've found I can access the stdout with oc logs -f <pod>, but I really need to access the log files.
I'm aware Openshift 2 apparently had an environment variable location OPENSHIFT_LOG_DIR that log files were written to, but Openshift 3.5 doesn't appear to have this.
What's my best way of logging and accessing the logs from the CronJob after the pod has succeeded and finished?
After a Job runs to completion, the Pod terminates, but it is not automatically deleted. Since it has completed, you need to use -a to see it. Once you have the Pod name, kubectl logs works as you would expect.
$ kubectl get pods -a
NAME READY STATUS RESTARTS AGE
curator-1499817660-6rzmf 0/1 Completed 0 28d
$ kubectl logs curator-1499817660-6rzmf
2017-07-12 00:01:10,409 INFO ...
A bit late but I hope this answer helps someone facing an almost similar context. For my case I needed to access some files generated by a CronJob and because the pod (and logs) are no longer accessible once the job completes I could not do so I was getting an error:
kubectl logs mongodump-backup-29087654-hj89
Error from server (NotFound): pods "mongodump-backup-27640120-n8p7z" not found
My solution was to deploy a Pod that could access the PVC. The Pod runs a busybox image as below :
apiVersion: v1
kind: Pod
metadata:
name: pvc-inspector
namespace: demos
spec:
containers:
- image: busybox
imagePullPolicy: "IfNotPresent"
name: pvc-inspector
command: ["tail"]
args: ["-f", "/dev/null"]
volumeMounts:
- mountPath: "/tmp"
name: pvc-mount
volumes:
- name: pvc-mount
persistentVolumeClaim:
claimName: mongo-backup-toolset-claim
After deploying this pod side by side to the CronJob I can exec into the pvc-inspector pod and then access files generated by the CronJob :
kubectl exec -it mongodump-backup-29087654-hj89 -- sh
cd tmp
ls
The pvc-inspector has to use the same persistentVolumeClaim as the CronJob and it also has to mount to the same directory as the CronJob.
The CronJob is a simple utility that is doing database backups of Mongo instances :
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: mongodump-backup
spec:
schedule: "*/5 * * * *" #Cron job every 5 minutes
startingDeadlineSeconds: 60
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 2
jobTemplate:
spec:
template:
spec:
containers:
- name: mongodump-backup
image: golide/mongo-backup-toolset:0.0.2
imagePullPolicy: "IfNotPresent"
env:
- name: DATABASE_NAME
value: "claims"
- name: MONGODB_URI
value: mongodb://root:mypasswordhere#mongodb-af1/claims
volumeMounts:
- mountPath: "/tmp"
name: mongodump-volume
command: ['sh', '-c',"./dumpp.sh"]
restartPolicy: OnFailure
volumes:
- name: mongodump-volume
persistentVolumeClaim:
claimName: mongo-backup-toolset-claim