why does my daemonset crash when a node goes down? - kubernetes

I have configured this DaemonSet in my cluster that is on the official Kubernetes page and everything works fine since it repartitions the replicas of my applications between my two available work nodes. The problem comes when one node goes down, then all the replicas start running on the other node. Once the downed node recovers the pods are not automatically partitioned between my nodes, so I have to manually remove all replicas and scale them again to get the DaemonSet to work.
How can i fix this?
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-elasticsearch
namespace: kube-system
labels:
k8s-app: fluentd-logging
spec:
selector:
matchLabels:
name: fluentd-elasticsearch
template:
metadata:
labels:
name: fluentd-elasticsearch
spec:
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
containers:
- name: fluentd-elasticsearch
image: gcr.io/fluentd-elasticsearch/fluentd:v2.5.1
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers

Related

Nexus on k3s on restart does not persist Users and data

I have installed on K3S raspberry pi cluster nexus with the following setups for kubernetes learning purposes. First I created a StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nexus
namespace: dev-ops
spec:
serviceName: "nexus"
replicas: 1
selector:
matchLabels:
app: nexus-server
template:
metadata:
labels:
app: nexus-server
spec:
containers:
- name: nexus
image: klo2k/nexus3:latest
env:
- name: MAX_HEAP
value: "800m"
- name: MIN_HEAP
value: "300m"
resources:
limits:
memory: "4Gi"
cpu: "1000m"
requests:
memory: "2Gi"
cpu: "500m"
ports:
- containerPort: 8081
volumeMounts:
- name: nexusstorage
mountPath: /sonatype-work
volumes:
- name: nexusstorage
persistentVolumeClaim:
claimName: nexusstorage
Storage class
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nexusstorage
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "30"
fsType: "ext4"
diskSelector: "ssd"
nodeSelector: "ssd"
pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nexusstorage
namespace: dev-ops
spec:
accessModes:
- ReadWriteOnce
storageClassName: nexusstorage
resources:
requests:
storage: 50Gi
Service
apiVersion: v1
kind: Service
metadata:
name: nexus-server
namespace: dev-ops
annotations:
prometheus.io/scrape: 'true'
prometheus.io/path: /
prometheus.io/port: '8081'
spec:
selector:
app: nexus-server
type: LoadBalancer
ports:
- port: 8081
targetPort: 8081
nodePort: 32000
this setup will spin up nexus, but if I restart the pod the data will not persist and I have to create all the setups and users from scratch.
What I'm missing in this case?
UPDATE
I got it working, nexus needs on mount permissions on directory. The working StatefulSet looks as it follow
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nexus
namespace: dev-ops
spec:
serviceName: "nexus"
replicas: 1
selector:
matchLabels:
app: nexus-server
template:
metadata:
labels:
app: nexus-server
spec:
securityContext:
runAsUser: 200
runAsGroup: 200
fsGroup: 200
containers:
- name: nexus
image: klo2k/nexus3:latest
env:
- name: MAX_HEAP
value: "800m"
- name: MIN_HEAP
value: "300m"
resources:
limits:
memory: "4Gi"
cpu: "1000m"
requests:
memory: "2Gi"
cpu: "500m"
ports:
- containerPort: 8081
volumeMounts:
- name: nexus-storage
mountPath: /nexus-data
volumes:
- name: nexus-storage
persistentVolumeClaim:
claimName: nexus-storage
important snippet to get it working
securityContext:
runAsUser: 200
runAsGroup: 200
fsGroup: 200
I'm not familiar with that image, although checking dockerhub, they mention using a Dockerfile similar to that of Sonatype. Then, I would change the mountpoint for your volume, to /nexus-data
This is the default path storing data (they set this env var, then declare a VOLUME). Which we can confirm, looking at the repository that most likely produced your arm-capable image
And following up on your last comment, let's try to also mount it in /opt/sonatype/sonatype-work/nexus3...
In your statefulset, change volumeMounts, to this:
volumeMounts:
- name: nexusstorage
mountPath: /nexus-data
- name: nexusstorage
mountPath: /opt/sonatype/sonatype-work/nexus3
volumes:
- name: nexusstorage
persistentVolumeClaim:
claimName: nexusstorage
Although the second volumeMount entry should not be necessary, as far as I understand. Maybe something's wrong with your storage provider?
Are you sure your PVC is write-able? Reverting back to your initial configuration, enter your pod (kubectl exec -it) and try to write a file at the root of your PVC.

Kubernetes cluster won't send logs to Loggly

I have a Kubernetes cluster which I am trying to get to run fluentd to send logs to Loggly for viewing. I am having an issue where I know my token is correct since it was working on another cluster before.
Below is my manifet.yaml file, I have no logs to share as the pod isn't logging anything.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-es-v1.20
namespace: kube-system
labels:
k8s-app: fluentd-loggly
kubernetes.io/cluster-service: "true"
version: v1.20
spec:
selector:
matchLabels:
k8s-app: fluentd-loggly
template:
metadata:
labels:
k8s-app: fluentd-loggly
kubernetes.io/cluster-service: "true"
version: v1.20
spec:
containers:
- name: fluentd-loggly
image: garland/kubernetes-fluentd-loggly:1.0
command:
- '/bin/sh'
- '-c'
- '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
env:
- name: LOGGLY_URL
value: "https://logs-01.loggly.com/inputs/<token>/tag/<tag>"
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
Edit 1
Logs requested
2021-06-21 16:33:43 +0000 [warn]: pattern not match: "2021-06-21T16:33:43.548145693Z stdout F 2021-06-21 16:33:43 UTC | CORE | ERROR | (pkg/autodiscovery/config_poller.go:123 in collect) | Unable to collect configurations from provider docker: temporary failure in dockerutil, will retry later: try delay not elapsed yet"
This repeats over and over with nothing else in the logs.

Time Zone in Kubernetes Pods Using Environment Variable

I am trying to update my pod time to Asia/Kolkata zone as per kubernetes timezone in POD with command and argument. However, the time still remains the same UTC time. Only the time zone is getting updated from UTC to Asia.
I was able to fix it using the volume mounts as below. Create a config map and apply the deployment yaml.
kubectl create configmap tz --from-file=/usr/share/zoneinfo/Asia/Kolkata -n <required namespace>
Why is the environmental variable method not working? Will a pod eviction occur from one host to another if we use volume mount time and will if affect the volume mount time after pod eviction?
The EV deployment YAML is below which does not update the time
apiVersion: apps/v1
kind: Deployment
metadata:
name: connector
labels:
app: connector
namespace: clients
spec:
replicas: 1
selector:
matchLabels:
app: connector
template:
metadata:
labels:
app: connector
spec:
containers:
- image: connector
name: connector
resources:
requests:
memory: "32Mi" # "64M"
cpu: "250m"
limits:
memory: "64Mi" # "128M"
cpu: "500m"
ports:
- containerPort: 3307
protocol: TCP
env:
- name: TZ
value: Asia/Kolkata
volumeMounts:
- name: connector-rd
mountPath: /home/mongobi/mongosqld.conf
subPath: mongosqld.conf
volumes:
- name: connector-rd
configMap:
name: connector-rd
items:
- key: mongod.conf
Volume Mount yaml is below.
apiVersion: apps/v1
kind: Deployment
metadata:
name: connector
labels:
app: connector
namespace: clients
spec:
replicas: 1
selector:
matchLabels:
app: connector
template:
metadata:
labels:
app: connector
spec:
containers:
- image: connector
name: connector
resources:
requests:
memory: "32Mi" # "64M"
cpu: "250m"
limits:
memory: "64Mi" # "128M"
cpu: "500m"
ports:
- containerPort: 3307
protocol: TCP
volumeMounts:
- name: tz-config
mountPath: /etc/localtime
- name: connector-rd
mountPath: /home/mongobi/mongosqld.conf
subPath: mongosqld.conf
volumes:
- name: connector-rd
configMap:
name: connector-rd
items:
- key: mongod.conf
path: mongosqld.conf
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Kolkata
In this scenario you need to mention type attribute as File for hostPath in the deployment configuration. The below configuration should work for you.
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Kolkata
type: File
Simply setting TZ env variable in deployment works for me

How can I see my failing jobs with Kubernetes

I have an issue with a Job in kubernetes. When I want to debug, I do:
kubectl describe job -n influx pamela-1578898800
Name: pamela-1578898800
Namespace: influx
Selector: controller-uid=xxx
Labels: controller-uid=xxx
job-name=pamela-1578898800
Annotations: <none>
Controlled By: CronJob/pamela
Parallelism: 1
Completions: 1
Start Time: Mon, 13 Jan 2020 08:00:04 +0100
Pods Statuses: 0 Running / 0 Succeeded / 5 Failed
Pod Template:
Labels: controller-uid=53110b24-35d2-11ea-bca1-06ecc706e86a
job-name=pamela-1578898800
Containers:
pamela:
Image: registry.gitlab.com/xxx/pamela:latest
Port: <none>
Host Port: <none>
Limits:
cpu: 800m
memory: 1000Mi
Requests:
cpu: 800m
memory: 1000Mi
Environment Variables from:
pamela-env Secret Optional: false
Environment: <none>
Mounts:
/config from pamela-keys (rw)
/log from pamela-claim (rw,path="log")
/raw from pamela-claim (rw,path="raw")
Volumes:
pamela-claim:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: pamela-claim
ReadOnly: false
pamela-keys:
Type: Secret (a volume populated by a Secret)
SecretName: pamela-keys
Optional: false
Events: <none>
Here you can see 5 failing pods, but I don't know how to see the logs of failing pods.
When I do :
kubectl get po -A
I have no pods "pamela-xxx"
How should I see the issue ?
EDIT:
Here are the scripts I use to trigger job.
# job.yaml - THIS ONE WORKS, LAUNCHING MANUALLY
apiVersion: batch/v1
kind: Job
metadata:
name: pamela-singlerun
namespace: influx
spec:
template:
spec:
containers:
- image: registry.gitlab.com/xxx/pamela:latest
envFrom:
- secretRef:
name: pamela-env
name: pamela
volumeMounts:
- mountPath: /raw
name: pamela-claim
subPath: raw
- mountPath: /log
name: pamela-claim
subPath: log
- mountPath: /config
name: pamela-keys
restartPolicy: Never
volumes:
- name: pamela-claim
persistentVolumeClaim:
claimName: pamela-claim
- name: pamela-keys
secret:
secretName: pamela-keys
items:
- key: keys.yml
path: keys.yml
nodeSelector:
kops.k8s.io/instancegroup: pamela-nodes
imagePullSecrets:
- name: gitlab-registry
And cronjob.yml, THIS ONE DOESN'T WORK
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: pamela
namespace: influx
spec:
schedule: "0 7,19 * * *"
concurrencyPolicy: Replace
jobTemplate:
spec:
template:
spec:
containers:
- image: registry.gitlab.com/xxx/pamela:latest
envFrom:
- secretRef:
name: pamela-env
name: pamela
resources:
limits:
cpu: 800m
memory: 1000Mi
requests:
cpu: 800m
memory: 1000Mi
volumeMounts:
- mountPath: /raw
name: pamela-claim
subPath: raw
- mountPath: /log
name: pamela-claim
subPath: log
- mountPath: /config
name: pamela-keys
restartPolicy: Never
volumes:
- name: pamela-claim
persistentVolumeClaim:
claimName: pamela-claim
- name: pamela-keys
secret:
secretName: pamela-keys
items:
- key: keys.yml
path: keys.yml
nodeSelector:
kops.k8s.io/instancegroup: pamela-nodes
imagePullSecrets:
- name: gitlab-registry
EDIT 2: After running cron each 10 minutes, I can see my jobs, and I get expected results ( means it works )
pamela-1578992400-ppgtx 0/1 Completed 0 21m
pamela-1578993000-kn8nd 0/1 Completed 0 11m
But when right after this, I get:
Error from server (NotFound): pods "pamela-1578992400-ppgtx" not found
when trying to get logs, means that ttl should be 10 min. When trying to increase ttl, I get a feature-gates disabled issue. checking how to fix it
It is weird, after setting cron job each 10 min, I get:
➜ kubectl get jobs -n influx
NAME COMPLETIONS DURATION AGE
pamela-1578898800 0/1 32h 32h
pamela-1579007400 1/1 99s 159m
pamela-1579011000 1/1 97s 99m
pamela-1579014600 1/1 108s 39m
I use: schedule: "10 * * * *"
Don't understand what's going on here...

Deploy pods in different nodes

I have a namespace called airflow that has 2 pods: webserver and scheduler. I want to deploy scheduler on node A and webserver on node B.
And here you can see deployment files:
scheduler:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: airflow
name: airflow-scheduler
labels:
name: airflow-scheduler
spec:
replicas: 1
template:
metadata:
labels:
app: airflow-scheduler
spec:
terminationGracePeriodSeconds: 60
containers:
- name: scheduler
image: 123423.dkr.ecr.us-east-1.amazonaws.com/airflow:$COMMIT_SHA1
volumeMounts:
- name: logs
mountPath: /logs
command: ["airflow"]
args: ["scheduler"]
imagePullPolicy: Always
resources:
limits:
memory: "3072Mi"
requests:
cpu: "500m"
memory: "2048Mi"
volumes:
- name: logs
persistentVolumeClaim:
claimName: logs
webserver:
apiVersion: v1
kind: Service
metadata:
name: airflow-webserver
namespace: airflow
labels:
run: airflow-webserver
spec:
ports:
- port: 80
targetPort: 8080
selector:
run: airflow-webserver
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: airflow-webserver
namespace: airflow
annotations:
kubernetes.io/ingress.class: nginx
certmanager.k8s.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- airflow.awesome.com.br
secretName: airflow-crt
rules:
- host: airflow.awesome.com.br
http:
paths:
- path: /
backend:
serviceName: airflow-webserver
servicePort: 80
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: airflow
name: airflow-webserver
labels:
run: airflow-webserver
spec:
replicas: 1
template:
metadata:
labels:
run: airflow-webserver
spec:
terminationGracePeriodSeconds: 60
containers:
- name: webserver
image: 123423.dkr.ecr.us-east-1.amazonaws.com/airflow:$COMMIT_SHA1
volumeMounts:
- name: logs
mountPath: /logs
ports:
- containerPort: 8080
command: ["airflow"]
args: ["webserver"]
imagePullPolicy: Always
resources:
limits:
cpu: "200m"
memory: "3072Mi"
requests:
cpu: "100m"
memory: "2048Mi"
volumes:
- name: logs
persistentVolumeClaim:
claimName: logs
What's the proper way to ensure that pods will be deployed on different nodes?
edit1:
antiaffinity is not working:
I've tried to set podAntiAffinity on scheduler but it's not working:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: name
operator: In
values:
- airflow-webserver
topologyKey: "kubernetes.io/hostname"
If you want to have these pods run on different nodes but you don't care about which nodes exactly, you can use the Pod anti-affinity feature. It basically defines that the pod X should not run on the same node (it can be also used with failure domain / zones etc., not just with nodes) as pod Y and uses labels to specify the pods. So you will need to add some labels and specify them in the spec sections. More info about it is in Kube docs.
If in addition you want to also specify on which node it should run, you can use the Node affinity feature. See Kube docs for more details.