How to introduce delay between Pod restarts in a statefulset? - kubernetes

I am using the below command to restart Pods in a statefulset
kubectl rollout restart statefulset ts
If I have to introduce a delay between pod rotation, is there an argument or any other method to achieve it? I am using a sidecar that updates the Pod IP address to a configuration file, if the Pod restarts before the IP address is updated in the config file, the service is not healthy. Looking for a way to introduce a delay between pod restarts/pod rotations.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: typesense
namespace: typesense
labels:
service: typesense
app: typesense
spec:
serviceName: ts
podManagementPolicy: Parallel
replicas: 3
selector:
matchLabels:
service: typesense
app: typesense
template:
metadata:
labels:
service: typesense
app: typesense
spec:
serviceAccountName: typesense-service-account
securityContext:
fsGroup: 2000
runAsUser: 10000
runAsGroup: 3000
runAsNonRoot: true
terminationGracePeriodSeconds: 300
containers:
- name: typesense
envFrom:
# - configMapRef:
# name: typesense-config
- secretRef:
name: typesense-secret
image: typesense/typesense:0.23.0.rc43
command:
- "/opt/typesense-server"
- "-d"
- "/usr/share/typesense/data"
- "--api-port"
- "8108"
- "--peering-port"
- "8107"
- "--nodes"
- "/usr/share/typesense/nodes"
ports:
- containerPort: 8108
name: http
resources:
requests:
memory: 100Mi
cpu: "100m"
limits:
memory: 1Gi
cpu: "1000m"
volumeMounts:
- name: nodeslist
mountPath: /usr/share/typesense
- name: data
mountPath: /usr/share/typesense/data
- name: typesense-node-resolver
image: alasano/typesense-node-resolver
command:
- "/opt/tsns"
- "-namespace=typesense"
volumeMounts:
- name: nodeslist
mountPath: /usr/share/typesense
volumes:
- name: nodeslist
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
storageClassName: nfs
resources:
requests:
storage: 1Gi
You can find the full manifest here.

Maybe using init container ?
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: typesense
namespace: typesense
labels:
service: typesense
app: typesense
spec:
serviceName: ts
podManagementPolicy: Parallel
replicas: 3
selector:
matchLabels:
service: typesense
app: typesense
template:
metadata:
labels:
service: typesense
app: typesense
spec:
serviceAccountName: typesense-service-account
securityContext:
fsGroup: 2000
runAsUser: 10000
runAsGroup: 3000
runAsNonRoot: true
terminationGracePeriodSeconds: 300
initContainers:
- name: init-myservice
image: busybox:1.34.1
command: [
"echo",
"ip",
"to",
"config",
"file",,
]
containers:
- name: typesense
envFrom:
# - configMapRef:
# name: typesense-config
- secretRef:
name: typesense-secret
image: typesense/typesense:0.23.0.rc43
command:
- "/opt/typesense-server"
- "-d"
- "/usr/share/typesense/data"
- "--api-port"
- "8108"
- "--peering-port"
- "8107"
- "--nodes"
- "/usr/share/typesense/nodes"
ports:
- containerPort: 8108
name: http
resources:
requests:
memory: 100Mi
cpu: "100m"
limits:
memory: 1Gi
cpu: "1000m"
volumeMounts:
- name: nodeslist
mountPath: /usr/share/typesense
- name: data
mountPath: /usr/share/typesense/data
- name: typesense-node-resolver
image: alasano/typesense-node-resolver
command:
- "/opt/tsns"
- "-namespace=typesense"
volumeMounts:
- name: nodeslist
mountPath: /usr/share/typesense
volumes:
- name: nodeslist
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
storageClassName: nfs
resources:
requests:
storage: 1Gi

Maybe you could try terminationGracePeriodSeconds and preStop hook with following command.
command: [ "/bin/bash", "-c", "sleep 30" ]
It introduces a 30 second delay after your container receive stop from control plane. With terminationGracePeriodSeconds, control plane won't force kill your container within the period.
The complete example should look like this:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test
labels:
app: test
spec:
replicas: 1
serviceName: "test"
selector:
matchLabels:
app: test
template:
metadata:
labels:
app: test
spec:
containers:
- image: nginx
name: test
lifecycle:
preStop:
exec:
command: [ "/bin/bash", "-c", "sleep 60" ]
terminationGracePeriodSeconds: 60

Related

I have the following YAML file whose errors I want to ignore

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
name: mysqldb-1
labels:
app.kubernetes.io: mysqldb-1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
name: mysqldb-2
labels:
app.kubernetes.io: mysqldb-2
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
---
apiVersion: v1
kind: Service
metadata:
name: mysqldb-service
labels:
app.kubernetes.io: mysqldb
spec:
ports:
- name: "5306"
port: 5306
targetPort: 3306
selector:
app.kubernetes.io: mysqldb
status:
loadBalancer: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysqldb
labels:
app.kubernetes.io: mysqldb
spec:
replicas: 3
selector:
matchLabels:
app.kubernetes.io: mysqldb
template:
metadata:
labels:
app.kubernetes.io: mysqldb
spec:
containers:
-name: mysqldb
image: mysql:8.0
ports:
- containerPort: 3306
volumeMounts:
- mountPath: /var/lib/mysql
name: mysqldb-1
- mountPath: /docker-entrypoint-initdb.d/init.sql
name: mysqldb-2
restartPolicy: Always
volumes:
- name: mysqldb-1
persistentVolumeClaim:
claimName: mysqldb-1
- name: mysqldb-2
persistentVolumeClaim:
claimName: mysqldb-2
status: {}
I got this error
Error from server (BadRequest): error when creating "mysqldb.yaml": Deployment in version "v1" cannot be handled as a Deployment: strict decoding error: unknown field "spec.spec"
How do I go about ignoring the errors for an included yaml file?
There are indentation issues in your Deployment resource. Also, add resource limits for the containers.
Try this one:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysqldb
labels:
app.kubernetes.io: mysqldb
spec:
replicas: 3
selector:
matchLabels:
app.kubernetes.io: mysqldb
template:
metadata:
labels:
app.kubernetes.io: mysqldb
spec:
containers:
- name: mysqldb
image: mysql:8.0
ports:
- containerPort: 3306
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "512Mi"
cpu: "1000m"
volumeMounts:
- mountPath: /var/lib/mysql
name: mysqldb-1
- mountPath: /docker-entrypoint-initdb.d/init.sql
name: mysqldb-2
restartPolicy: Always
volumes:
- name: mysqldb-1
persistentVolumeClaim:
claimName: mysqldb-1
- name: mysqldb-2
persistentVolumeClaim:
claimName: mysqldb-2
status: {}
NOTE: You can use an IDE extension for Kubernetes to get errors and warnings for YAML resource issues easily.

Nexus on k3s on restart does not persist Users and data

I have installed on K3S raspberry pi cluster nexus with the following setups for kubernetes learning purposes. First I created a StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nexus
namespace: dev-ops
spec:
serviceName: "nexus"
replicas: 1
selector:
matchLabels:
app: nexus-server
template:
metadata:
labels:
app: nexus-server
spec:
containers:
- name: nexus
image: klo2k/nexus3:latest
env:
- name: MAX_HEAP
value: "800m"
- name: MIN_HEAP
value: "300m"
resources:
limits:
memory: "4Gi"
cpu: "1000m"
requests:
memory: "2Gi"
cpu: "500m"
ports:
- containerPort: 8081
volumeMounts:
- name: nexusstorage
mountPath: /sonatype-work
volumes:
- name: nexusstorage
persistentVolumeClaim:
claimName: nexusstorage
Storage class
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nexusstorage
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "30"
fsType: "ext4"
diskSelector: "ssd"
nodeSelector: "ssd"
pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nexusstorage
namespace: dev-ops
spec:
accessModes:
- ReadWriteOnce
storageClassName: nexusstorage
resources:
requests:
storage: 50Gi
Service
apiVersion: v1
kind: Service
metadata:
name: nexus-server
namespace: dev-ops
annotations:
prometheus.io/scrape: 'true'
prometheus.io/path: /
prometheus.io/port: '8081'
spec:
selector:
app: nexus-server
type: LoadBalancer
ports:
- port: 8081
targetPort: 8081
nodePort: 32000
this setup will spin up nexus, but if I restart the pod the data will not persist and I have to create all the setups and users from scratch.
What I'm missing in this case?
UPDATE
I got it working, nexus needs on mount permissions on directory. The working StatefulSet looks as it follow
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nexus
namespace: dev-ops
spec:
serviceName: "nexus"
replicas: 1
selector:
matchLabels:
app: nexus-server
template:
metadata:
labels:
app: nexus-server
spec:
securityContext:
runAsUser: 200
runAsGroup: 200
fsGroup: 200
containers:
- name: nexus
image: klo2k/nexus3:latest
env:
- name: MAX_HEAP
value: "800m"
- name: MIN_HEAP
value: "300m"
resources:
limits:
memory: "4Gi"
cpu: "1000m"
requests:
memory: "2Gi"
cpu: "500m"
ports:
- containerPort: 8081
volumeMounts:
- name: nexus-storage
mountPath: /nexus-data
volumes:
- name: nexus-storage
persistentVolumeClaim:
claimName: nexus-storage
important snippet to get it working
securityContext:
runAsUser: 200
runAsGroup: 200
fsGroup: 200
I'm not familiar with that image, although checking dockerhub, they mention using a Dockerfile similar to that of Sonatype. Then, I would change the mountpoint for your volume, to /nexus-data
This is the default path storing data (they set this env var, then declare a VOLUME). Which we can confirm, looking at the repository that most likely produced your arm-capable image
And following up on your last comment, let's try to also mount it in /opt/sonatype/sonatype-work/nexus3...
In your statefulset, change volumeMounts, to this:
volumeMounts:
- name: nexusstorage
mountPath: /nexus-data
- name: nexusstorage
mountPath: /opt/sonatype/sonatype-work/nexus3
volumes:
- name: nexusstorage
persistentVolumeClaim:
claimName: nexusstorage
Although the second volumeMount entry should not be necessary, as far as I understand. Maybe something's wrong with your storage provider?
Are you sure your PVC is write-able? Reverting back to your initial configuration, enter your pod (kubectl exec -it) and try to write a file at the root of your PVC.

$(POD_NAME) in subPath of Statefulset + Kustomize not expanding

I have a stateful set with a volume that uses a subPath: $(POD_NAME) I've also tried $HOSTNAME which also doesn't work. How does one set the subPath of a volumeMount to the name of the pod or the $HOSTNAME?
Here's what I have:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: ravendb
namespace: pltfrmd
labels:
app: ravendb
spec:
serviceName: ravendb
template:
metadata:
labels:
app: ravendb
spec:
containers:
- command:
# ["/bin/sh", "-ec", "while :; do echo '.'; sleep 6 ; done"]
- /bin/sh
- -c
- /opt/RavenDB/Server/Raven.Server --log-to-console --config-path /configuration/settings.json
image: ravendb/ravendb:latest
imagePullPolicy: Always
name: ravendb
env:
- name: POD_HOST_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: RAVEN_Logs_Mode
value: Information
ports:
- containerPort: 8080
name: http-api
protocol: TCP
- containerPort: 38888
name: tcp-server
protocol: TCP
- containerPort: 161
name: snmp
protocol: TCP
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data
subPath: $(POD_NAME)
- mountPath: /configuration
name: configuration
subPath: ravendb
- mountPath: /certificates
name: certificates
dnsPolicy: ClusterFirst
restartPolicy: Always
terminationGracePeriodSeconds: 120
volumes:
- name: certificates
secret:
secretName: ravendb-certificate
- name: configuration
persistentVolumeClaim:
claimName: configuration
- name: data
persistentVolumeClaim:
claimName: ravendb
And the Persistent Volume:
apiVersion: v1
kind: PersistentVolume
metadata:
namespace: pltfrmd
name: ravendb
labels:
type: local
spec:
storageClassName: local-storage
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /volumes/ravendb
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: pltfrmd
name: ravendb
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
$HOSTNAME used to work, but doesn't anymore for some reason. Wondering if it's a bug in the host path storage provider?
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
targetPort: 80
selector:
app: nginx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
spec:
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.10
ports:
- containerPort: 80
volumeMounts:
- name: nginx
mountPath: /usr/share/nginx/html
subPath: $(POD_NAME)
volumes:
- name: nginx
configMap:
name: nginx
Ok, so after great experimentation I found a way that still works:
Step one, map an environment variable:
env:
- name: POD_HOST_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
This creates $(POD_HOST_NAME) based on the field metadata.name
Then in your mount you do this:
volumeMounts:
- mountPath: /data
name: data
subPathExpr: $(POD_HOST_NAME)
It's important to use subPathExpr as subPath (which worked before) doesn't work. Then it will use the environment variable you created and properly expand it.

Kubernetes Error With Deployment YAML File

I have the following file using which I'm setting up Prometheus on my Kubernetes cluster:
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-deployment
namespace: plant-simulator-monitoring
spec:
replicas: 1
selector:
matchLabels:
name: prometheus-server
template:
metadata:
labels:
app: prometheus-server
spec:
containers:
- name: prometheus
image: prom/prometheus:latest
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus/"
ports:
- containerPort: 9090
volumeMounts:
- name: prometheus-config-volume
mountPath: /etc/prometheus/
- name: prometheus-storage-volume
mountPath: /prometheus/
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
volumes:
- name: prometheus-config-volume
configMap:
defaultMode: 420
name: prometheus-server-conf
- name: prometheus-storage-volume
emptyDir: {}
When I apply this to my Kubernetes cluster, I see the following error:
ts=2020-03-16T21:40:33.123641578Z caller=sync.go:165 component=daemon err="plant-simulator-monitoring:deployment/prometheus-deployment: running kubectl: The Deployment \"prometheus-deployment\" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{\"app\":\"prometheus-server\"}: `selector` does not match template `labels`"
I could not see anything wrong with my yaml file. Is there something that I'm missing?
As I mentioned in comments, You have issue with matching labels.
In spec.selector.matchLabels you have name: prometheus-server and in spec.template.medatada.labels you have app: prometheus-server. Values there need to be the same. Below what I get when used your yaml:
$ kubectl apply -f deploymentoriginal.yaml
The Deployment "prometheus-deployment" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"app":"prometheus-server"}: `selector` does not match template `labels`
And output when I used below yaml with the same labels:
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-deployment
namespace: plant-simulator-monitoring
spec:
replicas: 1
selector:
matchLabels:
name: prometheus-server
template:
metadata:
labels:
name: prometheus-server
spec:
containers:
- name: prometheus
image: prom/prometheus:latest
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus/"
ports:
- containerPort: 9090
volumeMounts:
- name: prometheus-config-volume
mountPath: /etc/prometheus/
- name: prometheus-storage-volume
mountPath: /prometheus/
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
volumes:
- name: prometheus-config-volume
configMap:
defaultMode: 420
name: prometheus-server-conf
- name: prometheus-storage-volume
emptyDir: {}
$ kubectl apply -f deploymentselectors.yaml
deployment.apps/prometheus-deployment created
More detailed info about selectors/labels can be found in Official Kubernetes docs.
There is a mismatch between the label in selector(name: prometheus-server) and metadata (app: prometheus-server). Below should work.
selector:
matchLabels:
app: prometheus-server
template:
metadata:
labels:
app: prometheus-server

Kubernetes Permission denied for mounted nfs volume

The following is the k8s definition used:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pv-provisioning-demo
labels:
demo: nfs-pv-provisioning
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 200Gi
---
apiVersion: v1
kind: ReplicationController
metadata:
name: nfs-server
spec:
securityContext:
runAsUser: 1000
fsGroup: 2000
replicas: 1
selector:
role: nfs-server
template:
metadata:
labels:
role: nfs-server
spec:
containers:
- name: nfs-server
image: k8s.gcr.io/volume-nfs:0.8
ports:
- name: nfs
containerPort: 2049
- name: mountd
containerPort: 20048
- name: rpcbind
containerPort: 111
securityContext:
privileged: true
volumeMounts:
- mountPath: /exports
name: mypvc
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: nfs-pv-provisioning-demo
---
kind: Service
apiVersion: v1
metadata:
name: nfs-server
spec:
ports:
- name: nfs
port: 2049
- name: mountd
port: 20048
- name: rpcbind
port: 111
selector:
role: nfs-server
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
nfs:
# FIXME: use the right IP
server: nfs-server
path: "/"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 1Gi
---
# This mounts the nfs volume claim into /mnt and continuously
# overwrites /mnt/index.html with the time and hostname of the pod.
apiVersion: v1
kind: ReplicationController
metadata:
name: nfs-busybox
spec:
securityContext:
runAsUser: 1000
fsGroup: 2000
replicas: 2
selector:
name: nfs-busybox
template:
metadata:
labels:
name: nfs-busybox
spec:
containers:
- image: busybox
command:
- sh
- -c
- 'while true; do date > /mnt/index.html; hostname >> /mnt/index.html; sleep $(($RANDOM % 5 + 5)); done'
imagePullPolicy: IfNotPresent
name: busybox
volumeMounts:
# name must match the volume name below
- name: nfs
mountPath: "/mnt"
volumes:
- name: nfs
persistentVolumeClaim:
claimName: nfs
Now /mnt directory in nfs-busybox should have 2000 as gid(as per docs). But it still have root and root as user and group. Since application is running with 1000/2000 its not able to create any logs or data in /mnt directory.
chmod might solve the issue, but it looks like work around. Is there any permenant solution for this?
Observations: If i replace nfs with some other PVC its working fine as told in docs.
Have you tried initContainers method? It fixes permissions on an exported directory:
initContainers:
- name: volume-mount-hack
image: busybox
command: ["sh", "-c", "chmod -R 777 /exports"]
volumeMounts:
- name: nfs
mountPath: /exports
If you use a standalone NFS server on Linux box, I suggest using no_root_squash option:
/exports *(rw,no_root_squash,no_subtree_check)
To manage the directory permission on nfs-server, there is a need to change security context and raise it to privileged mode:
apiVersion: v1
kind: Pod
metadata:
name: nfs-server
labels:
role: nfs-server
spec:
containers:
- name: nfs-server
image: nfs-server
ports:
- name: nfs
containerPort: 2049
securityContext:
privileged: true