Cannot deploy mongodb StatefulSet with volumes for replicas grater than one - mongodb

Context
I'm sharing /data/db directory, which is mounted as a Network File System volume among all pods controlled by StatefulSet.
Problem
When I set replicas: 1 stateful set correctly deploys mongodb. Problem starts when I scale up (nr. of replicas grater than one e.g. replicas: 2)
All consecutive pods have CrashLoopBackOff status.
Question
I understand error message -check debug section below. But, I don't get it. Basically, what I try to achieve is stateful deployment of mongodb, so even
after pods are deleted they will persist data. Somehow, mongo stops me from doing that because Another mongod instance is already running on the /data/db director.
My questions are: What am I doing wrong? How can I deploy mongodb so it's stateful and persist data, while scaling up stateful set?
Debug
Cluster state
$ kubectl get svc,sts,po,pv,pvc --output=wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/mongo ClusterIP None <none> 27017/TCP 10h run=mongo
NAME READY AGE CONTAINERS IMAGES
statefulset.apps/mongo 1/2 8m50s mongo mongo:4.2.0-bionic
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/mongo-0 1/1 Running 0 8m50s 10.44.0.2 web01 <none> <none>
pod/mongo-1 0/1 CrashLoopBackOff 6 8m48s 10.36.0.3 compute01 <none> <none>
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE VOLUMEMODE
persistentvolume/phenex-nfs-mongo 1Gi RWX Retain Bound phenex-nfs-mongo 22m Filesystem
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
persistentvolumeclaim/phenex-nfs-mongo Bound phenex-nfs-mongo 1Gi RWX 22m Filesystem
Log
$ kubectl logs -f mongo-1
2019-08-14T23:52:30.632+0000 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=mongo-1
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] db version v4.2.0
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] git version: a4b751dcf51dd249c5865812b390cfd1c0129c30
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.1.1 11 Sep 2018
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] allocator: tcmalloc
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] modules: none
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] build environment:
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] distmod: ubuntu1804
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] distarch: x86_64
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] target_arch: x86_64
2019-08-14T23:52:30.635+0000 I CONTROL [initandlisten] options: { net: { bindIp: "0.0.0.0" }, replication: { replSet: "rs0" } }
2019-08-14T23:52:30.642+0000 I STORAGE [initandlisten] exception in initAndListen: DBPathInUse: Unable to lock the lock file: /data/db/mongod.lock (Resource temporarily unavailable). Another mongod instance is already running on the /data/db directory, terminating
2019-08-14T23:52:30.643+0000 I NETWORK [initandlisten] shutdown: going to close listening sockets...
2019-08-14T23:52:30.643+0000 I NETWORK [initandlisten] removing socket file: /tmp/mongodb-27017.sock
2019-08-14T23:52:30.643+0000 I - [initandlisten] Stopping further Flow Control ticket acquisitions.
2019-08-14T23:52:30.643+0000 I CONTROL [initandlisten] now exiting
2019-08-14T23:52:30.643+0000 I CONTROL [initandlisten] shutting down with code:100
Error
Unable to lock the lock file: /data/db/mongod.lock (Resource temporarily unavailable).
Another mongod instance is already running on the /data/db directory, terminating
YAML files
# StatefulSet
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: mongo
replicas: 2
selector:
matchLabels:
run: mongo
tier: backend
template:
metadata:
labels:
run: mongo
tier: backend
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo:4.2.0-bionic
command:
- mongod
args:
- "--replSet=rs0"
- "--bind_ip=0.0.0.0"
ports:
- containerPort: 27017
volumeMounts:
- name: phenex-nfs-mongo
mountPath: /data/db
volumes:
- name: phenex-nfs-mongo
persistentVolumeClaim:
claimName: phenex-nfs-mongo
# PersistentVolume
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: phenex-nfs-mongo
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 1Gi
nfs:
server: master
path: /nfs/data/phenex/production/permastore/mongo
claimRef:
name: phenex-nfs-mongo
persistentVolumeReclaimPolicy: Retain
# PersistentVolumeClaim
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: phenex-nfs-mongo
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Mi

Problem:
You are deploying more than one pod using the same pvc and pv.
Solution:
Use volumeClaimTemplates, example
Example:
# StatefulSet
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: mongo
replicas: 2
selector:
matchLabels:
run: mongo
tier: backend
template:
metadata:
labels:
run: mongo
tier: backend
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo:4.2.0-bionic
command:
- mongod
args:
- "--replSet=rs0"
- "--bind_ip=0.0.0.0"
ports:
- containerPort: 27017
volumeMounts:
- name: phenex-nfs-mongo
mountPath: /data/db
volumeClaimTemplates:
- metadata:
name: phenex-nfs-mongo
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Mi

Related

MongoDb (k8s pod) is failing due to Operation is not permitted

I'm running a mongoDB (5.0.12) instance as a kubernetes pod. Suddenly the pod is failing and I need some help to understand the logs:
{"t":{"$date":"2022-09-13T18:39:51.104+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"AuthorizationManager-1","msg":"WiredTiger error","attr":{"error":1,"message":"[1663094391:104664][1:0x7fc5224cc700], file:index-9--3195476868760592993.wt, WT_SESSION.open_cursor: __posix_open_file, 808: /data/db/index-9--3195476868760592993.wt: handle-open: open: Operation not permitted"}}
{"t":{"$date":"2022-09-13T18:39:51.104+00:00"},"s":"F", "c":"STORAGE", "id":50882, "ctx":"AuthorizationManager-1","msg":"Failed to open WiredTiger cursor. This may be due to data corruption","attr":{"uri":"table:index-9--3195476868760592993","config":"overwrite=false","error":{"code":8,"codeName":"UnknownError","errmsg":"1: Operation not permitted"},"message":"Please read the documentation for starting MongoDB with --repair here: http://dochub.mongodb.org/core/repair"}}
{"t":{"$date":"2022-09-13T18:39:51.104+00:00"},"s":"F", "c":"-", "id":23091, "ctx":"AuthorizationManager-1","msg":"Fatal assertion","attr":{"msgid":50882,"file":"src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp","line":109}}
{"t":{"$date":"2022-09-13T18:39:51.104+00:00"},"s":"F", "c":"-", "id":23092, "ctx":"AuthorizationManager-1","msg":"\n\n***aborting after fassert() failure\n\n"}
So why is there operation is not permitted? I already run mongod --repair, but the error still occurs.
This is how the pod is deployed:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb
spec:
replicas: 1
selector:
matchLabels:
app: mongodb
strategy:
type: Recreate
template:
metadata:
labels:
app: mongodb
spec:
hostname: mongodb
# securityContext:
# runAsUser: 999
# runAsGroup: 3000
# fsGroup: 2000
volumes:
- name: data
persistentVolumeClaim:
claimName: data
containers:
- name: mongodb
image: mongo:5.0.12
args: ["--auth", "--dbpath", "/data/db"]
imagePullPolicy: IfNotPresent
ports:
- containerPort: 27017
volumeMounts:
- mountPath: /data/db
name: data
# securityContext:
# allowPrivilegeEscalation: false
Update
The PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
You can try checking the permissions for that file before execution:
ls -l
then using chmod you can try changing the permission and then try executing it.
OR
You can refer here, this might help you:
https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
You should have a look at setting the umask on the directory:
http://www.cyberciti.biz/tips/understanding-linux-unix-umask-value-usage.html
That will ensure new files in the directory are created with the specified permissions/ownerships.

Kubenetes Mongo Deployment Loses Data After Pod Restart

Recently, the managed pod in my mongo deployment onto GKE was automatically deleted and a new one was created in its place. As a result, all my db data was lost.
I specified a PV for the deployment and the PVC was bound too, and I used the standard storage class (google persistent disk). The Persistent Volume Claim had not been deleted either.
Here's an image of the result from kubectl get pv:
pvc
My mongo deployment along with the persistent volume claim and service deployment were all created by using kubernets' kompose tool from a docker-compose.yml for a prisma 1 + mongodb deployment.
Here are my yamls:
mongo-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose -f docker-compose.yml convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: mongo
name: mongo
namespace: dbmode
spec:
replicas: 1
selector:
matchLabels:
io.kompose.service: mongo
strategy:
type: Recreate
template:
metadata:
annotations:
kompose.cmd: kompose -f docker-compose.yml convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: mongo
spec:
containers:
- env:
- name: MONGO_INITDB_ROOT_PASSWORD
value: prisma
- name: MONGO_INITDB_ROOT_USERNAME
value: prisma
image: mongo:3.6
imagePullPolicy: ""
name: mongo
ports:
- containerPort: 27017
resources: {}
volumeMounts:
- mountPath: /var/lib/mongo
name: mongo
restartPolicy: Always
serviceAccountName: ""
volumes:
- name: mongo
persistentVolumeClaim:
claimName: mongo
status: {}
mongo-persistentvolumeclaim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
labels:
io.kompose.service: mongo
name: mongo
namespace: dbmode
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
status: {}
mongo-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
kompose.cmd: kompose -f docker-compose.yml convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: mongo
name: mongo
namespace: dbmode
spec:
ports:
- name: "27017"
port: 27017
targetPort: 27017
selector:
io.kompose.service: mongo
status:
loadBalancer: {}
I've tried checking the contents mounted in /var/lib/mongo and all I got was an empty lost+found/ folder, and I've tried to search the Google Persistent Disks but there was nothing in the root directory and I didn't know where else to look.
I guess that for some reason the mongo deployment is not pulling from the persistent volume for the old data when it starts a new pod, which is extremely perplexing.
I also have another kubernetes project where the same thing happened, except that the old pod still showed but had an evicted status.
I've tried checking the contents mounted in /var/lib/mongo and all I
got was an empty lost+found/ folder,
OK, but have you checked it was actually saving data there, before the Pod restart and data loss ? I guess it was never saving any data in that directory.
I checked the image you used by running a simple Pod:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: my-pod
image: mongo:3.6
When you connect to it by running:
kubectl exec -ti my-pod -- /bin/bash
and check the default mongo configuration file:
root#my-pod:/var/lib# cat /etc/mongod.conf.orig
# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# Where and how to store data.
storage:
dbPath: /var/lib/mongodb # 👈
journal:
enabled: true
# engine:
# mmapv1:
# wiredTiger:
you can see among other things that dbPath is actually set to /var/lib/mongodb and NOT to /var/lib/mongo.
So chances are that your mongo wasn't actually saving any data to your PV i.e. to /var/lib/mongo directory, where it was mounted, but to /var/lib/mongodb as stated in its configuration file.
You should be able to check it easily by kubectl exec to your running mongo pod:
kubectl exec -ti <mongo-pod-name> -- /bin/bash
and verify where the data is saved.
If you didn't overwrite in any way the original config file (e.g. by providing a ConfigMap), mongo should save its data to /var/lib/mongodb and this directory, not being a mount point for your volume, is part of a Pod filesystem and its ephemeral.
Update:
The above mentioned /etc/mongod.conf.orig is only a template so it doesn't reflect the actual configuration that has been applied.
If you run:
kubectl logs your-mongo-pod
it will show where the data directory is located:
$ kubectl logs my-pod
2020-12-16T22:20:47.472+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=my-pod
2020-12-16T22:20:47.473+0000 I CONTROL [initandlisten] db version v3.6.21
...
As we can see, data is saved in /data/db:
dbpath=/data/db

minikube mongodb statefulset with persistent volume config for 3 replicas

Pod fails to start and I am having a tough time debugging it with my limited experience. I have also mounted a volume from my local machine to the minikube vm like so:
minikube start --cpus 4 --memory 8192 --mount-string="/data/minikube:/data" --mount
Any help would be much appreciated.
Storage Class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: db-sc
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
Persistent Volume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: db-pv
spec:
storageClassName: db-sc
capacity:
storage: 10Gi
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Retain
accessModes:
- ReadWriteOnce
local:
path: "/data"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- minikube
StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test-db
labels:
app: test-db
spec:
serviceName: test-db
replicas: 3
selector:
matchLabels:
app: test-db
replicaset: MainRepSet
template:
metadata:
labels:
app: test-db
replicaset: MainRepSet
spec:
containers:
- name: test-db
image: mongo:3.4
command:
- "numactl"
- "--interleave=all"
- "mongod"
- "--bind_ip"
- "0.0.0.0"
- "--replSet"
- "MainRepSet"
- "--auth"
- "--clusterAuthMode"
- "keyFile"
- "--keyFile"
- "/etc/secrets-volume/internal-auth-db-keyfile"
- "--setParameter"
- "authenticationMechanisms=SCRAM-SHA-1"
ports:
- containerPort: 27017
volumeMounts:
- name: secrets-volume
readOnly: true
mountPath: /etc/secrets-volume
- name: db-pvc
mountPath: /data/db
resources:
requests:
cpu: 150m
memory: 150Mi
limits:
cpu: 400m
memory: 400Mi
volumes:
- name: secrets-volume
secret:
secretName: db-user
defaultMode: 256
volumeClaimTemplates:
- metadata:
name: db-pvc
spec:
storageClassName: db-sc
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 500Mi
Service:
apiVersion: v1
kind: Service
metadata:
name: test-db
labels:
name: test-db
spec:
selector:
app: test-db
clusterIP: None
ports:
- port: 27017
Pod logs from test-db-0:
Loading...2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=test-db-0
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] db version v3.4.23
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] git version: 324017ede1dbb1c9554dd2dceb15f8da3c59d0e8
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] allocator: tcmalloc
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] modules: none
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] build environment:
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] distmod: ubuntu1604
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] distarch: x86_64
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] target_arch: x86_64
2019-09-29T20:10:16.413+0000 I CONTROL [initandlisten] options: { net: { bindIp: "0.0.0.0" }, replication: { replSet: "MainRepSet" }, security: { authorization: "enabled", clusterAuthMode: "keyFile", keyFile: "/etc/secrets-volume/internal-auth-db-keyfile" }, setParameter: { authenticationMechanisms: "SCRAM-SHA-1" } }
2019-09-29T20:10:16.416+0000 I STORAGE [initandlisten] exception in initAndListen: 98 Unable to create/open lock file: /data/db/mongod.lock Unknown error 526 Is a mongod instance already running?, terminating
2019-09-29T20:10:16.416+0000 I NETWORK [initandlisten] shutdown: going to close listening sockets...
2019-09-29T20:10:16.416+0000 I NETWORK [initandlisten] shutdown: going to flush diaglog...
2019-09-29T20:10:16.417+0000 I CONTROL [initandlisten] now exiting
2019-09-29T20:10:16.417+0000 I CONTROL [initandlisten] shutting down with code:100

mongo setup in k8s not using persistent volume

I'm trying to mount a local folder as the /data/db of mongo in my minikube cluster. So far no luck :(
So, I followed these steps. They describe how to create a persistent volume, a persistent volume claim, a service and a pod.
The config files make sense, but when I eventually spin up the pod, it restarts due to an error and then it keeps running. The log from the pod (kubectl log mongo-0) is
2019-07-02T13:51:49.177+0000 I CONTROL [main] note: noprealloc may hurt performance in many applications
2019-07-02T13:51:49.180+0000 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=mongo-0
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] db version v4.0.10
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] git version: c389e7f69f637f7a1ac3cc9fae843b635f20b766
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] allocator: tcmalloc
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] modules: none
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] build environment:
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] distmod: ubuntu1604
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] distarch: x86_64
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] target_arch: x86_64
2019-07-02T13:51:49.184+0000 I CONTROL [initandlisten] options: { net: { bindIp: "0.0.0.0" }, storage: { mmapv1: { preallocDataFiles: false, smallFiles: true } } }
2019-07-02T13:51:49.186+0000 I STORAGE [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2019-07-02T13:51:49.186+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=483M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2019-07-02T13:51:51.913+0000 I STORAGE [initandlisten] WiredTiger message [1562075511:913047][1:0x7ffa7b8fca80], txn-recover: Main recovery loop: starting at 3/1920 to 4/256
2019-07-02T13:51:51.914+0000 I STORAGE [initandlisten] WiredTiger message [1562075511:914009][1:0x7ffa7b8fca80], txn-recover: Recovering log 3 through 4
2019-07-02T13:51:51.948+0000 I STORAGE [initandlisten] WiredTiger message [1562075511:948068][1:0x7ffa7b8fca80], txn-recover: Recovering log 4 through 4
2019-07-02T13:51:51.976+0000 I STORAGE [initandlisten] WiredTiger message [1562075511:976820][1:0x7ffa7b8fca80], txn-recover: Set global recovery timestamp: 0
2019-07-02T13:51:51.979+0000 I RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(0, 0)
2019-07-02T13:51:51.986+0000 W STORAGE [initandlisten] Detected configuration for non-active storage engine mmapv1 when current storage engine is wiredTiger
2019-07-02T13:51:51.986+0000 I CONTROL [initandlisten]
2019-07-02T13:51:51.986+0000 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2019-07-02T13:51:51.986+0000 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2019-07-02T13:51:51.986+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2019-07-02T13:51:51.986+0000 I CONTROL [initandlisten]
2019-07-02T13:51:52.003+0000 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
2019-07-02T13:51:52.005+0000 I NETWORK [initandlisten] waiting for connections on port 27017
If I connect to the MongoDB/pod, mongo is just running fine!
But, it is not using the persistent volume. Here is my pv.yaml:
kind: PersistentVolume
apiVersion: v1
metadata:
name: mongo-pv
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/k8s/mongo"
Inside the mongo pod is see the mongo files in /data/db but on my local machine (/k8s/mongo) the folder is empty.
Below I'll also list the persistent volume claim (pvc) and pod/service yaml
pvc.yaml:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: mongo-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
mongo.yaml:
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
name: mongo
spec:
clusterIP: None
ports:
- port: 27017
targetPort: 27017
selector:
role: mongo
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: "mongo"
replicas: 1
template:
metadata:
labels:
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 10
volumes:
- name: mongo-pv-storage
persistentVolumeClaim:
claimName: mongo-pv-claim
containers:
- name: mongo
image: mongo
command:
- mongod
- "--bind_ip"
- 0.0.0.0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-pv-storage
mountPath: /data/db
I've also tried, instead of using persistentVolumeClaim to do
volumes:
- name: mongo-pv-storage
hostPath:
path: /k8s/mongo
Gives same issues except there is no error during creation.
Any suggestion what the problem might be or where to look next for more details?
Also, how are the PV and the PVC connected?
Please try this
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
app: mongodb
spec:
replicas: 1
template:
metadata:
labels:
app: mongodb
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo:3
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-volume
mountPath: /data/db
volumeClaimTemplates:
- metadata:
name: mongo-persistent-volume
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 5Gi
you can create whole new PVC and use that here or change the name. This is working for me also i faced same issue to configure mongoDB when passing commands. remove the commands and try it.
For more details check this github
Some suggesstions (may/may not help)
Change your storage class name to String:
storageClassname: "manual"
This one is very weird but it worked for me, make sure your path /k8s/mongo has correct permissions.
chmod 777 /k8s/mongo
I can confirm that it does work in k8s docker-for-desktop environment. So the issue is related to minikube. I've tested minikube with hyperkit and vritualbox driver. In both cases the files written to /data/db are not visible in the local folder (/k8s/mongo)

Kubernetes statefulset ends in completed state

I'm running a k8s cluster on Google GKE where I have a statefulsets running Redis and ElasticSearch.
So every now and then the pods end up in a completed state and so they aren't running anymore and my services depending on it fail.
These pods will also never restart by themselves, a simple kubectl delete pod x will resolve the problem but I want my pods to heal by themselves.
I'm running the latest version available 1.6.4, I have no clue why they aren't pickup and restarted like any other regular pod. Maybe I'm missing something obvious.
edit: I've also notice the pod get a termination signal and shuts down properly so I'm wondering where that is coming from. I'm not manually shutting down and I experience the same with ElasticSearch
This is my statefulset resource declaration:
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: redis
spec:
serviceName: "redis"
replicas: 1
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:3.2-alpine
ports:
- name: redis-server
containerPort: 6379
volumeMounts:
- name: redis-storage
mountPath: /data
volumeClaimTemplates:
- metadata:
name: redis-storage
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
Check the version of docker you run, and whether the docker daemon was restarted during that time.
If the docker daemon was restarted, all the container would be terminated (unless you use the new "live restore" feature in 1.12). In some docker versions, docker may incorrectly reports "exit code 0" for all containers terminated in this situation. See https://github.com/docker/docker/issues/31262 for more details.
source: https://stackoverflow.com/a/43051371/5331893
I am using same configuration as you but removing the annotation in the volumeClaimTemplates since I am trying this on minikube:
$ cat sc.yaml
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: redis
spec:
serviceName: "redis"
replicas: 1
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:3.2-alpine
ports:
- name: redis-server
containerPort: 6379
volumeMounts:
- name: redis-storage
mountPath: /data
volumeClaimTemplates:
- metadata:
name: redis-storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
Now trying to simulate the case where redis fails, so execing into the pod and killing the redis server process:
$ k exec -it redis-0 sh
/data # kill 1
/data # $
See that immediately after the process dies I can see that the STATUS has changed to Completed:
$ k get pods
NAME READY STATUS RESTARTS AGE
redis-0 0/1 Completed 1 38s
It took some time for me to get the redis up and running:
$ k get pods
NAME READY STATUS RESTARTS AGE
redis-0 1/1 Running 2 52s
But soon after that I could see it starting the pod, can you see the events triggered when this happened? Like was there a problem when re-attaching the volume to the pod?