I'm running a mongoDB (5.0.12) instance as a kubernetes pod. Suddenly the pod is failing and I need some help to understand the logs:
{"t":{"$date":"2022-09-13T18:39:51.104+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"AuthorizationManager-1","msg":"WiredTiger error","attr":{"error":1,"message":"[1663094391:104664][1:0x7fc5224cc700], file:index-9--3195476868760592993.wt, WT_SESSION.open_cursor: __posix_open_file, 808: /data/db/index-9--3195476868760592993.wt: handle-open: open: Operation not permitted"}}
{"t":{"$date":"2022-09-13T18:39:51.104+00:00"},"s":"F", "c":"STORAGE", "id":50882, "ctx":"AuthorizationManager-1","msg":"Failed to open WiredTiger cursor. This may be due to data corruption","attr":{"uri":"table:index-9--3195476868760592993","config":"overwrite=false","error":{"code":8,"codeName":"UnknownError","errmsg":"1: Operation not permitted"},"message":"Please read the documentation for starting MongoDB with --repair here: http://dochub.mongodb.org/core/repair"}}
{"t":{"$date":"2022-09-13T18:39:51.104+00:00"},"s":"F", "c":"-", "id":23091, "ctx":"AuthorizationManager-1","msg":"Fatal assertion","attr":{"msgid":50882,"file":"src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp","line":109}}
{"t":{"$date":"2022-09-13T18:39:51.104+00:00"},"s":"F", "c":"-", "id":23092, "ctx":"AuthorizationManager-1","msg":"\n\n***aborting after fassert() failure\n\n"}
So why is there operation is not permitted? I already run mongod --repair, but the error still occurs.
This is how the pod is deployed:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb
spec:
replicas: 1
selector:
matchLabels:
app: mongodb
strategy:
type: Recreate
template:
metadata:
labels:
app: mongodb
spec:
hostname: mongodb
# securityContext:
# runAsUser: 999
# runAsGroup: 3000
# fsGroup: 2000
volumes:
- name: data
persistentVolumeClaim:
claimName: data
containers:
- name: mongodb
image: mongo:5.0.12
args: ["--auth", "--dbpath", "/data/db"]
imagePullPolicy: IfNotPresent
ports:
- containerPort: 27017
volumeMounts:
- mountPath: /data/db
name: data
# securityContext:
# allowPrivilegeEscalation: false
Update
The PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
You can try checking the permissions for that file before execution:
ls -l
then using chmod you can try changing the permission and then try executing it.
OR
You can refer here, this might help you:
https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
You should have a look at setting the umask on the directory:
http://www.cyberciti.biz/tips/understanding-linux-unix-umask-value-usage.html
That will ensure new files in the directory are created with the specified permissions/ownerships.
Related
i try to run a mongodb within a kubernetes cluster secured with a keyFile. For this, i created a simple statefulset and a configmap, where i stored the keyfile:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongodb
spec:
serviceName: mongodb
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: mongo:4.4
args:
- --bind_ip
- '0.0.0.0,::'
- --replSet
- MySetname01
- --auth
- --keyFile
- /etc/mongodb/keyfile/keyfile
env:
- name: MONGO_INITDB_ROOT_USERNAME
value: MyUsername
- name: MONGO_INITDB_ROOT_PASSWORD
value: MyPassword
ports:
- containerPort: 27017
name: mongodb
volumeMounts:
- name: mongodb-persistent-storage
mountPath: /data/db
- name: mongodb-keyfile
mountPath: /etc/mongodb/keyfile
readOnly: True
volumes:
- name: mongodb-keyfile
configMap:
name: mongodb-keyfile
volumeClaimTemplates:
- metadata:
name: mongodb-persistent-storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
name: mongodb
labels:
app: mongodb
spec:
ports:
- port: 27017
selector:
app: mongodb
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mongodb-keyfile
data:
keyfile: |
+PN6gXEU8NeRsyjlWDnTesHCoPOn6uQIEI5pNorDkphREi6RyoSHCIaXOzLrUpPq
jpSGhSc5/MZj17R7K5anjerhvR6f5JtWjBuQcrjdJdNBceck71F2bly/u9ICfCOy
STFzv6foMQJBJTBYqLwtfyEO7CQ9ywodM0K5r9jtT7x5BiJaqso+F8VN/VFtIYpe
vnzKj7uU3GwDbmw6Yduybgv6P88BGXyW3w6HG8VLMgud5aV7wxIIPE6nAcr2nYmM
1BqC7wp8G6uCcMiHx5pONPA5ONYAIF+u3zj2wAthgMe2UeQxx2L2ERx8Zdsa9HLR
qYOmy9XhfolwdCTwwYvqYRO+RqXGoPGczenC/CKJPj14yfkua+0My5NBWvpL/fIB
osu0lQNw1vFu0rcT1/9OcaJHuwFWocec2qBih9tk2C3c7jNMuxkPo3dxjv8J/lex
vN3Et6tK/wDsQo2+8j6uLYkPFQbHZJQzf/oQiekV4RaC6/pejAf9fSAo4zbQXh29
8BIMpRL3fik+hvamjrtS/45yfqGf/Q5DQ7o8foI4HYmhy+SU2+Bxyc0ZLTn659zl
myesNjB6uC9lMWtpjas0XphNy8GvJxfjvz+bckccPUVczxyC3QSEIcVMMH9vhzes
AcQscswhFMgzp1Z0fbNKy0FqQiDy1hUSir06ZZ3xBGLKeIySRsw9D1Pyh1Y11HlH
NdGwF14cLqm53TGVd9gYeIAm2siQYMKm8rEjxmecc3yGgn0B69gtMcBmxr+z3xMU
X256om6l8L2BJjm3W1zUTiZABuKzeNKjhmXQdEFPQvxhubvCinTYs68XL76ZdVdJ
Q909MmllkOXKbAhi/TMdWmpV9nhINUCBrnu3F08jAQ3UkmVb923XZBzcbdPlpuHe
Orp11/f3Dke4x0niqATccidRHf6Hz+ufVkwIrucBZwcHhK4SBY/RU90n233nV06t
JXlBl/4XjWifB7iJi9mxy/66k
Problem is: MongoDb stays in a Crashloopbackoff , because the permissions on the keyfile are too open:
{"t":{"$date":"2022-12-19T12:41:41.399+00:00"},"s":"I", "c":"CONTROL", "id":23285, "ctx":"main","msg":"Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'"}
{"t":{"$date":"2022-12-19T12:41:41.402+00:00"},"s":"I", "c":"NETWORK", "id":4648601, "ctx":"main","msg":"Implicit TCP FastOpen unavailable. If TCP FastOpen is required, set tcpFastOpenServer, tcpFastOpenClient, and tcpFastOpenQueueSize."}
{"t":{"$date":"2022-12-19T12:41:41.402+00:00"},"s":"I", "c":"ACCESS", "id":20254, "ctx":"main","msg":"Read security file failed","attr":{"error":{"code":30,"codeName":"InvalidPath","errmsg":"permissions on /etc/mongodb/keyfile/keyfile are too open"}}}
For what i dont have a explanation.
I already set the volumemount of the configmap on readonly (You see in the mongo statefulset). Also i tried around with commands or lifecyclehooks to chmod 600/400 the file. I tried differnt versions of mongodb, but got always the same error.
For sure i also tried if the configmap is included correctly, it is (I uncommented the args and Username/Password for that one).
Permissions are shown:
lrwxrwxrwx 1 root root 14 Dec 19 12:50 keyfile -> ..data/keyfile
Maybe its related to that fact that the file is shown as linked?
I expect a kubernetes yaml which is able to start with a keyfile. Thank you very much.
EDIT: I tried to mount the file directly, not as a link with subpath. Now i got the following permissions:
-rw-r--r-- 1 root root 1001 Dec 19 13:34 mongod.key
But sadly the db will not start with that one too, it's still crashing with the same error.
EDIT2:
Adding defaultMode: 0600 to the volume in the statefulset led at least to the correct permissions, but also another error (already mentioned in one of my comments):
file: /var/lib/mongo/mongod.key: bad file"
So i tried to mount on different places in the Pod (You see here /var/lib/) for example and i tried to include the keyfile as secret. But none is working.
I am trying to deploy a single instance mongodb inside of a kubernetes cluster (RKE2 specifically) on an AWS ec2 instance running Redhat 8.5. I am just trying to use the local file system i.e. no EBS. I am having trouble getting my application to work with persistent volumes so I have a few questions. Below is my pv.yaml
kind: Namespace
apiVersion: v1
metadata:
name: mongo
labels:
name: mongo
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: mongodb-pv
namespace: mongo
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/home/ec2-user/database"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongodb-pvc
namespace: mongo
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
And here is my mongo deployment (I know having the user/password in plain text is not secure but this is for the sake of the example)
apiVersion: v1
kind: Pod
metadata:
name: mongodb-pod
namespace: mongo
labels:
app.kubernetes.io/name: mongodb-pod
spec:
containers:
- name: mongo
image: mongo:latest
imagePullPolicy: Always
ports:
- containerPort: 27017
name: mongodb-cp
env:
- name: MONGO_INITDB_ROOT_USERNAME
value: "user"
- name: MONGO_INITDB_ROOT_PASSWORD
value: "password"
volumeMounts:
- mountPath: /data/db
name: mongodb-storage
volumes:
- name: mongodb-storage
persistentVolumeClaim:
claimName: mongodb-pvc
---
apiVersion: v1
kind: Service
metadata:
name: mongodb
namespace: mongo
spec:
selector:
app.kubernetes.io/name: mongodb-pod
ports:
- name: mongodb-cp
port: 27017
targetPort: mongodb-cp
When I run the above configuration files, I get the following errors from the mongo pod:
find: '/data/db': Permission denied
chown: changing ownership of '/data/db': Permission denied
I tried creating a mongodb user on the host with a uid and gid of 1001 since that is the process owner inside the mongo container and chowning the hostPath mentioned above but the error persists.
I have tried adding a securityContext block at both the pod and container level like so:
securityContext:
runAsUser: 1001
runAsGroup: 1001
fsGroup: 1001
which does get me further, but I now get the following error:
{"t":{"$date":"2022-06-02T20:32:13.015+00:00"},"s":"E", "c":"CONTROL", "id":20557, "ctx":"initandlisten","msg":"DBException in initAndListen, terminating","attr":{"error":"IllegalOperation: Attempted to create a lock file on a read-only directory: /data/db"}}
and then the pod dies. If I set the container securityContext to privileged
securityContext:
privileged: true
Everything runs fine. So the two questions are.. is it secure to run a pod as privileged? If not (which is my assumption), what is the correct and secure way to use persistent volumes with the above configurations/example?
I'm trying to deploy Postgresql on Azure Kubernetes with data persistency. So I'm using PVC.
I searched lots of posts on here, most of them offered yaml files like below, but it's giving the error below;
chmod: changing permissions of '/var/lib/postgresql/data/pgdata': Operation not permitted
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
initdb: error: could not change permissions of directory "/var/lib/postgresql/data/pgdata": Operation not permitted
fixing permissions on existing directory /var/lib/postgresql/data/pgdata ...
deployment yaml file is below;
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgresql
spec:
replicas: 1
selector:
matchLabels:
app: postgresql
template:
metadata:
labels:
app: postgresql
spec:
containers:
- name: postgresql
image: postgres:13.2
securityContext:
runAsUser: 999
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
envFrom:
- secretRef:
name: postgresql-secret
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgredb-kap
volumes:
- name: postgredb-kap
persistentVolumeClaim:
claimName: postgresql-pvc
Secret yaml is below;
apiVersion: v1
kind: Secret
metadata:
name: postgresql-secret
type: Opaque
data:
POSTGRES_DB: a2V5sd4=
POSTGRES_USER: cG9zdGdyZXNhZG1pbg==
POSTGRES_PASSWORD: c234Rw==
PGDATA: L3Za234dGF0YQ==
pvc and sc yaml files are below:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: postgresql-pvc
labels:
app: postgresql
spec:
storageClassName: postgresql-sc
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: postgresql-sc
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=1000
- gid=1000
parameters:
skuName: Standard_LRS
provisioner: kubernetes.io/azure-file
reclaimPolicy: Retain
So when I use the mountpath like "- mountPath: /var/lib/postgresql/", it's working. I can reach the DB and it's good. But when I delete the pod and recreating, there is no DB! So no data persistency.
Can you please help, what am I missing here?
Thanks!
One thing you could try is to change uid=1000,gid=1000 in mount options to 999 since this is the uid of postgres user in postgres conatiner (I didn't test this).
Another solution that will for certain solve this issue involves init conatainers.
Postgres container requires to start as root to be able to chown pgdata dir since its mounted as root dir. After it does this, it drops root permisions and runs as postgres user.
But you can use init container (running as root) to chmod the volume dir so that you can run main container as non-root.
Here is an example:
initContainers:
- name: init
image: alpine
command: ["sh", "-c", "chown 999:999 /var/lib/postgresql/data"]
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgredb-kap
Based on the helpful answer from Matt. For bitnami postgresql the initContainer also works but with a slightly different configuration:
initContainers:
- name: init
image: alpine
command: ["sh", "-c", "chown 1001:1001 /bitnami/postgresql"]
volumeMounts:
- mountPath: /bitnami/postgresql
name: postgres-volume
Recently, the managed pod in my mongo deployment onto GKE was automatically deleted and a new one was created in its place. As a result, all my db data was lost.
I specified a PV for the deployment and the PVC was bound too, and I used the standard storage class (google persistent disk). The Persistent Volume Claim had not been deleted either.
Here's an image of the result from kubectl get pv:
pvc
My mongo deployment along with the persistent volume claim and service deployment were all created by using kubernets' kompose tool from a docker-compose.yml for a prisma 1 + mongodb deployment.
Here are my yamls:
mongo-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose -f docker-compose.yml convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: mongo
name: mongo
namespace: dbmode
spec:
replicas: 1
selector:
matchLabels:
io.kompose.service: mongo
strategy:
type: Recreate
template:
metadata:
annotations:
kompose.cmd: kompose -f docker-compose.yml convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: mongo
spec:
containers:
- env:
- name: MONGO_INITDB_ROOT_PASSWORD
value: prisma
- name: MONGO_INITDB_ROOT_USERNAME
value: prisma
image: mongo:3.6
imagePullPolicy: ""
name: mongo
ports:
- containerPort: 27017
resources: {}
volumeMounts:
- mountPath: /var/lib/mongo
name: mongo
restartPolicy: Always
serviceAccountName: ""
volumes:
- name: mongo
persistentVolumeClaim:
claimName: mongo
status: {}
mongo-persistentvolumeclaim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
labels:
io.kompose.service: mongo
name: mongo
namespace: dbmode
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
status: {}
mongo-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
kompose.cmd: kompose -f docker-compose.yml convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: mongo
name: mongo
namespace: dbmode
spec:
ports:
- name: "27017"
port: 27017
targetPort: 27017
selector:
io.kompose.service: mongo
status:
loadBalancer: {}
I've tried checking the contents mounted in /var/lib/mongo and all I got was an empty lost+found/ folder, and I've tried to search the Google Persistent Disks but there was nothing in the root directory and I didn't know where else to look.
I guess that for some reason the mongo deployment is not pulling from the persistent volume for the old data when it starts a new pod, which is extremely perplexing.
I also have another kubernetes project where the same thing happened, except that the old pod still showed but had an evicted status.
I've tried checking the contents mounted in /var/lib/mongo and all I
got was an empty lost+found/ folder,
OK, but have you checked it was actually saving data there, before the Pod restart and data loss ? I guess it was never saving any data in that directory.
I checked the image you used by running a simple Pod:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: my-pod
image: mongo:3.6
When you connect to it by running:
kubectl exec -ti my-pod -- /bin/bash
and check the default mongo configuration file:
root#my-pod:/var/lib# cat /etc/mongod.conf.orig
# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# Where and how to store data.
storage:
dbPath: /var/lib/mongodb # 👈
journal:
enabled: true
# engine:
# mmapv1:
# wiredTiger:
you can see among other things that dbPath is actually set to /var/lib/mongodb and NOT to /var/lib/mongo.
So chances are that your mongo wasn't actually saving any data to your PV i.e. to /var/lib/mongo directory, where it was mounted, but to /var/lib/mongodb as stated in its configuration file.
You should be able to check it easily by kubectl exec to your running mongo pod:
kubectl exec -ti <mongo-pod-name> -- /bin/bash
and verify where the data is saved.
If you didn't overwrite in any way the original config file (e.g. by providing a ConfigMap), mongo should save its data to /var/lib/mongodb and this directory, not being a mount point for your volume, is part of a Pod filesystem and its ephemeral.
Update:
The above mentioned /etc/mongod.conf.orig is only a template so it doesn't reflect the actual configuration that has been applied.
If you run:
kubectl logs your-mongo-pod
it will show where the data directory is located:
$ kubectl logs my-pod
2020-12-16T22:20:47.472+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=my-pod
2020-12-16T22:20:47.473+0000 I CONTROL [initandlisten] db version v3.6.21
...
As we can see, data is saved in /data/db:
dbpath=/data/db
In Google Container Engines Kubernetes I have 3 Nodes each having 3.75 GB of ram
Now i also have an api that is called from a single endpoint. this endpoint makes batch inserts in mongodb like this.
IMongoCollection<T> stageCollection = Database.GetCollection<T>(StageName);
foreach (var batch in entites.Batch(1000))
{
await stageCollection.InsertManyAsync(batch);
}
Now it happens very often then we endup in scenarios out ouf memory scenarios.
On the one hand we limited the wiredTigerCacheSizeGB to 1.5 and on the other hand we defined a ressource limit for the pod.
But still the same result.
For me it looks like mongodb isn't aware of the memory limit the node pod has.
Is this a known issue? how to deal with it, without scaling to "monster" engines?
the configuration yaml looks like this:
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
---
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: "mongo"
replicas: 1
template:
metadata:
labels:
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo:3.6
command:
- mongod
- "--replSet"
- rs0
- "--bind_ip"
- "0.0.0.0"
- "--noprealloc"
- "--wiredTigerCacheSizeGB"
- "1.5"
resources:
limits:
memory: "2Gi"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo,environment=test"
volumeClaimTemplates:
- metadata:
name: mongo-persistent-storage
annotations:
volume.beta.kubernetes.io/storage-class: "fast"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 32Gi
UPDATE
in the meanwhile i also configured the pod antiaffinity to make sure that on the nodes where mongo db is running we don't have any interference in ram. but still we got the oom scenarios –
I'm facing a similar issue where the pod gets OOMKilled even if there is limits and WiredTiger cache limit set.
This PR is tackling the issue for which MongoDB it's taking the node's memory rather than the container memory limit.
In your case I advise you is to update the MongoDB container image to a more recent version (since the PR is fixing 3.6.13, and you are running 3.6).
It may be still the case that your pod will be OOMKilled given that I'm using 4.4.10 and still facing this issue.