redis cluster in Kubernetes doesn't write nodes.conf file - kubernetes

I'm trying to set up a Redis cluster and I followed this guide here: https://rancher.com/blog/2019/deploying-redis-cluster/
Basically I'm creating a StatefulSet with a replica 6, so that I can have 3 master nodes and 3 slave nodes.
After all the nodes are up, I create the cluster, and it all works fine... but if I look into the file "nodes.conf" (where the configuration of all the nodes should be saved) of each redis node, I can see it's empty.
This is a problem, because whenever a redis node gets restarted, it searches into that file for the configuration of the node to update the IP address of itself and MEET the other nodes, but he finds nothing, so it basically starts a new cluster on his own, with a new ID.
My storage is an NFS connected shared folder. The YAML responsible for the storage access is this one:
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: nfs-provisioner-raid5
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: nfs-provisioner-raid5
spec:
serviceAccountName: nfs-provisioner-raid5
containers:
- name: nfs-provisioner-raid5
image: quay.io/external_storage/nfs-client-provisioner:latest
volumeMounts:
- name: nfs-raid5-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: 'nfs.raid5'
- name: NFS_SERVER
value: 10.29.10.100
- name: NFS_PATH
value: /raid5
volumes:
- name: nfs-raid5-root
nfs:
server: 10.29.10.100
path: /raid5
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-provisioner-raid5
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs.raid5
provisioner: nfs.raid5
parameters:
archiveOnDelete: "false"
This is the YAML of the redis cluster StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster
labels:
app: redis-cluster
spec:
serviceName: redis-cluster
replicas: 6
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: redis:5-alpine
ports:
- containerPort: 6379
name: client
- containerPort: 16379
name: gossip
command: ["/conf/fix-ip.sh", "redis-server", "/conf/redis.conf"]
readinessProbe:
exec:
command:
- sh
- -c
- "redis-cli -h $(hostname) ping"
initialDelaySeconds: 15
timeoutSeconds: 5
livenessProbe:
exec:
command:
- sh
- -c
- "redis-cli -h $(hostname) ping"
initialDelaySeconds: 20
periodSeconds: 3
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- name: conf
mountPath: /conf
readOnly: false
- name: data
mountPath: /data
readOnly: false
volumes:
- name: conf
configMap:
name: redis-cluster
defaultMode: 0755
volumeClaimTemplates:
- metadata:
name: data
labels:
name: redis-cluster
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: nfs.raid5
resources:
requests:
storage: 1Gi
This is the configMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster
labels:
app: redis-cluster
data:
fix-ip.sh: |
#!/bin/sh
CLUSTER_CONFIG="/data/nodes.conf"
echo "creating nodes"
if [ -f ${CLUSTER_CONFIG} ]; then
if [ -z "${POD_IP}" ]; then
echo "Unable to determine Pod IP address!"
exit 1
fi
echo "Updating my IP to ${POD_IP} in ${CLUSTER_CONFIG}"
sed -i.bak -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${POD_IP}/" ${CLUSTER_CONFIG}
echo "done"
fi
exec "$#"
redis.conf: |+
cluster-enabled yes
cluster-require-full-coverage no
cluster-node-timeout 15000
cluster-config-file /data/nodes.conf
cluster-migration-barrier 1
appendonly yes
protected-mode no
and I created the cluster using the command:
kubectl exec -it redis-cluster-0 -- redis-cli --cluster create --cluster-replicas 1 $(kubectl get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 ')
what am I doing wrong?
this is what I see into the /data folder:
the nodes.conf file shows 0 bytes.
Lastly, this is the log from the redis-cluster-0 pod:
creating nodes
1:C 07 Nov 2019 13:01:31.166 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 07 Nov 2019 13:01:31.166 # Redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 07 Nov 2019 13:01:31.166 # Configuration loaded
1:M 07 Nov 2019 13:01:31.179 * No cluster configuration found, I'm e55801f9b5d52f4e599fe9dba5a0a1e8dde2cdcb
1:M 07 Nov 2019 13:01:31.182 * Running mode=cluster, port=6379.
1:M 07 Nov 2019 13:01:31.182 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 07 Nov 2019 13:01:31.182 # Server initialized
1:M 07 Nov 2019 13:01:31.182 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 07 Nov 2019 13:01:31.185 * Ready to accept connections
1:M 07 Nov 2019 13:08:04.264 # configEpoch set to 1 via CLUSTER SET-CONFIG-EPOCH
1:M 07 Nov 2019 13:08:04.306 # IP address for this node updated to 10.40.0.27
1:M 07 Nov 2019 13:08:09.216 # Cluster state changed: ok
1:M 07 Nov 2019 13:08:10.144 * Replica 10.44.0.14:6379 asks for synchronization
1:M 07 Nov 2019 13:08:10.144 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '27972faeb07fe922f1ab581cac0fe467c85c3efd', my replication IDs are '31944091ef93e3f7c004908e3ff3114fd733ea6a' and '0000000000000000000000000000000000000000')
1:M 07 Nov 2019 13:08:10.144 * Starting BGSAVE for SYNC with target: disk
1:M 07 Nov 2019 13:08:10.144 * Background saving started by pid 1041
1041:C 07 Nov 2019 13:08:10.161 * DB saved on disk
1041:C 07 Nov 2019 13:08:10.161 * RDB: 0 MB of memory used by copy-on-write
1:M 07 Nov 2019 13:08:10.233 * Background saving terminated with success
1:M 07 Nov 2019 13:08:10.243 * Synchronization with replica 10.44.0.14:6379 succeeded
thank you for the help.

Looks to be an issue with the shell script that is mounted from configmap. can you update as below
fix-ip.sh: |
#!/bin/sh
CLUSTER_CONFIG="/data/nodes.conf"
echo "creating nodes"
if [ -f ${CLUSTER_CONFIG} ]; then
echo "[ INFO ]File:${CLUSTER_CONFIG} is Found"
else
touch $CLUSTER_CONFIG
fi
if [ -z "${POD_IP}" ]; then
echo "Unable to determine Pod IP address!"
exit 1
fi
echo "Updating my IP to ${POD_IP} in ${CLUSTER_CONFIG}"
sed -i.bak -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${POD_IP}/" ${CLUSTER_CONFIG}
echo "done"
exec "$#"
I just deployed with the updated script and it worked. see below the output
master $ kubectl get po
NAME READY STATUS RESTARTS AGE
redis-cluster-0 1/1 Running 0 83s
redis-cluster-1 1/1 Running 0 54s
redis-cluster-2 1/1 Running 0 45s
redis-cluster-3 1/1 Running 0 38s
redis-cluster-4 1/1 Running 0 31s
redis-cluster-5 1/1 Running 0 25s
master $ kubectl exec -it redis-cluster-0 -- redis-cli --cluster create --cluster-replicas 1 $(kubectl getpods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 ')
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.40.0.4:6379 to 10.40.0.1:6379
Adding replica 10.40.0.5:6379 to 10.40.0.2:6379
Adding replica 10.40.0.6:6379 to 10.40.0.3:6379
M: 9984141f922bed94bfa3532ea5cce43682fa524c 10.40.0.1:6379
slots:[0-5460] (5461 slots) master
M: 76ebee0dd19692c2b6d95f0a492d002cef1c6c17 10.40.0.2:6379
slots:[5461-10922] (5462 slots) master
M: 045b27c73069bff9ca9a4a1a3a2454e9ff640d1a 10.40.0.3:6379
slots:[10923-16383] (5461 slots) master
S: 1bc8d1b8e2d05b870b902ccdf597c3eece7705df 10.40.0.4:6379
replicates 9984141f922bed94bfa3532ea5cce43682fa524c
S: 5b2b019ba8401d3a8c93a8133db0766b99aac850 10.40.0.5:6379
replicates 76ebee0dd19692c2b6d95f0a492d002cef1c6c17
S: d4b91700b2bb1a3f7327395c58b32bb4d3521887 10.40.0.6:6379
replicates 045b27c73069bff9ca9a4a1a3a2454e9ff640d1a
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 10.40.0.1:6379)
M: 9984141f922bed94bfa3532ea5cce43682fa524c 10.40.0.1:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
M: 045b27c73069bff9ca9a4a1a3a2454e9ff640d1a 10.40.0.3:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 1bc8d1b8e2d05b870b902ccdf597c3eece7705df 10.40.0.4:6379
slots: (0 slots) slave
replicates 9984141f922bed94bfa3532ea5cce43682fa524c
S: d4b91700b2bb1a3f7327395c58b32bb4d3521887 10.40.0.6:6379
slots: (0 slots) slave
replicates 045b27c73069bff9ca9a4a1a3a2454e9ff640d1a
M: 76ebee0dd19692c2b6d95f0a492d002cef1c6c17 10.40.0.2:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 5b2b019ba8401d3a8c93a8133db0766b99aac850 10.40.0.5:6379
slots: (0 slots) slave
replicates 76ebee0dd19692c2b6d95f0a492d002cef1c6c17
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
master $ kubectl exec -it redis-cluster-0 -- redis-cli cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:61
cluster_stats_messages_pong_sent:76
cluster_stats_messages_sent:137
cluster_stats_messages_ping_received:71
cluster_stats_messages_pong_received:61
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:137
master $ for x in $(seq 0 5); do echo "redis-cluster-$x"; kubectl exec redis-cluster-$x -- redis-cli role;echo; done
redis-cluster-0
master
588
10.40.0.4
6379
588
redis-cluster-1
master
602
10.40.0.5
6379
602
redis-cluster-2
master
588
10.40.0.6
6379
588
redis-cluster-3
slave
10.40.0.1
6379
connected
602
redis-cluster-4
slave
10.40.0.2
6379
connected
602
redis-cluster-5
slave
10.40.0.3
6379
connected
588

Related

could not write to mounted volume when using non-root user (same when using fsGroup)

I hope I miss something, but could not find answer anywhere. I am using ROOK storage class resource to provision PV that are later attached to POD. in example adding 3 volumes (1 emptyDir, and 2 volumes from ROOK cluster)
test.yaml
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
namespace: misc
spec:
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
# runAsNonRoot: true
volumes:
- name: keymanager-keys
persistentVolumeClaim:
claimName: keymanager-pvc
readOnly: false
- name: keymanager-keyblock
persistentVolumeClaim:
claimName: keymanager-block-pvc
readOnly: false
- name: keymanager-key-n
persistentVolumeClaim:
claimName: keymanager-ext4x-pvc
readOnly: false
- name: local-keys
emptyDir: {}
- name: ephemeral-storage
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteMany"]
storageClassName: "ceph-filesystem"
resources:
requests:
storage: 10Mi
containers:
- name: sec-ctx-demo
image: busybox:1.28
command: [ "sh", "-c", "sleep 1h" ]
# securityContext:
# privileged: true
volumeMounts:
- name: keymanager-keys
mountPath: /data/pvc-filesystem-preconfigured
- name: local-keys
mountPath: /data/emptydir
- name: ephemeral-storage
mountPath: /data/pvc-filesystem-ephemeral-storage
- name: keymanager-keyblock
mountPath: /data/pvc-block-storage
- name: keymanager-key-n
mountPath: /data/testing-with-fstypes
POD information:
kubectl get pods -n misc
NAME READY STATUS RESTARTS AGE
security-context-demo 1/1 Running 15 (30m ago) 15h
kubectl exec -it -n misc security-context-demo -- sh
/ $ ls -l /data/
total 5
drwxrwsrwx 2 root 2000 4096 Feb 13 17:31 emptydir
drwxrwsr-x 3 root 2000 1024 Feb 13 07:08 pvc-block-storage
drwxr-xr-x 2 root root 0 Feb 13 17:31 pvc-filesystem-ephemeral-storage
drwxr-xr-x 2 root root 0 Feb 9 11:29 pvc-filesystem-preconfigured
drwxr-xr-x 2 root root 0 Feb 13 08:28 testing-with-fstypes
/data/emptydir and /data/pvc-block-storage directories are not mounted as expected to group 2000.
valumes that uses Rook CephFS ignores fsGroup field in the PodSecurityContext.If Rook filesystem support fsGroup this should not be happening.
Expected behavior:
would like to see mounted volumes to specific group (that will be used by unprivileged users). Don't know weather doing something wrong or what is happening. Expecting to be able to write to mounted volume with app user.
Environment:
OS (e.g. from /etc/os-release): Debian GNU/Linux 11 (bullseye) Kernel (e.g. uname -a):
Linux worker1 5.10.0-9-amd64 1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 GNU/Linux
Rook version (use rook version inside of a Rook Pod): v1.10.6
Storage backend version (e.g. for ceph do ceph -v): ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)
Kubernetes version (use kubectl version): v1.24.1
cluster:
id: 25e0134a-515e-4f62-8eec-ae5d8cb3e650
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,c,d (age 7w)
mgr: a(active, since 10w)
mds: 1/1 daemons up, 1 hot standby
osd: 3 osds: 3 up (since 11w), 3 in (since 11w)
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 185 pgs
objects: 421 objects, 1.2 MiB
usage: 1.5 GiB used, 249 GiB / 250 GiB avail
pgs: 185 active+clean
if more information needed let me know and I will update this post. also if helps can add PV and PVC configs that are created after Pod creation.

The server must be started by the user that owns the data directory

I am trying to get some persistant storage for a docker instance of PostgreSQL running on Kubernetes. However, the pod fails with
FATAL: data directory "/var/lib/postgresql/data" has wrong ownership
HINT: The server must be started by the user that owns the data directory.
This is the NFS configuration:
% exportfs -v
/srv/nfs/postgresql/postgres-registry
kubehost*.example.com(rw,wdelay,insecure,no_root_squash,no_subtree_check,sec=sys,rw,no_root_squash,no_all_squash)
$ ls -ldn /srv/nfs/postgresql/postgres-registry
drwxrwxrwx. 3 999 999 4096 Jul 24 15:02 /srv/nfs/postgresql/postgres-registry
$ ls -ln /srv/nfs/postgresql/postgres-registry
total 4
drwx------. 2 999 999 4096 Jul 25 08:36 pgdata
The full log from the pod:
2019-07-25T07:32:50.617532000Z The files belonging to this database system will be owned by user "postgres".
2019-07-25T07:32:50.618113000Z This user must also own the server process.
2019-07-25T07:32:50.619048000Z The database cluster will be initialized with locale "en_US.utf8".
2019-07-25T07:32:50.619496000Z The default database encoding has accordingly been set to "UTF8".
2019-07-25T07:32:50.619943000Z The default text search configuration will be set to "english".
2019-07-25T07:32:50.620826000Z Data page checksums are disabled.
2019-07-25T07:32:50.621697000Z fixing permissions on existing directory /var/lib/postgresql/data ... ok
2019-07-25T07:32:50.647445000Z creating subdirectories ... ok
2019-07-25T07:32:50.765065000Z selecting default max_connections ... 20
2019-07-25T07:32:51.035710000Z selecting default shared_buffers ... 400kB
2019-07-25T07:32:51.062039000Z selecting default timezone ... Etc/UTC
2019-07-25T07:32:51.062828000Z selecting dynamic shared memory implementation ... posix
2019-07-25T07:32:51.218995000Z creating configuration files ... ok
2019-07-25T07:32:51.252788000Z 2019-07-25 07:32:51.251 UTC [79] FATAL: data directory "/var/lib/postgresql/data" has wrong ownership
2019-07-25T07:32:51.253339000Z 2019-07-25 07:32:51.251 UTC [79] HINT: The server must be started by the user that owns the data directory.
2019-07-25T07:32:51.262238000Z child process exited with exit code 1
2019-07-25T07:32:51.263194000Z initdb: removing contents of data directory "/var/lib/postgresql/data"
2019-07-25T07:32:51.380205000Z running bootstrap script ...
The deployment has the following in:
securityContext:
runAsUser: 999
supplementalGroups: [999,1000]
fsGroup: 999
What am I doing wrong?
Edit: Added storage.yaml file:
kind: PersistentVolume
apiVersion: v1
metadata:
name: postgres-registry-pv-volume
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 192.168.3.7
path: /srv/nfs/postgresql/postgres-registry
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: postgres-registry-pv-claim
labels:
app: postgres-registry
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
Edit: And the full deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres-registry
spec:
replicas: 1
template:
metadata:
labels:
app: postgres-registry
spec:
securityContext:
runAsUser: 999
supplementalGroups: [999,1000]
fsGroup: 999
containers:
- name: postgres-registry
image: postgres:latest
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: postgresdb
- name: POSTGRES_USER
value: postgres
- name: POSTGRES_PASSWORD
value: Sekret
volumeMounts:
- mountPath: /var/lib/postgresql/data
subPath: "pgdata"
name: postgredb-registry-persistent-storage
volumes:
- name: postgredb-registry-persistent-storage
persistentVolumeClaim:
claimName: postgres-registry-pv-claim
Even more debugging adding:
command: ["/bin/bash", "-c"]
args:["id -u; ls -ldn /var/lib/postgresql/data"]
Which returned:
999
drwx------. 2 99 99 4096 Jul 25 09:11 /var/lib/postgresql/data
Clearly, the UID/GID are wrong. Why?
Even with the work around suggested by Jakub Bujny, I get this:
2019-07-25T09:32:08.734807000Z The files belonging to this database system will be owned by user "postgres".
2019-07-25T09:32:08.735335000Z This user must also own the server process.
2019-07-25T09:32:08.736976000Z The database cluster will be initialized with locale "en_US.utf8".
2019-07-25T09:32:08.737416000Z The default database encoding has accordingly been set to "UTF8".
2019-07-25T09:32:08.737882000Z The default text search configuration will be set to "english".
2019-07-25T09:32:08.738754000Z Data page checksums are disabled.
2019-07-25T09:32:08.739648000Z fixing permissions on existing directory /var/lib/postgresql/data ... ok
2019-07-25T09:32:08.766606000Z creating subdirectories ... ok
2019-07-25T09:32:08.852381000Z selecting default max_connections ... 20
2019-07-25T09:32:09.119031000Z selecting default shared_buffers ... 400kB
2019-07-25T09:32:09.145069000Z selecting default timezone ... Etc/UTC
2019-07-25T09:32:09.145730000Z selecting dynamic shared memory implementation ... posix
2019-07-25T09:32:09.168161000Z creating configuration files ... ok
2019-07-25T09:32:09.200134000Z 2019-07-25 09:32:09.199 UTC [70] FATAL: data directory "/var/lib/postgresql/data" has wrong ownership
2019-07-25T09:32:09.200715000Z 2019-07-25 09:32:09.199 UTC [70] HINT: The server must be started by the user that owns the data directory.
2019-07-25T09:32:09.208849000Z child process exited with exit code 1
2019-07-25T09:32:09.209316000Z initdb: removing contents of data directory "/var/lib/postgresql/data"
2019-07-25T09:32:09.274741000Z running bootstrap script ... 999
2019-07-25T09:32:09.278124000Z drwx------. 2 99 99 4096 Jul 25 09:32 /var/lib/postgresql/data
Using your setup and ensuring the nfs mount is owned by 999:999 it worked just fine.
You're also missing an 's' in your name: postgredb-registry-persistent-storage
And with your subPath: "pgdata" do you need to change the $PGDATA? I didn't include the subpath for this.
$ sudo mount 172.29.0.218:/test/nfs ./nfs
$ sudo su -c "ls -al ./nfs" postgres
total 8
drwx------ 2 postgres postgres 4096 Jul 25 14:44 .
drwxrwxr-x 3 rei rei 4096 Jul 25 14:44 ..
$ kubectl apply -f nfspv.yaml
persistentvolume/postgres-registry-pv-volume created
persistentvolumeclaim/postgres-registry-pv-claim created
$ kubectl apply -f postgres.yaml
deployment.extensions/postgres-registry created
$ sudo su -c "ls -al ./nfs" postgres
total 124
drwx------ 19 postgres postgres 4096 Jul 25 14:46 .
drwxrwxr-x 3 rei rei 4096 Jul 25 14:44 ..
drwx------ 3 postgres postgres 4096 Jul 25 14:46 base
drwx------ 2 postgres postgres 4096 Jul 25 14:46 global
drwx------ 2 postgres postgres 4096 Jul 25 14:46 pg_commit_ts
. . .
I noticed using nfs: directly in the persistent volume took significantly longer to initialize the database, whereas using hostPath: to the mounted nfs volume behaved normally.
So after a few minutes:
$ kubectl logs postgres-registry-675869694-9fp52 | tail -n 3
2019-07-25 21:50:57.181 UTC [30] LOG: database system is ready to accept connections
done
server started
$ kubectl exec -it postgres-registry-675869694-9fp52 psql
psql (11.4 (Debian 11.4-1.pgdg90+1))
Type "help" for help.
postgres=#
Checking the uid/gid
$ kubectl exec -it postgres-registry-675869694-9fp52 bash
postgres#postgres-registry-675869694-9fp52:/$ whoami && id -u && id -g
postgres
999
999
nfspv.yaml:
kind: PersistentVolume
apiVersion: v1
metadata:
name: postgres-registry-pv-volume
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 172.29.0.218
path: /test/nfs
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: postgres-registry-pv-claim
labels:
app: postgres-registry
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
postgres.yaml:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres-registry
spec:
replicas: 1
template:
metadata:
labels:
app: postgres-registry
spec:
securityContext:
runAsUser: 999
supplementalGroups: [999,1000]
fsGroup: 999
containers:
- name: postgres-registry
image: postgres:latest
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: postgresdb
- name: POSTGRES_USER
value: postgres
- name: POSTGRES_PASSWORD
value: Sekret
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgresdb-registry-persistent-storage
volumes:
- name: postgresdb-registry-persistent-storage
persistentVolumeClaim:
claimName: postgres-registry-pv-claim
I cannot explain why those 2 IDs are different but as workaround I would try to override postgres's entrypoint with
command: ["/bin/bash", "-c"]
args: ["chown -R 999:999 /var/lib/postgresql/data && ./docker-entrypoint.sh postgres"]
This type of errors is quite common when you link a NTFS directory into your docker container. NTFS directories don't support ext3 file & directory access control. The only way to make it work is to link directory from a ext3 drive into your container.
I got a bit desperate when I played around Apache / PHP containers with linking the www folder. After I linked files reside on a ext3 filesystem the problem disappear.
I published a short Docker tutorial on youtube, may it helps to understand this problem: https://www.youtube.com/watch?v=eS9O05TTFjM

Kubernetes job pod completed successfully but one of the containers were not ready

I've got some strange looking behavior.
When a job is run, it completes successfully but one of the containers says it's not (or was not..) ready:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default **********-migration-22-20-16-29-11-2018-xnffp 1/2 Completed 0 11h 10.4.5.8 gke-******
job yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: migration-${timestamp_hhmmssddmmyy}
labels:
jobType: database-migration
spec:
backoffLimit: 0
template:
spec:
restartPolicy: Never
containers:
- name: app
image: "${appApiImage}"
imagePullPolicy: IfNotPresent
command:
- php
- artisan
- migrate
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/cloud_sql_proxy",
"-instances=${SQL_INSTANCE_NAME}=tcp:3306",
"-credential_file=/secrets/cloudsql/credentials.json"]
securityContext:
runAsUser: 2 # non-root user
allowPrivilegeEscalation: false
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
What may be the cause of this behavior? There is no readiness or liveness probes defined on the containers.
If I do a describe on the pod, the relevant info is:
...
Command:
php
artisan
migrate
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 29 Nov 2018 22:20:18 +0000
Finished: Thu, 29 Nov 2018 22:20:19 +0000
Ready: False
Restart Count: 0
Requests:
cpu: 100m
...
A Pod with a Ready status means it "is able to serve requests and should be added to the load balancing pools of all matching Services", see https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions
In your case, you don't want to serve requests, but simply to execute php artisan migrate once, and done. So you don't have to worry about this status, the important part is the State: Terminated with a Reason: Completed and a zero exit code: your command did whatever and then exited successfully.
If the result of the command is not what you expected, you'd have to investigate the logs from the container that ran this command with kubectl logs your-pod -c app (where app is the name of the container you defined), and/or you would expect the php artisan migrate command to NOT issue a zero exit code.
In my case, I was using istio, and experienced the same issue, removing istio-sidecar from the job pod solves this problem.
my solution if using istio:
spec:
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"

Why Kubernetes Pod gets into Terminated state giving Completed reason and exit code 0?

I am struggling to find any answer to this in the Kubernetes documentation. The scenario is the following:
Kubernetes version 1.4 over AWS
8 pods running a NodeJS API (Express) deployed as a Kubernetes Deployment
One of the pods gets restarted for no apparent reason late at night (no traffic, no CPU spikes, no memory pressure, no alerts...). Number of restarts is increased as a result of this.
Logs don't show anything abnormal (ran kubectl -p to see previous logs, no errors at all in there)
Resource consumption is normal, cannot see any events about Kubernetes rescheduling the pod into another node or similar
Describing the pod gives back TERMINATED state, giving back COMPLETED reason and exit code 0. I don't have the exact output from kubectl as this pod has been replaced multiple times now.
The pods are NodeJS server instances, they cannot complete, they are always running waiting for requests.
Would this be internal Kubernetes rearranging of pods? Is there any way to know when this happens? Shouldn't be an event somewhere saying why it happened?
Update
This just happened in our prod environment. The result of describing the offending pod is:
api:
Container ID: docker://7a117ed92fe36a3d2f904a882eb72c79d7ce66efa1162774ab9f0bcd39558f31
Image: 1.0.5-RC1
Image ID: docker://sha256:XXXX
Ports: 9080/TCP, 9443/TCP
State: Running
Started: Mon, 27 Mar 2017 12:30:05 +0100
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 24 Mar 2017 13:32:14 +0000
Finished: Mon, 27 Mar 2017 12:29:58 +0100
Ready: True
Restart Count: 1
Update 2
Here it is the deployment.yaml file used:
apiVersion: "extensions/v1beta1"
kind: "Deployment"
metadata:
namespace: "${ENV}"
name: "${APP}${CANARY}"
labels:
component: "${APP}${CANARY}"
spec:
replicas: ${PODS}
minReadySeconds: 30
revisionHistoryLimit: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
component: "${APP}${CANARY}"
spec:
serviceAccount: "${APP}"
${IMAGE_PULL_SECRETS}
containers:
- name: "${APP}${CANARY}"
securityContext:
capabilities:
add:
- IPC_LOCK
image: "134078050561.dkr.ecr.eu-west-1.amazonaws.com/${APP}:${TAG}"
env:
- name: "KUBERNETES_CA_CERTIFICATE_FILE"
value: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
- name: "NAMESPACE"
valueFrom:
fieldRef:
fieldPath: "metadata.namespace"
- name: "ENV"
value: "${ENV}"
- name: "PORT"
value: "${INTERNAL_PORT}"
- name: "CACHE_POLICY"
value: "all"
- name: "SERVICE_ORIGIN"
value: "${SERVICE_ORIGIN}"
- name: "DEBUG"
value: "http,controllers:recommend"
- name: "APPDYNAMICS"
value: "true"
- name: "VERSION"
value: "${TAG}"
ports:
- name: "http"
containerPort: ${HTTP_INTERNAL_PORT}
protocol: "TCP"
- name: "https"
containerPort: ${HTTPS_INTERNAL_PORT}
protocol: "TCP"
The Dockerfile of the image referenced in the above Deployment manifest:
FROM ubuntu:14.04
ENV NVM_VERSION v0.31.1
ENV NODE_VERSION v6.2.0
ENV NVM_DIR /home/app/nvm
ENV NODE_PATH $NVM_DIR/v$NODE_VERSION/lib/node_modules
ENV PATH $NVM_DIR/v$NODE_VERSION/bin:$PATH
ENV APP_HOME /home/app
RUN useradd -c "App User" -d $APP_HOME -m app
RUN apt-get update; apt-get install -y curl
USER app
# Install nvm with node and npm
RUN touch $HOME/.bashrc; curl https://raw.githubusercontent.com/creationix/nvm/${NVM_VERSION}/install.sh | bash \
&& /bin/bash -c 'source $NVM_DIR/nvm.sh; nvm install $NODE_VERSION'
ENV NODE_PATH $NVM_DIR/versions/node/$NODE_VERSION/lib/node_modules
ENV PATH $NVM_DIR/versions/node/$NODE_VERSION/bin:$PATH
# Create app directory
WORKDIR /home/app
COPY . /home/app
# Install app dependencies
RUN npm install
EXPOSE 9080 9443
CMD [ "npm", "start" ]
npm start is an alias for a regular node app.js command that starts a NodeJS server on port 9080.
Check the version of docker you run, and whether the docker daemon was restarted during that time.
If the docker daemon was restarted, all the container would be terminated (unless you use the new "live restore" feature in 1.12). In some docker versions, docker may incorrectly reports "exit code 0" for all containers terminated in this situation. See https://github.com/docker/docker/issues/31262 for more details.
If this is still relevant, we just had a similar problem in our cluster.
We have managed to find more information by inspecting the logs from docker itself. ssh onto your k8s node and run the following:
sudo journalctl -fu docker.service
I hade similar problem when we upgraded to version 2.x Pos get restarted even after the Dags ran successfully.
I later resolved it after a long time of debugging by overriding the pod template and specifying it in the airflow.cfg file.
[kubernetes]
....
pod_template_file = {{ .Values.airflow.home }}/pod_template.yaml
---
# pod_template.yaml
apiVersion: v1
kind: Pod
metadata:
name: dummy-name
spec:
serviceAccountName: default
restartPolicy: Never
containers:
- name: base
image: dummy_image
imagePullPolicy: IfNotPresent
ports: []
command: []

Mongodb container's data becomes "read-only" after restarting kubernetes, with glusterfs as storage?

My mongo is running as a docker container on the kubernetes, with glusterfs providing persistent volume. After I restart kuberntes (the machine power off and restart), all the mongo pods cannot come back, their logs:
chown: changing ownership of `/data/db/user_management.ns': Read-only file system
chown: changing ownership of `/data/db/storage.bson': Read-only file system
chown: changing ownership of `/data/db/local.ns': Read-only file system
chown: changing ownership of `/data/db/mongod.lock': Read-only file system
Here /data/db/ is the mounted gluster volume and I can make sure it's rw mode!:
# kubectl get pod mongoxxx -o yaml
apiVersion: v1
kind: Pod
spec:
containers:
- image: mongo:3.0.5
imagePullPolicy: IfNotPresent
name: mongo
ports:
- containerPort: 27017
protocol: TCP
volumeMounts:
- mountPath: /data/db
name: mongo-storage
volumes:
- name: mongo-storage
persistentVolumeClaim:
claimName: auth-mongo-data
# kubectl describe pod mongoxxx
...
Volume Mounts:
/data/db from mongo-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wdrfp (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
mongo-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: auth-mongo-data
ReadOnly: false
...
# kubect get pv xxx
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/bound-by-controller: "yes"
name: auth-mongo-data
resourceVersion: "215201"
selfLink: /api/v1/persistentvolumes/auth-mongo-data
uid: fb74a4b9-e0a3-11e6-b0d1-5254003b48ea
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 4Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: auth-mongo-data
namespace: default
glusterfs:
endpoints: glusterfs-cluster
path: infra-auth-mongo
persistentVolumeReclaimPolicy: Retain
status:
phase: Bound
And when I ls on the kubernetes node:
# ls -ls /var/lib/kubelet/pods/fc6c9ef3-e0a3-11e6-b0d1-5254003b48ea/volumes/kubernetes.io~glusterfs/auth-mongo-data/
total 163849
4 drwxr-xr-x. 2 mongo input 4096 1月 22 21:18 journal
65536 -rw-------. 1 mongo input 67108864 1月 22 21:16 local.0
16384 -rw-------. 1 mongo root 16777216 1月 23 17:15 local.ns
1 -rwxr-xr-x. 1 mongo root 2 1月 23 17:15 mongod.lock
1 -rw-r--r--. 1 mongo root 69 1月 23 17:15 storage.bson
4 drwxr-xr-x. 2 mongo input 4096 1月 22 21:18 _tmp
65536 -rw-------. 1 mongo input 67108864 1月 22 21:18 user_management.0
16384 -rw-------. 1 mongo root 16777216 1月 23 17:15 user_management.ns
I cannot chown though the volume is mounted as rw.
My host is CentOs 7.3:
Linux c4v160 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux.
I guess that it is because the glusterfs volume I have provided is unclean. The glusterfs volume infra-auth-mongo may consist of dirty directories. One solution is to remove this volume and create another.
Another solution is to hack mongodb dockerfile, force it change the ownership of /data/db before starting mongodb process. Like this: https://github.com/harryge00/mongo/commit/143bfc317e431692010f09b5c0d1f28395d2055b