Prometheus & Alert Manager keeps crashing after updating the EKS version to 1.16

Prometheus & Alert Manager keeps crashing after updating the EKS version to 1.16 - kubernetes

prometheus-prometheus-kube-prometheus-prometheus-0 0/2 Terminating 0 4s
alertmanager-prometheus-kube-prometheus-alertmanager-0 0/2 Terminating 0 10s
After updating EKS cluster to 1.16 from 1.15 everything works fine except these two pods, they keep on terminating and unable to initialise. Hence, prometheus monitoring does not work. I am getting below errors while describing the pods.
Error: failed to start container "prometheus": Error response from daemon: OCI runtime create failed: container_linux.go:362: creating new parent process caused: container_linux.go:1941: running lstat on namespace path "/proc/29271/ns/ipc" caused: lstat /proc/29271/ns/ipc: no such file or directory: unknown
Error: failed to start container "config-reloader": Error response from daemon: cannot join network of a non running container: 7e139521980afd13dad0162d6859352b0b2c855773d6d4062ee3e2f7f822a0b3
Error: cannot find volume "config" to mount into container "config-reloader"
Error: cannot find volume "config" to mount into container "prometheus"
here is my yaml file for the deployment:
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
creationTimestamp: "2021-04-30T16:39:14Z"
deletionGracePeriodSeconds: 600
deletionTimestamp: "2021-04-30T16:49:14Z"
generateName: prometheus-prometheus-kube-prometheus-prometheus-
labels:
app: prometheus
app.kubernetes.io/instance: prometheus-kube-prometheus-prometheus
app.kubernetes.io/managed-by: prometheus-operator
app.kubernetes.io/name: prometheus
app.kubernetes.io/version: 2.26.0
controller-revision-hash: prometheus-prometheus-kube-prometheus-prometheus-56d9fcf57
operator.prometheus.io/name: prometheus-kube-prometheus-prometheus
operator.prometheus.io/shard: "0"
prometheus: prometheus-kube-prometheus-prometheus
statefulset.kubernetes.io/pod-name: prometheus-prometheus-kube-prometheus-prometheus-0
name: prometheus-prometheus-kube-prometheus-prometheus-0
namespace: mo
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: prometheus-prometheus-kube-prometheus-prometheus
uid: 326a09f2-319c-449d-904a-1dd0019c6d80
resourceVersion: "9337443"
selfLink: /api/v1/namespaces/monitoring/pods/prometheus-prometheus-kube-prometheus-prometheus-0
uid: e2be062f-749d-488e-a6cc-42ef1396851b
spec:
containers:
- args:
- --web.console.templates=/etc/prometheus/consoles
- --web.console.libraries=/etc/prometheus/console_libraries
- --config.file=/etc/prometheus/config_out/prometheus.env.yaml
- --storage.tsdb.path=/prometheus
- --storage.tsdb.retention.time=10d
- --web.enable-lifecycle
- --storage.tsdb.no-lockfile
- --web.external-url=http://prometheus-kube-prometheus-prometheus.monitoring:9090
- --web.route-prefix=/
image: quay.io/prometheus/prometheus:v2.26.0
imagePullPolicy: IfNotPresent
name: prometheus
ports:
- containerPort: 9090
name: web
protocol: TCP
readinessProbe:
failureThreshold: 120
httpGet:
path: /-/ready
port: web
scheme: HTTP
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 3
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/prometheus/config_out
name: config-out
readOnly: true
- mountPath: /etc/prometheus/certs
name: tls-assets
readOnly: true
- mountPath: /prometheus
name: prometheus-prometheus-kube-prometheus-prometheus-db
- mountPath: /etc/prometheus/rules/prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
name: prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: prometheus-kube-prometheus-prometheus-token-mh66q
readOnly: true
- args:
- --listen-address=:8080
- --reload-url=http://localhost:9090/-/reload
- --config-file=/etc/prometheus/config/prometheus.yaml.gz
- --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
- --watched-dir=/etc/prometheus/rules/prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
command:
- /bin/prometheus-config-reloader
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: SHARD
value: "0"
image: quay.io/prometheus-operator/prometheus-config-reloader:v0.47.0
imagePullPolicy: IfNotPresent
name: config-reloader
ports:
- containerPort: 8080
name: reloader-web
protocol: TCP
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/prometheus/config
name: config
- mountPath: /etc/prometheus/config_out
name: config-out
- mountPath: /etc/prometheus/rules/prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
name: prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: prometheus-kube-prometheus-prometheus-token-mh66q
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostname: prometheus-prometheus-kube-prometheus-prometheus-0
nodeName: ip-10-1-49-45.ec2.internal
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccount: prometheus-kube-prometheus-prometheus
serviceAccountName: prometheus-kube-prometheus-prometheus
subdomain: prometheus-operated
terminationGracePeriodSeconds: 600
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: config
secret:
defaultMode: 420
secretName: prometheus-prometheus-kube-prometheus-prometheus
- name: tls-assets
secret:
defaultMode: 420
secretName: prometheus-prometheus-kube-prometheus-prometheus-tls-assets
- emptyDir: {}
name: config-out
- configMap:
defaultMode: 420
name: prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
name: prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
- emptyDir: {}
name: prometheus-prometheus-kube-prometheus-prometheus-db
- name: prometheus-kube-prometheus-prometheus-token-mh66q
secret:
defaultMode: 420
secretName: prometheus-kube-prometheus-prometheus-token-mh66q
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2021-04-30T16:39:14Z"
status: "True"
type: PodScheduled
phase: Pending
qosClass: Burstable

If someone needs to know the answer, in my case(the above situation) there were 2 Prometheus operators running in different different namespace, 1 in default & another monitoring namespace. so I removed the one from the default namespace and it resolved my pods crashing issue.

Related

Kubernetes CronJob - Multiple CronJob configuration is not working

I have to run two CronJobs in Kubernetes (AWS-EKS) and I have below configuration. When I apply the template, only one CronJob is getting created. The one that gets created is always the second one. So it looks like the first one is getting overwritten by the second. I am unable to figure out what am I doing wrong.
# Source: deploy-k8s-app/templates/multicron.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
labels:
app: my-app
name: my-app
namespace: commercial
spec:
schedule: '5/15 * * * *'
concurrencyPolicy: Forbid
jobTemplate:
spec:
parallelism: 1
completions: 1
activeDeadlineSeconds: 900
template:
metadata:
labels:
app: my-app
name: my-app
namespace: commercial
spec:
containers:
- env:
- name: SERVER_SERVLET_CONTEXT_PATH
value: "/my-app"
- name: IS_JACOCO_ENABLED
value: "false"
- name: SPRING_PROFILES_ACTIVE
value: "int-dc4"
- name: METRICS_ADDRESS
value: "NA"
- name: APP_MODULE
value: "expand"
- name: JAVA_TOOL_OPTIONS
value: "-Xms256M -Xmx512M"
image: "xxxxx.dkr.ecr.us-east-1.amazonaws.com/my-ecr:my-app-latest-10"
imagePullPolicy: IfNotPresent
name: my-app
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
limits:
cpu: 160m
memory: 1024Mi
requests:
cpu: 100m
memory: 256Mi
volumeMounts:
- name: apps-logs
mountPath: /var/log/containers
- name: fluentdconf
mountPath: /fluentd/etc
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.11.2-debian-cloudwatch-1.0
env:
- name: REGION
value: us-east-1
- name: AWS_REGION
value: us-east-1
- name: CLUSTER_NAME
value: MY-EKS-Cluster
- name: CI_VERSION
value: "k8s/1.0.1"
- name: LOG_GROUP_NAME
value: /aws/containerinsights/MY-EKS-Cluster/springapp
resources:
limits:
cpu: 160m
memory: 1024Mi
requests:
cpu: 100m
memory: 256Mi
volumeMounts:
- name: fluentdconf
mountPath: /fluentd/etc
- name: apps-logs
mountPath: /var/log/containers
volumes:
- name: fluentdconf
configMap:
name: fluentd-spring-config
- name: apps-logs
emptyDir: {}
- name: my-app-shared
emptyDir: {}
restartPolicy: OnFailure
apiVersion: batch/v1beta1
kind: CronJob
metadata:
labels:
app: my-app
name: my-app-addl
namespace: commercial
spec:
schedule: '15/30 * * * *'
concurrencyPolicy: Forbid
jobTemplate:
spec:
parallelism: 1
completions: 1
activeDeadlineSeconds: 1800
template:
metadata:
labels:
app: my-app
name: my-app
namespace: commercial
spec:
containers:
- env:
- name: SERVER_SERVLET_CONTEXT_PATH
value: "/my-app"
- name: IS_JACOCO_ENABLED
value: "false"
- name: SPRING_PROFILES_ACTIVE
value: "int-dc4"
- name: METRICS_ADDRESS
value: "NA"
- name: APP_MODULE
value: "expand"
- name: JAVA_TOOL_OPTIONS
value: "-Xms256M -Xmx512M"
image: "xxxxx.dkr.ecr.us-east-1.amazonaws.com/my-ecr:my-app-latest-10"
imagePullPolicy: IfNotPresent
name: my-app
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
limits:
cpu: 160m
memory: 1024Mi
requests:
cpu: 100m
memory: 256Mi
volumeMounts:
- name: apps-logs
mountPath: /var/log/containers
- name: fluentdconf
mountPath: /fluentd/etc
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.11.2-debian-cloudwatch-1.0
env:
- name: REGION
value: us-east-1
- name: AWS_REGION
value: us-east-1
- name: CLUSTER_NAME
value: MY-EKS-Cluster
- name: CI_VERSION
value: "k8s/1.0.1"
- name: LOG_GROUP_NAME
value: /aws/containerinsights/MY-EKS-Cluster/springapp
resources:
limits:
cpu: 160m
memory: 1024Mi
requests:
cpu: 100m
memory: 256Mi
volumeMounts:
- name: fluentdconf
mountPath: /fluentd/etc
- name: apps-logs
mountPath: /var/log/containers
volumes:
- name: fluentdconf
configMap:
name: fluentd-spring-config
- name: apps-logs
emptyDir: {}
- name: my-app-shared
emptyDir: {}
restartPolicy: OnFailure
kubectl apply -f multicron.yaml
cronjob.batch/my-app-addl created
(Expectation: Two CronJobs to be created. Actual: Only one is created, and that is the second one)
kubectl get cronjob -n commercial
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
my-app-addl 15/30 * * * * False 0 <none> 9s
(Expectation: Two CronJobs to be created. Actual: Only one is created, and that is the second one)
Thanks!
Abhilash

I could solve this by separating the Documents by using --- between CronJob entries

Unable to deploy mongodb community operator in openshift

I'm trying to deploy the mongodb community operator in openshift 3.11, using the following commands:
git clone https://github.com/mongodb/mongodb-kubernetes-operator.git
cd mongodb-kubernetes-operator
oc new-project mongodb
oc create -f deploy/crds/mongodb.com_mongodb_crd.yaml -n mongodb
oc create -f deploy/operator/role.yaml -n mongodb
oc create -f deploy/operator/role_binding.yaml -n mongodb
oc create -f deploy/operator/service_account.yaml -n mongodb
oc apply -f deploy/openshift/operator_openshift.yaml -n mongodb
oc apply -f deploy/crds/mongodb.com_v1_mongodb_openshift_cr.yaml -n mongodb
Operator pod is successfully running, but the mongodb replicaset pods do not spin up. Error is as follows:
[kubenode#master mongodb-kubernetes-operator]$ oc get pods
NAME READY STATUS RESTARTS AGE
example-openshift-mongodb-0 1/2 CrashLoopBackOff 4 2m
mongodb-kubernetes-operator-66bfcbcf44-9xvj7 1/1 Running 0 2m
[kubenode#master mongodb-kubernetes-operator]$ oc logs -f example-openshift-mongodb-0 -c mongodb-agent
panic: Failed to get current user: user: unknown userid 1000510000
goroutine 1 [running]:
com.tengen/cm/util.init.3()
/data/mci/2f46ec94982c5440960d2b2bf2b6ae15/mms-automation/build/go-dependencies/src/com.tengen/cm/util/user.go:14 +0xe5
I have gone through all the issues raised on the mongodb-kubernetes-operator repository which are related to this issue (reference), and found a suggestion to set the MANAGED_SECURITY_CONTEXT environment variable to true in the operator, mongodb and mongodb-agent containers.
I have done so for all of these containers, but am still facing the same issue.
Here is the confirmation that the environment variables are correctly set:
[kubenode#master mongodb-kubernetes-operator]$ oc set env statefulset.apps/example-openshift-mongodb --list
# statefulsets/example-openshift-mongodb, container mongodb-agent
AGENT_STATUS_FILEPATH=/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
AUTOMATION_CONFIG_MAP=example-openshift-mongodb-config
HEADLESS_AGENT=true
MANAGED_SECURITY_CONTEXT=true
# POD_NAMESPACE from field path metadata.namespace
# statefulsets/example-openshift-mongodb, container mongod
AGENT_STATUS_FILEPATH=/healthstatus/agent-health-status.json
MANAGED_SECURITY_CONTEXT=true
[kubenode#master mongodb-kubernetes-operator]$ oc set env deployment.apps/mongodb-kubernetes-operator --list
# deployments/mongodb-kubernetes-operator, container mongodb-kubernetes-operator
# WATCH_NAMESPACE from field path metadata.namespace
# POD_NAME from field path metadata.name
MANAGED_SECURITY_CONTEXT=true
OPERATOR_NAME=mongodb-kubernetes-operator
AGENT_IMAGE=quay.io/mongodb/mongodb-agent:10.19.0.6562-1
VERSION_UPGRADE_HOOK_IMAGE=quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.2
Operator Information
Operator Version: 0.3.0
MongoDB Image used: 4.2.6
Cluster Information
[kubenode#master mongodb-kubernetes-operator]$ openshift version
openshift v3.11.0+62803d0-1
[kubenode#master mongodb-kubernetes-operator]$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2018-10-15T09:45:30Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2020-12-07T17:59:40Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
Update
When I check the replica pod yaml (see below), I see three occurrences of runAsUser security context set as 1000510000. I'm not sure how, but this is being set even though I'm not setting it manually.
[kubenode#master mongodb-kubernetes-operator]$ oc get -o yaml pod example-openshift-mongodb-0
apiVersion: v1
kind: Pod
metadata:
annotations:
openshift.io/scc: restricted
creationTimestamp: 2021-01-19T07:45:05Z
generateName: example-openshift-mongodb-
labels:
app: example-openshift-mongodb-svc
controller-revision-hash: example-openshift-mongodb-6549495b
statefulset.kubernetes.io/pod-name: example-openshift-mongodb-0
name: example-openshift-mongodb-0
namespace: mongodb
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: example-openshift-mongodb
uid: 3e91eb40-5a2a-11eb-a5e0-0050569b1f59
resourceVersion: "15616863"
selfLink: /api/v1/namespaces/mongodb/pods/example-openshift-mongodb-0
uid: 3ea17a28-5a2a-11eb-a5e0-0050569b1f59
spec:
containers:
- command:
- agent/mongodb-agent
- -cluster=/var/lib/automation/config/cluster-config.json
- -skipMongoStart
- -noDaemonize
- -healthCheckFilePath=/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
- -serveStatusPort=5000
- -useLocalMongoDbTools
env:
- name: AGENT_STATUS_FILEPATH
value: /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
- name: AUTOMATION_CONFIG_MAP
value: example-openshift-mongodb-config
- name: HEADLESS_AGENT
value: "true"
- name: MANAGED_SECURITY_CONTEXT
value: "true"
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: quay.io/mongodb/mongodb-agent:10.19.0.6562-1
imagePullPolicy: Always
name: mongodb-agent
readinessProbe:
exec:
command:
- /var/lib/mongodb-mms-automation/probes/readinessprobe
failureThreshold: 60
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1000510000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/automation/config
name: automation-config
readOnly: true
- mountPath: /data
name: data-volume
- mountPath: /var/lib/mongodb-mms-automation/authentication
name: example-openshift-mongodb-agent-scram-credentials
- mountPath: /var/log/mongodb-mms-automation/healthstatus
name: healthstatus
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: mongodb-kubernetes-operator-token-lr9l4
readOnly: true
- command:
- /bin/sh
- -c
- |2
# run post-start hook to handle version changes
/hooks/version-upgrade
# wait for config to be created by the agent
while [ ! -f /data/automation-mongod.conf ]; do sleep 3 ; done ; sleep 2 ;
# start mongod with this configuration
exec mongod -f /data/automation-mongod.conf ;
env:
- name: AGENT_STATUS_FILEPATH
value: /healthstatus/agent-health-status.json
- name: MANAGED_SECURITY_CONTEXT
value: "true"
image: mongo:4.2.6
imagePullPolicy: IfNotPresent
name: mongod
resources: {}
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1000510000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data-volume
- mountPath: /var/lib/mongodb-mms-automation/authentication
name: example-openshift-mongodb-agent-scram-credentials
- mountPath: /healthstatus
name: healthstatus
- mountPath: /hooks
name: hooks
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: mongodb-kubernetes-operator-token-lr9l4
readOnly: true
dnsPolicy: ClusterFirst
hostname: example-openshift-mongodb-0
imagePullSecrets:
- name: mongodb-kubernetes-operator-dockercfg-jhplw
initContainers:
- command:
- cp
- version-upgrade-hook
- /hooks/version-upgrade
image: quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.2
imagePullPolicy: Always
name: mongod-posthook
resources: {}
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1000510000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /hooks
name: hooks
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: mongodb-kubernetes-operator-token-lr9l4
readOnly: true
nodeName: node1.192.168.27.116.nip.io
nodeSelector:
node-role.kubernetes.io/compute: "true"
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1000510000
seLinuxOptions:
level: s0:c23,c2
serviceAccount: mongodb-kubernetes-operator
serviceAccountName: mongodb-kubernetes-operator
subdomain: example-openshift-mongodb-svc
terminationGracePeriodSeconds: 30
volumes:
- name: data-volume
persistentVolumeClaim:
claimName: data-volume-example-openshift-mongodb-0
- name: automation-config
secret:
defaultMode: 416
secretName: example-openshift-mongodb-config
- name: example-openshift-mongodb-agent-scram-credentials
secret:
defaultMode: 384
secretName: example-openshift-mongodb-agent-scram-credentials
- emptyDir: {}
name: healthstatus
- emptyDir: {}
name: hooks
- name: mongodb-kubernetes-operator-token-lr9l4
secret:
defaultMode: 420
secretName: mongodb-kubernetes-operator-token-lr9l4
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2021-01-19T07:46:45Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2021-01-19T07:46:39Z
message: 'containers with unready status: [mongodb-agent]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: null
message: 'containers with unready status: [mongodb-agent]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: 2021-01-19T07:45:05Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://bd3ede9178bb78267bc19d1b5da0915d3bcd1d4dcee3e142c7583424bd2aa777
image: docker.io/mongo:4.2.6
imageID: docker-pullable://docker.io/mongo#sha256:c880f6b56f443bb4d01baa759883228cd84fa8d78fa1a36001d1c0a0712b5a07
lastState: {}
name: mongod
ready: true
restartCount: 0
state:
running:
startedAt: 2021-01-19T07:46:55Z
- containerID: docker://5e39c0b6269b8231bbf9cabb4ff3457d9f91e878eff23953e318a9475fb8a90e
image: quay.io/mongodb/mongodb-agent:10.19.0.6562-1
imageID: docker-pullable://quay.io/mongodb/mongodb-agent#sha256:790c2670ef7cefd61cfaabaf739de16dbd2e07dc3b539add0da21ab7d5ac7626
lastState:
terminated:
containerID: docker://5e39c0b6269b8231bbf9cabb4ff3457d9f91e878eff23953e318a9475fb8a90e
exitCode: 2
finishedAt: 2021-01-19T19:39:58Z
reason: Error
startedAt: 2021-01-19T19:39:58Z
name: mongodb-agent
ready: false
restartCount: 144
state:
waiting:
message: Back-off 5m0s restarting failed container=mongodb-agent pod=example-openshift-mongodb-0_mongodb(3ea17a28-5a2a-11eb-a5e0-0050569b1f59)
reason: CrashLoopBackOff
hostIP: 192.168.27.116
initContainerStatuses:
- containerID: docker://7c31cef2a68e3e6100c2cc9c83e3780313f1e8ab43bebca79ad4d48613f124bd
image: quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.2
imageID: docker-pullable://quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook#sha256:e99105b1c54e12913ddaf470af8025111a6e6e4c8917fc61be71d1bc0328e7d7
lastState: {}
name: mongod-posthook
ready: true
restartCount: 0
state:
terminated:
containerID: docker://7c31cef2a68e3e6100c2cc9c83e3780313f1e8ab43bebca79ad4d48613f124bd
exitCode: 0
finishedAt: 2021-01-19T07:46:45Z
reason: Completed
startedAt: 2021-01-19T07:46:44Z
phase: Running
podIP: 10.129.0.119
qosClass: BestEffort
startTime: 2021-01-19T07:46:39Z

How to mount a kubernetes.io/dockerconfigjson

I have a secret of type kubernetes.io/dockerconfigjson:
$ kubectl describe secrets dockerjson
Name: dockerjson
Namespace: my-prd
Labels: <none>
Annotations: <none>
Type: kubernetes.io/dockerconfigjson
Data
====
.dockerconfigjson: 1335 bytes
When I try to mount this secret into a container - I cannot find a config.json:
- name: dump
image: kaniko-executor:debug
imagePullPolicy: Always
command: ["/busybox/find", "/", "-name", "config.json"]
volumeMounts:
- name: docker-config
mountPath: /foobar
volumes:
- name: docker-config
secret:
secretName: dockerjson
defaultMode: 256
which only prints:
/kaniko/.docker/config.json
Is this supported at all or am I doing something wrong?
Am using OpenShift 3.9 - which should be Kubernetes 1.9.

apiVersion: v1
kind: Pod
metadata:
name: kaniko
spec:
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor:debug-v0.9.0
command:
- /busybox/cat
resources:
limits:
cpu: 2
memory: 2Gi
requests:
cpu: 0.5
memory: 500Mi
tty: true
volumeMounts:
- name: docker-config
mountPath: /kaniko/.docker/
volumes:
- name: docker-config
secret:
secretName: dockerjson
items:
- key: .dockerconfigjson
path: config.json

Why Istio "Authentication Policy" Example Page isn't working as expected?

The article here: https://istio.io/docs/tasks/security/authn-policy/
Specifically, when I follow the instruction on the Setup section, I can't connect any httpbin that are residing in namespace foo and bar. But the legacy's one is okay. I expect there is something wrong in the side car proxy being installed.
Here is the output of httpbin pod yaml file (after being injected with istioctl kubeinject --includeIPRanges "10.32.0.0/16" command). I use --includeIPRanges so that the pod can communicate with external ip (for my debugging purpose to install dnsutils, etc package)
apiVersion: v1
kind: Pod
metadata:
annotations:
sidecar.istio.io/inject: "true"
sidecar.istio.io/status: '{"version":"4120ea817406fd7ed43b7ecf3f2e22abe453c44d3919389dcaff79b210c4cd86","initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-certs"],"imagePullSecrets":null}'
creationTimestamp: 2018-08-15T11:40:59Z
generateName: httpbin-8b9cf99f5-
labels:
app: httpbin
pod-template-hash: "465795591"
version: v1
name: httpbin-8b9cf99f5-9c47z
namespace: foo
ownerReferences:
- apiVersion: extensions/v1beta1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: httpbin-8b9cf99f5
uid: 1450d75d-a080-11e8-aece-42010a940168
resourceVersion: "65722138"
selfLink: /api/v1/namespaces/foo/pods/httpbin-8b9cf99f5-9c47z
uid: 1454b68d-a080-11e8-aece-42010a940168
spec:
containers:
- image: docker.io/citizenstig/httpbin
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 8000
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-pkpvf
readOnly: true
- args:
- proxy
- sidecar
- --configPath
- /etc/istio/proxy
- --binaryPath
- /usr/local/bin/envoy
- --serviceCluster
- httpbin
- --drainDuration
- 45s
- --parentShutdownDuration
- 1m0s
- --discoveryAddress
- istio-pilot.istio-system:15007
- --discoveryRefreshDelay
- 1s
- --zipkinAddress
- zipkin.istio-system:9411
- --connectTimeout
- 10s
- --statsdUdpAddress
- istio-statsd-prom-bridge.istio-system.istio-system:9125
- --proxyAdminPort
- "15000"
- --controlPlaneAuthPolicy
- NONE
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: INSTANCE_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: ISTIO_META_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: ISTIO_META_INTERCEPTION_MODE
value: REDIRECT
image: docker.io/istio/proxyv2:1.0.0
imagePullPolicy: IfNotPresent
name: istio-proxy
resources:
requests:
cpu: 10m
securityContext:
privileged: false
readOnlyRootFilesystem: true
runAsUser: 1337
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/istio/proxy
name: istio-envoy
- mountPath: /etc/certs/
name: istio-certs
readOnly: true
dnsPolicy: ClusterFirst
initContainers:
- args:
- -p
- "15001"
- -u
- "1337"
- -m
- REDIRECT
- -i
- 10.32.0.0/16
- -x
- ""
- -b
- 8000,
- -d
- ""
image: docker.io/istio/proxy_init:1.0.0
imagePullPolicy: IfNotPresent
name: istio-init
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
nodeName: gke-tvlk-data-dev-default-medium-pool-46397778-q2sb
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: default-token-pkpvf
secret:
defaultMode: 420
secretName: default-token-pkpvf
- emptyDir:
medium: Memory
name: istio-envoy
- name: istio-certs
secret:
defaultMode: 420
optional: true
secretName: istio.default
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2018-08-15T11:41:01Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2018-08-15T11:44:28Z
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2018-08-15T11:40:59Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://758e130a4c31a15c1b8bc1e1f72bd7739d5fa1103132861eea9ae1a6ae1f080e
image: citizenstig/httpbin:latest
imageID: docker-pullable://citizenstig/httpbin#sha256:b81c818ccb8668575eb3771de2f72f8a5530b515365842ad374db76ad8bcf875
lastState: {}
name: httpbin
ready: true
restartCount: 0
state:
running:
startedAt: 2018-08-15T11:41:01Z
- containerID: docker://9c78eac46a99457f628493975f5b0c5bbffa1dac96dab5521d2efe4143219575
image: istio/proxyv2:1.0.0
imageID: docker-pullable://istio/proxyv2#sha256:77915a0b8c88cce11f04caf88c9ee30300d5ba1fe13146ad5ece9abf8826204c
lastState:
terminated:
containerID: docker://52299a80a0fa8949578397357861a9066ab0148ac8771058b83e4c59e422a029
exitCode: 255
finishedAt: 2018-08-15T11:44:27Z
reason: Error
startedAt: 2018-08-15T11:41:02Z
name: istio-proxy
ready: true
restartCount: 1
state:
running:
startedAt: 2018-08-15T11:44:28Z
hostIP: 10.32.96.27
initContainerStatuses:
- containerID: docker://f267bb44b70d2d383ce3f9943ab4e917bb0a42ecfe17fe0ed294bde4d8284c58
image: istio/proxy_init:1.0.0
imageID: docker-pullable://istio/proxy_init#sha256:345c40053b53b7cc70d12fb94379e5aa0befd979a99db80833cde671bd1f9fad
lastState: {}
name: istio-init
ready: true
restartCount: 0
state:
terminated:
containerID: docker://f267bb44b70d2d383ce3f9943ab4e917bb0a42ecfe17fe0ed294bde4d8284c58
exitCode: 0
finishedAt: 2018-08-15T11:41:00Z
reason: Completed
startedAt: 2018-08-15T11:41:00Z
phase: Running
podIP: 10.32.19.61
qosClass: Burstable
startTime: 2018-08-15T11:40:59Z
Here is the example command when I got the error sleep.legacy -> httpbin.foo
> kubectl exec $(kubectl get pod -l app=sleep -n legacy -o jsonpath={.items..metadata.name}) -c sleep -n legacy -- curl http://httpbin.foo:8000/ip -s -o /dev/null -w "%{http_code}\n"
000
command terminated with exit code 7
** Here is the example command when I get success status: sleep.legacy -> httpbin.legacy **
> kubectl exec $(kubectl get pod -l app=sleep -n legacy -o jsonpath={.items..metadata.name}) -csleep -n legacy -- curl http://httpbin.legacy:8000/ip -s -o /dev/null -w "%{http_code}\n"
200
I have followed the instruction to ensure there is no mtls policy defined, etc.
> kubectl get policies.authentication.istio.io --all-namespaces
No resources found.
> kubectl get meshpolicies.authentication.istio.io
No resources found.
> kubectl get destinationrules.networking.istio.io --all-namespaces -o yaml | grep "host:"
host: istio-policy.istio-system.svc.cluster.local
host: istio-telemetry.istio-system.svc.cluster.local

NVM, I think I found why. There is configuration being messed up in my part.
If you take a look at the statsd address, it is defined with unrecognized hostname istio-statsd-prom-bridge.istio-system.istio-system:9125. I noticed that after looking at the proxy container being restarted/crashed multiple times.

How to change user and group owner for VolumeMount

I want to set up a pod and there are two containers running inside the pod, which try to access a mounted file /var/run/udspath.
In container serviceC, I need to change the file and group owner of /var/run/udspath, so I add a command into the yaml file. But it does not work.
kubectl apply does not complain, but container serviceC is not created.
Without this "command: ['/bin/sh', '-c', 'sudo chown 1337:1337 /var/run/udspath']", the container could be created.
apiVersion: v1
kind: Service
metadata:
name: clitool
labels:
app: httpbin
spec:
ports:
- name: http
port: 8000
selector:
app: httpbin
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: clitool
spec:
replicas: 1
strategy: {}
template:
metadata:
annotations:
sidecar.istio.io/status: '{"version":"1c09c07e5751560367349d807c164267eaf5aea4018b4588d884f7d265cf14a4","initContainers":["istio-init"],"containers":["serviceC"],"volumes":["istio-envoy","istio-certs"],"imagePullSecrets":null}'
creationTimestamp: null
labels:
app: httpbin
version: v1
spec:
containers:
- image:
name: serviceA
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /var/run/udspath
name: sdsudspath
- image:
imagePullPolicy: IfNotPresent
name: serviceB
ports:
- containerPort: 8000
resources: {}
- args:
- proxy
- sidecar
- --configPath
- /etc/istio/proxy
- --binaryPath
- /usr/local/bin/envoy
- --serviceCluster
- httpbin
- --drainDuration
- 45s
- --parentShutdownDuration
- 1m0s
- --discoveryAddress
- istio-pilot.istio-system:15007
- --discoveryRefreshDelay
- 1s
- --zipkinAddress
- zipkin.istio-system:9411
- --connectTimeout
- 10s
- --statsdUdpAddress
- istio-statsd-prom-bridge.istio-system:9125
- --proxyAdminPort
- "15000"
- --controlPlaneAuthPolicy
- NONE
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: INSTANCE_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: ISTIO_META_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: ISTIO_META_INTERCEPTION_MODE
value: REDIRECT
image:
imagePullPolicy: IfNotPresent
command: ["/bin/sh"]
args: ["-c", "sudo chown 1337:1337 /var/run/udspath"]
name: serviceC
resources:
requests:
cpu: 10m
securityContext:
privileged: false
readOnlyRootFilesystem: true
runAsUser: 1337
volumeMounts:
- mountPath: /etc/istio/proxy
name: istio-envoy
- mountPath: /etc/certs/
name: istio-certs
readOnly: true
- mountPath: /var/run/udspath
name: sdsudspath
initContainers:
- args:
- -p
- "15001"
- -u
- "1337"
- -m
- REDIRECT
- -i
- '*'
- -x
- ""
- -b
- 8000,
- -d
- ""
image: docker.io/quanlin/proxy_init:180712-1038
imagePullPolicy: IfNotPresent
name: istio-init
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: true
volumes:
- name: sdsudspath
hostPath:
path: /var/run/udspath
- emptyDir:
medium: Memory
name: istio-envoy
- name: istio-certs
secret:
optional: true
secretName: istio.default
status: {}
---
kubectl describe pod xxx shows that
serviceC:
Container ID:
Image:
Image ID:
Port: <none>
Command:
/bin/sh
Args:
-c
sudo chown 1337:1337 /var/run/udspath
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 30 Jul 2018 10:30:04 -0700
Finished: Mon, 30 Jul 2018 10:30:04 -0700
Ready: False
Restart Count: 2
Requests:
cpu: 10m
Environment:
POD_NAME: clitool-5d548b856-6v9p9 (v1:metadata.name)
POD_NAMESPACE: default (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
ISTIO_META_POD_NAME: clitool-5d548b856-6v9p9 (v1:metadata.name)
ISTIO_META_INTERCEPTION_MODE: REDIRECT
Mounts:
/etc/certs/ from certs (ro)
/etc/istio/proxy from envoy (rw)
/var/run/udspath from sdsudspath (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-g2zzv (ro)

More information would be helpful. Like what error are you getting.
Nevertheless, it really depends on what is defined in ServiceC's dockerfile entrypoint or cmd.
Mapping between docker and kubernetes:
Docker Entrypoint --> Pod command (The command run by the container)
Docker cmd --> Pod args (The arguments passed to the command)
https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Prometheus & Alert Manager keeps crashing after updating the EKS version to 1.16 - kubernetes

If someone needs to know the answer, in my case(the above situation) there were 2 Prometheus operators running in different different namespace, 1 in default & another monitoring namespace. so I removed the one from the default namespace and it resolved my pods crashing issue.

Related

Kubernetes CronJob - Multiple CronJob configuration is not working

Unable to deploy mongodb community operator in openshift

How to mount a kubernetes.io/dockerconfigjson

Why Istio "Authentication Policy" Example Page isn't working as expected?

How to change user and group owner for VolumeMount

Categories

Resources