Grafana Loki, AlertManager - unable to read rule dir, open /tmp/loki/rules/fake: no such file or directory - kubernetes-helm

I've deployed promtail, Grafana, Loki and AlertManager using Helm charts on k3d cluster on my local machine. I would like to have some rules in Loki such that if something will happen, AlertManager should be informed. Now I tried only with some simple rule, just to check if it works.
My Loki version: {"version":"2.6.1","revision":"6bd05c9a4","branch":"HEAD","buildUser":"root#ea1e89b8da02","buildDate":"2022-07-18T08:49:07Z","goVersion":""}
My Grafana version:
And the Loki configuration:
loki:
# should loki be deployed on cluster?
enabled: true
image:
repository: grafana/loki
pullPolicy: Always
pullSecrets:
- registry
priorityClassName: normal
resources:
limits:
memory: 3Gi
cpu: 0
requests:
memory: 0
cpu: 0
config:
chunk_store_config:
max_look_back_period: 30d
table_manager:
retention_deletes_enabled: true
retention_period: 30d
query_range:
split_queries_by_interval: 0
parallelise_shardable_queries: false
querier:
max_concurrent: 2048
frontend:
max_outstanding_per_tenant: 4096
compress_responses: true
ingester:
wal:
enabled: true
dir: /tmp/wal
schema_config:
configs:
- from: 2022-12-05
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /tmp/loki/boltdb-shipper-active
cache_location: /tmp/loki/boltdb-shipper-cache
cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space
shared_store: filesystem
filesystem:
directory: /tmp/loki/chunks
compactor:
working_directory: /tmp/loki/boltdb-shipper-compactor
shared_store: filesystem
ruler:
storage:
type: local
local:
directory: /tmp/loki/rules/
ring:
kvstore:
store: inmemory
rule_path: /tmp/loki/rules-temp
alertmanager_url: http://onprem-kube-prometheus-alertmanager.svc.mylocal-monitoring:9093
enable_api: true
enable_alertmanager_v2: true
write:
extraVolumeMounts:
- name: rules-config
mountPath: /tmp/loki/rules/fake/
extraVolumes:
- name: rules-config
configMap:
name: rules-cfgmap
items:
- key: "rules.yaml"
path: "rules.yaml"
read:
extraVolumeMounts:
- name: rules-config
mountPath: /tmp/loki/rules/fake/
extraVolumes:
- name: rules-config
configMap:
name: rules-cfgmap
items:
- key: "rules.yaml"
path: "rules.yaml"
promtail:
image:
registry: docker
pullPolicy: Always
imagePullSecrets:
- name: registry
priorityClassName: normal
resources:
limits:
memory: 256Mi
cpu: 0
requests:
memory: 0
cpu: 0
livenessProbe:
failureThreshold: 5
httpGet:
path: /ready
port: http-metrics
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
config:
snippets:
pipelineStages:
- cri: {}
common:
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: node_name
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: replace
replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
- action: replace
replacement: /var/log/pods/*$1/*.log
regex: true/(.*)
separator: /
source_labels:
- __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
- __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
- __meta_kubernetes_pod_container_name
target_label: __path__
monitoring:
enabled: false
networkPolicies:
enabled: false
The problem is, when I want to check the rules, doing curl -X GET localhost:3100/loki/api/v1/rules it shows me: unable to read rule dir /tmp/loki/rules/fake: open /tmp/loki/rules/fake: no such file or directory.
So it seems it can't find the rules file.
I also tried to change the config like this:
write:
extraVolumeMounts:
- name: rules-conf
mountPath: /tmp/loki/rules/fake/rules.yaml
extraVolumes:
- name: rules-conf
read:
extraVolumeMounts:
- name: rules-conf
mountPath: /tmp/loki/rules/fake/rules.yaml
extraVolumes:
- name: rules-conf
And my configmap:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: rules-cfgmap
namespace: mylocal-monitoring
data:
rules.yaml: |
groups:
- name: PrometheusAlertsGroup
rules:
- alert: test1
expr: |
1 > 0
for: 0m
labels:
severity: critical
annotations:
summary: TEST: testing test
description: test
And the rules file:
groups:
- name: PrometheusAlertsGroup
rules:
- alert: test1
expr: |
1 > 0
for: 0m
labels:
severity: critical
annotations:
summary: TEST: testing test
description: test
But the problem is the same. Any ideas?
It finally worked when I created /tmp/loki/rules/fake/rules.yaml manually, but thats not the point to create it manually.

Related

Error on Telegraf Helm Chart update: Error parsing data

Im trying to deploy telegraf helm chart on kubernetes.
helm upgrade --install telegraf-instance -f values.yaml influxdata/telegraf
When I add modbus input plugin with holding_register i get error
[telegraf] Error running agent: Error loading config file /etc/telegraf/telegraf.conf: Error parsing data: line 49: key `name’ is in conflict with line 2fd
my values.yaml like below
## Default values.yaml for Telegraf
## This is a YAML-formatted file.
## ref: https://hub.docker.com/r/library/telegraf/tags/
replicaCount: 1
image:
repo: "telegraf"
tag: "1.21.4"
pullPolicy: IfNotPresent
podAnnotations: {}
podLabels: {}
imagePullSecrets: []
args: []
env:
- name: HOSTNAME
value: "telegraf-polling-service"
resources: {}
nodeSelector: {}
affinity: {}
tolerations: []
service:
enabled: true
type: ClusterIP
annotations: {}
rbac:
create: true
clusterWide: false
rules: []
serviceAccount:
create: false
name:
annotations: {}
config:
agent:
interval: 60s
round_interval: true
metric_batch_size: 1000000
metric_buffer_limit: 100000000
collection_jitter: 0s
flush_interval: 60s
flush_jitter: 0s
precision: ''
hostname: '9825128'
omit_hostname: false
processors:
- enum:
mapping:
field: "status"
dest: "status_code"
value_mappings:
healthy: 1
problem: 2
critical: 3
inputs:
- modbus:
name: "PS MAIN ENGINE"
controller: 'tcp://192.168.0.101:502'
slave_id: 1
holding_registers:
- name: "Coolant Level"
byte_order: CDAB
data_type: FLOAT32
scale: 0.001
address: [51410, 51411]
- modbus:
name: "SB MAIN ENGINE"
controller: 'tcp://192.168.0.102:502'
slave_id: 1
holding_registers:
- name: "Coolant Level"
byte_order: CDAB
data_type: FLOAT32
scale: 0.001
address: [51410, 51411]
outputs:
- influxdb_v2:
token: token
organization: organisation
bucket: bucket
urls:
- "url"
metrics:
health:
enabled: true
service_address: "http://:8888"
threshold: 5000.0
internal:
enabled: true
collect_memstats: false
pdb:
create: true
minAvailable: 1
Problem resolved by doing the following steps
deleted config section of my values.yaml
added my telegraf.conf to /additional_config path
added configmap to kubernetes with the following command
kubectl create configmap external-config --from-file=/additional_config
added the following command to values.yaml
volumes:
- name: my-config
configMap:
name: external-config
volumeMounts:
- name: my-config
mountPath: /additional_config
args:
- "--config=/etc/telegraf/telegraf.conf"
- "--config-directory=/additional_config"

Mount Dynamic created PV to multiple containers on the same pod

I am working on a use-case where I need to add a new container in jupyterhub pod , This new container (sidecontainer) monitors jupyterhub directories.
Jupyterhub container while comming up creates a dynamic PV , see the section below
singleuser:
podNameTemplate:
extraTolerations: []
nodeSelector: {}
extraNodeAffinity:
required: []
preferred: []
extraPodAffinity:
required: []
preferred: []
extraPodAntiAffinity:
required: []
preferred: []
networkTools:
image:
name: jupyterhub/k8s-network-tools
tag: "set-by-chartpress"
pullPolicy:
pullSecrets: []
resources: {}
cloudMetadata:
# block set to true will append a privileged initContainer using the
# iptables to block the sensitive metadata server at the provided ip.
blockWithIptables: true
ip: 169.254.169.254
networkPolicy:
enabled: true
ingress: []
egress:
# Required egress to communicate with the hub and DNS servers will be
# augmented to these egress rules.
#
# This default rule explicitly allows all outbound traffic from singleuser
# pods, except to a typical IP used to return metadata that can be used by
# someone with malicious intent.
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 169.254.169.254/32
interNamespaceAccessLabels: ignore
allowedIngressPorts: []
events: true
extraAnnotations: {}
extraLabels:
hub.jupyter.org/network-access-hub: "true"
extraFiles: {}
extraEnv: {}
lifecycleHooks: {}
initContainers: []
extraContainers: []
uid: 1000
fsGid: 100
serviceAccountName:
storage:
type: dynamic
extraLabels: {}
extraVolumes: []
extraVolumeMounts: []
static:
pvcName:
subPath: "{username}"
capacity: 10Gi
homeMountPath: /home/jovyan
dynamic:
storageClass:
pvcNameTemplate: claim-{username}{servername}
volumeNameTemplate: volume-{username}{servername}
storageAccessModes: [ReadWriteOnce]
image:
name: jupyterhub/k8s-singleuser-sample
tag: "set-by-chartpress"
pullPolicy:
pullSecrets: []
startTimeout: 300
cpu:
limit:
guarantee:
memory:
limit:
guarantee: 1G
extraResource:
limits: {}
guarantees: {}
cmd:
defaultUrl:
extraPodConfig: {}
profileList: []
I have added my new container in the extraContainer section of deployment file , my container does start but the dynamic PV is not mounted on that container.
Is the use-case I am trying to achieve technically possible at kubernetes level.
full yaml file is present here for reference
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/main/jupyterhub/values.yaml
Config Map for reference
singleuser:
baseUrl: /
cloudMetadata:
enabled: true
ip: xx.xx.xx.xx
cpu: {}
events: true
extraAnnotations: {}
extraConfigFiles:
config_files:
- cm_key: ''
content: ''
file_path: ''
- cm_key: ''
content: ''
file_path: ''
enabled: false
extraContainers:
- image: 'docker:19.03-rc-dind'
lifecycle:
postStart:
exec:
command:
- sh
- '-c'
- update-ca-certificates; echo Certificates Updated
name: dind
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/lib/docker
name: dind-storage
- mountPath: /usr/local/share/ca-certificates/
name: docker-cert
extraEnv:
ACTUAL_HADOOP_CONF_DIR: ''
ACTUAL_HIVE_CONF_DIR: ''
ACTUAL_SPARK_CONF_DIR: ''
CDH_PARCEL_DIR: ''
DOCKER_HOST: ''
JAVA_HOME:
LIVY_URL: ''
SPARK2_PARCEL_DIR: ''
extraLabels:
hub.jupyter.org/network-access-hub: 'true'
extraNodeAffinity:
preferred: []
required: []
extraPodAffinity:
preferred: []
required: []
extraPodAntiAffinity:
preferred: []
required: []
extraPodConfig: {}
extraResource:
guarantees: {}
limits: {}
extraTolerations: []
fsGid: 0
image:
name: >-
/jupyterhub/jpt-spark-magic
pullPolicy: IfNotPresent
tag: xxx
imagePullSecret:
email: null
enabled: false
registry: null
username: null
initContainers: []
lifecycleHooks: {}
memory:
guarantee: 1G
networkPolicy:
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 169.254.169.254/32
enabled: false
ingress: []
networkTools:
image:
name: >-
/k8s-hub-multispawn
pullPolicy: IfNotPresent
tag: '12345'
nodeSelector: {}
profileList:
- description: Python for data enthusiasts
display_name: 0
kubespawner_override:
cmd:
- jpt-entry-cmd.sh
cpu_limit: 1
environment:
XYZ_SERVICE_URL: 'http://XYZ:8080'
CURL_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
DOCKER_HOST: 'tcp://localhost:2375'
HADOOP_CONF_DIR: /etc/hadoop/conf
HADOOP_HOME: /usr/hdp/3.1.5.6091-7/hadoop/
HDP_DIR: /usr/hdp/3.1.5.6091-7
HDP_HOME_DIR: /usr/hdp/3.1.5.6091-7
HDP_VERSION: 3.1.5.6091-7
HIVE_CONF_DIR: /usr/hdp/3.1.5.6091-7/hive
HIVE_HOME: /usr/hdp/3.1.5.6091-7/hive
INTEGRATION_ENV: HDP3
JAVA_HOME: /usr/jdk64/jdk1.8.0_112
LD_LIBRARY_PATH: >-
/usr/hdp/3.1.5.6091-7/hadoop/lib/native:/usr/jdk64/jdk1.8.0_112/jre:/usr/hdp/3.1.5.6091-7/usr/lib/:/usr/hdp/3.1.5.6091-7/usr/lib/
LIVY_URL: 'http://ammaster01.fake.org:8999'
MLFLOW_TRACKING_URI: 'http://mlflow:5100'
NO_PROXY: mlflow
SPARK_CONF_DIR: /etc/spark2/conf
SPARK_HOME: /usr/hdp/3.1.5.6091-7/spark2
SPARK2_PARCEL_DIR: /usr/hdp/3.1.5.6091-7/spark2
TOOLS_BASE_PATH: /usr/local/bin
image: >-
/jupyterhub/jpt-spark-magic:1.1.2
mem_limit: 4096M
uid: 0
- description: R for data enthusiasts
display_name: 1
kubespawner_override:
cmd:
- start-all.sh
environment:
XYZ_SERVICE_URL: 'http://XYZ-service:8080'
DISABLE_AUTH: 'true'
XYZ: /home/rstudio/kitematic
image: '/jupyterhub/rstudio:364094'
uid: 0
- description: Python for data enthusiasts test2
display_name: 2
kubespawner_override:
cmd:
- jpt-entry-cmd.sh
cpu_limit: 4
environment:
XYZ_SERVICE_URL: 'http://XYZ-service:8080'
CURL_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
DOCKER_HOST: 'tcp://localhost:2375'
HADOOP_CONF_DIR: /etc/hadoop/conf
HADOOP_HOME: /usr/hdp/3.1.5.6091-7/hadoop/
HDP_DIR: /usr/hdp/3.1.5.6091-7
HDP_HOME_DIR: /usr/hdp/3.1.5.6091-7
HDP_VERSION: 3.1.5.6091-7
HIVE_CONF_DIR: /usr/hdp/3.1.5.6091-7/hive
HIVE_HOME: /usr/hdp/3.1.5.6091-7/hive
INTEGRATION_ENV: HDP3
JAVA_HOME: /usr/jdk64/jdk1.8.0_112
LD_LIBRARY_PATH: >-
/usr/hdp/3.1.5.6091-7/hadoop/lib/native:/usr/jdk64/jdk1.8.0_112/jre:/usr/hdp/3.1.5.6091-7/usr/lib/:/usr/hdp/3.1.5.6091-7/usr/lib/
LIVY_URL: 'http://xyz:8999'
MLFLOW_TRACKING_URI: 'http://mlflow:5100'
NO_PROXY: mlflow
SPARK_CONF_DIR: /etc/spark2/conf
SPARK_HOME: /usr/hdp/3.1.5.6091-7/spark2
SPARK2_PARCEL_DIR: /usr/hdp/3.1.5.6091-7/spark2
TOOLS_BASE_PATH: /usr/local/bin
image: >-
/jupyterhub/jpt-spark-magic:1.1.2
mem_limit: 8192M
uid: 0
- description: Python for data enthusiasts test3
display_name: 3
kubespawner_override:
cmd:
- jpt-entry-cmd.sh
cpu_limit: 8
environment:
XYZ_SERVICE_URL: 'http://XYZ-service:8080'
CURL_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
DOCKER_HOST: 'tcp://localhost:2375'
HADOOP_CONF_DIR: /etc/hadoop/conf
HADOOP_HOME: /usr/hdp/3.1.5.6091-7/hadoop/
HDP_DIR: /usr/hdp/3.1.5.6091-7
HDP_HOME_DIR: /usr/hdp/3.1.5.6091-7
HDP_VERSION: 3.1.5.6091-7
HIVE_CONF_DIR: /usr/hdp/3.1.5.6091-7/hive
HIVE_HOME: /usr/hdp/3.1.5.6091-7/hive
INTEGRATION_ENV: HDP3
JAVA_HOME: /usr/jdk64/jdk1.8.0_112
LD_LIBRARY_PATH: >-
/usr/hdp/3.1.5.6091-7/hadoop/lib/native:/usr/jdk64/jdk1.8.0_112/jre:/usr/hdp/3.1.5.6091-7/usr/lib/:/usr/hdp/3.1.5.6091-7/usr/lib/
LIVY_URL: 'http://fake.org:8999'
MLFLOW_TRACKING_URI: 'http://mlflow:5100'
NO_PROXY: mlflow
SPARK_CONF_DIR: /etc/spark2/conf
SPARK_HOME: /usr/hdp/3.1.5.6091-7/spark2
SPARK2_PARCEL_DIR: /usr/hdp/3.1.5.6091-7/spark2
TOOLS_BASE_PATH: /usr/local/bin
image: >-
/jupyterhub/jpt-spark-magic:1.1.2
mem_limit: 16384M
uid: 0
startTimeout: 300
storage:
capacity: 10Gi
dynamic:
pvcNameTemplate: 'claim-{username}{servername}'
storageAccessModes:
- ReadWriteOnce
storageClass: nfs-client
volumeNameTemplate: 'volume-{username}{servername}'
extraLabels: {}
extraVolumeMounts:
- mountPath: /etc/krb5.conf
name: krb
readOnly: true
- mountPath: /usr/jdk64/jdk1.8.0_112
name: java-home
readOnly: true
- mountPath: /xyz/conda/envs
name: xyz-conda-envs
readOnly: false
- mountPath: /usr/hdp/
name: bigdata
readOnly: true
subPath: usr-hdp
- mountPath: /etc/hadoop/
name: bigdata
readOnly: true
subPath: HDP
- mountPath: /etc/hive/
name: bigdata
readOnly: true
subPath: hdp-hive
- mountPath: /etc/spark2/
name: bigdata
readOnly: true
subPath: hdp-spark2
extraVolumes:
- emptyDir: {}
name: dind-storage
- name: docker-cert
secret:
secretName: docker-cert
- hostPath:
path: /var/lib/ut_xyz_ts/jdk1.8.0_112
type: Directory
name: java-home
- hostPath:
path: /xyz/conda/envs
type: Directory
name: xyz-conda-envs
- hostPath:
path: /etc/krb5.conf
type: File
name: krb
- name: bigdata
persistentVolumeClaim:
claimName: bigdata
homeMountPath: '/home/{username}'
static:
subPath: '{username}'
type: dynamic
uid: 0
Thanks in advance ..
For you question, whether technically two or more containers of same pod share same volume, the answer in Yes. Refer here - https://youtu.be/GQJP9QdHHs8?t=82 .
But you should have a volumeMount (refer example in the video as well) defined in your extra containers spec as well. If you can check that, or share output of kubectl describe deployment <your-deployment> I can confirm it.

Prometheus & Alert Manager keeps crashing after updating the EKS version to 1.16

prometheus-prometheus-kube-prometheus-prometheus-0 0/2 Terminating 0 4s
alertmanager-prometheus-kube-prometheus-alertmanager-0 0/2 Terminating 0 10s
After updating EKS cluster to 1.16 from 1.15 everything works fine except these two pods, they keep on terminating and unable to initialise. Hence, prometheus monitoring does not work. I am getting below errors while describing the pods.
Error: failed to start container "prometheus": Error response from daemon: OCI runtime create failed: container_linux.go:362: creating new parent process caused: container_linux.go:1941: running lstat on namespace path "/proc/29271/ns/ipc" caused: lstat /proc/29271/ns/ipc: no such file or directory: unknown
Error: failed to start container "config-reloader": Error response from daemon: cannot join network of a non running container: 7e139521980afd13dad0162d6859352b0b2c855773d6d4062ee3e2f7f822a0b3
Error: cannot find volume "config" to mount into container "config-reloader"
Error: cannot find volume "config" to mount into container "prometheus"
here is my yaml file for the deployment:
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
creationTimestamp: "2021-04-30T16:39:14Z"
deletionGracePeriodSeconds: 600
deletionTimestamp: "2021-04-30T16:49:14Z"
generateName: prometheus-prometheus-kube-prometheus-prometheus-
labels:
app: prometheus
app.kubernetes.io/instance: prometheus-kube-prometheus-prometheus
app.kubernetes.io/managed-by: prometheus-operator
app.kubernetes.io/name: prometheus
app.kubernetes.io/version: 2.26.0
controller-revision-hash: prometheus-prometheus-kube-prometheus-prometheus-56d9fcf57
operator.prometheus.io/name: prometheus-kube-prometheus-prometheus
operator.prometheus.io/shard: "0"
prometheus: prometheus-kube-prometheus-prometheus
statefulset.kubernetes.io/pod-name: prometheus-prometheus-kube-prometheus-prometheus-0
name: prometheus-prometheus-kube-prometheus-prometheus-0
namespace: mo
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: prometheus-prometheus-kube-prometheus-prometheus
uid: 326a09f2-319c-449d-904a-1dd0019c6d80
resourceVersion: "9337443"
selfLink: /api/v1/namespaces/monitoring/pods/prometheus-prometheus-kube-prometheus-prometheus-0
uid: e2be062f-749d-488e-a6cc-42ef1396851b
spec:
containers:
- args:
- --web.console.templates=/etc/prometheus/consoles
- --web.console.libraries=/etc/prometheus/console_libraries
- --config.file=/etc/prometheus/config_out/prometheus.env.yaml
- --storage.tsdb.path=/prometheus
- --storage.tsdb.retention.time=10d
- --web.enable-lifecycle
- --storage.tsdb.no-lockfile
- --web.external-url=http://prometheus-kube-prometheus-prometheus.monitoring:9090
- --web.route-prefix=/
image: quay.io/prometheus/prometheus:v2.26.0
imagePullPolicy: IfNotPresent
name: prometheus
ports:
- containerPort: 9090
name: web
protocol: TCP
readinessProbe:
failureThreshold: 120
httpGet:
path: /-/ready
port: web
scheme: HTTP
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 3
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/prometheus/config_out
name: config-out
readOnly: true
- mountPath: /etc/prometheus/certs
name: tls-assets
readOnly: true
- mountPath: /prometheus
name: prometheus-prometheus-kube-prometheus-prometheus-db
- mountPath: /etc/prometheus/rules/prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
name: prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: prometheus-kube-prometheus-prometheus-token-mh66q
readOnly: true
- args:
- --listen-address=:8080
- --reload-url=http://localhost:9090/-/reload
- --config-file=/etc/prometheus/config/prometheus.yaml.gz
- --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
- --watched-dir=/etc/prometheus/rules/prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
command:
- /bin/prometheus-config-reloader
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: SHARD
value: "0"
image: quay.io/prometheus-operator/prometheus-config-reloader:v0.47.0
imagePullPolicy: IfNotPresent
name: config-reloader
ports:
- containerPort: 8080
name: reloader-web
protocol: TCP
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/prometheus/config
name: config
- mountPath: /etc/prometheus/config_out
name: config-out
- mountPath: /etc/prometheus/rules/prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
name: prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: prometheus-kube-prometheus-prometheus-token-mh66q
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostname: prometheus-prometheus-kube-prometheus-prometheus-0
nodeName: ip-10-1-49-45.ec2.internal
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccount: prometheus-kube-prometheus-prometheus
serviceAccountName: prometheus-kube-prometheus-prometheus
subdomain: prometheus-operated
terminationGracePeriodSeconds: 600
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: config
secret:
defaultMode: 420
secretName: prometheus-prometheus-kube-prometheus-prometheus
- name: tls-assets
secret:
defaultMode: 420
secretName: prometheus-prometheus-kube-prometheus-prometheus-tls-assets
- emptyDir: {}
name: config-out
- configMap:
defaultMode: 420
name: prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
name: prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
- emptyDir: {}
name: prometheus-prometheus-kube-prometheus-prometheus-db
- name: prometheus-kube-prometheus-prometheus-token-mh66q
secret:
defaultMode: 420
secretName: prometheus-kube-prometheus-prometheus-token-mh66q
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2021-04-30T16:39:14Z"
status: "True"
type: PodScheduled
phase: Pending
qosClass: Burstable
If someone needs to know the answer, in my case(the above situation) there were 2 Prometheus operators running in different different namespace, 1 in default & another monitoring namespace. so I removed the one from the default namespace and it resolved my pods crashing issue.

Unable to deploy mongodb community operator in openshift

I'm trying to deploy the mongodb community operator in openshift 3.11, using the following commands:
git clone https://github.com/mongodb/mongodb-kubernetes-operator.git
cd mongodb-kubernetes-operator
oc new-project mongodb
oc create -f deploy/crds/mongodb.com_mongodb_crd.yaml -n mongodb
oc create -f deploy/operator/role.yaml -n mongodb
oc create -f deploy/operator/role_binding.yaml -n mongodb
oc create -f deploy/operator/service_account.yaml -n mongodb
oc apply -f deploy/openshift/operator_openshift.yaml -n mongodb
oc apply -f deploy/crds/mongodb.com_v1_mongodb_openshift_cr.yaml -n mongodb
Operator pod is successfully running, but the mongodb replicaset pods do not spin up. Error is as follows:
[kubenode#master mongodb-kubernetes-operator]$ oc get pods
NAME READY STATUS RESTARTS AGE
example-openshift-mongodb-0 1/2 CrashLoopBackOff 4 2m
mongodb-kubernetes-operator-66bfcbcf44-9xvj7 1/1 Running 0 2m
[kubenode#master mongodb-kubernetes-operator]$ oc logs -f example-openshift-mongodb-0 -c mongodb-agent
panic: Failed to get current user: user: unknown userid 1000510000
goroutine 1 [running]:
com.tengen/cm/util.init.3()
/data/mci/2f46ec94982c5440960d2b2bf2b6ae15/mms-automation/build/go-dependencies/src/com.tengen/cm/util/user.go:14 +0xe5
I have gone through all the issues raised on the mongodb-kubernetes-operator repository which are related to this issue (reference), and found a suggestion to set the MANAGED_SECURITY_CONTEXT environment variable to true in the operator, mongodb and mongodb-agent containers.
I have done so for all of these containers, but am still facing the same issue.
Here is the confirmation that the environment variables are correctly set:
[kubenode#master mongodb-kubernetes-operator]$ oc set env statefulset.apps/example-openshift-mongodb --list
# statefulsets/example-openshift-mongodb, container mongodb-agent
AGENT_STATUS_FILEPATH=/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
AUTOMATION_CONFIG_MAP=example-openshift-mongodb-config
HEADLESS_AGENT=true
MANAGED_SECURITY_CONTEXT=true
# POD_NAMESPACE from field path metadata.namespace
# statefulsets/example-openshift-mongodb, container mongod
AGENT_STATUS_FILEPATH=/healthstatus/agent-health-status.json
MANAGED_SECURITY_CONTEXT=true
[kubenode#master mongodb-kubernetes-operator]$ oc set env deployment.apps/mongodb-kubernetes-operator --list
# deployments/mongodb-kubernetes-operator, container mongodb-kubernetes-operator
# WATCH_NAMESPACE from field path metadata.namespace
# POD_NAME from field path metadata.name
MANAGED_SECURITY_CONTEXT=true
OPERATOR_NAME=mongodb-kubernetes-operator
AGENT_IMAGE=quay.io/mongodb/mongodb-agent:10.19.0.6562-1
VERSION_UPGRADE_HOOK_IMAGE=quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.2
Operator Information
Operator Version: 0.3.0
MongoDB Image used: 4.2.6
Cluster Information
[kubenode#master mongodb-kubernetes-operator]$ openshift version
openshift v3.11.0+62803d0-1
[kubenode#master mongodb-kubernetes-operator]$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2018-10-15T09:45:30Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2020-12-07T17:59:40Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
Update
When I check the replica pod yaml (see below), I see three occurrences of runAsUser security context set as 1000510000. I'm not sure how, but this is being set even though I'm not setting it manually.
[kubenode#master mongodb-kubernetes-operator]$ oc get -o yaml pod example-openshift-mongodb-0
apiVersion: v1
kind: Pod
metadata:
annotations:
openshift.io/scc: restricted
creationTimestamp: 2021-01-19T07:45:05Z
generateName: example-openshift-mongodb-
labels:
app: example-openshift-mongodb-svc
controller-revision-hash: example-openshift-mongodb-6549495b
statefulset.kubernetes.io/pod-name: example-openshift-mongodb-0
name: example-openshift-mongodb-0
namespace: mongodb
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: example-openshift-mongodb
uid: 3e91eb40-5a2a-11eb-a5e0-0050569b1f59
resourceVersion: "15616863"
selfLink: /api/v1/namespaces/mongodb/pods/example-openshift-mongodb-0
uid: 3ea17a28-5a2a-11eb-a5e0-0050569b1f59
spec:
containers:
- command:
- agent/mongodb-agent
- -cluster=/var/lib/automation/config/cluster-config.json
- -skipMongoStart
- -noDaemonize
- -healthCheckFilePath=/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
- -serveStatusPort=5000
- -useLocalMongoDbTools
env:
- name: AGENT_STATUS_FILEPATH
value: /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
- name: AUTOMATION_CONFIG_MAP
value: example-openshift-mongodb-config
- name: HEADLESS_AGENT
value: "true"
- name: MANAGED_SECURITY_CONTEXT
value: "true"
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: quay.io/mongodb/mongodb-agent:10.19.0.6562-1
imagePullPolicy: Always
name: mongodb-agent
readinessProbe:
exec:
command:
- /var/lib/mongodb-mms-automation/probes/readinessprobe
failureThreshold: 60
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1000510000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/automation/config
name: automation-config
readOnly: true
- mountPath: /data
name: data-volume
- mountPath: /var/lib/mongodb-mms-automation/authentication
name: example-openshift-mongodb-agent-scram-credentials
- mountPath: /var/log/mongodb-mms-automation/healthstatus
name: healthstatus
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: mongodb-kubernetes-operator-token-lr9l4
readOnly: true
- command:
- /bin/sh
- -c
- |2
# run post-start hook to handle version changes
/hooks/version-upgrade
# wait for config to be created by the agent
while [ ! -f /data/automation-mongod.conf ]; do sleep 3 ; done ; sleep 2 ;
# start mongod with this configuration
exec mongod -f /data/automation-mongod.conf ;
env:
- name: AGENT_STATUS_FILEPATH
value: /healthstatus/agent-health-status.json
- name: MANAGED_SECURITY_CONTEXT
value: "true"
image: mongo:4.2.6
imagePullPolicy: IfNotPresent
name: mongod
resources: {}
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1000510000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data-volume
- mountPath: /var/lib/mongodb-mms-automation/authentication
name: example-openshift-mongodb-agent-scram-credentials
- mountPath: /healthstatus
name: healthstatus
- mountPath: /hooks
name: hooks
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: mongodb-kubernetes-operator-token-lr9l4
readOnly: true
dnsPolicy: ClusterFirst
hostname: example-openshift-mongodb-0
imagePullSecrets:
- name: mongodb-kubernetes-operator-dockercfg-jhplw
initContainers:
- command:
- cp
- version-upgrade-hook
- /hooks/version-upgrade
image: quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.2
imagePullPolicy: Always
name: mongod-posthook
resources: {}
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1000510000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /hooks
name: hooks
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: mongodb-kubernetes-operator-token-lr9l4
readOnly: true
nodeName: node1.192.168.27.116.nip.io
nodeSelector:
node-role.kubernetes.io/compute: "true"
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1000510000
seLinuxOptions:
level: s0:c23,c2
serviceAccount: mongodb-kubernetes-operator
serviceAccountName: mongodb-kubernetes-operator
subdomain: example-openshift-mongodb-svc
terminationGracePeriodSeconds: 30
volumes:
- name: data-volume
persistentVolumeClaim:
claimName: data-volume-example-openshift-mongodb-0
- name: automation-config
secret:
defaultMode: 416
secretName: example-openshift-mongodb-config
- name: example-openshift-mongodb-agent-scram-credentials
secret:
defaultMode: 384
secretName: example-openshift-mongodb-agent-scram-credentials
- emptyDir: {}
name: healthstatus
- emptyDir: {}
name: hooks
- name: mongodb-kubernetes-operator-token-lr9l4
secret:
defaultMode: 420
secretName: mongodb-kubernetes-operator-token-lr9l4
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2021-01-19T07:46:45Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2021-01-19T07:46:39Z
message: 'containers with unready status: [mongodb-agent]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: null
message: 'containers with unready status: [mongodb-agent]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: 2021-01-19T07:45:05Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://bd3ede9178bb78267bc19d1b5da0915d3bcd1d4dcee3e142c7583424bd2aa777
image: docker.io/mongo:4.2.6
imageID: docker-pullable://docker.io/mongo#sha256:c880f6b56f443bb4d01baa759883228cd84fa8d78fa1a36001d1c0a0712b5a07
lastState: {}
name: mongod
ready: true
restartCount: 0
state:
running:
startedAt: 2021-01-19T07:46:55Z
- containerID: docker://5e39c0b6269b8231bbf9cabb4ff3457d9f91e878eff23953e318a9475fb8a90e
image: quay.io/mongodb/mongodb-agent:10.19.0.6562-1
imageID: docker-pullable://quay.io/mongodb/mongodb-agent#sha256:790c2670ef7cefd61cfaabaf739de16dbd2e07dc3b539add0da21ab7d5ac7626
lastState:
terminated:
containerID: docker://5e39c0b6269b8231bbf9cabb4ff3457d9f91e878eff23953e318a9475fb8a90e
exitCode: 2
finishedAt: 2021-01-19T19:39:58Z
reason: Error
startedAt: 2021-01-19T19:39:58Z
name: mongodb-agent
ready: false
restartCount: 144
state:
waiting:
message: Back-off 5m0s restarting failed container=mongodb-agent pod=example-openshift-mongodb-0_mongodb(3ea17a28-5a2a-11eb-a5e0-0050569b1f59)
reason: CrashLoopBackOff
hostIP: 192.168.27.116
initContainerStatuses:
- containerID: docker://7c31cef2a68e3e6100c2cc9c83e3780313f1e8ab43bebca79ad4d48613f124bd
image: quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.2
imageID: docker-pullable://quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook#sha256:e99105b1c54e12913ddaf470af8025111a6e6e4c8917fc61be71d1bc0328e7d7
lastState: {}
name: mongod-posthook
ready: true
restartCount: 0
state:
terminated:
containerID: docker://7c31cef2a68e3e6100c2cc9c83e3780313f1e8ab43bebca79ad4d48613f124bd
exitCode: 0
finishedAt: 2021-01-19T07:46:45Z
reason: Completed
startedAt: 2021-01-19T07:46:44Z
phase: Running
podIP: 10.129.0.119
qosClass: BestEffort
startTime: 2021-01-19T07:46:39Z

Why Istio "Authentication Policy" Example Page isn't working as expected?

The article here: https://istio.io/docs/tasks/security/authn-policy/
Specifically, when I follow the instruction on the Setup section, I can't connect any httpbin that are residing in namespace foo and bar. But the legacy's one is okay. I expect there is something wrong in the side car proxy being installed.
Here is the output of httpbin pod yaml file (after being injected with istioctl kubeinject --includeIPRanges "10.32.0.0/16" command). I use --includeIPRanges so that the pod can communicate with external ip (for my debugging purpose to install dnsutils, etc package)
apiVersion: v1
kind: Pod
metadata:
annotations:
sidecar.istio.io/inject: "true"
sidecar.istio.io/status: '{"version":"4120ea817406fd7ed43b7ecf3f2e22abe453c44d3919389dcaff79b210c4cd86","initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-certs"],"imagePullSecrets":null}'
creationTimestamp: 2018-08-15T11:40:59Z
generateName: httpbin-8b9cf99f5-
labels:
app: httpbin
pod-template-hash: "465795591"
version: v1
name: httpbin-8b9cf99f5-9c47z
namespace: foo
ownerReferences:
- apiVersion: extensions/v1beta1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: httpbin-8b9cf99f5
uid: 1450d75d-a080-11e8-aece-42010a940168
resourceVersion: "65722138"
selfLink: /api/v1/namespaces/foo/pods/httpbin-8b9cf99f5-9c47z
uid: 1454b68d-a080-11e8-aece-42010a940168
spec:
containers:
- image: docker.io/citizenstig/httpbin
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 8000
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-pkpvf
readOnly: true
- args:
- proxy
- sidecar
- --configPath
- /etc/istio/proxy
- --binaryPath
- /usr/local/bin/envoy
- --serviceCluster
- httpbin
- --drainDuration
- 45s
- --parentShutdownDuration
- 1m0s
- --discoveryAddress
- istio-pilot.istio-system:15007
- --discoveryRefreshDelay
- 1s
- --zipkinAddress
- zipkin.istio-system:9411
- --connectTimeout
- 10s
- --statsdUdpAddress
- istio-statsd-prom-bridge.istio-system.istio-system:9125
- --proxyAdminPort
- "15000"
- --controlPlaneAuthPolicy
- NONE
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: INSTANCE_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: ISTIO_META_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: ISTIO_META_INTERCEPTION_MODE
value: REDIRECT
image: docker.io/istio/proxyv2:1.0.0
imagePullPolicy: IfNotPresent
name: istio-proxy
resources:
requests:
cpu: 10m
securityContext:
privileged: false
readOnlyRootFilesystem: true
runAsUser: 1337
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/istio/proxy
name: istio-envoy
- mountPath: /etc/certs/
name: istio-certs
readOnly: true
dnsPolicy: ClusterFirst
initContainers:
- args:
- -p
- "15001"
- -u
- "1337"
- -m
- REDIRECT
- -i
- 10.32.0.0/16
- -x
- ""
- -b
- 8000,
- -d
- ""
image: docker.io/istio/proxy_init:1.0.0
imagePullPolicy: IfNotPresent
name: istio-init
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
nodeName: gke-tvlk-data-dev-default-medium-pool-46397778-q2sb
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: default-token-pkpvf
secret:
defaultMode: 420
secretName: default-token-pkpvf
- emptyDir:
medium: Memory
name: istio-envoy
- name: istio-certs
secret:
defaultMode: 420
optional: true
secretName: istio.default
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2018-08-15T11:41:01Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2018-08-15T11:44:28Z
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2018-08-15T11:40:59Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://758e130a4c31a15c1b8bc1e1f72bd7739d5fa1103132861eea9ae1a6ae1f080e
image: citizenstig/httpbin:latest
imageID: docker-pullable://citizenstig/httpbin#sha256:b81c818ccb8668575eb3771de2f72f8a5530b515365842ad374db76ad8bcf875
lastState: {}
name: httpbin
ready: true
restartCount: 0
state:
running:
startedAt: 2018-08-15T11:41:01Z
- containerID: docker://9c78eac46a99457f628493975f5b0c5bbffa1dac96dab5521d2efe4143219575
image: istio/proxyv2:1.0.0
imageID: docker-pullable://istio/proxyv2#sha256:77915a0b8c88cce11f04caf88c9ee30300d5ba1fe13146ad5ece9abf8826204c
lastState:
terminated:
containerID: docker://52299a80a0fa8949578397357861a9066ab0148ac8771058b83e4c59e422a029
exitCode: 255
finishedAt: 2018-08-15T11:44:27Z
reason: Error
startedAt: 2018-08-15T11:41:02Z
name: istio-proxy
ready: true
restartCount: 1
state:
running:
startedAt: 2018-08-15T11:44:28Z
hostIP: 10.32.96.27
initContainerStatuses:
- containerID: docker://f267bb44b70d2d383ce3f9943ab4e917bb0a42ecfe17fe0ed294bde4d8284c58
image: istio/proxy_init:1.0.0
imageID: docker-pullable://istio/proxy_init#sha256:345c40053b53b7cc70d12fb94379e5aa0befd979a99db80833cde671bd1f9fad
lastState: {}
name: istio-init
ready: true
restartCount: 0
state:
terminated:
containerID: docker://f267bb44b70d2d383ce3f9943ab4e917bb0a42ecfe17fe0ed294bde4d8284c58
exitCode: 0
finishedAt: 2018-08-15T11:41:00Z
reason: Completed
startedAt: 2018-08-15T11:41:00Z
phase: Running
podIP: 10.32.19.61
qosClass: Burstable
startTime: 2018-08-15T11:40:59Z
Here is the example command when I got the error sleep.legacy -> httpbin.foo
> kubectl exec $(kubectl get pod -l app=sleep -n legacy -o jsonpath={.items..metadata.name}) -c sleep -n legacy -- curl http://httpbin.foo:8000/ip -s -o /dev/null -w "%{http_code}\n"
000
command terminated with exit code 7
** Here is the example command when I get success status: sleep.legacy -> httpbin.legacy **
> kubectl exec $(kubectl get pod -l app=sleep -n legacy -o jsonpath={.items..metadata.name}) -csleep -n legacy -- curl http://httpbin.legacy:8000/ip -s -o /dev/null -w "%{http_code}\n"
200
I have followed the instruction to ensure there is no mtls policy defined, etc.
> kubectl get policies.authentication.istio.io --all-namespaces
No resources found.
> kubectl get meshpolicies.authentication.istio.io
No resources found.
> kubectl get destinationrules.networking.istio.io --all-namespaces -o yaml | grep "host:"
host: istio-policy.istio-system.svc.cluster.local
host: istio-telemetry.istio-system.svc.cluster.local
NVM, I think I found why. There is configuration being messed up in my part.
If you take a look at the statsd address, it is defined with unrecognized hostname istio-statsd-prom-bridge.istio-system.istio-system:9125. I noticed that after looking at the proxy container being restarted/crashed multiple times.