no endpoints available for service \"kubernetes-dashboard\" after kubernetes upgrade from 1.17.5 to 1.18.0 - kubernetes-dashboard

I have upgraded my 3 master and 6 nodes kubernetes cluster from kubernetes version 1.17.5 to 1.18.0 and post upgrade kubernetes-dashboard stopped working with below error :
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "no endpoints available for service \"kubernetes-dashboard\"",
"reason": "ServiceUnavailable",
"code": 503
}
dashboard status Before upgrade:
kubectl get deployment kubernetes-dashboard --namespace=kubernetes-dashboard
NAME READY UP-TO-DATE AVAILABLE AGE
kubernetes-dashboard 1/1 1 1 35m
kubectl describe service kubernetes-dashboard --namespace=kubernetes-dashboard
Name: kubernetes-dashboard
Namespace: kubernetes-dashboard
Labels: k8s-app=kubernetes-dashboard
Annotations: Selector: k8s-app=kubernetes-dashboard
Type: ClusterIP
IP: 10.y.y.y
Port: <unset> 443/TCP
TargetPort: 8443/TCP
Endpoints: 192.168.x.x:8443
Session Affinity: None
Events: <none>
kubectl --namespace=kubernetes-dashboard get ep kubernetes-dashboard
NAME ENDPOINTS AGE
kubernetes-dashboard 192.168.x.x:8443 36m
dashboard status After Upgrade:
kubectl get deployment kubernetes-dashboard --namespace=kubernetes-dashboard
NAME READY UP-TO-DATE AVAILABLE AGE
kubernetes-dashboard 0/1 0 0 88m
kubectl describe service kubernetes-dashboard --namespace=kubernetes-dashboard
Name: kubernetes-dashboard
Namespace: kubernetes-dashboard
Labels: k8s-app=kubernetes-dashboard
Annotations: <none>
Selector: k8s-app=kubernetes-dashboard
Type: ClusterIP
IP: 10.y.y.y
Port: <unset> 443/TCP
TargetPort: 8443/TCP
Endpoints: <none>
Session Affinity: None
Events: <none>
kubectl --namespace=kubernetes-dashboard get ep kubernetes-dashboard
NAME ENDPOINTS AGE
kubernetes-dashboard <none> 88m
No pods are present under namespace kubernetes-dashboard while it was there before upgrade. Here is some part of kubernetes-dashboard deployment yaml output :
manager: kubectl
operation: Update
time: "2020-05-14T20:56:04Z"
name: kubernetes-dashboard
namespace: kubernetes-dashboard
resourceVersion: "66272"
selfLink: /apis/apps/v1/namespaces/kubernetes-dashboard/deployments/kubernetes-dashboard
uid: 11d1b3e7-b333-4cf0-a542-a72a09f3d56a
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- args:
- --auto-generate-certificates
- --namespace=kubernetes-dashboard
image: kubernetesui/dashboard:v2.0.0
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
httpGet:
path: /
port: 8443
scheme: HTTPS
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
name: kubernetes-dashboard
ports:
- containerPort: 8443
protocol: TCP
resources: {}
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsGroup: 2001
runAsUser: 1001
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /certs
name: kubernetes-dashboard-certs
- mountPath: /tmp
name: tmp-volume
dnsPolicy: ClusterFirst
nodeSelector:
kubernetes.io/os: linux
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: kubernetes-dashboard
serviceAccountName: kubernetes-dashboard
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
volumes:
- name: kubernetes-dashboard-certs
secret:
defaultMode: 420
secretName: kubernetes-dashboard-certs
- emptyDir: {}
name: tmp-volume
status:
conditions:
- lastTransitionTime: "2020-05-14T20:56:04Z"
lastUpdateTime: "2020-05-14T20:56:04Z"
message: Created new replica set "kubernetes-dashboard-7b544877d5"
reason: NewReplicaSetCreated
status: "True"
type: Progressing
- lastTransitionTime: "2020-05-14T20:56:04Z"
lastUpdateTime: "2020-05-14T20:56:04Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2020-05-14T20:56:04Z"
lastUpdateTime: "2020-05-14T20:56:04Z"
message: 'pods "kubernetes-dashboard-7b544877d5-" is forbidden: unable to validate
against any pod security policy: []'
reason: FailedCreate
status: "True"
type: ReplicaFailure
observedGeneration: 1
unavailableReplicas: 1

Related

Kubernetes Service does not have active Endpoint

I created a Deployment, Service and an Ingress. Unfortunately, the ingress-nginx-controller pods are complaining that my Service does not have an Active Endpoint:
controller.go:920] Service "<namespace>/web-server" does not have any active Endpoint.
My Service definition:
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/should_be_scraped: "false"
creationTimestamp: "2021-06-22T07:07:18Z"
labels:
chart: <namespace>-core-1.9.2
release: <namespace>
name: web-server
namespace: <namespace>
resourceVersion: "9050796"
selfLink: /api/v1/namespaces/<namespace>/services/web-server
uid: 82b3c3b4-a181-4ba2-887a-a4498346bc81
spec:
clusterIP: 10.233.56.52
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
selector:
app: web-server
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
My Deployment definition:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
creationTimestamp: "2021-06-22T07:07:19Z"
generation: 1
labels:
app: web-server
chart: <namespace>-core-1.9.2
release: <namespace>
name: web-server
namespace: <namespace>
resourceVersion: "9051062"
selfLink: /apis/apps/v1/namespaces/<namespace>/deployments/web-server
uid: fb085727-9e8a-4931-8067-fd4ed410b8ca
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: web-server
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: web-server
spec:
containers:
- env:
<removed environment variables>
image: <url>/<namespace>/web-server:1.10.1
imagePullPolicy: IfNotPresent
name: web-server
ports:
- containerPort: 8080
name: http
protocol: TCP
- containerPort: 8082
name: metrics
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /actuator/health
port: 8080
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
memory: 1Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /config
name: <namespace>-config
dnsPolicy: ClusterFirst
hostAliases:
- hostnames:
- <url>
ip: 10.0.1.178
imagePullSecrets:
- name: registry-pull-secret
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: <namespace>-config
name: <namespace>-config
status:
conditions:
- lastTransitionTime: "2021-06-22T07:07:19Z"
lastUpdateTime: "2021-06-22T07:07:19Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2021-06-22T07:17:20Z"
lastUpdateTime: "2021-06-22T07:17:20Z"
message: ReplicaSet "web-server-6df6d6565b" has timed out progressing.
reason: ProgressDeadlineExceeded
status: "False"
type: Progressing
observedGeneration: 1
replicas: 1
unavailableReplicas: 1
updatedReplicas: 1
In the same namespace, I have more Service and Deployment resources, all of them work, except this one (+ another, see below).
# kubectl get endpoints -n <namespace>
NAME ENDPOINTS AGE
activemq 10.233.64.3:61613,10.233.64.3:8161,10.233.64.3:61616 + 1 more... 26d
content-backend 10.233.96.17:8080 26d
datastore3 10.233.96.16:8080 26d
web-server 74m
web-server-metrics 26d
As you can see, the selector/label are the same (web-server) in the Service as well as in the Deployment definition.
C-Nan has solved the problem, and has posted a solution as a comment:
I found the issue. The Pod was started, but not in Ready state due to a failing readinessProbe. I wasn't aware that an endpoint wouldn't be created until the Pod is in Ready state. Removing the readinessProbe created the Endpoint.
From the status of your Deployment, it seems that no pod is running for the Deployment.
status:
conditions:
- lastTransitionTime: "2021-06-22T07:07:19Z"
lastUpdateTime: "2021-06-22T07:07:19Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2021-06-22T07:17:20Z"
lastUpdateTime: "2021-06-22T07:17:20Z"
message: ReplicaSet "web-server-6df6d6565b" has timed out progressing.
reason: ProgressDeadlineExceeded
status: "False"
type: Progressing
observedGeneration: 1
replicas: 1
unavailableReplicas: 1
updatedReplicas: 1
The unavailableReplicas: 1 filed is indicating that the desired pod is not available. As a result the Service has no active endpoint.
You can describe the deployment to see why the pod is unavailable.

Calico: networkPlugin cni failed to set up pod, i/o timeout

I have got an issue with deploy some pods on my k8s node. The error is following:
Failed create pod sandbox: rpc error: code = Unknown desc = failed to
set up sandbox container
"7da8bce09dd6820a65754073b1b4e52e640291dcb82f1da87ae99570c6964d1b"
network for pod "webservices-8675d4667d-7mdf9": networkPlugin cni
failed to set up pod "webservices-8675d4667d-7mdf9_default" network:
Get https://[10.233.0.1]:443/api/v1/namespaces/default: dial tcp
10.233.0.1:443: i/o timeout
However, some pods are deployed, for example kubernetes-dashboard:
Update:
NAME STATUS ROLES AGE VERSION LABELS
k8s-master.mariyo.eu Ready master 3d15h v1.16.6 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master.mariyo.eu,kubernetes.io/os=linux,node-role.kubernetes.io/master=
k8s-node-1.mariyo.eu Ready <none> 3d15h v1.16.6 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node-1.mariyo.eu,kubernetes.io/os=linux
Deployment for coredns:
kind: Deployment
apiVersion: apps/v1
metadata:
name: coredns
namespace: kube-system
selfLink: /apis/apps/v1/namespaces/kube-system/deployments/coredns
uid: bd5451ec-2a33-443d-8519-ffcec935ac0c
resourceVersion: '397508'
generation: 2
creationTimestamp: '2020-01-24T16:14:37Z'
labels:
addonmanager.kubernetes.io/mode: Reconcile
k8s-app: kube-dns
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: coredns
annotations:
deployment.kubernetes.io/revision: '1'
kubectl.kubernetes.io/last-applied-configuration: >
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","k8s-app":"kube-dns","kubernetes.io/cluster-service":"true","kubernetes.io/name":"coredns"},"name":"coredns","namespace":"kube-system"},"spec":{"selector":{"matchLabels":{"k8s-app":"kube-dns"}},"strategy":{"rollingUpdate":{"maxSurge":"10%","maxUnavailable":0},"type":"RollingUpdate"},"template":{"metadata":{"annotations":{"seccomp.security.alpha.kubernetes.io/pod":"docker/default"},"labels":{"k8s-app":"kube-dns"}},"spec":{"affinity":{"nodeAffinity":{"preferredDuringSchedulingIgnoredDuringExecution":[{"preference":{"matchExpressions":[{"key":"node-role.kubernetes.io/master","operator":"In","values":[""]}]},"weight":100}]},"podAntiAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":[{"labelSelector":{"matchLabels":{"k8s-app":"kube-dns"}},"topologyKey":"kubernetes.io/hostname"}]}},"containers":[{"args":["-conf","/etc/coredns/Corefile"],"image":"docker.io/coredns/coredns:1.6.0","imagePullPolicy":"IfNotPresent","livenessProbe":{"failureThreshold":10,"httpGet":{"path":"/health","port":8080,"scheme":"HTTP"},"successThreshold":1,"timeoutSeconds":5},"name":"coredns","ports":[{"containerPort":53,"name":"dns","protocol":"UDP"},{"containerPort":53,"name":"dns-tcp","protocol":"TCP"},{"containerPort":9153,"name":"metrics","protocol":"TCP"}],"readinessProbe":{"failureThreshold":10,"httpGet":{"path":"/ready","port":8181,"scheme":"HTTP"},"successThreshold":1,"timeoutSeconds":5},"resources":{"limits":{"memory":"170Mi"},"requests":{"cpu":"100m","memory":"70Mi"}},"securityContext":{"allowPrivilegeEscalation":false,"capabilities":{"add":["NET_BIND_SERVICE"],"drop":["all"]},"readOnlyRootFilesystem":true},"volumeMounts":[{"mountPath":"/etc/coredns","name":"config-volume"}]}],"dnsPolicy":"Default","nodeSelector":{"beta.kubernetes.io/os":"linux"},"priorityClassName":"system-cluster-critical","serviceAccountName":"coredns","tolerations":[{"effect":"NoSchedule","key":"node-role.kubernetes.io/master"},{"key":"CriticalAddonsOnly","operator":"Exists"}],"volumes":[{"configMap":{"items":[{"key":"Corefile","path":"Corefile"}],"name":"coredns"},"name":"config-volume"}]}}}}
spec:
replicas: 2
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
creationTimestamp: null
labels:
k8s-app: kube-dns
annotations:
seccomp.security.alpha.kubernetes.io/pod: docker/default
spec:
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
defaultMode: 420
containers:
- name: coredns
image: 'docker.io/coredns/coredns:1.6.0'
args:
- '-conf'
- /etc/coredns/Corefile
ports:
- name: dns
containerPort: 53
protocol: UDP
- name: dns-tcp
containerPort: 53
protocol: TCP
- name: metrics
containerPort: 9153
protocol: TCP
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 10
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 10
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: Default
nodeSelector:
beta.kubernetes.io/os: linux
serviceAccountName: coredns
serviceAccount: coredns
securityContext: {}
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: In
values:
- ''
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
k8s-app: kube-dns
topologyKey: kubernetes.io/hostname
schedulerName: default-scheduler
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: CriticalAddonsOnly
operator: Exists
priorityClassName: system-cluster-critical
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 10%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 2
replicas: 2
updatedReplicas: 2
readyReplicas: 1
availableReplicas: 1
unavailableReplicas: 1
conditions:
- type: Progressing
status: 'True'
lastUpdateTime: '2020-01-24T16:14:42Z'
lastTransitionTime: '2020-01-24T16:14:37Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "coredns-58687784f9" has successfully progressed.
- type: Available
status: 'False'
lastUpdateTime: '2020-01-27T17:42:57Z'
lastTransitionTime: '2020-01-27T17:42:57Z'
reason: MinimumReplicasUnavailable
message: Deployment does not have minimum availability.
Deployment for webservices:
kind: Deployment
apiVersion: apps/v1
metadata:
name: webservices
namespace: default
selfLink: /apis/apps/v1/namespaces/default/deployments/webservices
uid: da75d3d8-92f4-4d06-86d6-e2fb325806a5
resourceVersion: '398529'
generation: 1
creationTimestamp: '2020-01-27T08:05:16Z'
labels:
run: webservices
annotations:
deployment.kubernetes.io/revision: '1'
spec:
replicas: 5
selector:
matchLabels:
run: webservices
template:
metadata:
creationTimestamp: null
labels:
run: webservices
spec:
containers:
- name: webservices
image: nginx
ports:
- containerPort: 80
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: Always
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 1
replicas: 5
updatedReplicas: 5
unavailableReplicas: 5
conditions:
- type: Available
status: 'False'
lastUpdateTime: '2020-01-27T08:05:16Z'
lastTransitionTime: '2020-01-27T08:05:16Z'
reason: MinimumReplicasUnavailable
message: Deployment does not have minimum availability.
- type: Progressing
status: 'False'
lastUpdateTime: '2020-01-27T17:52:58Z'
lastTransitionTime: '2020-01-27T17:52:58Z'
reason: ProgressDeadlineExceeded
message: ReplicaSet "webservices-8675d4667d" has timed out progressing.
Finally, I decided to reinstall nodes from Debian 10 to Ubuntu 18.04 and everything works as expected.
Thank you for your time
Problem is that kube-proxy isn't functioning correctly as I believe the 10.233.0.1 is the kubernetes api service address which it is responsible for configuring/setting up. You should check kube-proxy logs and see that it is healthy and create the iptables rules for the kubernetes services.
Take a look here: calico-timeout-pod.
I had to set the following on the worker node as well, before joining it, for it to work:
sudo sysctl net.bridge.bridge-nf-call-iptables=1
I was having a similar issue. I am using microk8s in my instance. it seems the node needs to advertise itself to the cluster. I hope it points you in the right direction (repost from github):
microk8s stop
# or for workers: sudo snap stop microk8s
sudo vim.tiny /var/snap/microk8s/current/args/kubelet
# Add this to bottom: --node-ip=<this-specific-node-lan-ip>
sudo vim.tiny /var/snap/microk8s/current/args/kube-apiserver
# Add this to bottom: --advertise-address=<this-specific-node-lan-ip>
microk8s start
# or for workers: sudo snap start microk8s

No resources found when installing kubernetes dashboard

I am install kubernetes dashboard using this command:
[root#iZuf63refzweg1d9dh94t8Z ~]# kubectl create -f kubernetes-dashboard.yaml
secret/kubernetes-dashboard-certs created
serviceaccount/kubernetes-dashboard created
role.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
deployment.apps/kubernetes-dashboard created
service/kubernetes-dashboard created
this is my kubernetes yaml config:
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- name: kubernetes-dashboard
image: registry.cn-beijing.aliyuncs.com/minminmsn/kubernetes-dashboard:v1.10.1
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
# ------------------- Dashboard Service ------------------- #
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
get the result:
[root#iZuf63refzweg1d9dh94t8Z ~]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP 102d
kubernetes-dashboard NodePort 10.254.180.117 <none> 443:31720/TCP 58s
metrics-server ClusterIP 10.43.96.112 <none> 443/TCP 102d
[root#iZuf63refzweg1d9dh94t8Z ~]# kubectl get pods -n kube-system
No resources found.
but when I check the port 31720:
lsof -i:31720
the output is empty.Is the service deploy success? How to check the deploy log? Why the port not binding success?
it is under its own namespace - "kubernetes-dashboard". So, just use kubectl get all -n kubernetes-dashboard to see everything

Prometheus server in pending state after installation using Helm

I am new to k8s and trying to setup prometheus monitoring for k8s. I used
"helm install" to setup prometheus. Now:
two pods are still in pending state:
prometheus-server
prometheus-alertmanager
I manually created persistent volume for both
Can anyone help me with how to map these PV with PVC created by helm chart?
[centos#k8smaster1 ~]$ kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
prometheus-alertmanager-7757d759b8-x6bd7 0/2 Pending 0 44m
prometheus-kube-state-metrics-7f85b5d86c-cq9kr 1/1 Running 0 44m
prometheus-node-exporter-5rz2k 1/1 Running 0 44m
prometheus-pushgateway-5b8465d455-672d2 1/1 Running 0 44m
prometheus-server-7f8b5fc64b-w626v 0/2 Pending 0 44m
[centos#k8smaster1 ~]$ kubectl get pv
prometheus-alertmanager 3Gi RWX Retain Available 22m
prometheus-server 12Gi RWX Retain Available 30m
[centos#k8smaster1 ~]$ kubectl get pvc -n monitoring
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
prometheus-alertmanager Pending 20m
prometheus-server Pending 20m
[centos#k8smaster1 ~]$ kubectl describe pvc prometheus-alertmanager -n monitoring
Name: prometheus-alertmanager
Namespace: monitoring
StorageClass:
Status: Pending
Volume:
Labels: app=prometheus
chart=prometheus-8.15.0
component=alertmanager
heritage=Tiller
release=prometheus
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 116s (x83 over 22m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
Mounted By: prometheus-alertmanager-7757d759b8-x6bd7
I am expecting the pods to get into running state
!!!UPDATE!!!
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
prometheus-alertmanager Pending local-storage 4m29s
prometheus-server Pending local-storage 4m29s
[centos#k8smaster1 prometheus_pv_storage]$ kubectl describe pvc prometheus-server -n monitoring
Name: prometheus-server
Namespace: monitoring
StorageClass: local-storage
Status: Pending
Volume:
Labels: app=prometheus
chart=prometheus-8.15.0
component=server
heritage=Tiller
release=prometheus
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 11s (x22 over 4m59s) persistentvolume-controller waiting for first consumer to be created before binding
Mounted By: prometheus-server-7f8b5fc64b-bqf42
!!UPDATE-2!!
[centos#k8smaster1 ~]$ kubectl get pods prometheus-server-7f8b5fc64b-bqf42 -n monitoring -o yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2019-08-18T16:10:54Z"
generateName: prometheus-server-7f8b5fc64b-
labels:
app: prometheus
chart: prometheus-8.15.0
component: server
heritage: Tiller
pod-template-hash: 7f8b5fc64b
release: prometheus
name: prometheus-server-7f8b5fc64b-bqf42
namespace: monitoring
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: prometheus-server-7f8b5fc64b
uid: c1979bcb-c1d2-11e9-819d-fa163ebb8452
resourceVersion: "2461054"
selfLink: /api/v1/namespaces/monitoring/pods/prometheus-server-7f8b5fc64b-bqf42
uid: c19890d1-c1d2-11e9-819d-fa163ebb8452
spec:
containers:
- args:
- --volume-dir=/etc/config
- --webhook-url=http://127.0.0.1:9090/-/reload
image: jimmidyson/configmap-reload:v0.2.2
imagePullPolicy: IfNotPresent
name: prometheus-server-configmap-reload
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/config
name: config-volume
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: prometheus-server-token-7h2df
readOnly: true
- args:
- --storage.tsdb.retention.time=15d
- --config.file=/etc/config/prometheus.yml
- --storage.tsdb.path=/data
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
- --web.enable-lifecycle
image: prom/prometheus:v2.11.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /-/healthy
port: 9090
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
name: prometheus-server
ports:
- containerPort: 9090
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /-/ready
port: 9090
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/config
name: config-volume
- mountPath: /data
name: storage-volume
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: prometheus-server-token-7h2df
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 65534
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
serviceAccount: prometheus-server
serviceAccountName: prometheus-server
terminationGracePeriodSeconds: 300
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- configMap:
defaultMode: 420
name: prometheus-server
name: config-volume
- name: storage-volume
persistentVolumeClaim:
claimName: prometheus-server
- name: prometheus-server-token-7h2df
secret:
defaultMode: 420
secretName: prometheus-server-token-7h2df
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2019-08-18T16:10:54Z"
message: '0/2 nodes are available: 1 node(s) didn''t find available persistent
volumes to bind, 1 node(s) had taints that the pod didn''t tolerate.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: BestEffort
Also I have the volumes created and assigned to local storage
[centos#k8smaster1 prometheus_pv]$ kubectl get pv -n monitoring
prometheus-alertmanager 3Gi RWX Retain Available local-storage 2d19h
prometheus-server 12Gi RWX Retain Available local-storage 2d19h
If you are in EKS, your node need to have the next permission
arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy
and the Amazon EBS CSI Driver Add-on
Prometheus will try to create PersiatentVolumeClaims with accessModes as ReadWriteOnce, PVC will get matched to PersistentVolume only if accessmodes are same. Change your accessmode of PV to ReadWriteOnce, it should work.

A pod cannot connect to a service by IP

My ingress pod is having trouble reaching two clusterIP services by IP. There are plenty of other clusterIP services it has no trouble reaching. Including in the same namespace. Another pod has no problem reaching the service (I tried the default backend in the same namespace and it was fine).
Where should I look? Here are my actual services, it cannot reach the first but can reach the second:
- apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-08-23T16:59:10Z"
labels:
app: pka-168-emtpy-id
app.kubernetes.io/instance: palletman-pka-168-emtpy-id
app.kubernetes.io/managed-by: Tiller
helm.sh/chart: pal-0.0.1
release: palletman-pka-168-emtpy-id
name: pka-168-emtpy-id
namespace: palletman
resourceVersion: "108574168"
selfLink: /api/v1/namespaces/palletman/services/pka-168-emtpy-id
uid: 539364f9-c5c7-11e9-8699-0af40ce7ce3a
spec:
clusterIP: 100.65.111.47
ports:
- port: 80
protocol: TCP
targetPort: 8080
selector:
app: pka-168-emtpy-id
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-03-05T19:57:26Z"
labels:
app: production
app.kubernetes.io/instance: palletman
app.kubernetes.io/managed-by: Tiller
helm.sh/chart: pal-0.0.1
release: palletman
name: production
namespace: palletman
resourceVersion: "81337664"
selfLink: /api/v1/namespaces/palletman/services/production
uid: e671c5e0-3f80-11e9-a1fc-0af40ce7ce3a
spec:
clusterIP: 100.65.82.246
ports:
- port: 80
protocol: TCP
targetPort: 8080
selector:
app: production
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
My ingress pod:
apiVersion: v1
kind: Pod
metadata:
annotations:
sumologic.com/format: text
sumologic.com/sourceCategory: 103308/CT/LI/kube_ingress
sumologic.com/sourceName: kube_ingress
creationTimestamp: "2019-08-21T19:34:48Z"
generateName: ingress-nginx-65877649c7-
labels:
app: ingress-nginx
k8s-addon: ingress-nginx.addons.k8s.io
pod-template-hash: "2143320573"
name: ingress-nginx-65877649c7-5npmp
namespace: kube-ingress
ownerReferences:
- apiVersion: extensions/v1beta1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: ingress-nginx-65877649c7
uid: 97db28a9-c43f-11e9-920a-0af40ce7ce3a
resourceVersion: "108278133"
selfLink: /api/v1/namespaces/kube-ingress/pods/ingress-nginx-65877649c7-5npmp
uid: bcd92d96-c44a-11e9-8699-0af40ce7ce3a
spec:
containers:
- args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/nginx-default-backend
- --configmap=$(POD_NAMESPACE)/ingress-nginx
- --publish-service=$(POD_NAMESPACE)/ingress-nginx
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.13
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: ingress-nginx
ports:
- containerPort: 80
name: http
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-dg5wn
readOnly: true
dnsPolicy: ClusterFirst
nodeName: ip-10-55-131-177.eu-west-1.compute.internal
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 60
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: default-token-dg5wn
secret:
defaultMode: 420
secretName: default-token-dg5wn
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2019-08-21T19:34:48Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2019-08-21T19:34:50Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2019-08-21T19:34:48Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://d597673f4f38392a52e9537e6dd2473438c62c2362a30e3d58bf8a98e177eb12
image: gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.13
imageID: docker-pullable://gcr.io/google_containers/nginx-ingress-controller#sha256:c9d2e67f8096d22564a6507794e1a591fbcb6461338fc655a015d76a06e8dbaa
lastState: {}
name: ingress-nginx
ready: true
restartCount: 0
state:
running:
startedAt: "2019-08-21T19:34:50Z"
hostIP: 10.55.131.177
phase: Running
podIP: 172.6.218.18
qosClass: BestEffort
startTime: "2019-08-21T19:34:48Z"
It could be connectivity to the node where your Pod is running. (Or network overlay related) You can check where that pod is running:
$ kubectl get pod -o=json | jq .items[0].spec.nodeName
Check if the node is 'Ready':
$ kubectl get node <node-from-above>
If it's ready, then ssh into the node to further troubleshoot:
$ ssh <node-from-above>
Is your overlay pod running on the node? (Calico, Weave, CNI, etc)
You can further troubleshoot connecting to the pod/container
# From <node-from-above>
$ docker exec -it <container-id-in-pod> bash
# Check connectivity (ping, dig, curl, etc)
Also, from using the kubectl command line (if you have network connectivity to the node)
$ kubectl exec -it <pod-id> -c <container-name> bash
# Troubleshoot...