k8s Prometheus:pod has unbound PersistentVolumeClaims - kubernetes

I install kube1.10.3 in two virtualbox(centos 7.4) in my win10 machine. I use git clone to get prometheus yaml files.
git clone https://github.com/kubernetes/kubernetes
Then I enter kubernetes/cluster/addons/prometheus annd follow this order to create pods:
alertmanager-configmap.yaml
alertmanager-pvc.yaml
alertmanager-deployment.yaml
alertmanager-service.yaml
kube-state-metrics-rbac.yaml
kube-state-metrics-deployment.yaml
kube-state-metrics-service.yaml
node-exporter-ds.yml
node-exporter-service.yaml
prometheus-configmap.yaml
prometheus-rbac.yaml
prometheus-statefulset.yaml
prometheus-service.yaml
But Prometheus and alertmanage are in pending state:
kube-system alertmanager-6bd9584b85-j4h5m 0/2 Pending 0 9m
kube-system calico-etcd-pnwtr 1/1 Running 0 16m
kube-system calico-kube-controllers-5d74847676-mjq4j 1/1 Running 0 16m
kube-system calico-node-59xfk 2/2 Running 1 16m
kube-system calico-node-rqsh5 2/2 Running 1 16m
kube-system coredns-7997f8864c-ckhsq 1/1 Running 0 16m
kube-system coredns-7997f8864c-jjtvq 1/1 Running 0 16m
kube-system etcd-master16g 1/1 Running 0 15m
kube-system heapster-589b7db6c9-mpmks 1/1 Running 0 16m
kube-system kube-apiserver-master16g 1/1 Running 0 15m
kube-system kube-controller-manager-master16g 1/1 Running 0 15m
kube-system kube-proxy-hqq49 1/1 Running 0 16m
kube-system kube-proxy-l8hmh 1/1 Running 0 16m
kube-system kube-scheduler-master16g 1/1 Running 0 16m
kube-system kube-state-metrics-8595f97c4-g6x5x 2/2 Running 0 8m
kube-system kubernetes-dashboard-7d5dcdb6d9-944xl 1/1 Running 0 16m
kube-system monitoring-grafana-7b767fb8dd-mg6dd 1/1 Running 0 16m
kube-system monitoring-influxdb-54bd58b4c9-z9tgd 1/1 Running 0 16m
kube-system node-exporter-f6pmw 1/1 Running 0 8m
kube-system node-exporter-zsd9b 1/1 Running 0 8m
kube-system prometheus-0 0/2 Pending 0 7m
I checked prometheus pod by command shown below:
[root#master16g prometheus]# kubectl describe pod prometheus-0 -n kube-system
Name: prometheus-0
Namespace: kube-system
Node: <none>
Labels: controller-revision-hash=prometheus-8fc558cb5
k8s-app=prometheus
statefulset.kubernetes.io/pod-name=prometheus-0
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
Status: Pending
IP:
Controlled By: StatefulSet/prometheus
Init Containers:
init-chown-data:
Image: busybox:latest
Port: <none>
Host Port: <none>
Command:
chown
-R
65534:65534
/data
Environment: <none>
Mounts:
/data from prometheus-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-f6v42 (ro)
Containers:
prometheus-server-configmap-reload:
Image: jimmidyson/configmap-reload:v0.1
Port: <none>
Host Port: <none>
Args:
--volume-dir=/etc/config
--webhook-url=http://localhost:9090/-/reload
Limits:
cpu: 10m
memory: 10Mi
Requests:
cpu: 10m
memory: 10Mi
Environment: <none>
Mounts:
/etc/config from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-f6v42 (ro)
prometheus-server:
Image: prom/prometheus:v2.2.1
Port: 9090/TCP
Host Port: 0/TCP
Args:
--config.file=/etc/config/prometheus.yml
--storage.tsdb.path=/data
--web.console.libraries=/etc/prometheus/console_libraries
--web.console.templates=/etc/prometheus/consoles
--web.enable-lifecycle
Limits:
cpu: 200m
memory: 1000Mi
Requests:
cpu: 200m
memory: 1000Mi
Liveness: http-get http://:9090/-/healthy delay=30s timeout=30s period=10s #success=1 #failure=3
Readiness: http-get http://:9090/-/ready delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/data from prometheus-data (rw)
/etc/config from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-f6v42 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
prometheus-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: prometheus-data-prometheus-0
ReadOnly: false
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus-config
Optional: false
prometheus-token-f6v42:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-token-f6v42
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 42s (x22 over 5m) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times)
In the last line, it shows warning message: pod has unbound PersistentVolumeClaims (repeated 2 times)
The Prometheus logs says:
[root#master16g prometheus]# kubectl logs prometheus-0 -n kube-system
Error from server (BadRequest): a container name must be specified for pod prometheus-0, choose one of: [prometheus-server-configmap-reload prometheus-server] or one of the init containers: [init-chown-data]
The I describe alertmanager pod and its logs:
[root#master16g prometheus]# kubectl describe pod alertmanager-6bd9584b85-j4h5m -n kube-system
Name: alertmanager-6bd9584b85-j4h5m
Namespace: kube-system
Node: <none>
Labels: k8s-app=alertmanager
pod-template-hash=2685140641
version=v0.14.0
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
Status: Pending
IP:
Controlled By: ReplicaSet/alertmanager-6bd9584b85
Containers:
prometheus-alertmanager:
Image: prom/alertmanager:v0.14.0
Port: 9093/TCP
Host Port: 0/TCP
Args:
--config.file=/etc/config/alertmanager.yml
--storage.path=/data
--web.external-url=/
Limits:
cpu: 10m
memory: 50Mi
Requests:
cpu: 10m
memory: 50Mi
Readiness: http-get http://:9093/%23/status delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/data from storage-volume (rw)
/etc/config from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-snfrt (ro)
prometheus-alertmanager-configmap-reload:
Image: jimmidyson/configmap-reload:v0.1
Port: <none>
Host Port: <none>
Args:
--volume-dir=/etc/config
--webhook-url=http://localhost:9093/-/reload
Limits:
cpu: 10m
memory: 10Mi
Requests:
cpu: 10m
memory: 10Mi
Environment: <none>
Mounts:
/etc/config from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-snfrt (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: alertmanager-config
Optional: false
storage-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: alertmanager
ReadOnly: false
default-token-snfrt:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-snfrt
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m (x26 over 9m) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times)
And its log:
[root#master16g prometheus]# kubectl logs alertmanager-6bd9584b85-j4h5m -n kube-system
Error from server (BadRequest): a container name must be specified for pod alertmanager-6bd9584b85-j4h5m, choose one of: [prometheus-alertmanager prometheus-alertmanager-configmap-reload]
It has same warning message as Prometheus:
pod has unbound PersistentVolumeClaims (repeated 2 times)
Then I get pvc by issuing command as follows:
[root#master16g prometheus]# kubectl get pvc --all-namespaces
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
kube-system alertmanager Pending standard 20m
kube-system prometheus-data-prometheus-0 Pending standard 19m
My question is how to make bound persistentVolumnClaim? Why log says container name must be specified?
===============================================================
Second edition
Since pvc file defined storage class, so I need to define a storage class yaml. How to do it if I want Nfs or GlusterFs? In this way, I could avoid cloud vendor, like Google or AWS.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: alertmanager
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: EnsureExists
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "2Gi"

This log entry:
Error from server (BadRequest): a container name must be specified for pod alertmanager-6bd9584b85-j4h5m, choose one of: [prometheus-alertmanager prometheus-alertmanager-configmap-reload]
means Pod alertmanager-6bd9584b85-j4h5m consists of two containers:
prometheus-alertmanager
prometheus-alertmanager-configmap-reload
When you use kubectl logs for Pod which consists of more then one containers you must specify a name of the container to view its logs. Command template:
kubectl -n <namespace> logs <pod_name> <container_name>
For example, if you want to view logs of the container prometheus-alertmanager which is a part of Pod alertmanager-6bd9584b85-j4h5m in the namespace kube-system you should use this command:
kubectl -n kube-system logs alertmanager-6bd9584b85-j4h5m prometheus-alertmanager
Pending status of the PVCs could mean you have no corresponding PVs

Related

Why do I keep getting error "5 pod has unbound immediate PersistentVolumeClaims"?

I am following the book Kubernetes for developers and seems maybe book is heavily outdated now.
Recently I have been trying to get prometheus up and running on kubernetes following the instruction from book. That suggested to install and use HELM to get Prometheus and grafana up and running.
helm install monitor stable/prometheus --namespace monitoring
this resulted:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
monitor-kube-state-metrics-578cdbb5b7-pdjzw 0/1 CrashLoopBackOff 14 36m 192.168.23.1 kube-worker-vm3 <none> <none>
monitor-prometheus-alertmanager-7b4c476678-gr4s6 0/2 Pending 0 35m <none> <none> <none> <none>
monitor-prometheus-node-exporter-5kz8x 1/1 Running 0 14h 192.168.1.13 rockpro64 <none> <none>
monitor-prometheus-node-exporter-jjrjh 1/1 Running 1 14h 192.168.1.35 osboxes <none> <none>
monitor-prometheus-node-exporter-k62fn 1/1 Running 1 14h 192.168.1.37 kube-worker-vm3 <none> <none>
monitor-prometheus-node-exporter-wcg2k 1/1 Running 1 14h 192.168.1.36 kube-worker-vm2 <none> <none>
monitor-prometheus-pushgateway-6898f8475b-sk4dz 1/1 Running 0 36m 192.168.90.200 osboxes <none> <none>
monitor-prometheus-server-74d7dc5d4c-vlqmm 0/2 Pending 0 14h <none> <none> <none
For the prometheus server I checked why is it Pending:
# kubectl describe pod monitor-prometheus-server-74d7dc5d4c-vlqmm -n monitoring
Name: monitor-prometheus-server-74d7dc5d4c-vlqmm
Namespace: monitoring
Priority: 0
Node: <none>
Labels: app=prometheus
chart=prometheus-13.8.0
component=server
heritage=Helm
pod-template-hash=74d7dc5d4c
release=monitor
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/monitor-prometheus-server-74d7dc5d4c
Containers:
prometheus-server-configmap-reload:
Image: jimmidyson/configmap-reload:v0.4.0
Port: <none>
Host Port: <none>
Args:
--volume-dir=/etc/config
--webhook-url=http://127.0.0.1:9090/-/reload
Environment: <none>
Mounts:
/etc/config from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from monitor-prometheus-server-token-n49ls (ro)
prometheus-server:
Image: prom/prometheus:v2.20.1
Port: 9090/TCP
Host Port: 0/TCP
Args:
--storage.tsdb.retention.time=15d
--config.file=/etc/config/prometheus.yml
--storage.tsdb.path=/data
--web.console.libraries=/etc/prometheus/console_libraries
--web.console.templates=/etc/prometheus/consoles
--web.enable-lifecycle
Liveness: http-get http://:9090/-/healthy delay=30s timeout=30s period=15s #success=1 #failure=3
Readiness: http-get http://:9090/-/ready delay=30s timeout=30s period=5s #success=1 #failure=3
Environment: <none>
Mounts:
/data from storage-volume (rw)
/etc/config from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from monitor-prometheus-server-token-n49ls (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: monitor-prometheus-server
Optional: false
storage-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: monitor-prometheus-server
ReadOnly: false
monitor-prometheus-server-token-n49ls:
Type: Secret (a volume populated by a Secret)
SecretName: monitor-prometheus-server-token-n49ls
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 28m (x734 over 14h) default-scheduler 0/6 nodes are available: 6 pod has unbound immediate PersistentVolumeClaims.
Warning FailedScheduling 3m5s (x23 over 24m) default-scheduler 0/5 nodes are available: 5 pod has unbound immediate PersistentVolumeClaims.
r
However this message I am seeing 0/5 nodes are available: 5 pod has unbound immediate PersistentVolumeClaims. is coming with all other nodejs's StatefulSets and rabbitmq Deployments I have tried created. for rabbitmq and nodejs I figured out I need to create a PersistantVolume and a storage class whose name I needed to specify in the PV and PVC. and then it all worked but now I have Prometheus Server, Do I have to do the same for prometheus as well ? why is it not instructed by the HELM ?
Has something change in the Kubernetes API recently ? that I always have to create a PV and Storage Class explicitly for a PVC ?
Unless you configure your cluster with dynamic volume provisioning , you will have to make the PV manually each time. Even if you are not on a cloud, you can setup dynamic storage providers. There are a number of options for providers and you can find many here. Ceph and minio are popular providers.

Volume is not mounted even when pvc is created

I am using helm chart for the installation of the application, the volume is not mounted. I am doing something wrong but not sure what is it. I am new to devops
values.yaml
persistence:
enabled: true
existingClaim: grafana-persistent-storage
mountPath: "/dev/grafana/"
pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-block-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Block
storageClassName: grafana-persistent-storage
resources:
requests:
storage: 10Gi
storageClass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: grafana-persistent-storage
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
iopsPerGB: "10"
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
- debug
volumeBindingMode: Immediate
PVC is creaed
kubectl --kubeconfig=<configfile> get pvc -n grafana
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
grafana-block-pvc Bound pvc-6dc39e0d-471e-11ea-b432-0a505018290a 10Gi RWO grafana-persistent-storage 10m
PV created too
pvc-6dc39e0d-471e-11ea-b432-0a505018290a 10Gi RWO Retain Bound grafana/grafana-block-pvc grafana-persistent-storage 10m
Kubectl describe pod - the description of the pod created.
Name: grafana1-v1-79fb988995-lnnl6
Namespace: grafana
Priority: 0
Node: ip-10-10-108-165.ap-southeast-1.compute.internal/10.10.108.165
Start Time: Tue, 04 Feb 2020 13:15:17 +0530
Labels: app.kubernetes.io/instance=grafana1
app.kubernetes.io/name=grafana1
pod-template-hash=79fb988995
Annotations: kubernetes.io/psp: eks.privileged
sidecar.istio.io/status:
{"version":"761ebc53976754715f22fcf548f05270fb4b8db07324894aebdb31fa81d960","initContainers":["istio-init"],"containers":["istio-proxy"]...
Status: Running
IP: 10.10.127.38
IPs: <none>
Controlled By: ReplicaSet/grafana1-v1-79fb988995
Init Containers:
istio-init:
Container ID: docker://a95db52c5b45c8147fb6c6d0ce4013bef6d495752dc820565188032bc36926
Image: docker.io/istio/proxy_init:1.2.5
Image ID: docker-pullable://istio/proxy_init#sha256:c9964a8c28b85cc631bbc90390eac238c90f82c8f929495d1e9f9a9135b724
Port: <none>
Host Port: <none>
Args:
-p
15001
-u
1337
-m
REDIRECT
-i
*
-x
-b
3000
-d
15020
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 04 Feb 2020 13:15:18 +0530
Finished: Tue, 04 Feb 2020 13:15:19 +0530
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 10m
memory: 10Mi
Environment: <none>
Mounts: <none>
Containers:
grafana1:
Container ID: docker://92338e43bbf69a2c0919e81f5ae16948e6f7966353a3db52274a5a14902599
Image: grafana/grafana:latest
Image ID: docker-pullable://grafana/grafana#sha256:4319ca3e5592ee408f5842ce5b5955312549d89dc1572d2543f2f6d67ca619
Port: 3000/TCP
Host Port: 0/TCP
State: Running
Started: Tue, 04 Feb 2020 13:15:23 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 200Mi
Environment:
GF_SECURITY_ADMIN_PASSWORD: deskera#reports
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-99rfk (ro)
istio-proxy:
Container ID: docker://21b965ec954474b3bcb941a20782f99642f002bb0e9a212aed20e19838c2f0
Image: docker.io/istio/proxyv2:1.2.5
Image ID: docker-pullable://istio/proxyv2#sha256:8f210c3d09beb6b8658a55d9ac30e25549295834a44083ed67d652ad7453e4
Port: 15090/TCP
Host Port: 0/TCP
Args:
proxy
sidecar
--domain
$(POD_NAMESPACE).svc.cluster.local
--configPath
/etc/istio/proxy
--binaryPath
/usr/local/bin/envoy
--serviceCluster
istio-proxy.grafana
--drainDuration
45s
--parentShutdownDuration
1m0s
--discoveryAddress
istio-pilot.istio-system:15010
--zipkinAddress
zipkin.istio-system:9411
--dnsRefreshRate
300s
--connectTimeout
10s
--proxyAdminPort
15000
--concurrency
2
--controlPlaneAuthPolicy
NONE
--statusPort
15020
--applicationPorts
3000
State: Running
Started: Tue, 04 Feb 2020 13:15:23 +0530
Ready: True
Restart Count: 0
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 100m
memory: 128Mi
Readiness: http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
Environment:
POD_NAME: grafana1-v1-79fb988995-lnnl6 (v1:metadata.name)
POD_NAMESPACE: grafana (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
ISTIO_META_POD_NAME: grafana1-v1-79fb988995-lnnl6 (v1:metadata.name)
ISTIO_META_CONFIG_NAMESPACE: grafana (v1:metadata.namespace)
ISTIO_META_INTERCEPTION_MODE: REDIRECT
ISTIO_META_INCLUDE_INBOUND_PORTS: 3000
ISTIO_METAJSON_ANNOTATIONS: {"kubernetes.io/psp":"eks.privileged"}
ISTIO_METAJSON_LABELS: {"app.kubernetes.io/instance":"grafana1","app.kubernetes.io/name":"grafana1","pod-template-hash":"79fb988995"}
Mounts:
/etc/certs/ from istio-certs (ro)
/etc/istio/proxy from istio-envoy (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-99rfk (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-99rfk:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-99rfk
Optional: false
istio-envoy:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istio-certs:
Type: Secret (a volume populated by a Secret)
SecretName: istio.default
Optional: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned grafana/grafana1-v1-79fb988995-lnnl6 to ip-10-10-108-165.ap-southeast-1.compute.internal
Normal Pulled 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Container image "docker.io/istio/proxy_init:1.2.5" already present on machine
Normal Created 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Created container istio-init
Normal Started 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Started container istio-init
Normal Pulling 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Pulling image "grafana/grafana:latest"
Normal Pulled 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Successfully pulled image "grafana/grafana:latest"
Normal Created 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Created container grafana1
Normal Started 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Started container grafana1
Normal Pulled 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Container image "docker.io/istio/proxyv2:1.2.5" already present on machine
Normal Created 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Created container istio-proxy
Normal Started 13m kubelet, ip-10-10-108-165.ap-southeast-1.compute.internal Started container istio-proxy
Please refer the describe part of the pod. The volume is still not mounted even after changing the existing claim to pvc
persistence:
enabled: true
existingClaim: grafana-block-pvc
mountPath: "/dev/grafana/"
Claim name should be grafana-block-pvc rather than grafana-persistent-storage in your values.yaml

kubernetes: waiting for first consumer to be created before binding

I have been trying to run kafka/zookeeper on Kubernetes. Using helm charts I am able to install zookeeper on the cluster. However the ZK pods are stuck in pending state. When I issued describe on one of the pod "didn't find available persistent volumes to bind, 1 node(s) had taints that the pod didn't tolerate." was the reason for scheduling failure. But when I issue describe on PVC , I am getting "waiting for first consumer to be created before binding". I tried to re-spawn the whole cluster but the result is same. Trying to use https://kubernetes.io/blog/2018/04/13/local-persistent-volumes-beta/ as guide.
Can someone please guide me here ?
kubectl get pods -n zoo-keeper
kubectl get pods -n zoo-keeper
NAME READY STATUS RESTARTS AGE
zoo-keeper-zk-0 0/1 Pending 0 20m
zoo-keeper-zk-1 0/1 Pending 0 20m
zoo-keeper-zk-2 0/1 Pending 0 20m
kubectl get sc
kubectl get sc
NAME PROVISIONER AGE
local-storage kubernetes.io/no-provisioner 25m
kubectl describe sc
kubectl describe sc
Name: local-storage
IsDefaultClass: No
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"local-storage"},"provisioner":"kubernetes.io/no-provisioner","volumeBindingMode":"WaitForFirstConsumer"}
Provisioner: kubernetes.io/no-provisioner
Parameters: <none>
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: WaitForFirstConsumer
Events: <none>
kubectl describe pod foob-zookeeper-0 -n zoo-keeper
ubuntu#kmaster:~$ kubectl describe pod foob-zookeeper-0 -n zoo-keeper
Name: foob-zookeeper-0
Namespace: zoo-keeper
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: app=foob-zookeeper
app.kubernetes.io/instance=data-coord
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=foob-zookeeper
app.kubernetes.io/version=foob-zookeeper-9.1.0-15
controller-revision-hash=foob-zookeeper-5321f8ff5
release=data-coord
statefulset.kubernetes.io/pod-name=foob-zookeeper-0
Annotations: foobar.com/product-name: zoo-keeper ZK
foobar.com/product-revision: ABC
Status: Pending
IP:
Controlled By: StatefulSet/foob-zookeeper
Containers:
foob-zookeeper:
Image: repo.data.foobar.se/latest/zookeeper-3.4.10:1.6.0-15
Ports: 2181/TCP, 2888/TCP, 3888/TCP, 10007/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 1
memory: 2Gi
Liveness: exec [zkOk.sh] delay=15s timeout=5s period=10s #success=1 #failure=3
Readiness: tcp-socket :2181 delay=15s timeout=5s period=10s #success=1 #failure=3
Environment:
ZK_REPLICAS: 3
ZK_HEAP_SIZE: 1G
ZK_TICK_TIME: 2000
ZK_INIT_LIMIT: 10
ZK_SYNC_LIMIT: 5
ZK_MAX_CLIENT_CNXNS: 60
ZK_SNAP_RETAIN_COUNT: 3
ZK_PURGE_INTERVAL: 1
ZK_LOG_LEVEL: INFO
ZK_CLIENT_PORT: 2181
ZK_SERVER_PORT: 2888
ZK_ELECTION_PORT: 3888
JMXPORT: 10007
Mounts:
/var/lib/zookeeper from datadir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nfcfx (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: datadir-foob-zookeeper-0
ReadOnly: false
default-token-nfcfx:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nfcfx
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 69s (x4 over 3m50s) default-scheduler 0/2 nodes are available: 1 node(s) didn't find available persistent volumes to bind, 1 node(s) had taints that the pod didn't tolerate.
kubectl get pv
ubuntu#kmaster:~$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv 50Gi RWO Retain Available local-storage 10m
ubuntu#kmaster:~$
kubectl get pvc local-claim
ubuntu#kmaster:~$ kubectl get pvc local-claim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
local-claim Pending local-storage 8m9s
ubuntu#kmaster:~$
kubectl describe pvc local-claim
ubuntu#kmaster:~$ kubectl describe pvc local-claim
Name: local-claim
Namespace: default
StorageClass: local-storage
Status: Pending
Volume:
Labels: <none>
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 2m3s (x26 over 7m51s) persistentvolume-controller waiting for first consumer to be created before binding
Mounted By: <none>
MY PV files:
cat create-pv.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/kafka-mount
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- kmaster
cat pvc.yml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: local-claim
spec:
accessModes:
- ReadWriteOnce
storageClassName: local-storage
resources:
requests:
storage: 50Gi
It looks like you created your PV on master node. By default master node is marked unschedulable by ordinary pods using so called taint. To be able to run some service on master node you have two options:
1) Add toleration to some service to allow it to run on master node:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
You may even specify that some service runs only on master node:
nodeSelector:
node-role.kubernetes.io/master: ""
2) You can remove taint from master node, so any pod can run on it. You should know that this is dangerous because can make your cluster very unstable.
kubectl taint nodes --all node-role.kubernetes.io/master-
Read more here and taints and tolerations: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/

How to run Tiller on Kubernetes cluster on AWS EKS

I created EKS Kubernetes cluster with terraform. It all went fine, cluster is created and there is one EC2 machine on it. However, I can't init helm and install Tiller there. All the code is on https://github.com/amorfis/aws-eks-terraform
As stated in README.md, after cluster creation I update ~/.kube/config, create rbac, and try to init helm. However, it's pod is still pending:
$> kubectl --namespace kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-7554568866-8mnsm 0/1 Pending 0 3h
coredns-7554568866-mng65 0/1 Pending 0 3h
tiller-deploy-77c96688d7-87rb8 0/1 Pending 0 1h
As well as other 2 coredns pods.
What am i missing?
UPDATE: Output of describe:
$> kubectl describe pod tiller-deploy-77c96688d7-87rb8 --namespace kube-system
Name: tiller-deploy-77c96688d7-87rb8
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: app=helm
name=tiller
pod-template-hash=3375224483
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/tiller-deploy-77c96688d7
Containers:
tiller:
Image: gcr.io/kubernetes-helm/tiller:v2.12.2
Ports: 44134/TCP, 44135/TCP
Host Ports: 0/TCP, 0/TCP
Liveness: http-get http://:44135/liveness delay=1s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:44135/readiness delay=1s timeout=1s period=10s #success=1 #failure=3
Environment:
TILLER_NAMESPACE: kube-system
TILLER_HISTORY_MAX: 0
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from tiller-token-b9x6d (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
tiller-token-b9x6d:
Type: Secret (a volume populated by a Secret)
SecretName: tiller-token-b9x6d
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
try to allow the master to run pods
according to this issue issue form githup
kubectl taint nodes --all node-role.kubernetes.io/master-

Kube-dns always in pending state

I have deployed kubernetes on a virt-manager vm following this link
https://kubernetes.io/docs/setup/independent/install-kubeadm/
When i join my another vm to the cluster i find that the kube-dns is in pending state.
root#ubuntu1:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-ubuntu1 1/1 Running 0 7m
kube-system kube-apiserver-ubuntu1 1/1 Running 0 8m
kube-system kube-controller-manager-ubuntu1 1/1 Running 0 8m
kube-system kube-dns-86f4d74b45-br6ck 0/3 Pending 0 8m
kube-system kube-proxy-sh9lg 1/1 Running 0 8m
kube-system kube-proxy-zwdt5 1/1 Running 0 7m
kube-system kube-scheduler-ubuntu1 1/1 Running 0 8m
root#ubuntu1:~# kubectl --namespace=kube-system describe pod kube-dns-86f4d74b45-br6ck
Name: kube-dns-86f4d74b45-br6ck
Namespace: kube-system
Node: <none>
Labels: k8s-app=kube-dns
pod-template-hash=4290830601
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/kube-dns-86f4d74b45
Containers:
kubedns:
Image: k8s.gcr.io/k8s-dns-kube-dns-amd64:1.14.8
Ports: 10053/UDP, 10053/TCP, 10055/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
--domain=cluster.local.
--dns-port=10053
--config-dir=/kube-dns-config
--v=2
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:10054/healthcheck/kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
Environment:
PROMETHEUS_PORT: 10055
Mounts:
/kube-dns-config from kube-dns-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-4fjt4 (ro)
dnsmasq:
Image: k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.8
Ports: 53/UDP, 53/TCP
Host Ports: 0/UDP, 0/TCP
Args:
-v=2
-logtostderr
-configDir=/etc/k8s/dns/dnsmasq-nanny
-restartDnsmasq=true
--
-k
--cache-size=1000
--no-negcache
--log-facility=-
--server=/cluster.local/127.0.0.1#10053
--server=/in-addr.arpa/127.0.0.1#10053
--server=/ip6.arpa/127.0.0.1#10053
Requests:
cpu: 150m
memory: 20Mi
Liveness: http-get http://:10054/healthcheck/dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/k8s/dns/dnsmasq-nanny from kube-dns-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-4fjt4 (ro)
sidecar:
Image: k8s.gcr.io/k8s-dns-sidecar-amd64:1.14.8
Port: 10054/TCP
Host Port: 0/TCP
Args:
--v=2
--logtostderr
--probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,SRV
--probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,SRV
Requests:
cpu: 10m
memory: 20Mi
Liveness: http-get http://:10054/metrics delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-4fjt4 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-dns-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-dns
Optional: true
kube-dns-token-4fjt4:
Type: Secret (a volume populated by a Secret)
SecretName: kube-dns-token-4fjt4
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 6m (x7 over 7m) default-scheduler 0/1 nodes are available: 1 node(s) were not ready.
Warning FailedScheduling 3s (x19 over 6m) default-scheduler 0/2 nodes are available: 2 node(s) were not ready.
Can anyone just help me how to deconstruct this and find the actual issue??
Any help would be off great use
Thanks in advance.
In addition to what #justcompile has wrote you will need a minimum of 2 CPU cores in order to run all pods from the kube-system namespace without issues.
You need to verify how much resources you have on that box and compare it with CPU reservations which each of Pods make.
For example in the provided by you output I can see that your DNS service tries to make a reservetion for 10% of CPU core:
Requests:
cpu: 100m
You can check each of deployed pods and their CPU reservations using:
kubectl describe pods --namespace=kube-system
in your cause kubectl get pods --all-namespaces output cannot see any about pods network.
so you may choice a network implementation and have to install a Pod Network before then kube-dns may deployed fully. for detail kube-dns is stuck in the Pending state and install pod network solution
Firstly, if you run kubectl get nodes does this show both/all nodes in a Ready state?
If they are, I faced this problem and found that when inspecting kubectl get events it showed that the pods were failing as they required a minimum of 2 CPUs to run.
As I was initially running this on an old Macbook Pro via VirtualBox I had to give up and use AWS (other Cloud Platforms are of course available) in order to get multiple CPUs per node.