calico/node is not ready: BIRD is not ready: BGP not established - kubernetes

I'm running Kubernetes 1.13.2, setup using kubeadm and struggling with getting calico 3.5 up and running. The cluster is run on top of KVM.
Setup:
kubeadm init --apiserver-advertise-address=10.255.253.20 --pod-network-cidr=192.168.0.0/16
modified calico.yaml file to include:
- name: IP_AUTODETECTION_METHOD
value: "interface=ens.*"
applied rbac.yaml, etcd.yaml, calico.yaml
Output from kubectl describe pods:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 23m default-scheduler Successfully assigned kube-system/calico-node-hjwrc to k8s-master-01
Normal Pulling 23m kubelet, k8s-master-01 pulling image "quay.io/calico/cni:v3.5.0"
Normal Pulled 23m kubelet, k8s-master-01 Successfully pulled image "quay.io/calico/cni:v3.5.0"
Normal Created 23m kubelet, k8s-master-01 Created container
Normal Started 23m kubelet, k8s-master-01 Started container
Normal Pulling 23m kubelet, k8s-master-01 pulling image "quay.io/calico/node:v3.5.0"
Normal Pulled 23m kubelet, k8s-master-01 Successfully pulled image "quay.io/calico/node:v3.5.0"
Warning Unhealthy 23m kubelet, k8s-master-01 Readiness probe failed: calico/node is not ready: felix is not ready: Get http://localhost:9099/readiness: dial tcp [::1]:9099: connect: connection refused
Warning Unhealthy 23m kubelet, k8s-master-01 Liveness probe failed: Get http://localhost:9099/liveness: dial tcp [::1]:9099: connect: connection refused
Normal Created 23m (x2 over 23m) kubelet, k8s-master-01 Created container
Normal Started 23m (x2 over 23m) kubelet, k8s-master-01 Started container
Normal Pulled 23m kubelet, k8s-master-01 Container image "quay.io/calico/node:v3.5.0" already present on machine
Warning Unhealthy 3m32s (x23 over 7m12s) kubelet, k8s-master-01 Readiness probe failed: calico/node is not ready: BIRD is not ready: BGP not established with 10.255.253.22
Output from calicoctl node status:
Calico process is running.
IPv4 BGP status
+---------------+-------------------+-------+----------+---------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+---------+
| 10.255.253.22 | node-to-node mesh | start | 16:24:44 | Passive |
+---------------+-------------------+-------+----------+---------+
IPv6 BGP status
No IPv6 peers found.
Output from ETCD_ENDPOINTS=http://localhost:6666 calicoctl get nodes -o yaml:
apiVersion: projectcalico.org/v3
items:
- apiVersion: projectcalico.org/v3
kind: Node
metadata:
annotations:
projectcalico.org/kube-labels: '{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/hostname":"k8s-master-01","node-role.kubernetes.io/master":""}'
creationTimestamp: 2019-01-31T16:08:56Z
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/hostname: k8s-master-01
node-role.kubernetes.io/master: ""
name: k8s-master-01
resourceVersion: "28"
uid: 82fee4dc-2572-11e9-8ab7-5254002c725d
spec:
bgp:
ipv4Address: 10.255.253.20/24
ipv4IPIPTunnelAddr: 192.168.151.128
orchRefs:
- nodeName: k8s-master-01
orchestrator: k8s
- apiVersion: projectcalico.org/v3
kind: Node
metadata:
annotations:
projectcalico.org/kube-labels: '{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/hostname":"k8s-worker-01"}'
creationTimestamp: 2019-01-31T16:24:44Z
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/hostname: k8s-worker-01
name: k8s-worker-01
resourceVersion: "170"
uid: b7c2c5a6-2574-11e9-aaa4-5254007d5f6a
spec:
bgp:
ipv4Address: 10.255.253.22/24
ipv4IPIPTunnelAddr: 192.168.36.192
orchRefs:
- nodeName: k8s-worker-01
orchestrator: k8s
kind: NodeList
metadata:
resourceVersion: "395"
Output from ETCD_ENDPOINTS=http://localhost:6666 calicoctl get bgppeers:
NAME PEERIP NODE ASN
Ouput from kubectl logs:
2019-01-31 17:01:20.519 [INFO][48] int_dataplane.go 751: Applying dataplane updates
2019-01-31 17:01:20.519 [INFO][48] ipsets.go 223: Asked to resync with the dataplane on next update. family="inet"
2019-01-31 17:01:20.519 [INFO][48] ipsets.go 254: Resyncing ipsets with dataplane. family="inet"
2019-01-31 17:01:20.523 [INFO][48] ipsets.go 304: Finished resync family="inet" numInconsistenciesFound=0 resyncDuration=3.675284ms
2019-01-31 17:01:20.523 [INFO][48] int_dataplane.go 765: Finished applying updates to dataplane. msecToApply=4.124166000000001
bird: BGP: Unexpected connect from unknown address 10.255.253.14 (port 36329)
bird: BGP: Unexpected connect from unknown address 10.255.253.14 (port 52383)
2019-01-31 17:01:23.182 [INFO][48] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}
bird: BGP: Unexpected connect from unknown address 10.255.253.14 (port 39661)
2019-01-31 17:01:25.433 [INFO][48] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}
bird: BGP: Unexpected connect from unknown address 10.255.253.14 (port 57359)
bird: BGP: Unexpected connect from unknown address 10.255.253.14 (port 47151)
bird: BGP: Unexpected connect from unknown address 10.255.253.14 (port 39243)
2019-01-31 17:01:30.943 [INFO][48] int_dataplane.go 751: Applying dataplane updates
2019-01-31 17:01:30.943 [INFO][48] ipsets.go 223: Asked to resync with the dataplane on next update. family="inet"
2019-01-31 17:01:30.943 [INFO][48] ipsets.go 254: Resyncing ipsets with dataplane. family="inet"
2019-01-31 17:01:30.945 [INFO][48] ipsets.go 304: Finished resync family="inet" numInconsistenciesFound=0 resyncDuration=2.369997ms
2019-01-31 17:01:30.946 [INFO][48] int_dataplane.go 765: Finished applying updates to dataplane. msecToApply=2.8165820000000004
bird: BGP: Unexpected connect from unknown address 10.255.253.14 (port 60641)
2019-01-31 17:01:33.190 [INFO][48] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}
Note: the above unknown address (10.255.253.14) is the IP under br0 on the KVM host, not too sure why it's made an appearance.

I got the solution :
The first preference of ifconfig(in my case) through that it will try to connect to the worker-nodes which is not the right ip.
Solution:Change the calico.yaml file to override that ip to etho-ip by using the following steps.
Need to open port Calico networking (BGP) - TCP 179
# Specify interface
- name: IP_AUTODETECTION_METHOD
value: "interface=eth1"
calico.yaml
---
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
name: calico-config
namespace: kube-system
data:
# Typha is disabled.
typha_service_name: "none"
# Configure the backend to use.
calico_backend: "bird"
# Configure the MTU to use
veth_mtu: "1440"
# The CNI network configuration to install on each node. The special
# values in this config will be automatically populated.
cni_network_config: |-
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"datastore_type": "kubernetes",
"nodename": "__KUBERNETES_NODE_NAME__",
"mtu": __CNI_MTU__,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "__KUBECONFIG_FILEPATH__"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
}
]
}
---
# Source: calico/templates/kdd-crds.yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: felixconfigurations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: FelixConfiguration
plural: felixconfigurations
singular: felixconfiguration
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamblocks.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMBlock
plural: ipamblocks
singular: ipamblock
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: blockaffinities.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BlockAffinity
plural: blockaffinities
singular: blockaffinity
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamhandles.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMHandle
plural: ipamhandles
singular: ipamhandle
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamconfigs.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMConfig
plural: ipamconfigs
singular: ipamconfig
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: bgppeers.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BGPPeer
plural: bgppeers
singular: bgppeer
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: bgpconfigurations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BGPConfiguration
plural: bgpconfigurations
singular: bgpconfiguration
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ippools.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPPool
plural: ippools
singular: ippool
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: hostendpoints.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: HostEndpoint
plural: hostendpoints
singular: hostendpoint
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: clusterinformations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: ClusterInformation
plural: clusterinformations
singular: clusterinformation
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: globalnetworkpolicies.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: GlobalNetworkPolicy
plural: globalnetworkpolicies
singular: globalnetworkpolicy
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: globalnetworksets.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: GlobalNetworkSet
plural: globalnetworksets
singular: globalnetworkset
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: networkpolicies.crd.projectcalico.org
spec:
scope: Namespaced
group: crd.projectcalico.org
version: v1
names:
kind: NetworkPolicy
plural: networkpolicies
singular: networkpolicy
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: networksets.crd.projectcalico.org
spec:
scope: Namespaced
group: crd.projectcalico.org
version: v1
names:
kind: NetworkSet
plural: networksets
singular: networkset
---
# Source: calico/templates/rbac.yaml
# Include a clusterrole for the kube-controllers component,
# and bind it to the calico-kube-controllers serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-kube-controllers
rules:
# Nodes are watched to monitor for deletions.
- apiGroups: [""]
resources:
- nodes
verbs:
- watch
- list
- get
# Pods are queried to check for existence.
- apiGroups: [""]
resources:
- pods
verbs:
- get
# IPAM resources are manipulated when nodes are deleted.
- apiGroups: ["crd.projectcalico.org"]
resources:
- ippools
verbs:
- list
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
- ipamblocks
- ipamhandles
verbs:
- get
- list
- create
- update
- delete
# Needs access to update clusterinformations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- clusterinformations
verbs:
- get
- create
- update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-kube-controllers
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-kube-controllers
subjects:
- kind: ServiceAccount
name: calico-kube-controllers
namespace: kube-system
---
# Include a clusterrole for the calico-node DaemonSet,
# and bind it to the calico-node serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-node
rules:
# The CNI plugin needs to get pods, nodes, and namespaces.
- apiGroups: [""]
resources:
- pods
- nodes
- namespaces
verbs:
- get
- apiGroups: [""]
resources:
- endpoints
- services
verbs:
# Used to discover service IPs for advertisement.
- watch
- list
# Used to discover Typhas.
- get
- apiGroups: [""]
resources:
- nodes/status
verbs:
# Needed for clearing NodeNetworkUnavailable flag.
- patch
# Calico stores some configuration information in node annotations.
- update
# Watch for changes to Kubernetes NetworkPolicies.
- apiGroups: ["networking.k8s.io"]
resources:
- networkpolicies
verbs:
- watch
- list
# Used by Calico for policy information.
- apiGroups: [""]
resources:
- pods
- namespaces
- serviceaccounts
verbs:
- list
- watch
# The CNI plugin patches pods/status.
- apiGroups: [""]
resources:
- pods/status
verbs:
- patch
# Calico monitors various CRDs for config.
- apiGroups: ["crd.projectcalico.org"]
resources:
- globalfelixconfigs
- felixconfigurations
- bgppeers
- globalbgpconfigs
- bgpconfigurations
- ippools
- ipamblocks
- globalnetworkpolicies
- globalnetworksets
- networkpolicies
- networksets
- clusterinformations
- hostendpoints
verbs:
- get
- list
- watch
# Calico must create and update some CRDs on startup.
- apiGroups: ["crd.projectcalico.org"]
resources:
- ippools
- felixconfigurations
- clusterinformations
verbs:
- create
- update
# Calico stores some configuration information on the node.
- apiGroups: [""]
resources:
- nodes
verbs:
- get
- list
- watch
# These permissions are only requried for upgrade from v2.6, and can
# be removed after upgrade or on fresh installations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- bgpconfigurations
- bgppeers
verbs:
- create
- update
# These permissions are required for Calico CNI to perform IPAM allocations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
- ipamblocks
- ipamhandles
verbs:
- get
- list
- create
- update
- delete
- apiGroups: ["crd.projectcalico.org"]
resources:
- ipamconfigs
verbs:
- get
# Block affinities must also be watchable by confd for route aggregation.
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
verbs:
- watch
# The Calico IPAM migration needs to get daemonsets. These permissions can be
# removed if not upgrading from an installation using host-local IPAM.
- apiGroups: ["apps"]
resources:
- daemonsets
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: calico-node
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-node
subjects:
- kind: ServiceAccount
name: calico-node
namespace: kube-system
---
# Source: calico/templates/calico-node.yaml
# This manifest installs the calico-node container, as well
# as the CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: calico-node
namespace: kube-system
labels:
k8s-app: calico-node
spec:
selector:
matchLabels:
k8s-app: calico-node
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: calico-node
annotations:
# This, along with the CriticalAddonsOnly toleration below,
# marks the pod as a critical add-on, ensuring it gets
# priority scheduling and that its resources are reserved
# if it ever gets evicted.
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
nodeSelector:
beta.kubernetes.io/os: linux
hostNetwork: true
tolerations:
# Make sure calico-node gets scheduled on all nodes.
- effect: NoSchedule
operator: Exists
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- effect: NoExecute
operator: Exists
serviceAccountName: calico-node
# Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
# deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
terminationGracePeriodSeconds: 0
priorityClassName: system-node-critical
initContainers:
# This container performs upgrade from host-local IPAM to calico-ipam.
# It can be deleted if this is a fresh installation, or if you have already
# upgraded to use calico-ipam.
- name: upgrade-ipam
image: calico/cni:v3.8.2
command: ["/opt/cni/bin/calico-ipam", "-upgrade"]
env:
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
volumeMounts:
- mountPath: /var/lib/cni/networks
name: host-local-net-dir
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
# This container installs the CNI binaries
# and CNI network config file on each node.
- name: install-cni
image: calico/cni:v3.8.2
command: ["/install-cni.sh"]
env:
# Name of the CNI config file to create.
- name: CNI_CONF_NAME
value: "10-calico.conflist"
# The CNI network config to install on each node.
- name: CNI_NETWORK_CONFIG
valueFrom:
configMapKeyRef:
name: calico-config
key: cni_network_config
# Set the hostname based on the k8s node name.
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# CNI MTU Config variable
- name: CNI_MTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# Prevents the container from sleeping forever.
- name: SLEEP
value: "false"
volumeMounts:
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
- mountPath: /host/etc/cni/net.d
name: cni-net-dir
# Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes
# to communicate with Felix over the Policy Sync API.
- name: flexvol-driver
image: calico/pod2daemon-flexvol:v3.8.2
volumeMounts:
- name: flexvol-driver-host
mountPath: /host/driver
containers:
# Runs calico-node container on each Kubernetes node. This
# container programs network policy and routes on each
# host.
- name: calico-node
image: calico/node:v3.8.2
env:
# Use Kubernetes API as the backing datastore.
- name: DATASTORE_TYPE
value: "kubernetes"
# Wait for the datastore.
- name: WAIT_FOR_DATASTORE
value: "true"
# Set based on the k8s node name.
- name: NODENAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Choose the backend to use.
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
# Cluster type to identify the deployment type
- name: CLUSTER_TYPE
value: "k8s,bgp"
# Specify interface
- name: IP_AUTODETECTION_METHOD
value: "interface=eth1"
# Auto-detect the BGP IP address.
- name: IP
value: "autodetect"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Always"
# Set MTU for tunnel device used if ipip is enabled
- name: FELIX_IPINIPMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within `--cluster-cidr`.
- name: CALICO_IPV4POOL_CIDR
value: "192.168.0.0/16"
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
# Set Felix endpoint to host default action to ACCEPT.
- name: FELIX_DEFAULTENDPOINTTOHOSTACTION
value: "ACCEPT"
# Disable IPv6 on Kubernetes.
- name: FELIX_IPV6SUPPORT
value: "false"
# Set Felix logging to "info"
- name: FELIX_LOGSEVERITYSCREEN
value: "info"
- name: FELIX_HEALTHENABLED
value: "true"
securityContext:
privileged: true
resources:
requests:
cpu: 250m
livenessProbe:
httpGet:
path: /liveness
port: 9099
host: localhost
periodSeconds: 10
initialDelaySeconds: 10
failureThreshold: 6
readinessProbe:
exec:
command:
- /bin/calico-node
- -bird-ready
- -felix-ready
periodSeconds: 10
volumeMounts:
- mountPath: /lib/modules
name: lib-modules
readOnly: true
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
- mountPath: /var/run/calico
name: var-run-calico
readOnly: false
- mountPath: /var/lib/calico
name: var-lib-calico
readOnly: false
- name: policysync
mountPath: /var/run/nodeagent
volumes:
# Used by calico-node.
- name: lib-modules
hostPath:
path: /lib/modules
- name: var-run-calico
hostPath:
path: /var/run/calico
- name: var-lib-calico
hostPath:
path: /var/lib/calico
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
# Used to install CNI.
- name: cni-bin-dir
hostPath:
path: /opt/cni/bin
- name: cni-net-dir
hostPath:
path: /etc/cni/net.d
# Mount in the directory for host-local IPAM allocations. This is
# used when upgrading from host-local to calico-ipam, and can be removed
# if not using the upgrade-ipam init container.
- name: host-local-net-dir
hostPath:
path: /var/lib/cni/networks
# Used to create per-pod Unix Domain Sockets
- name: policysync
hostPath:
type: DirectoryOrCreate
path: /var/run/nodeagent
# Used to install Flex Volume Driver
- name: flexvol-driver-host
hostPath:
type: DirectoryOrCreate
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-node
namespace: kube-system
---
# Source: calico/templates/calico-kube-controllers.yaml
# See https://github.com/projectcalico/kube-controllers
apiVersion: apps/v1
kind: Deployment
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
spec:
# The controllers can only have a single active instance.
replicas: 1
selector:
matchLabels:
k8s-app: calico-kube-controllers
strategy:
type: Recreate
template:
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
nodeSelector:
beta.kubernetes.io/os: linux
tolerations:
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/master
effect: NoSchedule
serviceAccountName: calico-kube-controllers
priorityClassName: system-cluster-critical
containers:
- name: calico-kube-controllers
image: calico/kube-controllers:v3.8.2
env:
# Choose which controllers to run.
- name: ENABLED_CONTROLLERS
value: node
- name: DATASTORE_TYPE
value: kubernetes
readinessProbe:
exec:
command:
- /usr/bin/check-status
- -r
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-kube-controllers
namespace: kube-system
---
# Source: calico/templates/calico-etcd-secrets.yaml
---
# Source: calico/templates/calico-typha.yaml
---
# Source: calico/templates/configure-canal.yaml

As an addition to this it can also be set using the following kubectl command;
kubectl set env daemonset/calico-node -n kube-system
IP_AUTODETECTION_METHOD=interface=eth1

I restarted the docker service on that node, and the issue has been fixed.

the best solution :
Make sure that TCP port 179 is open https://projectcalico.docs.tigera.io/getting-started/kubernetes/requirements
firewall-cmd --zone=public --add-port=179/tcp --permanent
firewall-cmd --reload
2. Then put this command
kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=interface=eth.*

In my case i just uncommented - name: CALICO_IPV4POOL_CIDR value: "192.168.0.0/16"
or you can change the CALICO_IPV4POOL_CIDR value to the same range as kubernete's --pod-network-cidr value.

Disable Firewall
sudo systemctl stop firewalld.service
sudo systemctl disable firewalld.service
And restart your pod
kubectl delete pod <pod-name> -n <name-ns>
check
watch kubectl get pod -n <names-ns>

This is normal behavior. There must be some not working node in normal status. You can use kubectl get node --all-namespaces to check it. After you recover the problem node, the problem will do away.

Related

ErrImageNeverPull: Container image "myjenkins:latest" is not present with pull policy of Never

I'm going through a tutorial on running jenkins on your kubernetes cluster. In the tutorial they're using minikube and for my existing cluster it's running on eks. When I apply my jenkins.yaml file, the pod it creates gets this error
Normal Scheduled 27m default-scheduler Successfully assigned default/jenkins-799666d8db-ft642 to ip-192-168-84-126.us-west-2.compute.internal
Warning Failed 24m (x12 over 27m) kubelet Error: ErrImageNeverPull
Warning ErrImageNeverPull 114s (x116 over 27m) kubelet Container image "myjenkins:latest" is not present with pull policy of Never
This was from describing the pod ^
Here's my jenkins.yaml file that I'm using to try to run jenkins on my cluster
apiVersion: v1
kind: ServiceAccount
metadata:
name: jenkins
namespace: default
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: jenkins
namespace: default
rules:
- apiGroups: [""]
resources: ["pods","services"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get","list","watch"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["create","delete","get","list","patch","update","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: jenkins
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: jenkins
subjects:
- kind: ServiceAccount
name: jenkins
---
# Allows jenkins to create persistent volumes
# This cluster role binding allows anyone in the "manager" group to read secrets in any namespace.
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: jenkins-crb
subjects:
- kind: ServiceAccount
namespace: default
name: jenkins
roleRef:
kind: ClusterRole
name: jenkinsclusterrole
apiGroup: rbac.authorization.k8s.io
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
# "namespace" omitted since ClusterRoles are not namespaced
name: jenkinsclusterrole
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["create","delete","get","list","patch","update","watch"]
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: jenkins
namespace: default
spec:
selector:
matchLabels:
app: jenkins
replicas: 1
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: myjenkins:latest
env:
- name: JAVA_OPTS
value: -Djenkins.install.runSetupWizard=false
ports:
- name: http-port
containerPort: 8080
- name: jnlp-port
containerPort: 50000
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
- name: docker-sock-volume
mountPath: "/var/run/docker.sock"
imagePullPolicy: Never
volumes:
# This allows jenkins to use the docker daemon on the host, for running builds
# see https://stackoverflow.com/questions/27879713/is-it-ok-to-run-docker-from-inside-docker
- name: docker-sock-volume
hostPath:
path: /var/run/docker.sock
- name: jenkins-home
hostPath:
path: /mnt/jenkins-store
serviceAccountName: jenkins
---
apiVersion: v1
kind: Service
metadata:
name: jenkins
namespace: default
spec:
type: NodePort
ports:
- name: ui
port: 8080
targetPort: 8080
nodePort: 31000
- name: jnlp
port: 50000
targetPort: 50000
selector:
app: jenkins
Edit:
So far I tried removing imagePullPolicy: Never and tried it again and got a different error
Warning Failed 17s (x2 over 32s) kubelet Failed to pull image "myjenkins:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for myjenkins, repository does not exist or may re
quire 'docker login': denied: requested access to the resource is denied
I tried running docker login and logging in and I'm still getting this same error ^. I tried changing imagePullPolicy: Never to Always and received the same error
After changing the image to jenkins/jenkins:lts it's still crashing and when I describe, this is what it says
Normal Scheduled 4m37s default-scheduler Successfully assigned default/jenkins-776574886b-x2l8p to ip-192-168-77-17.us-west-2.compute.internal
Normal Pulled 4m26s kubelet Successfully pulled image "jenkins/jenkins:lts" in 11.07948886s
Normal Pulled 4m22s kubelet Successfully pulled image "jenkins/jenkins:lts" in 908.246481ms
Normal Pulled 4m7s kubelet Successfully pulled image "jenkins/jenkins:lts" in 885.936781ms
Normal Created 3m39s (x4 over 4m23s) kubelet Created container jenkins
Normal Started 3m39s (x4 over 4m23s) kubelet Started container jenkins
Normal Pulled 3m39s kubelet Successfully pulled image "jenkins/jenkins:lts" in 895.651242ms
Warning BackOff 3m3s (x8 over 4m20s) kubelet Back-off restarting failed container
When I try to run "kubectl logs" on that pod I even get an error for that, which I've never received before when getting logs
touch: cannot touch '/var/jenkins_home/copy_reference_file.log': Permission denied
Can not write to /var/jenkins_home/copy_reference_file.log. Wrong volume permissions?
Also had to change my volumemount for jenkins to this and it worked!
I found another resource online saying to change my jenkins volume mount to this to fix the permissions issue and my container works now
`
volumeMounts:
- mountPath: /var
name: jenkins-volume
subPath: jenkins_home`
As you already did, removing imagePullPolicy: Never would solve your first problem. Your second problem comes from the fact that you are trying to pull an image called myjenkins:latest, which doesn't exist. What you most likely want is this image.
Change
image: myjenkins:latest
to
image: jenkins/jenkins:lts

filebeat Failed to list *v1.Pod: Unauthorized - permmision issue

i am trying to deploy a filebeat deamonset on my aks cluster
i want it to run on every node and collect all the logs generated by the pods
to do so i have 5 steps
1.create user
2.create role with appropriate permissions
3.bind them
4.create config map
5.create deamonset utilizing the config map
everything was created just fine.
however upon inspection of the filebeat logs i see the following messages indicating filebeat does not have permission to list pods:
E0519 16:19:18.243183 1 reflector.go:125] github.com/elastic/beats/libbeat/common/kubernetes/watcher.go:235: Failed to list *v1.Pod: Unauthorized
E0519 16:19:19.251644 1 reflector.go:125] github.com/elastic/beats/libbeat/common/kubernetes/watcher.go:235: Failed to list *v1.Pod: Unauthorized
this is my yml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: default
labels:
k8s-app: filebeat
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: filebeat
namespace: default
labels:
k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- namespaces
- pods
verbs:
- get
- watch
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: filebeat
namespace: default
subjects:
- kind: ServiceAccount
name: filebeat
namespace: default
roleRef:
kind: ClusterRole
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
labels:
k8s-app: filebeat
data:
filebeat.yml: |-
filebeat.inputs:
- type: container
enabled: true
paths:
- /var/log/containers/*.log
# If you setup helm for your cluster and want to investigate its logs, comment out this section.
exclude_files: ['tiller-deploy-*']
# To be used by Logstash for distinguishing index names while writing to elasticsearch.
fields_under_root: true
fields:
index_prefix: k8s-logs
# Enrich events with k8s, cloud metadata
processors:
- add_cloud_metadata:
- add_host_metadata:
- add_kubernetes_metadata:
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: "/var/log/containers/"
# Send events to Logstash.
output.logstash:
enabled: true
hosts: ["logstash-logstash-headless.elk-stack:9600"]
# You can set logging.level to debug to see the generated events by the running filebeat instance.
logging.level: info
logging.to_files: false
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 7
permissions: 0644
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat
labels:
k8s-app: filebeat
spec:
selector:
matchLabels:
k8s-app: filebeat
template:
metadata:
labels:
k8s-app: filebeat
spec:
# Refers to our previously defined ServiceAccount.
serviceAccountName: filebeat
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: filebeat
image: docker.elastic.co/beats/filebeat:7.5.0
args: [
"-c", "/etc/filebeat.yml",
"-e",
]
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
resources: # comment out for using full speed
limits:
memory: 200Mi
requests:
cpu: 500m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/filebeat.yml
readOnly: true
subPath: filebeat.yml
- name: data
mountPath: /usr/share/filebeat/data
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
# Bind previously defined ConfigMap
- name: config
configMap:
defaultMode: 0600
name: filebeat-config
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
# data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
- name: data
hostPath:
path: /var/lib/filebeat-data
type: DirectoryOrCreate
any idea what might be the problem?

How to fix "Failed to watch *v1beta1.IngressClass: failed to list *v1beta1.IngressClass: ingressclasses.networking.k8s.io is forbidden"

I have HA proxy ingress in Kubernetes AKS. After upgrading Kubernetes version, I get errors from HA proxy. I tried to solve the problem modifying my old haproxy.yaml to avoid deprecated API's and to get the latest image of HA proxy ingress. But the error persist. How can I fix the errors?.
I also tried this answer, but it doesn't work for me.
I checked this issue on github, but despite I use v0.12-snapshot.3 the error persist.
This is my modified haproxy.yaml:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: ingress-controller
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: ingress-controller
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: ingress-controller
namespace: default
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
- namespaces
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
- create
- update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ingress-controller
subjects:
- kind: ServiceAccount
name: ingress-controller
namespace: default
- apiGroup: rbac.authorization.k8s.io
kind: User
name: ingress-controller
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: ingress-controller
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress-controller
subjects:
- kind: ServiceAccount
name: ingress-controller
namespace: default
- apiGroup: rbac.authorization.k8s.io
kind: User
name: ingress-controller
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: ingress-default-backend
name: ingress-default-backend
namespace: default
spec:
selector:
matchLabels:
run: ingress-default-backend
template:
metadata:
labels:
run: ingress-default-backend
spec:
containers:
- name: ingress-default-backend
image: gcr.io/google_containers/defaultbackend:1.0
ports:
- containerPort: 8080
resources:
limits:
cpu: 10m
memory: 20Mi
---
apiVersion: v1
kind: Service
metadata:
name: ingress-default-backend
namespace: default
spec:
ports:
- port: 8080
selector:
run: ingress-default-backend
---
apiVersion: v1
kind: ConfigMap
metadata:
name: haproxy-ingress
namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: haproxy-ingress
name: haproxy-ingress
spec:
selector:
matchLabels:
run: haproxy-ingress
template:
metadata:
labels:
run: haproxy-ingress
spec:
serviceAccountName: ingress-controller
containers:
- name: haproxy-ingress
image: quay.io/jcmoraisjr/haproxy-ingress:v0.12.1
imagePullPolicy: Always
resources:
requests:
memory: "64Mi"
cpu: "75m"
limits:
memory: "256Mi"
cpu: "500m"
args:
- --default-backend-service=$(POD_NAMESPACE)/ingress-default-backend
- --configmap=$(POD_NAMESPACE)/haproxy-ingress
- --reload-strategy=reusesocket
ports:
- name: https
containerPort: 443
- name: stat
containerPort: 1936
livenessProbe:
httpGet:
path: /healthz
port: 10253
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
---
apiVersion: v1
kind: Service
metadata:
labels:
run: haproxy-ingress
name: haproxy-ingress
namespace: default
spec:
type: LoadBalancer
externalTrafficPolicy: Local
ports:
- name: https
port: 443
- name: stat
port: 1936
selector:
run: haproxy-ingress
The following is the output of kubectl logs :
I0307 20:52:16.873675 6 launch.go:215]
Name: HAProxy
Release: v0.12-snapshot.3
Build: git-b34edd0
Repository: https://github.com/jcmoraisjr/haproxy-ingress
I0307 20:52:16.873776 6 launch.go:218] watching for ingress resources with 'kubernetes.io/ingress.class' annotation: haproxy
I0307 20:52:16.873787 6 launch.go:225] watching for ingress resources with IngressClass' controller name: haproxy-ingress.github.io/controller
I0307 20:52:16.873802 6 launch.go:230] ignoring ingress resources without any class reference - --watch-ingress-without-class is false
I0307 20:52:16.873968 6 launch.go:492] Creating API client for https://10.0.0.1:443
I0307 20:52:16.902520 6 launch.go:504] Running in Kubernetes Cluster version v1.17 (v1.17.16) - git (clean) commit d88fadbd65c5e8bde22630d251766a634c7613b0 - platform linux/amd64
I0307 20:52:16.908078 6 launch.go:257] validated default/ingress-default-backend as the default backend
I0307 20:52:18.693995 6 listers.go:134] loading object cache...
E0307 20:52:18.696953 6 reflector.go:127] pkg/mod/k8s.io/client-go#v0.19.0/tools/cache/reflector.go:156: Failed to watch *v1beta1.IngressClass: failed to list *v1beta1.IngressClass: ingressclasses.networking.k8s.io is forbidden: User "system:serviceaccount:default:ingress-controller" cannot list resource "ingressclasses" in API group "networking.k8s.io" at the cluster scope
E0307 20:52:19.982962 6 reflector.go:127] pkg/mod/k8s.io/client-go#v0.19.0/tools/cache/reflector.go:156: Failed to watch *v1beta1.IngressClass: failed to list *v1beta1.IngressClass: ingressclasses.networking.k8s.io is forbidden: User "system:serviceaccount:default:ingress-controller" cannot list resource "ingressclasses" in API group "networking.k8s.io" at the cluster scope
E0307 20:52:23.089836 6 reflector.go:127] pkg/mod/k8s.io/client-go#v0.19.0/tools/cache/reflector.go:156: Failed to watch *v1beta1.IngressClass: failed to list *v1beta1.IngressClass: ingressclasses.networking.k8s.io is forbidden: User "system:serviceaccount:default:ingress-controller" cannot list resource "ingressclasses" in API group "networking.k8s.io" at the cluster scope
E0307 20:52:28.419408 6 reflector.go:127] pkg/mod/k8s.io/client-go#v0.19.0/tools/cache/reflector.go:156: Failed to watch *v1beta1.IngressClass: failed to list *v1beta1.IngressClass: ingressclasses.networking.k8s.io is forbidden: User "system:serviceaccount:default:ingress-controller" cannot list resource "ingressclasses" in API group "networking.k8s.io" at the cluster scope
E0307 20:52:37.624105 6 reflector.go:127] pkg/mod/k8s.io/client-go#v0.19.0/tools/cache/reflector.go:156: Failed to watch *v1beta1.IngressClass: failed to list *v1beta1.IngressClass: ingressclasses.networking.k8s.io is forbidden: User "system:serviceaccount:default:ingress-controller" cannot list resource "ingressclasses" in API group "networking.k8s.io" at the cluster scope
I0307 20:52:45.320562 6 main.go:47] Shutting down with signal terminated
I0307 20:52:45.320631 6 controller.go:208] shutting down controller queues
E0307 20:52:45.320675 6 listers.go:132] initial cache sync has timed out or shutdown has requested
I0307 20:52:45.320711 6 controller.go:87] HAProxy Ingress successfully initialized
I0307 20:52:45.320722 6 main.go:40] Exiting (0)
As per #jesús-lópez comment, upgrading the kubernetes version to 1.18.4 from 1.17 and reinstalling haproxy resolved the issue.

kuberentes communication is not working cross nodes

I have a procedure of installing kubernetes cluster via kubeadm and it worked multiple times.
for some reason now I have a cluster which I installed and for some reason the nodes are having trouble communicating.
the problem reflect in couple of ways :
sometimes the cluster is unable to resolve global dns records such as mirrorlist.centos.org
sometimes one pod from a specific node has no connectivity to another pod in different node
my kubernetes version is 1.9.2
my hosts are centOS 7.4
I use flannel as cni plugin in version 0.9.1
my cluster is built on AWS
mt debugging so far was :
kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' - to see subnets
10.244.0.0/24 10.244.1.0/24
I tried adding configurations to kubedns ( even though it is needed in all my other clusters ) like https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#configure-stub-domain-and-upstream-dns-servers
I tried installing busybox and ding nslookup to cluster kubernetes.default and it only works of busybox is on the same node as the dns ( tried this link https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/
I even tried creating an AMI from other running environments and deploying it as a node to this cluster and it still fails.
I tried checking if some port is missing so I even opened all ports between nodes
I also disabled iptables and firewall and all nodes just to make sure it is not the reason
nothing helps.
please any tip would help
edit :
I added my flannel configuration:
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"type": "flannel",
"delegate": {
"isDefaultGateway": true
}
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-system
labels:
tier: node
app: flannel
spec:
template:
metadata:
labels:
tier: node
app: flannel
spec:
hostNetwork: true
nodeSelector:
beta.kubernetes.io/arch: amd64
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.9.1-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conf
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.9.1-amd64
command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr" ]
securityContext:
privileged: true
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
the issues was that the AWS machines were provisioned not by me and the team that provisioned the machines assured that all internal traffic is opened.
after a lot of debugging with nmap I found out that UDP ports are not opened and since flannel requires UDP traffic the communication was not working properly.
once UDP was opened issues got solved.

K8s ingress: nginx ingress controller is not in running mode

I have a jenkins image, I made service as NodeType. It works well. Since I will add more services, I need to use ingress nginx to divert traffic to different kinds of services.
At this moment, I use my win10 to set up two vms (Centos 7.5). One vm as master1, it has two internal IPv4 address (10.0.2.9 and 192.168.56.103) and one vm as worker node4 (10.0.2.6 and 192.168.56.104).
All images are local. I have downloaded into local docker image repository. The problem is that Nginx ingress does not run.
My configuration as follows:
ingress-nginx-ctl.yaml:
apiVersion: extensions/v1beta1
metadata:
name: ingress-nginx
namespace: default
spec:
replicas: 1
template:
metadata:
labels:
app: ingress-nginx
spec:
terminationGracePeriodSeconds: 60
containers:
- image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
name: ingress-nginx
imagePullPolicy: Never
ports:
- name: http
containerPort: 80
protocol: TCP
- name: https
containerPort: 443
protocol: TCP
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/nginx-default-backend
ingress-nginx-res.yaml:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: my-ingress
namespace: default
spec:
rules:
- host:
http:
paths:
- path: /
backend:
serviceName: shinyinfo-jenkins-svc
servicePort: 8080
nginx-default-backend.yaml
kind: Service
apiVersion: v1
metadata:
name: nginx-default-backend
namespace: default
spec:
ports:
- port: 80
targetPort: http
selector:
app: nginx-default-backend
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: nginx-default-backend
namespace: default
spec:
replicas: 1
template:
metadata:
labels:
app: nginx-default-backend
spec:
terminationGracePeriodSeconds: 60
containers:
- name: default-http-backend
image: chenliujin/defaultbackend
imagePullPolicy: Never
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
resources:
limits:
cpu: 10m
memory: 10Mi
requests:
cpu: 10m
memory: 10Mi
ports:
- name: http
containerPort: 8080
protocol: TCP
shinyinfo-jenkins-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: shinyinfo-jenkins
labels:
app: shinyinfo-jenkins
spec:
containers:
- name: shinyinfo-jenkins
image: shinyinfo_jenkins
imagePullPolicy: Never
ports:
- containerPort: 8080
containerPort: 50000
volumeMounts:
- mountPath: /devops/password
name: jenkins-password
- mountPath: /var/jenkins_home
name: jenkins-home
volumes:
- name: jenkins-password
hostPath:
path: /jenkins/password
- name: jenkins-home
hostPath:
path: /jenkins
shinyinfo-jenkins-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: shinyinfo-jenkins-svc
labels:
name: shinyinfo-jenkins-svc
spec:
selector:
app: shinyinfo-jenkins
type: NodePort
ports:
- name: tcp
port: 8080
nodePort: 30003
There is something wrong with nginx ingress, the console output is as follows:
[master#master1 config]$ sudo kubectl apply -f ingress-nginx-ctl.yaml
service/ingress-nginx created
deployment.extensions/ingress-nginx created
[master#master1 config]$ sudo kubectl apply -f ingress-nginx-res.yaml
ingress.extensions/my-ingress created
Images is CrashLoopBackOff, Why???
[master#master1 config]$ sudo kubectl get po
NAME READY STATUS RESTARTS AGE
ingress-nginx-66df6b6d9-mhmj9 0/1 CrashLoopBackOff 1 9s
nginx-default-backend-645546c46f-x7s84 1/1 Running 0 6m
shinyinfo-jenkins 1/1 Running 0 20m
describe pod:
[master#master1 config]$ sudo kubectl describe po ingress-nginx-66df6b6d9-mhmj9
Name: ingress-nginx-66df6b6d9-mhmj9
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: node4/192.168.56.104
Start Time: Thu, 08 Nov 2018 16:45:46 +0800
Labels: app=ingress-nginx
pod-template-hash=228926285
Annotations: <none>
Status: Running
IP: 100.127.10.211
Controlled By: ReplicaSet/ingress-nginx-66df6b6d9
Containers:
ingress-nginx:
Container ID: docker://2aba164d116758585abef9d893a5fa0f0c5e23c04a13466263ce357ebe10cb0a
Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
Image ID: docker://sha256:a3f21ec4bd119e7e17c8c8b2bf8a3b9e42a8607455826cd1fa0b5461045d2fa9
Ports: 80/TCP, 443/TCP
Host Ports: 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--default-backend-service=$(POD_NAMESPACE)/nginx-default-backend
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Thu, 08 Nov 2018 16:46:09 +0800
Finished: Thu, 08 Nov 2018 16:46:09 +0800
Ready: False
Restart Count: 2
Liveness: http-get http://:10254/healthz delay=30s timeout=5s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-66df6b6d9-mhmj9 (v1:metadata.name)
POD_NAMESPACE: default (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-24hnm (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-24hnm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-24hnm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 40s default-scheduler Successfully assigned default/ingress-nginx-66df6b6d9-mhmj9 to node4
Normal Pulled 18s (x3 over 39s) kubelet, node4 Container image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0" already present on machine
Normal Created 18s (x3 over 39s) kubelet, node4 Created container
Normal Started 17s (x3 over 39s) kubelet, node4 Started container
Warning BackOff 11s (x5 over 36s) kubelet, node4 Back-off restarting failed container
logs of pod:
[master#master1 config]$ sudo kubectl logs ingress-nginx-66df6b6d9-mhmj9
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: 0.20.0
Build: git-e8d8103
Repository: https://github.com/kubernetes/ingress-nginx.git
-------------------------------------------------------------------------------
nginx version: nginx/1.15.5
W1108 08:47:16.081042 6 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I1108 08:47:16.081234 6 main.go:196] Creating API client for https://10.96.0.1:443
I1108 08:47:16.122315 6 main.go:240] Running in Kubernetes cluster version v1.11 (v1.11.3) - git (clean) commit a4529464e4629c21224b3d52edfe0ea91b072862 - platform linux/amd64
F1108 08:47:16.123661 6 main.go:97] ✖ The cluster seems to be running with a restrictive Authorization mode and the Ingress controller does not have the required permissions to operate normally.
Could experts here drop me some hints?
You need set ingress-nginx to use a seperate serviceaccount and give neccessary privilege to the serviceaccount.
here is a example:
apiVersion: v1
kind: ServiceAccount
metadata:
name: lb
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: nginx-ingress-normal
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: nginx-ingress-minimal
namespace: kube-system
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
- namespaces
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
- "ingress-controller-leader-dev"
- "ingress-controller-leader-prod"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: nginx-ingress-minimal
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: nginx-ingress-minimal
subjects:
- kind: ServiceAccount
name: lb
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: nginx-ingress-normal
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: nginx-ingress-normal
subjects:
- kind: ServiceAccount
name: lb
namespace: kube-system