Kube-Flannel cant get CIDR although PodCIDR available on node - kubernetes

currently I am setting up Kubernetes on a 1 Master 2 Node enviorement.
I succesfully initialized the Master and added the nodes to the Cluster
kubectl get nodes
When I joined the Nodes to the cluster, the kube-proxy pod started succesfully, but the kube-flannel pod gets an error and runs into a CrashLoopBackOff.
flannel-pod.log:
I0613 09:03:36.820387 1 main.go:475] Determining IP address of default interface,
I0613 09:03:36.821180 1 main.go:488] Using interface with name ens160 and address 172.17.11.2,
I0613 09:03:36.821233 1 main.go:505] Defaulting external address to interface address (172.17.11.2),
I0613 09:03:37.015163 1 kube.go:131] Waiting 10m0s for node controller to sync,
I0613 09:03:37.015436 1 kube.go:294] Starting kube subnet manager,
I0613 09:03:38.015675 1 kube.go:138] Node controller sync successful,
I0613 09:03:38.015767 1 main.go:235] Created subnet manager: Kubernetes Subnet Manager - caasfaasslave1.XXXXXX.local,
I0613 09:03:38.015828 1 main.go:238] Installing signal handlers,
I0613 09:03:38.016109 1 main.go:353] Found network config - Backend type: vxlan,
I0613 09:03:38.016281 1 vxlan.go:120] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false,
E0613 09:03:38.016872 1 main.go:280] Error registering network: failed to acquire lease: node "caasfaasslave1.XXXXXX.local" pod cidr not assigned,
I0613 09:03:38.016966 1 main.go:333] Stopping shutdownHandler...,
On the Node, I can verify that the PodCDIR is available:
kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}'
172.17.12.0/24
On the Masters kube-controller-manager, the pod cidr is also there
[root#caasfaasmaster manifests]# cat kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --leader-elect=true
- --controllers=*,bootstrapsigner,tokencleaner
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --address=127.0.0.1
- --use-service-account-credentials=true
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --allocate-node-cidrs=true
- --cluster-cidr=172.17.12.0/24
- --node-cidr-mask-size=24
env:
- name: http_proxy
value: http://ntlmproxy.XXXXXX.local:3154
- name: https_proxy
value: http://ntlmproxy.XXXXXX.local:3154
- name: no_proxy
value: .XXXXX.local,172.17.11.0/24,172.17.12.0/24
image: k8s.gcr.io/kube-controller-manager-amd64:v1.10.4
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10252
scheme: HTTP
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-controller-manager
resources:
requests:
cpu: 200m
volumeMounts:
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /etc/kubernetes/controller-manager.conf
name: kubeconfig
readOnly: true
- mountPath: /etc/pki
name: ca-certs-etc-pki
readOnly: true
hostNetwork: true
volumes:
- hostPath:
path: /etc/pki
type: DirectoryOrCreate
name: ca-certs-etc-pki
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/kubernetes/controller-manager.conf
type: FileOrCreate
name: kubeconfig
status: {}
XXXXX for anonymization
I initialized the master with the following kubeadm comman (which also went through without any errors)
kubeadm init --pod-network-cidr=172.17.12.0/24 --service-
cidr=172.17.11.129/25 --service-dns-domain=dcs.XXXXX.local
Does anyone know what could cause my issues and how to fix them?
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-caasfaasmaster.XXXXXX.local 1/1 Running 0 16h 172.17.11.1 caasfaasmaster.XXXXXX.local
kube-system kube-apiserver-caasfaasmaster.XXXXXX.local 1/1 Running 1 16h 172.17.11.1 caasfaasmaster.XXXXXX.local
kube-system kube-controller-manager-caasfaasmaster.XXXXXX.local 1/1 Running 0 16h 172.17.11.1 caasfaasmaster.XXXXXX.local
kube-system kube-dns-75c5968bf9-qfh96 3/3 Running 0 16h 172.17.12.2 caasfaasmaster.XXXXXX.local
kube-system kube-flannel-ds-4b6kf 0/1 CrashLoopBackOff 205 16h 172.17.11.2 caasfaasslave1.XXXXXX.local
kube-system kube-flannel-ds-j2fz6 0/1 CrashLoopBackOff 191 16h 172.17.11.3 caasfassslave2.XXXXXX.local
kube-system kube-flannel-ds-qjd89 1/1 Running 0 16h 172.17.11.1 caasfaasmaster.XXXXXX.local
kube-system kube-proxy-h4z54 1/1 Running 0 16h 172.17.11.3 caasfassslave2.XXXXXX.local
kube-system kube-proxy-sjwl2 1/1 Running 0 16h 172.17.11.2 caasfaasslave1.XXXXXX.local
kube-system kube-proxy-zc5xh 1/1 Running 0 16h 172.17.11.1 caasfaasmaster.XXXXXX.local
kube-system kube-scheduler-caasfaasmaster.XXXXXX.local 1/1 Running 0 16h 172.17.11.1 caasfaasmaster.XXXXXX.local

Failed to acquire lease simply means, the pod didn't get the podCIDR. Happened with me as well although the manifest on master-node says podCIDR true but still it wasn't working and funnel going in crashbackloop.
This is what i did to fix it.
From the master-node, first find out your funnel CIDR
sudo cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep -i cluster-cidr
Output:
- --cluster-cidr=172.168.10.0/24
Then run the following from the master node:
kubectl patch node slave-node-1 -p '{"spec":{"podCIDR":"172.168.10.0/24"}}'
where,
slave-node-1 is your node where acquire lease is failing
podCIDR is the cidr that you found in previous command
Hope this helps.

According to Flannel documentation:
At the bare minimum, you must tell flannel an IP range (subnet) that
it should use for the overlay. Here is an example of the minimum
flannel configuration:
{ "Network": "10.1.0.0/16" }
Therefore, you need to specify a network for pods with a minimum size of /16, and it should not be a part of your existing network because Flannel uses encapsulation to connect pods on different nodes to one overlay network.
Here is the part of documentation which describes it:
With Docker, each container is assigned an IP address that can be used
to communicate with other containers on the same host. For
communicating over a network, containers are tied to the IP addresses
of the host machines and must rely on port-mapping to reach the
desired container. This makes it difficult for applications running
inside containers to advertise their external IP and port as that
information is not available to them.
flannel solves the problem by giving each container an IP that can be
used for container-to-container communication. It uses packet
encapsulation to create a virtual overlay network that spans the whole
cluster. More specifically, flannel gives each host an IP subnet
(/24 by default) from which the Docker daemon is able to allocate
IPs to the individual containers.
In other words, you should recreate your cluster with settings like these:
kubeadm init --pod-network-cidr=10.17.0.0/16 --service-cidr=10.18.0.0/24 --service-dns-domain=dcs.XXXXX.local

Related

metrics-service in kubernetes not working

I'm running kubernetes using an ec2 machine on aws.
Node is in Ubuntu.
my metrics-server version.
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml
components.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-type=InternalIP,ExternalIP,Hostname
- --kubelet-insecure-tls
Even after adding args, the error appears.
error :
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
or
error: metrics not available yet
No matter how long I wait, that error appears.
my kops version : Version 1.18.0 (git-698bf974d8)
i use networking calico.
please help...
++
I try to wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml
view logs..
kubectl logs -n kube-system deploy/metrics-server
"Failed to scrape node" err="GET "https://172.20.51.226:10250/stats/summary?only_cpu_and_memory=true": bad status code "401 Unauthorized"" node="ip-172-20-51-226.ap-northeast-2.compute.internal"
"Failed probe" probe="metric-storage-ready" err="not metrics to serve"
Download the components.yaml file manually:
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Then edit the args section under Deployment:
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
add there two more lines:
- --kubelet-insecure-tls=true
- --kubelet-preferred-address-types=InternalIP
kubelet Of 10250 The port uses https agreement , The connection needs to be verified by tls certificate. Adding ,--kubelet-insecure-tls tells it do not verify client certificate.
After this modification just apply the manifest:
kubectl apply -f components.yaml
wait a minute and you will see metrics server pod is up
Last comment is useful.You can edit the deploy directly as well and adding line "--kubelet-insecure-tls=true" its enought for me:
Edit deploy:
$ kubectl edit deployment.apps/metrics-server -n kube-system
Add the line:
- --kubelet-insecure-tls=true
Similar result:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls=true
And save with ":wq" and enjoy.
~$ kubectl top pods -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-6d4b75cb6d-k8dmc 3m 18Mi
coredns-6d4b75cb6d-wxxn6 3m 17Mi
kube-apiserver-k8s-master1 82m 306Mi
kube-apiserver-k8s-master2 65m 247Mi
kube-controller-manager-k8s-master1 32m 47Mi
kube-controller-manager-k8s-master2 4m 19Mi
kube-proxy-9dbgk 1m 9Mi
kube-proxy-bwhdm 1m 14Mi
kube-proxy-fz8v8 1m 15Mi
kube-proxy-vcnrc 1m 9Mi
kube-scheduler-k8s-master1 7m 18Mi
kube-scheduler-k8s-master2 4m 16Mi
metrics-server-79576f7ff-97tpc 6m 15Mi
metrics-server-79576f7ff-qzczp 4m 13Mi
~$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master1 318m 15% 1047Mi 55%
k8s-master2 208m 10% 1002Mi 52%
k8s-worker1 30m 3% 804Mi 42%
k8s-worker2 35m 3% 550Mi 29%

Hashicorp Consul, Agent/Client access

I am trying to do Consul setup via Kubernetes, helm chart, https://www.consul.io/docs/k8s/helm
Based on my pre-Kubernetes knowledge: services, using Consul access via Consul Agent, running on each host and listening on hosts IP
Now, I deployed via Helm chart to Kubernetes cluster. First misunderstanding the terminology, Consul Agent vs Client in this setup? I presume it is the same
Now, set up:
Helm chart config (Terraform fragment), nothing specific to Clients/Agent's and their service:
global:
name: "consul"
datacenter: "${var.consul_config.datacenter}"
server:
storage: "${var.consul_config.storage}"
connect: false
syncCatalog:
enabled: true
default: true
k8sAllowNamespaces: ['*']
k8sDenyNamespaces: [${join(",", var.consul_config.k8sDenyNamespaces)}]
Pods, client/agent ones are DaemonSet, not in host network mode
kubectl get pods
NAME READY STATUS RESTARTS AGE
consul-8l587 1/1 Running 0 11h
consul-cfd8z 1/1 Running 0 11h
consul-server-0 1/1 Running 0 11h
consul-server-1 1/1 Running 0 11h
consul-server-2 1/1 Running 0 11h
consul-sync-catalog-8b688ff9b-klqrv 1/1 Running 0 11h
consul-vrmtp 1/1 Running 0 11h
Services
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consul ExternalName <none> consul.service.consul <none> 11h
consul-dns ClusterIP 172.20.124.238 <none> 53/TCP,53/UDP 11h
consul-server ClusterIP None <none> 8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 11h
consul-ui ClusterIP 172.20.131.29 <none> 80/TCP 11h
Question 1 Where is a service, to target Client (Agent) pods, but not Server's pods ? Did I miss it in helm chart?
My plan is, while I am not going to use Host (Kubernetes node) networking:
Find the Client/Agent service or make my own. So, it will be used by the Consul's user's. E.g., this service address I will specify for Consul template init pod of the Consul template. In the config consuming application
kubectl get pods --selector app=consul,component=client,release=consul
consul-8l587 1/1 Running 0 11h
consul-cfd8z 1/1 Running 0 11h
consul-vrmtp 1/1 Running 0 11h
Optional: will add a topologyKeys in to agent service, so each consumer will not cross host boundary
Question 2 Is it right approach? Or it is different for Consul Kubernetes deployments
You can use the Kubernetes downward API to inject the IP of host as an environment variable for your pod.
apiVersion: v1
kind: Pod
metadata:
name: consul-example
spec:
containers:
- name: example
image: 'consul:latest'
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
command:
- '/bin/sh'
- '-ec'
- |
export CONSUL_HTTP_ADDR="${HOST_IP}:8500"
consul kv put hello world
restartPolicy: Never
See https://www.consul.io/docs/k8s/installation/install#accessing-the-consul-http-api for more info.

how to scale daemon set about kubernetes using kubectl

Now I only have terminal to access kubernetes cluster now, check the ingress controller like this:
$ k get daemonset --all-namespaces
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system traefik-ingress-controller 0 0 0 0 0 IngressProxy=true 60d
logging fluentd-es 0 0 0 0 0 beta.kubernetes.io/fluentd-ds-ready=true 28d
I am now using kubectl(v1.15.2) to scale daemon set like this:
kubectl scale --replicas=1 DaemonSet/traefik-ingress-controller -n kube-system
but it shows:
Error from server (NotFound): the server could not find the requested resource
what should I do to start the traefik in terminal using command line? This is my daemon set describe output:
~/Library/Mobile Documents/com~apple~CloudDocs/Document/k8s/work/traefik-deployment-yaml/k8s-backup ⌚ 17:49:58
$ k describe daemonset traefik-ingress-controller -n kube-system
Name: traefik-ingress-controller
Selector: app=traefik
Node-Selector: IngressProxy=true
Labels: app=traefik
Annotations: deprecated.daemonset.template.generation: 18
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apps/v1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"app":"traefik"},"name":"traefik-ingress-controller","na...
Desired Number of Nodes Scheduled: 0
Current Number of Nodes Scheduled: 0
Number of Nodes Scheduled with Up-to-date Pods: 0
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=traefik
Service Account: traefik-ingress-controller
Containers:
traefik-ingress-lb:
Image: traefik:v2.1.6
Ports: 80/TCP, 443/TCP, 8080/TCP
Host Ports: 80/TCP, 443/TCP, 0/TCP
Args:
--configfile=/config/traefik.yaml
--logLevel=INFO
--metrics=true
--metrics.prometheus=true
--entryPoints.metrics.address=:8080
--metrics.prometheus.entryPoint=metrics
--metrics.prometheus.addServicesLabels=true
--metrics.prometheus.addEntryPointsLabels=true
--metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 1
memory: 1Gi
Environment: <none>
Mounts:
/config from config (rw)
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: traefik-config
Optional: false
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedDaemonPod 3h32m daemonset-controller Found failed daemon pod kube-system/traefik-ingress-controller-wdpsq on node azshara-k8s03, will try to kill it
Normal SuccessfulDelete 3h32m daemonset-controller Deleted pod: traefik-ingress-controller-wdpsq
Normal SuccessfulCreate 3h32m daemonset-controller Created pod: traefik-ingress-controller-qmttl
Warning FailedDaemonPod 3h32m daemonset-controller Found failed daemon pod kube-system/traefik-ingress-controller-qmttl on node azshara-k8s03, will try to kill it
Normal SuccessfulDelete 3h32m daemonset-controller Deleted pod: traefik-ingress-controller-qmttl
Normal SuccessfulCreate 3h32m daemonset-controller Created pod: traefik-ingress-controller-nlxwc
You don not need to scale a deamon set on K8s.
A Daemon Set ensures that all eligible nodes run a copy of a Pod..
As nodes are added to the cluster, Pods are added to them. So you need to add new node to cluster and deamon set will be scheduled there unless you have a very unique taint to disallow given deamon set.

Jenkins app is not accessible outside Kubernetes cluster

On CentOS 7.4, I have set up a Kubernetes master node, pulled down jenkins image and deployed it to the cluster defining the jenkins service on a NodePort as below.
I can curl the jenkins app from the worker or master nodes using the IP defined by the service. But, I can not access the Jenkins app (dashboard) from my browser (outside cluster) using the public IP of the master node.
[administrator#abcdefgh ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
abcdefgh Ready master 19h v1.13.1
hgfedcba Ready <none> 19h v1.13.1
[administrator#abcdefgh ~]$ sudo docker pull jenkinsci/jenkins:2.154-alpine
[administrator#abcdefgh ~]$ sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.13.1 fdb321fd30a0 5 days ago 80.2MB
k8s.gcr.io/kube-controller-manager v1.13.1 26e6f1db2a52 5 days ago 146MB
k8s.gcr.io/kube-apiserver v1.13.1 40a63db91ef8 5 days ago 181MB
k8s.gcr.io/kube-scheduler v1.13.1 ab81d7360408 5 days ago 79.6MB
jenkinsci/jenkins 2.154-alpine aa25058d8320 2 weeks ago 222MB
k8s.gcr.io/coredns 1.2.6 f59dcacceff4 6 weeks ago 40MB
k8s.gcr.io/etcd 3.2.24 3cab8e1b9802 2 months ago 220MB
quay.io/coreos/flannel v0.10.0-amd64 f0fad859c909 10 months ago 44.6MB
k8s.gcr.io/pause 3.1 da86e6ba6ca1 12 months ago 742kB
[administrator#abcdefgh ~]$ ls -l
total 8
-rw------- 1 administrator administrator 678 Dec 18 06:12 jenkins-deployment.yaml
-rw------- 1 administrator administrator 410 Dec 18 06:11 jenkins-service.yaml
[administrator#abcdefgh ~]$ cat jenkins-service.yaml
apiVersion: v1
kind: Service
metadata:
name: jenkins-ui
spec:
type: NodePort
ports:
- protocol: TCP
port: 8080
targetPort: 8080
name: ui
selector:
app: jenkins-master
---
apiVersion: v1
kind: Service
metadata:
name: jenkins-discovery
spec:
selector:
app: jenkins-master
ports:
- protocol: TCP
port: 50000
targetPort: 50000
name: jenkins-slaves
[administrator#abcdefgh ~]$ cat jenkins-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins
spec:
replicas: 1
template:
metadata:
labels:
app: jenkins-master
spec:
containers:
- image: jenkins/jenkins:2.154-alpine
name: jenkins
ports:
- containerPort: 8080
name: http-port
- containerPort: 50000
name: jnlp-port
env:
- name: JAVA_OPTS
value: -Djenkins.install.runSetupWizard=false
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
- name: jenkins-home
emptyDir: {}
[administrator#abcdefgh ~]$ kubectl create -f jenkins-service.yaml
service/jenkins-ui created
service/jenkins-discovery created
[administrator#abcdefgh ~]$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins-discovery ClusterIP 10.98.--.-- <none> 50000/TCP 19h
jenkins-ui NodePort 10.97.--.-- <none> 8080:31587/TCP 19h
kubernetes ClusterIP 10.96.--.-- <none> 443/TCP 20h
[administrator#abcdefgh ~]$ kubectl create -f jenkins-deployment.yaml
deployment.extensions/jenkins created
[administrator#abcdefgh ~]$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
jenkins 1/1 1 1 19h
[administrator#abcdefgh ~]$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default jenkins-6497cf9dd4-f9r5b 1/1 Running 0 19h
kube-system coredns-86c58d9df4-jfq5b 1/1 Running 0 20h
kube-system coredns-86c58d9df4-s4k6d 1/1 Running 0 20h
kube-system etcd-abcdefgh 1/1 Running 1 20h
kube-system kube-apiserver-abcdefgh 1/1 Running 1 20h
kube-system kube-controller-manager-abcdefgh 1/1 Running 5 20h
kube-system kube-flannel-ds-amd64-2w68w 1/1 Running 1 20h
kube-system kube-flannel-ds-amd64-6zl4g 1/1 Running 1 20h
kube-system kube-proxy-9r4xt 1/1 Running 1 20h
kube-system kube-proxy-s7fj2 1/1 Running 1 20h
kube-system kube-scheduler-abcdefgh 1/1 Running 8 20h
[administrator#abcdefgh ~]$ kubectl describe pod jenkins-6497cf9dd4-f9r5b
Name: jenkins-6497cf9dd4-f9r5b
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: hgfedcba/10.41.--.--
Start Time: Tue, 18 Dec 2018 06:32:50 -0800
Labels: app=jenkins-master
pod-template-hash=6497cf9dd4
Annotations: <none>
Status: Running
IP: 10.244.--.--
Controlled By: ReplicaSet/jenkins-6497cf9dd4
Containers:
jenkins:
Container ID: docker://55912512a7aa1f782784690b558d74001157f242a164288577a85901ecb5d152
Image: jenkins/jenkins:2.154-alpine
Image ID: docker-pullable://jenkins/jenkins#sha256:b222875a2b788f474db08f5f23f63369b0f94ed7754b8b32ac54b8b4d01a5847
Ports: 8080/TCP, 50000/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Tue, 18 Dec 2018 07:16:32 -0800
Ready: True
Restart Count: 0
Environment:
JAVA_OPTS: -Djenkins.install.runSetupWizard=false
Mounts:
/var/jenkins_home from jenkins-home (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wqph5 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
jenkins-home:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-wqph5:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-wqph5
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
[administrator#abcdefgh ~]$ kubectl describe svc jenkins-ui
Name: jenkins-ui
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=jenkins-master
Type: NodePort
IP: 10.97.--.--
Port: ui 8080/TCP
TargetPort: 8080/TCP
NodePort: ui 31587/TCP
Endpoints: 10.244.--.--:8080
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
# Check if NodePort along with Kubernetes ports are open
[administrator#abcdefgh ~]$ sudo su root
[root#abcdefgh administrator]# systemctl start firewalld
[root#abcdefgh administrator]# firewall-cmd --permanent --add-port=6443/tcp # Kubernetes API Server
Warning: ALREADY_ENABLED: 6443:tcp
success
[root#abcdefgh administrator]# firewall-cmd --permanent --add-port=2379-2380/tcp # etcd server client API
Warning: ALREADY_ENABLED: 2379-2380:tcp
success
[root#abcdefgh administrator]# firewall-cmd --permanent --add-port=10250/tcp # Kubelet API
Warning: ALREADY_ENABLED: 10250:tcp
success
[root#abcdefgh administrator]# firewall-cmd --permanent --add-port=10251/tcp # kube-scheduler
Warning: ALREADY_ENABLED: 10251:tcp
success
[root#abcdefgh administrator]# firewall-cmd --permanent --add-port=10252/tcp # kube-controller-manager
Warning: ALREADY_ENABLED: 10252:tcp
success
[root#abcdefgh administrator]# firewall-cmd --permanent --add-port=10255/tcp # Read-Only Kubelet API
Warning: ALREADY_ENABLED: 10255:tcp
success
[root#abcdefgh administrator]# firewall-cmd --permanent --add-port=31587/tcp # NodePort of jenkins-ui service
Warning: ALREADY_ENABLED: 31587:tcp
success
[root#abcdefgh administrator]# firewall-cmd --reload
success
[administrator#abcdefgh ~]$ kubectl cluster-info
Kubernetes master is running at https://10.41.--.--:6443
KubeDNS is running at https://10.41.--.--:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
[administrator#hgfedcba ~]$ curl 10.41.--.--:8080
curl: (7) Failed connect to 10.41.--.--:8080; Connection refused
# Successfully curl jenkins app using its service IP from the worker node
[administrator#hgfedcba ~]$ curl 10.97.--.--:8080
<!DOCTYPE html><html><head resURL="/static/5882d14a" data-rooturl="" data-resurl="/static/5882d14a">
<title>Dashboard [Jenkins]</title><link rel="stylesheet" ...
...
Would you know how to do that? Happy to provide additional logs. Also, I have installed jenkins from yum on another similar machine without any docker or kubernetes and it's possible to access it through 10.20.30.40:8080 in my browser so there is no provider firewall preventing me from doing that.
Your Jenkins Service is of type NodePort. That means that a specific port number, on any node within your cluster, will deliver your Jenkins UI.
When you described your Service, you can see that the port assigned was 31587.
You should be able to browse to http://SOME_IP:31587

Kube flannel in CrashLoopBackOff status

We just start to create our cluster on kubernetes.
Now we try to deploy tiller but we have en error:
NetworkPlugin cni failed to set up pod
"tiller-deploy-64c9d747bd-br9j7_kube-system" network: open
/run/flannel/subnet.env: no such file or directory
After that I call:
kubectl get pods --all-namespaces -o wide
And got response:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
kube-system coredns-78fcdf6894-ksdvt 1/1 Running 2 7d 192.168.0.4 kube-master <none>
kube-system coredns-78fcdf6894-p4l9q 1/1 Running 2 7d 192.168.0.5 kube-master <none>
kube-system etcd-kube-master 1/1 Running 2 7d 10.168.209.20 kube-master <none>
kube-system kube-apiserver-kube-master 1/1 Running 2 7d 10.168.209.20 kube-master <none>
kube-system kube-controller-manager-kube-master 1/1 Running 2 7d 10.168.209.20 kube-master <none>
kube-system kube-flannel-ds-amd64-42rl7 0/1 CrashLoopBackOff 2135 7d 10.168.209.17 node5 <none>
kube-system kube-flannel-ds-amd64-5fx2p 0/1 CrashLoopBackOff 2164 7d 10.168.209.14 node2 <none>
kube-system kube-flannel-ds-amd64-6bw5g 0/1 CrashLoopBackOff 2166 7d 10.168.209.15 node3 <none>
kube-system kube-flannel-ds-amd64-hm826 1/1 Running 1 7d 10.168.209.20 kube-master <none>
kube-system kube-flannel-ds-amd64-thjps 0/1 CrashLoopBackOff 2160 7d 10.168.209.16 node4 <none>
kube-system kube-flannel-ds-amd64-w99ch 0/1 CrashLoopBackOff 2166 7d 10.168.209.13 node1 <none>
kube-system kube-proxy-d6v2n 1/1 Running 0 7d 10.168.209.13 node1 <none>
kube-system kube-proxy-lcckg 1/1 Running 0 7d 10.168.209.16 node4 <none>
kube-system kube-proxy-pgblx 1/1 Running 1 7d 10.168.209.20 kube-master <none>
kube-system kube-proxy-rnqq5 1/1 Running 0 7d 10.168.209.14 node2 <none>
kube-system kube-proxy-wc959 1/1 Running 0 7d 10.168.209.15 node3 <none>
kube-system kube-proxy-wfqqs 1/1 Running 0 7d 10.168.209.17 node5 <none>
kube-system kube-scheduler-kube-master 1/1 Running 2 7d 10.168.209.20 kube-master <none>
kube-system kubernetes-dashboard-6948bdb78-97qcq 0/1 ContainerCreating 0 7d <none> node5 <none>
kube-system tiller-deploy-64c9d747bd-br9j7 0/1 ContainerCreating 0 45m <none> node4 <none>
We have some flannel pods in CrashLoopBackOff status. For example kube-flannel-ds-amd64-42rl7.
When I call:
kubectl describe pod -n kube-system kube-flannel-ds-amd64-42rl7
I've got status Running:
Name: kube-flannel-ds-amd64-42rl7
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: node5/10.168.209.17
Start Time: Wed, 22 Aug 2018 16:47:10 +0300
Labels: app=flannel
controller-revision-hash=911701653
pod-template-generation=1
tier=node
Annotations: <none>
Status: Running
IP: 10.168.209.17
Controlled By: DaemonSet/kube-flannel-ds-amd64
Init Containers:
install-cni:
Container ID: docker://eb7ee47459a54d401969b1770ff45b39dc5768b0627eec79e189249790270169
Image: quay.io/coreos/flannel:v0.10.0-amd64
Image ID: docker-pullable://quay.io/coreos/flannel#sha256:88f2b4d96fae34bfff3d46293f7f18d1f9f3ca026b4a4d288f28347fcb6580ac
Port: <none>
Host Port: <none>
Command:
cp
Args:
-f
/etc/kube-flannel/cni-conf.json
/etc/cni/net.d/10-flannel.conflist
State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 22 Aug 2018 16:47:24 +0300
Finished: Wed, 22 Aug 2018 16:47:24 +0300
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/etc/cni/net.d from cni (rw)
/etc/kube-flannel/ from flannel-cfg (rw)
/var/run/secrets/kubernetes.io/serviceaccount from flannel-token-9wmch (ro)
Containers:
kube-flannel:
Container ID: docker://521b457c648baf10f01e26dd867b8628c0f0a0cc0ea416731de658e67628d54e
Image: quay.io/coreos/flannel:v0.10.0-amd64
Image ID: docker-pullable://quay.io/coreos/flannel#sha256:88f2b4d96fae34bfff3d46293f7f18d1f9f3ca026b4a4d288f28347fcb6580ac
Port: <none>
Host Port: <none>
Command:
/opt/bin/flanneld
Args:
--ip-masq
--kube-subnet-mgr
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 30 Aug 2018 10:15:04 +0300
Finished: Thu, 30 Aug 2018 10:15:08 +0300
Ready: False
Restart Count: 2136
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 100m
memory: 50Mi
Environment:
POD_NAME: kube-flannel-ds-amd64-42rl7 (v1:metadata.name)
POD_NAMESPACE: kube-system (v1:metadata.namespace)
Mounts:
/etc/kube-flannel/ from flannel-cfg (rw)
/run from run (rw)
/var/run/secrets/kubernetes.io/serviceaccount from flannel-token-9wmch (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
run:
Type: HostPath (bare host directory volume)
Path: /run
HostPathType:
cni:
Type: HostPath (bare host directory volume)
Path: /etc/cni/net.d
HostPathType:
flannel-cfg:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-flannel-cfg
Optional: false
flannel-token-9wmch:
Type: Secret (a volume populated by a Secret)
SecretName: flannel-token-9wmch
Optional: false
QoS Class: Guaranteed
Node-Selectors: beta.kubernetes.io/arch=amd64
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/unreachable:NoExecute
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 51m (x2128 over 7d) kubelet, node5 Container image "quay.io/coreos/flannel:v0.10.0-amd64" already present on machine
Warning BackOff 1m (x48936 over 7d) kubelet, node5 Back-off restarting failed container
here kube-controller-manager.yaml:
apiVersion: v1
kind: Pod
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --address=127.0.0.1
- --allocate-node-cidrs=true
- --cluster-cidr=192.168.0.0/24
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --leader-elect=true
- --node-cidr-mask-size=24
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --use-service-account-credentials=true
image: k8s.gcr.io/kube-controller-manager-amd64:v1.11.2
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10252
scheme: HTTP
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-controller-manager
resources:
requests:
cpu: 200m
volumeMounts:
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /etc/kubernetes/controller-manager.conf
name: kubeconfig
readOnly: true
- mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
name: flexvolume-dir
- mountPath: /etc/pki
name: etc-pki
readOnly: true
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
hostNetwork: true
priorityClassName: system-cluster-critical
volumes:
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/kubernetes/controller-manager.conf
type: FileOrCreate
name: kubeconfig
- hostPath:
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
type: DirectoryOrCreate
name: flexvolume-dir
- hostPath:
path: /etc/pki
type: DirectoryOrCreate
name: etc-pki
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
status: {}
OS is CentOS Linux release 7.5.1804
logs from one of pods:
# kubectl logs --namespace kube-system kube-flannel-ds-amd64-5fx2p
main.go:475] Determining IP address of default interface
main.go:488] Using interface with name eth0 and address 10.168.209.14
main.go:505] Defaulting external address to interface address (10.168.209.14)
kube.go:131] Waiting 10m0s for node controller to sync
kube.go:294] Starting kube subnet manager
kube.go:138] Node controller sync successful
main.go:235] Created subnet manager: Kubernetes Subnet Manager - node2
main.go:238] Installing signal handlers
main.go:353] Found network config - Backend type: vxlan
vxlan.go:120] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
main.go:280] Error registering network: failed to acquire lease: node "node2" pod cidr not assigned
main.go:333] Stopping shutdownHandler...
Where error is?
For flannel to work correctly, you must pass --pod-network-cidr=10.244.0.0/16 to kubeadm init.
Try this:
Failed to acquire lease simply means, the pod didn't get the podCIDR. Happened with me as well although the manifest on master-node says podCIDR true but still it wasn't working and funnel going in crashbackloop.
This is what i did to fix it.
From the master-node, first find out your funnel CIDR
sudo cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep -i cluster-cidr
Output:
- --cluster-cidr=172.168.10.0/24
Then run the following from the master node:
kubectl patch node slave-node-1 -p '{"spec":{"podCIDR":"172.168.10.0/24"}}'
where,
slave-node-1 is your node where acquire lease is failing
podCIDR is the cidr that you found in previous command
Hope this helps.
The reason is that
flannel run with CIDR=10.244.0.0/16 NOT 10.244.0.0/24 !!!
CNI Conflicts because the node installed multiple CNIs Plugin within /etc/cni/net.d/.
The 2 Interface flannel.1 and cni0 did not match each other.
Eg:
flannel.1=10.244.0.0 and cni0=10.244.1.1 will failed. It should be
flannel.1=10.244.0.0 and cni0=10.244.0.1
To fix this, please following the step below:
Step 0: Reset all Nodes within your Cluster. Run all nodes with
kubeadm reset --force;
Step 1: Down Interface cni0 and flannel.1.
sudo ifconfig cni0 down;
sudo ifconfig flannel.1 down;
Step 2: Delete Interface cni0 and flannel.1.
sudo ip link delete cni0;
sudo ip link delete flannel.1;
Step 3: Remove all items within /etc/cni/net.d/.
sudo rm -rf /etc/cni/net.d/;
Step 4: Re-Bootstrap your Kubernetes Cluster again.
kubeadm init --control-plane-endpoint="..." --pod-network-cidr=10.244.0.0/16;
Step 5: Re-deploy CNIs.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml;
Step 6: Restart your CNIs, here I used Container Daemon (Containerd).
systemctl restart containerd;
This will ensure your Core-DNS working nicely.
I had a similar problem. I did the following steps to make it work:
Delete the nodes from the master by kubeadm reset on the worker node.
Clear the iptables rules by iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X.
Clerar the config file by rm -rf $HOME/.kube/config.
Reboot the worker node.
Disable the Swap on the worker node by swapoff -a.
Join the master node, again.
And also ensure SELinux set to Permissive or disabled.
# getenforce
Permissive
Had the same issue. When followed the solution mentioned by #PanDe, I got the following error.
[root#xxxxxx]# kubectl patch node myslavenode -p '{"spec":{"podCIDR":"10.244.0.0/16"}}'
The Node "myslavenode" is invalid:
spec.podCIDRs: Forbidden: node updates may not change podCIDR except from "" to valid
[]: Forbidden: node updates may only change labels, taints, or capacity (or configSource, if the DynamicKubeletConfig feature gate is enabled).
In the end, when selinux was checked,it was enabled. Setting it to permissive resolved the issue. Thanks #senthil murugan.
Regards,
Vivek