kubernetes api server not automatically start after master reboots - kubernetes

I have setup a small cluster with kubeadm, it was working fine and 6443 port was up. But after rebooting my system, the cluster is not getting up anymore.
What should I do?
Here is some information:
systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2020-04-05 14:16:44 UTC; 6s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 31079 (kubelet)
Tasks: 20 (limit: 4915)
CGroup: /system.slice/kubelet.service
└─31079 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet
k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Get https://infra01.mydomainname.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dtest-infra01&limit=500&resourceVersion=0: dial tcp 116.66.187.210:6443: connect: connection refused
kubectl get nodes
The connection to the server infra01.mydomainname.com:6443 was refused - did you specify the right host or port?
kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:12:12Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
journalctl -xeu kubelet
6 18167 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458:
Failed to list *v1.Node: Get https://infra01.mydomainname.com
1 18167 reflector.go:153]
k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://huawei-infra01.s
4 18167 aws_credentials.go:77] while getting AWS credentials
NoCredentialProviders: no valid providers in chain. Deprecated.
messaging see aws.Config.CredentialsChainVerboseErrors
6 18167 kuberuntime_manager.go:211] Container runtime docker initialized,
version: 19.03.7, apiVersion: 1.40.0
6 18167 server.go:1113] Started kubelet
1 18167 kubelet.go:1302] Image garbage collection failed once. Stats
initialization may not have completed yet: failed to get imageF
8 18167 server.go:144] Starting to listen on 0.0.0.0:10250
4 18167 server.go:778] Starting healthz server failed: listen tcp
127.0.0.1:10248: bind: address already in use
5 18167 fs_resource_analyzer.go:64] Starting FS ResourceAnalyzer
4 18167 volume_manager.go:265] Starting Kubelet Volume Manager
1 18167 desired_state_of_world_populator.go:138] Desired state populator
starts to run
3 18167 server.go:384] Adding debug handlers to kubelet server.
4 18167 server.go:158] listen tcp 0.0.0.0:10250: bind: address already in
use
Docker
docker run hello-world
Hello from Docker!
ubuntu
lsb_release -a
Ubuntu 18.04.2 LTS
swap && kubeconfig
swap is turned off and kubeconfig was correctly exported
Note
Things can be fixed by resetting the cluster, but this should be the final option.

Kubelet is not started because of port already in use and hence not able to create pod for api server.
Use following command to find out which process is holding the port 10250
root#master admin]# ss -lntp | grep 10250
LISTEN 0 128 :::10250 :::* users:(("kubelet",pid=23373,fd=20))
It will give you PID of that process and name of that process. If it is unwanted process which is holding the port, you can always kill the process and that port becomes available to use by kubelet.
After killing the process again run the above command, it should return no value.
Just to be on safe side run kubeadm reset and then run kubeadm init and it should go through
Edit:
Using snap stop kubelet did the trick of stopping kubelet on the node.

Related

Unable to start Kube cluster

I am trying to setup the kube cluster using Oracle VM Virtual Box. The command kubeadm is failing to start the cluster.
It waits on below:
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
Then fails because of below:
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
OS: Ubuntu 16.04-xenial Docker version: 18.09.7 Kube version:
Kubernetes v1.23.5 Cluster type: Flannel
OS: Ubuntu 16.04-xenial Docker version: 20.10.7 Kube version:
Kubernetes v1.23.5 Cluster type: Calico
What I tried so far, with help of Google:
turn off swap - which was already done
combinations of kube-docker as above
restarting kubelet service
other bits I do not remember.
ensured that the static ips have been allocated, and other
prerequisites.
Can anyone assist? I am new to Kube.

kubernetes worker node in "NotReady" status

I am trying to setup my first cluster using Kubernetes 1.13.1. The master got initialized okay, but both of my worker nodes are NotReady. kubectl describe node shows that Kubelet stopped posting node status on both worker nodes. On one of the worker nodes I get log output like
> kubelet[3680]: E0107 20:37:21.196128 3680 kubelet.go:2266] node
> "xyz" not found.
Here is the full details:
I am using Centos 7 & Kubernetes 1.13.1.
Initializing was done as follows:
[root#master ~]# kubeadm init --apiserver-advertise-address=10.142.0.4 --pod-network-cidr=10.142.0.0/24
Successfully initialized the cluster:
You can now join any number of machines by running the following on each node
as root:
`kubeadm join 10.142.0.4:6443 --token y0epoc.zan7yp35sow5rorw --discovery-token-ca-cert-hash sha256:f02d43311c2696e1a73e157bda583247b9faac4ffb368f737ee9345412c9dea4`
deployed the flannel CNI:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
The join command worked fine.
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "node01" as an annotation
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
Result of kubectl get nodes:
[root#master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 9h v1.13.1
node01 NotReady <none> 9h v1.13.1
node02 NotReady <none> 9h v1.13.1
on both nodes:
[root#node01 ~]# service kubelet status
Redirecting to /bin/systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Tue 2019-01-08 04:49:20 UTC; 32s ago
Docs: https://kubernetes.io/docs/
Main PID: 4224 (kubelet)
Memory: 31.3M
CGroup: /system.slice/kubelet.service
└─4224 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfi
`Jan 08 04:54:10 node01 kubelet[4224]: E0108 04:54:10.957115 4224 kubelet.go:2266] node "node01" not found`
I appreciate your advise on how to troubleshoot this.
The previous answer sounds correct. You can verify that by running
kubectl describe node node01 on the master, or wherever kubectl is correctly configured.
It seems like the reason of this error is due to incorrect subnet. In Flannel documentation it is written that you should use /16 not /24 for pod network.
NOTE: If kubeadm is used, then pass --pod-network-cidr=10.244.0.0/16
to kubeadm init to ensure that the podCIDR is set.
I tried to run kubeadm with /24 and although I had nodes in Ready state the flannel pods did not run properly which resulted in some issues.
You can check if your flannel pods are running properly by:
kubectl get pods -n kube-system if the status is other than running then it is incorrect behavior. In this case you can check details by running kubectl describe pod PODNAME -n kube-system. Try changing the subnet and update us if that fixed the problem.
I ran into almost the same problem, and in the end I found that the reason was that the firewall was not turned off. You can try the following commands:
sudo ufw disable
or
systemctl disable firewalld
or
setenforce 0

How to debug when Kubernetes nodes are in 'Not Ready' state

I initialized the master node and add 2 worker nodes, but only master and one of the worker node show up when I run the following command:
kubectl get nodes
also, both these nodes are in 'Not Ready' state.
What are the steps should I take to understand what the problem could be?
I can ping all the nodes from each of the other nodes.
The version of Kubernetes is 1.8.
OS is Cent OS 7
I used the following repo to install Kubernetes:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes] name=Kubernetes
baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
EOF
yum install kubelet kubeadm kubectl kubernetes-cni
First, describe nodes and see if it reports anything:
$ kubectl describe nodes
Look for conditions, capacity and allocatable:
Conditions:
Type Status
---- ------
OutOfDisk False
MemoryPressure False
DiskPressure False
Ready True
Capacity:
cpu: 2
memory: 2052588Ki
pods: 110
Allocatable:
cpu: 2
memory: 1950188Ki
pods: 110
If everything is alright here, SSH into the node and observe kubelet logs to see if it reports anything. Like certificate erros, authentication errors etc.
If kubelet is running as a systemd service, you can use
$ journalctl -u kubelet
Steps to debug:-
In case you face any issue in kubernetes, first step is to check if kubernetes self applications are running fine or not.
Command to check:- kubectl get pods -n kube-system
If you see any pod is crashing, check it's logs
if getting NotReady state error, verify network pod logs.
if not able to resolve with above, follow below steps:-
kubectl get nodes # Check which node is not in ready state
kubectl describe node nodename #nodename which is not in readystate
ssh to that node
execute systemctl status kubelet # Make sure kubelet is running
systemctl status docker # Make sure docker service is running
journalctl -u kubelet # To Check logs in depth
Most probably you will get to know about error here, After fixing it reset kubelet with below commands:-
systemctl daemon-reload
systemctl restart kubelet
In case you still didn't get the root cause, check below things:-
Make sure your node has enough space and memory. Check for /var directory space especially.
command to check: -df -kh, free -m
Verify cpu utilization with top command. and make sure any process is not taking an unexpected memory.
I was having similar issue because of a different reason:
Error:
cord#node1:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master 17h v1.13.5
node2 Ready <none> 17h v1.13.5
node3 NotReady <none> 9m48s v1.13.5
cord#node1:~$ kubectl describe node node3
Name: node3
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
Ready False Thu, 18 Apr 2019 01:15:46 -0400 Thu, 18 Apr 2019 01:03:48 -0400 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
InternalIP: 192.168.2.6
Hostname: node3
cord#node3:~$ journalctl -u kubelet
Apr 18 01:24:50 node3 kubelet[54132]: W0418 01:24:50.649047 54132 cni.go:149] Error loading CNI config list file /etc/cni/net.d/10-calico.conflist: error parsing configuration list: no 'plugins' key
Apr 18 01:24:50 node3 kubelet[54132]: W0418 01:24:50.649086 54132 cni.go:203] Unable to update cni config: No valid networks found in /etc/cni/net.d
Apr 18 01:24:50 node3 kubelet[54132]: E0418 01:24:50.649402 54132 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 18 01:24:55 node3 kubelet[54132]: W0418 01:24:55.650816 54132 cni.go:149] Error loading CNI config list file /etc/cni/net.d/10-calico.conflist: error parsing configuration list: no 'plugins' key
Apr 18 01:24:55 node3 kubelet[54132]: W0418 01:24:55.650845 54132 cni.go:203] Unable to update cni config: No valid networks found in /etc/cni/net.d
Apr 18 01:24:55 node3 kubelet[54132]: E0418 01:24:55.651056 54132 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 18 01:24:57 node3 kubelet[54132]: I0418 01:24:57.248519 54132 setters.go:72] Using node IP: "192.168.2.6"
Issue:
My file: 10-calico.conflist was incorrect. Verified it from a different node and from sample file in the same directory "calico.conflist.template".
Resolution:
Changing the file, "10-calico.conflist" and restarting the service using "systemctl restart kubelet", resolved my issue:
NAME STATUS ROLES AGE VERSION
node1 Ready master 18h v1.13.5
node2 Ready <none> 18h v1.13.5
node3 Ready <none> 48m v1.13.5
I recently started using VMWare Octant https://github.com/vmware-tanzu/octant. This is a better UI than the Kubernetes Dashboard. You can view the Kubernetes cluster and look at the details of the cluster and the PODS. This will allow you to check the logs and open a terminal into the POD(s).
I found applying the network and rebooting both the nodes did the trick for me.
kubectl apply -f [podnetwork].yaml
I recently had this issue and checking out the known-issues from kind website here https://kind.sigs.k8s.io/docs/user/known-issues/ it would tell you specifically the main problem mostly comes from the lack of memory allocated to docker. They actually advice to allocate 8GB to docker, I allocated 6GB up from 3GB and it worked fine for me this is kind version I am running atm
$ kind version
kind v0.10.0 go1.15.7 darwin/amd64
and this is docker version
$ docker version
Client:
Cloud integration: 1.0.17
Version: 20.10.8
API version: 1.41
Go version: go1.16.6
Git commit: 3967b7d
Built: Fri Jul 30 19:55:20 2021
OS/Arch: darwin/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.8
API version: 1.41 (minimum version 1.12)
Go version: go1.16.6
Git commit: 75249d8
Built: Fri Jul 30 19:52:10 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.9
GitCommit: e25210fe30a0a703442421b0f60afac609f950a3
runc:
Version: 1.0.1
GitCommit: v1.0.1-0-g4144b63
docker-init:
Version: 0.19.0
GitCommit: de40ad0
I hope this helps you or anyone facing the same issue.
and here is the output from kind
$ k get node
NAME STATUS ROLES AGE VERSION
test2-control-plane Ready control-plane,master 4m42s v1.20.2

using network plugins "cni": cni config unintialized; Skipping pod

I created the kubernetes cluster by using kubeadm kubeadm init.
I am getting error messages in /var/log/messages.
Oct 20 10:09:52 aws08 kubelet: I1020 10:09:52.015921 7116
docker_manager.go:1787] DNS ResolvConfPath exists:
/var/lib/docker/containers/717adf7a8481637ac20a9ba103d8f97635a88bf05f18bd4299f0d164e48f2920/resolv.conf.
Will attempt to add ndots option: options ndots:5 Oct 20 10:09:52
aws08 kubelet: I1020 10:09:52.015963 7116 docker_manager.go:2121]
Calling network plugin cni to setup pod for
kube-dns-2247936740-cjij4_kube-system(3b296413-96aa-11e6-8c40-02fff663a168)
Oct 20 10:09:52 aws08 kubelet: E1020 10:09:52.015982 7116
docker_manager.go:2127] Failed to setup network for pod
"kube-dns-2247936740-cjij4_kube-system(3b296413-96aa-11e6-8c40-02fff663a168)"
using network plugins "cni": cni config unintialized; Skipping pod Oct
20 10:09:52 aws08 kubelet: I1020 10:09:52.018824 7116
docker_manager.go:1492] Killing container
"717adf7a8481637ac20a9ba103d8f97635a88bf05f18bd4299f0d164e48f2920
kube-system/kube-dns-2247936740-cjij4" with 30 second grace period
The DNS pod is failing:
kube-system kube-dns-2247936740-j5rtc 0/3 ContainerCreating 21 1h
If I disabled CNI, the DNS pod is running. But the issue for DNS persists.
The method to disable cni is to comment the KUBELET_NETWORK_ARGS line in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and restart kubelet service
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
# Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=100.64.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_EXTRA_ARGS=--v=4"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_EXTRA_ARGS
followed by:
sudo systemctl restart kubelet
I'm guessing that you forgot to setup the pod network.
From the documentation:
It is necessary to do this before you try to deploy any applications to your cluster, and before kube-dns will start up. Note also that kubeadm only supports CNI based networks and therefore kubenet based networks will not work.
You can install a pod network add-on with the following command:
kubectl apply -f <add-on.yaml>
Example:
kubectl create -f https://git.io/weave-kube
To install Weave Net add-on.
After you have done this, you might need to recreate kube-dns pod.
The cni intialization should be completed during kubelet initialization. So try reboot kubelet service and make sure that cni configuration can be parsed correctly.

Errors when running kubelet

I'm trying to start a kubelet on a fedora 24/lxc container, but getting an error which appears to be related to libvirt/iptables
Docker (installed using dnf/yum):
[root#node2 ~]# docker version
Client:
Version: 1.12.0
API version: 1.24
Go version: go1.6.3
Git commit: 8eab29e
Built:
OS/Arch: linux/amd64
Server:
Version: 1.12.0
API version: 1.24
Go version: go1.6.3
Git commit: 8eab29e
Built:
OS/Arch: linux/amd64
Kubernetes (downloaded v1.3.3 and extracted tar):
root#node2 bin]# ./kubectl version
Client Version: version.Info{
Major:"1", Minor:"3", GitVersion:"v1.3.3",
GitCommit:"c6411395e09da356c608896d3d9725acab821418",
GitTreeState:"clean", BuildDate:"2016-07-22T20:29:38Z",
GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Startup, params, and error:
[root#node2 bin]# ./kubelet --address=0.0.0.0 --api-servers=http://master1:8080 --container-runtime=docker --hostname-override=node1 --port=10250
I0802 17:43:04.264454 2348 docker.go:327] Start docker client with request timeout=2m0s
W0802 17:43:04.271850 2348 server.go:487] Could not load kubeconfig file /var/lib/kubelet/kubeconfig: stat /var/lib/kubelet/kubeconfig: no such file or directory. Trying auth path instead.
W0802 17:43:04.271906 2348 server.go:448] Could not load kubernetes auth path /var/lib/kubelet/kubernetes_auth: stat /var/lib/kubelet/kubernetes_auth: no such file or directory. Continuing with defaults.
I0802 17:43:04.272241 2348 manager.go:138] cAdvisor running in container: "/"
W0802 17:43:04.275956 2348 manager.go:146] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp 127.0.0.1:15441: getsockopt: connection refused
I0802 17:43:04.280283 2348 fs.go:139] Filesystem partitions: map[/dev/mapper/fedora_kg--fedora-root:{mountpoint:/ major:253 minor:0 fsType:ext4 blockSize:0}]
I0802 17:43:04.284868 2348 manager.go:192] Machine: {NumCores:4 CpuFrequency:3192789
MemoryCapacity:4125679616 MachineID:1e80444278b7442385a762b9545cec7b
SystemUUID:5EC24D56-9CA6-B237-EE21-E0899C3C16AB BootID:44212209-ff1d-4340-8433-11a93274d927
Filesystems:[{Device:/dev/mapper/fedora_kg--fedora-root
Capacity:52710469632 Type:vfs Inodes:3276800}]
DiskMap:map[8:0:{Name:sda Major:8 Minor:0 Size:85899345920 Scheduler:cfq}
253:0:{Name:dm-0 Major:253 Minor:0 Size:53687091200 Scheduler:none}
253:1:{Name:dm-1 Major:253 Minor:1 Size:4160749568 Scheduler:none}
253:2:{Name:dm-2 Major:253 Minor:2 Size:27518828544 Scheduler:none}
253:3:{Name:dm-3 Major:253 Minor:3 Size:107374182400 Scheduler:none}]
NetworkDevices:[
{Name:eth0 MacAddress:00:16:3e:b9:ce:f3 Speed:10000 Mtu:1500}
{Name:flannel.1 MacAddress:fa:ed:34:75:d6:1d Speed:0 Mtu:1450}]
Topology:[
{Id:0 Memory:4125679616
Cores:[{Id:0 Threads:[0]
Caches:[]} {Id:1 Threads:[1] Caches:[]}]
Caches:[{Size:8388608 Type:Unified Level:3}]}
{Id:1 Memory:0 Cores:[{Id:0 Threads:[2]
Caches:[]} {Id:1 Threads:[3] Caches:[]}]
Caches:[{Size:8388608 Type:Unified Level:3}]}]
CloudProvider:Unknown InstanceType:Unknown InstanceID:None}
I0802 17:43:04.285649 2348 manager.go:198]
Version: {KernelVersion:4.6.4-301.fc24.x86_64 ContainerOsVersion:Fedora 24 (Twenty Four)
DockerVersion:1.12.0 CadvisorVersion: CadvisorRevision:}
I0802 17:43:04.286366 2348 server.go:768] Watching apiserver
W0802 17:43:04.286477 2348 kubelet.go:561] Hairpin mode set to "promiscuous-bridge" but configureCBR0 is false, falling back to "hairpin-veth"
I0802 17:43:04.286575 2348 kubelet.go:384] Hairpin mode set to "hairpin-veth"
W0802 17:43:04.303188 2348 plugins.go:170] can't set sysctl net/bridge/bridge-nf-call-iptables: open /proc/sys/net/bridge/bridge-nf-call-iptables: no such file or directory
I0802 17:43:04.307700 2348 docker_manager.go:235] Setting dockerRoot to /var/lib/docker
I0802 17:43:04.310175 2348 server.go:730] Started kubelet v1.3.3
E0802 17:43:04.311636 2348 kubelet.go:933] Image garbage collection failed: unable to find data for container /
E0802 17:43:04.312800 2348 kubelet.go:994] Failed to start ContainerManager [open /proc/sys/kernel/panic: read-only file system, open /proc/sys/kernel/panic_on_oops: read-only file system, open /proc/sys/vm/overcommit_memory: read-only file system]
I0802 17:43:04.312962 2348 status_manager.go:123] Starting to sync pod status with apiserver
I0802 17:43:04.313080 2348 kubelet.go:2468] Starting kubelet main sync loop.
I0802 17:43:04.313187 2348 kubelet.go:2477] skipping pod synchronization - [Failed to start ContainerManager [open /proc/sys/kernel/panic: read-only file system, open /proc/sys/kernel/panic_on_oops: read-only file system, open /proc/sys/vm/overcommit_memory: read-only file system] network state unknown container runtime is down]
I0802 17:43:04.313525 2348 server.go:117] Starting to listen on 0.0.0.0:10250
I0802 17:43:04.315021 2348 volume_manager.go:216] Starting Kubelet Volume Manager
I0802 17:43:04.325998 2348 factory.go:228] Registering Docker factory
E0802 17:43:04.326049 2348 manager.go:240] Registration of the rkt container factory failed: unable to communicate with Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp 127.0.0.1:15441: getsockopt: connection refused
I0802 17:43:04.326073 2348 factory.go:54] Registering systemd factory
I0802 17:43:04.326545 2348 factory.go:86] Registering Raw factory
I0802 17:43:04.326993 2348 manager.go:1072] Started watching for new ooms in manager
I0802 17:43:04.331164 2348 oomparser.go:185] oomparser using systemd
I0802 17:43:04.331904 2348 manager.go:281] Starting recovery of all containers
I0802 17:43:04.368958 2348 manager.go:286] Recovery completed
I0802 17:43:04.419959 2348 kubelet.go:1185] Node node1 was previously registered
I0802 17:43:09.313871 2348 kubelet.go:2477] skipping pod synchronization - [Failed to start ContainerManager [open /proc/sys/kernel/panic: read-only file system, open /proc/sys/kernel/panic_on_oops: read-only file system, open /proc/sys/vm/overcommit_memory: read-only file system]]
Flannel (installed using dnf/yum):
root#node2 bin]# systemctl status flanneld
● flanneld.service - Flanneld overlay address etcd agent
Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2016-08-01 22:14:06 UTC; 21h ago
Process: 1203 ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker (code=exited, status=0/SUCCESS)
Main PID: 1195 (flanneld)
Tasks: 11 (limit: 512)
Memory: 2.7M
CPU: 4.012s
CGroup: /system.slice/flanneld.service
└─1195 /usr/bin/flanneld -etcd-endpoints=http://master1:2379 -etcd-prefix=/flannel/network
LXC settings for the container:
[root#kg-fedora node2]# cat config
# Template used to create this container: /usr/share/lxc/templates/lxc-fedora
# Parameters passed to the template:
# For additional config options, please look at lxc.container.conf(5)
# Uncomment the following line to support nesting containers:
#lxc.include = /usr/share/lxc/config/nesting.conf
# (Be aware this has security implications)
lxc.network.type = veth
lxc.network.link = virbr0
lxc.network.hwaddr = 00:16:3e:b9:ce:f3
lxc.network.flags = up
lxc.network.ipv4 = 192.168.122.23/24
lxc.network.ipv4.gateway = 192.168.80.2
# Include common configuration
lxc.include = /usr/share/lxc/config/fedora.common.conf
lxc.arch = x86_64
# When using LXC with apparmor, uncomment the next line to run unconfined:
#lxc.aa_profile = unconfined
# example simple networking setup, uncomment to enable
#lxc.network.type = veth
#lxc.network.flags = up
#lxc.network.link = lxcbr0
#lxc.network.name = eth0
# Additional example for veth network type
# static MAC address,
#lxc.network.hwaddr = 00:16:3e:77:52:20
# persistent veth device name on host side
# Note: This may potentially collide with other containers of same name!
#lxc.network.veth.pair = v-fedora-template-e0
lxc.cgroup.devices.allow = a
lxc.cap.drop =
lxc.rootfs = /var/lib/lxc/node2/rootfs
lxc.rootfs.backend = dir
lxc.utsname = node2
libvirt-1.3.3.2-1.fc24.x86_64:
[root#kg-fedora node2]# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2016-07-29 16:33:09 EDT; 3 days ago
Docs: man:libvirtd(8)
http://libvirt.org
Main PID: 1191 (libvirtd)
Tasks: 18 (limit: 512)
Memory: 7.3M
CPU: 9.108s
CGroup: /system.slice/libvirtd.service
├─1191 /usr/sbin/libvirtd
├─1597 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
└─1599 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
Flannel/Docker config:
[root#node2 ~]# systemctl stop docker
[root#node2 ~]# ip link delete docker0
[root#node2 ~]# systemctl start docker
[root#node2 ~]# ip -4 a|grep inet
inet 127.0.0.1/8 scope host lo
inet 10.100.72.0/16 scope global flannel.1
inet 172.17.0.1/16 scope global docker0
inet 192.168.122.23/24 brd 192.168.122.255 scope global dynamic eth0
Notice that the docker0 interface is not using the same ip range as the flannel.1 interface
Any pointers would be much appreciated!
For anyone who may look for the solution to this issue:
Since you are using LXC, you need to make sure that the filesystem in question is mounted as rw. This it is needed to specify the following option in the config file for LXC:
raw.lxc: "lxc.apparmor.profile=unconfined\nlxc.cap.drop= \nlxc.cgroup.devices.allow=a\nlxc.mount.auto=proc:rw sys:rw"
or just
lxc.mount.auto: proc:rw sys:rw
Here are the references:
https://medium.com/#kvaps/run-kubernetes-in-lxc-container-f04aa94b6c9c
https://github.com/corneliusweig/kubernetes-lxd