I'm trying to provision a new node pool using gvisor sandboxing in GKE. I use the GCP web console to add a new node pool, use the cos_containerd OS and check the Enable gvisor Sandboxing checkbox, but the node pool provisioning fails each time with an "Unknown Error" in the GCP console notifications. The nodes never join the K8S cluster.
The GCE VM seems to boot fine and when I look in the journalctl for the node I see that cloud-init seems to have finished just fine, but the kubelet doesn't seem to be able to start. I see error messages like this:
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:07.184163 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:07.284735 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:07.385229 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:07.485626 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:07.522961 1143 eviction_manager.go:251] eviction manager: failed to get summary stats: failed to get node info: node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz containerd[976]: time="2020-10-12T16:58:07.576735750Z" level=error msg="Failed to load cni configuration" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:07.577353 1143 kubelet.go:2191] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:07.587824 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:07 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:07.989869 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:08 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:08.090287 1143
Oct 12 16:58:09 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:09.296365 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:09 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:09.396933 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:09 gke-main-sanboxes-dd9b8d84-dmzz node-problem-detector[1166]: F1012 16:58:09.449446 2481 main.go:71] cannot create certificate signing request: Post https://172.17.0.2/apis/certificates.k8s.io/v1beta1/certificatesigningrequests?timeout=5m0s: dial tcp 172.17.0.2:443: connect: no route
Oct 12 16:58:09 gke-main-sanboxes-dd9b8d84-dmzz node-problem-detector[1166]: E1012 16:58:09.450695 1166 manager.go:162] failed to update node conditions: Patch https://172.17.0.2/api/v1/nodes/gke-main-sanboxes-dd9b8d84-dmzz/status: getting credentials: exec: exit status 1
Oct 12 16:58:09 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:09.453825 2486 cache.go:125] failed reading existing private key: open /var/lib/kubelet/pki/kubelet-client.key: no such file or directory
Oct 12 16:58:09 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:09.543449 1143 kubelet.go:2271] node "gke-main-sanboxes-dd9b8d84-dmzz" not found
Oct 12 16:58:09 gke-main-sanboxes-dd9b8d84-dmzz kubelet[1143]: E1012 16:58:09.556623 2486 tpm.go:124] failed reading AIK cert: tpm2.NVRead(AIK cert): decoding NV_ReadPublic response: handle 1, error code 0xb : the handle is not correct for the use
I am not really sure what might be causing that, and I'd really like to be able to use autoscaling with this node pool, so I don't want to just fix it manually for this node and have to do so for any new nodes that join. How can I configure the node pool such that the gvisor based nodes provision fine on their own?
My cluster details:
GKE version: 1.17.9-gke.6300
Cluster type: Regional
VPC-native
Private cluster
Shielded GKE Nodes
You can report issues with Google products by following below link:
Cloud.google.com: Support: Docs: Issue Trackers
You will need to choose the: Create new Google Kubernetes Engine issue under Compute section.
I can confirm that I stumbled upon the same issue when creating a cluster as described in the question (private, shielded, etc.):
Create a cluster with one node pool.
Add the node pool with gvisor enabled after the cluster successfully created.
Creating cluster like above will push the GKE cluster to RECONCILING state:
NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
gke-gvisor europe-west3 1.17.9-gke.6300 XX.XXX.XXX.XXX e2-medium 1.17.9-gke.6300 6 RECONCILING
The changes in the cluster state:
Provisoning - creating the cluster
Running - created the cluster
Reconciling - added the node pool
Running - the node pool was added (for about a minute)
Reconciling - the cluster went into that state for about 25 minutes
GCP Cloud Console (Web UI) reports: Repairing Cluster
Related
Running Openshift 4.1 on K8s v1.13.4. I'm trying to add a second network (for NFS storage) to my compute nodes, and as soon as I do, the node stops reporting NodeReady.
See below logs from kubelet. Completely lost.. How can I add another interface to my nodes?
v1.13.4
FieldPath:""}): type: 'Normal' reason: 'NodeReady' Node compute-0 status is now: NodeReady
Jun 26 05:41:22 compute-0 hyperkube[923]: E0626 05:41:22.367174 923 kubelet_node_status.go:380] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/addresses\":[{\"type\":\"ExternalIP\"},{\"type\":\"InternalIP\"},{\"type\":\"ExternalIP\"},{\"type\":\"InternalIP\"},{\"type\":\"Hostname\"}],\"$setElementOrder/conditions\":[{\"type\":\"MemoryPressure\"},{\"type\":\"Dis
...
Jun 26 05:41:22 compute-0 hyperkube[923]: [map[address:10.90.49.111 type:ExternalIP] map[type:ExternalIP address:10.90.51.94] map[address:10.90.49.111 type:InternalIP] map[address:10.90.51.94 type:InternalIP]]
Jun 26 05:41:22 compute-0 hyperkube[923]: doesn't match $setElementOrder list:
Resolution was to delete node from cluster kubectl delete node compute-0, reboot it, and let ignition rejoin it to the cluster.
This is a known bug
I am tring to install kubenetes on debian 9.3, I followed the instructions on this document https://kubernetes.io/docs/setup/independent/install-kubeadm/, it failed to create the cluster with timeout error, the commands I used are as follows:
export HTTP_PROXY=http://192.168.56.1:1080 # this is my internet proxy
export HTTPS_PROXY=http://192.168.56.1:1080
export NO_PROXY=127.0.0.1,192.168.56.*,10.244.*,10.96.*
kubeadm init --apiserver-advertise-address=192.168.56.101 --pod-network-cidr=10.244.0.0/16
the last command hangs up for 1hour and failed with timeout, I found that several container had been running by command docker ps, the running containers included kube-controller-manager-amd64,etcd-amd64,kube-apiserver-amd64,kube-scheduler-amd64,4 instances of pause-amd64.
the error messages are as follows
duler-debvm01_kube-system(660259102d57385a8043d025ac189c87)": Get https://192.168.56.101:6443/api/v1/namespaces/kube-system/pods/kube-scheduler-debvm01: net/http: TLS handshake timeout
Apr 06 21:44:49 DebVM01 kubelet[10665]: E0406 21:44:49.923017 10665 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:474: Failed to list *v1.Node: Get https://192.168.56.101:6443/api/v1/nodes?fieldSelector=metadata.name%3Ddebvm01&limit=500&resourceVersion=0: net/http: TLS handshake timeout
Apr 06 21:44:49 DebVM01 kubelet[10665]: E0406 21:44:49.924966 10665 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:465: Failed to list *v1.Service: Get https://192.168.56.101:6443/api/v1/services?limit=500&resourceVersion=0: net/http: TLS handshake timeout
Apr 06 21:44:49 DebVM01 kubelet[10665]: E0406 21:44:49.925892 10665 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get xxx/api/v1/pods?fieldSelector=spec.nodeName%3Ddebvm01&limit=500&resourceVersion=0: net/http: TLS handshake timeout
Apr 06 21:44:50 DebVM01 kubelet[10665]: E0406 21:44:50.029333 10665 eviction_manager.go:238] eviction manager: unexpected err: failed to get node info: node "debvm01" not found
Apr 06 21:44:50 DebVM01 kubelet[10665]: E0406 21:44:50.379543 10665 kubelet_node_status.go:106] Unable to register node "debvm01" with API server: Post xxx: net/http: TLS handshake timeout
Apr 06 21:44:52 DebVM01 kubelet[10665]: E0406 21:44:52.575452 10665 event.go:209] Unable to write event: 'Post xxxx: net/http: TLS handshake timeout' (may retry after sleeping)
Apr 06 21:44:57 DebVM01 kubelet[10665]: I0406 21:44:57.380498 10665 kubelet_node_status.go:273] Setting node annotation to enable volume controller attach/detach
Apr 06 21:44:57 DebVM01 kubelet[10665]: I0406 21:44:57.430059 10665 kubelet_node_status.go:82] Attempting to register node debvm01
Apr 06 21:45:00 DebVM01 kubelet[10665]: E0406 21:45:00.030635 10665 eviction_manager.go:238] eviction manager: unexpected err: failed to get node info: node "debvm01" not found
Apr 06 21:45:01 DebVM01 kubelet[10665]: I0406 21:45:01.484580 10665 kubelet_node_status.go:85] Successfully registered node debvm01
the above error messages has been processed and eliminated a lot of repeated lines as follows:
Apr 06 22:46:20 DebVM01 kubelet[10665]: E0406 22:46:20.773690 10665 kubelet.go:2104] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 06 22:46:25 DebVM01 kubelet[10665]: W0406 22:46:25.779141 10665 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Kubernetes v1.9.3
could anyone help me?
kubeadm init --apiserver-advertise-address=192.168.56.101
--pod-network-cidr=10.244.0.0/16
From kubeadm documentation:
--apiserver-advertise-address ip-address The IP address the API Server will advertise it's listening on. Specify '0.0.0.0' to use the
address of the default network interface.
Unless otherwise specified, kubeadm uses the default gateway’s network
interface to advertise the master’s IP. If you want to use a different
network interface, specify --apiserver-advertise-address=ip-address
From kubernetes api-server documentation:
--advertise-address ip-address The IP address on which to advertise the apiserver to members of the cluster. This address must
be reachable by the rest of the cluster. If blank, the --bind-address
will be used. If --bind-address is unspecified, the host's default
interface will be used.
I've done a couple of experiments which confirm that it is necessary for ip-address to be configured (or added as a secondary IP) to one of the master's instance interfaces.
Just double check if that interface is up.
The last error message,
network plugin is not ready: cni config uninitialized
means that kubernetes networking subsystem is absent or broken. Try to install/reinstall it with
kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
This part described in section "(3/4) Installing a pod network" in the document you've mentioned.
If you are stuck, try to reinstall your cluster following this manual.
I am trying to setup Kubernetes cluster using the instruction at https://coreos.com/kubernetes/docs/latest/getting-started.html.
I am in the step 2 (Deploy master) where when I start the master service, the master service is in active status but it cannot communicate with the API server. Also, there are 6 containers started but the logs are empty. Please find the kubelet log below:
Jan 26 07:54:18 kubernetes-1.novalocal systemd[1]: Started kubelet.service.
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: W0126 07:54:20.214551 1115 server.go:585] Could not load kubeconfig file /var/lib/kubelet/kubeconfig: stat /var/lib/kubelet/kubeconfig: no such file or directory. Trying auth path instead.
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: W0126 07:54:20.214631 1115 server.go:547] Could not load kubernetes auth path /var/lib/kubelet/kubernetes_auth: stat /var/lib/kubelet/kubernetes_auth: no such file or directory. Continuing with defaults.
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: I0126 07:54:20.217269 1115 plugins.go:71] No cloud provider specified.
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: I0126 07:54:20.219217 1115 manager.go:128] cAdvisor running in container: "/system.slice/kubelet.service"
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: I0126 07:54:20.672952 1115 fs.go:108] Filesystem partitions: map[/dev/vda9:{mountpoint:/ major:254 minor:9 fsType: blockSize:0} /dev/vda3:{mountpoint:/usr major:254 minor:3 fsType: blockSize:0} /dev/vda6:{mountpoi
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: I0126 07:54:20.856238 1115 manager.go:163] Machine: {NumCores:2 CpuFrequency:1999999 MemoryCapacity:4149022720 MachineID:5a493caa9327449cabd050ac6cd2e065 SystemUUID:5A493CAA-9327-449C-ABD0-50AC6CD2E065 BootID:541d
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: I0126 07:54:20.858067 1115 manager.go:169] Version: {KernelVersion:4.3.3-coreos-r2 ContainerOsVersion:CoreOS 899.5.0 DockerVersion:1.9.1 CadvisorVersion: CadvisorRevision:}
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: I0126 07:54:20.862564 1115 server.go:798] Adding manifest file: /etc/kubernetes/manifests
Jan 26 07:54:20 kubernetes-1.novalocal kubelet[1115]: I0126 07:54:20.862655 1115 server.go:808] Watching apiserver
Jan 26 07:54:21 kubernetes-1.novalocal kubelet[1115]: I0126 07:54:21.165506 1115 plugins.go:56] Registering credential provider: .dockercfg
Jan 26 07:54:21 kubernetes-1.novalocal kubelet[1115]: E0126 07:54:21.171563 1115 kubelet.go:2284] Error updating node status, will retry: error getting node "192.168.111.32": Get http://127.0.0.1:8080/api/v1/nodes/192.168.111.32: dial tcp 127.0.0.1:8080: connection r
Jan 26 07:54:21 kubernetes-1.novalocal kubelet[1115]: E0126 07:54:21.172329 1115 kubelet.go:2284] Error updating node status, will retry: error getting node "192.168.111.32": Get http://127.0.0.1:8080/api/v1/nodes/192.168.111.32: dial tcp 127.0.0.1:8080: connection r
Jan 26 07:54:21 kubernetes-1.novalocal kubelet[1115]: E0126 07:54:21.173114 1115 kubelet.go:2284] Error updating node status, will retry: error getting node "192.168.111.32": Get http://127.0.0.1:8080/api/v1/nodes/192.168.111.32: dial tcp 127.0.0.1:8080: connection refused
Also, the following are the containers launched.
2bf275350996 gcr.io/google_containers/podmaster:1.1 "/podmaster --etcd-se" 26 minutes ago Up 26 minutes k8s_controller-manager-elector.5b0f7cea_kube-podmaster-192.168.111.32_kube-system_3b8350635fe89ab366063da0be8969fd_1f370f8c
c64042286744 gcr.io/google_containers/podmaster:1.1 "/podmaster --etcd-se" 26 minutes ago Up 26 minutes k8s_scheduler-elector.bc3d71be_kube-podmaster-192.168.111.32_kube-system_3b8350635fe89ab366063da0be8969fd_c9ecb387
81bd74d0396a gcr.io/google_containers/hyperkube:v1.1.2 "/hyperkube proxy --m" 26 minutes ago Up 26 minutes k8s_kube-proxy.176f5569_kube-proxy-192.168.111.32_kube-system_8a987aa8c76c4d76bd80ccff5b65ffea_840d8228
39494ed8e814 gcr.io/google_containers/pause:0.8.0 "/pause" 27 minutes ago Up 27 minutes k8s_POD.6d00e006_kube-podmaster-192.168.111.32_kube-system_3b8350635fe89ab366063da0be8969fd_36b73b1d
632dc0a2f612 gcr.io/google_containers/pause:0.8.0 "/pause" 27 minutes ago Up 27 minutes k8s_POD.6d00e006_kube-apiserver-192.168.111.32_kube-system_86819bf93f678db0ee778b8c8bb658dc_815c6627
361b297b37f9 gcr.io/google_containers/pause:0.8.0 "/pause" 27 minutes ago Up 27 minutes k8s_POD.6d00e006_kube-proxy-192.168.111.32_kube-system_8a987aa8c76c4d76bd80ccff5b65ffea_7a6182ed
These are trying to talk to the insecure version of the API, which shouldn't work between machines. That will only work on the master. Additionally, the master isn't set up to accept work (register_node=false), so it is not expected to report back its status.
The key piece of info we're missing, what machine did that log come from?
Did you set the MASTER_HOST= parameter correctly?
The address of the master node. In most cases this will be the publicly routable IP of the node. Worker nodes must be able to reach the master node(s) via this address on port 443.
Also, note this section of the docs:
Note that the kubelet running on a master node may log repeated attempts to post its status to the API server. These warnings are expected behavior and can be ignored. Future Kubernetes releases plan to handle this common deployment consideration more gracefully.
I have installed K8S on OpenStack following this guide.
The installation went fine and I was able to run pods but after some time my applications stops working. I can still create pods but request won't reach the services from outside the cluster and also from within the pods. Basically, something in networking gets messed up. The iptables -L -vnt nat still shows the proper configuration but things won't work.
To make it working, I have to rebuild cluster, removing all services and replication controllers doesn't work.
I tried to look into the logs. Below is the journal for kube-proxy:
Dec 20 02:12:18 minion01.novalocal systemd[1]: Started Kubernetes Proxy.
Dec 20 02:15:52 minion01.novalocal kube-proxy[1030]: I1220 02:15:52.269784 1030 proxier.go:487] Opened iptables from-containers public port for service "default/opensips:sipt" on TCP port 5060
Dec 20 02:15:52 minion01.novalocal kube-proxy[1030]: I1220 02:15:52.278952 1030 proxier.go:498] Opened iptables from-host public port for service "default/opensips:sipt" on TCP port 5060
Dec 20 03:05:11 minion01.novalocal kube-proxy[1030]: W1220 03:05:11.806927 1030 api.go:224] Got error status on WatchEndpoints channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested index is outdated and cleared (the requested history has been cleared [1433/544]) [2432] Reason: Details:<nil> Code:0}
Dec 20 03:06:08 minion01.novalocal kube-proxy[1030]: W1220 03:06:08.177225 1030 api.go:153] Got error status on WatchServices channel: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:} Status:Failure Message:401: The event in requested index is outdated and cleared (the requested history has been cleared [1476/207]) [2475] Reason: Details:<nil> Code:0}
..
..
..
Dec 20 16:01:23 minion01.novalocal kube-proxy[1030]: E1220 16:01:23.448570 1030 proxier.go:161] Failed to ensure iptables: error creating chain "KUBE-PORTALS-CONTAINER": fork/exec /usr/sbin/iptables: too many open files:
Dec 20 16:01:23 minion01.novalocal kube-proxy[1030]: W1220 16:01:23.448749 1030 iptables.go:203] Error checking iptables version, assuming version at least 1.4.11: %vfork/exec /usr/sbin/iptables: too many open files
Dec 20 16:01:23 minion01.novalocal kube-proxy[1030]: E1220 16:01:23.448868 1030 proxier.go:409] Failed to install iptables KUBE-PORTALS-CONTAINER rule for service "default/kubernetes:"
Dec 20 16:01:23 minion01.novalocal kube-proxy[1030]: E1220 16:01:23.448906 1030 proxier.go:176] Failed to ensure portal for "default/kubernetes:": error checking rule: fork/exec /usr/sbin/iptables: too many open files:
Dec 20 16:01:23 minion01.novalocal kube-proxy[1030]: W1220 16:01:23.449006 1030 iptables.go:203] Error checking iptables version, assuming version at least 1.4.11: %vfork/exec /usr/sbin/iptables: too many open files
Dec 20 16:01:23 minion01.novalocal kube-proxy[1030]: E1220 16:01:23.449133 1030 proxier.go:409] Failed to install iptables KUBE-PORTALS-CONTAINER rule for service "default/repo-client:"
I found few posts relating to "failed to install iptables" but they don't seem to be relevant as initially everything works but after few hours it gets messed up.
What version of Kubernetes is this? A long time ago (~1.0.4) we had a bug in the kube-proxy where it leaked sockets/file-descriptors.
If you aren't running a 1.1.3 binary, consider upgrading.
Also, you should be able to use lsof to figure out who has all of the files open.
Yesterday service worked fine. But today when i checked service's state i saw:
Mar 11 14:03:16 coreos-1 systemd[1]: scheduler.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 11 14:03:16 coreos-1 systemd[1]: Unit scheduler.service entered failed state.
Mar 11 14:03:16 coreos-1 systemd[1]: scheduler.service failed.
Mar 11 14:03:16 coreos-1 systemd[1]: Starting Kubernetes Scheduler...
Mar 11 14:03:16 coreos-1 systemd[1]: Started Kubernetes Scheduler.
Mar 11 14:08:16 coreos-1 kube-scheduler[4659]: E0311 14:08:16.808349 4659 reflector.go:118] watch of *api.Service ended with error: very short watch
Mar 11 14:08:16 coreos-1 kube-scheduler[4659]: E0311 14:08:16.811434 4659 reflector.go:118] watch of *api.Pod ended with error: unexpected end of JSON input
Mar 11 14:08:16 coreos-1 kube-scheduler[4659]: E0311 14:08:16.847595 4659 reflector.go:118] watch of *api.Pod ended with error: unexpected end of JSON input
It's really confused 'cause etcd, flannel and apiserver work fine.
Only some strange logs are for etcd:
Mar 11 20:22:21 coreos-1 etcd[472]: [etcd] Mar 11 20:22:21.572 INFO | aba44aa0670b4b2e8437c03a0286d779: warning: heartbeat time out peer="6f4934635b6b4291bf29763add9bf4c7" missed=1 backoff="2s"
Mar 11 20:22:48 coreos-1 etcd[472]: [etcd] Mar 11 20:22:48.269 INFO | aba44aa0670b4b2e8437c03a0286d779: warning: heartbeat time out peer="6f4934635b6b4291bf29763add9bf4c7" missed=1 backoff="2s"
Mar 11 20:48:12 coreos-1 etcd[472]: [etcd] Mar 11 20:48:12.070 INFO | aba44aa0670b4b2e8437c03a0286d779: warning: heartbeat time out peer="6f4934635b6b4291bf29763add9bf4c7" missed=1 backoff="2s"
So, I'm really stuck and don't know what's wrong. How can i resolve this problem? Or, how can i check details log for scheduler.
journalctl give me same logs like systemd status
Please see: https://github.com/GoogleCloudPlatform/kubernetes/issues/5311
It means apiserver accepted the watch request but then immediately terminated the connection.
If you see it occasionally, it implies a transient error and is not alarming. If you see it repeatedly, it implies that apiserver (or etcd) is sick.
Is something actually not working for you?