Kubernetes- Minion cannot talk to master - kubernetes

We have deployed Kubernetes using OpenStack Heat on CoreOS. The below command of fetching nodes does not get any results:
kubectl -s http://<Master FIP>:8080 get nodes
On looking at the minion, we saw that kubelet cannot talk to the master. kubelet on minion has these errors.
In the master node, the container - hyperkube controller displays the below error (10.0.0.4 is the private IP for the master):
W0909 17:42:34.411146 1 request.go:347] Field selector: v1 - serviceaccounts - metadata.name - default: need to check if this is versioned correctly.
I0909 17:42:34.465422 1 endpoints_controller.go:322] Waiting for pods controller to sync, requeuing service default/kubernetes
W0909 17:43:04.249935 1 nodecontroller.go:433] Unable to find Node: 10.0.0.4, deleting all assigned Pods.
E0909 17:43:04.284611 1 nodecontroller.go:434] pods "kube-apiserver-10.0.0.4" not found
I am not sure how should we debug this. Could someone please suggest what could be wrong.
Thanks

This was resolved by making the hyperkube version same on master and minion nodes. In our case, we updated it to v1.3.4. (Used gcr.io/google_containers/hyperkube:v1.3.4)

Related

I cannot load the node information on kubernetes

When I ran the command below, I got the below messages
bistel#BISTelResearchDev-DN03:~$ kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?
While in the master node, I get the information as below:
bistel#BISTelResearchDev-NN:/etc/kubernetes$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
bistelresearchdev-dn03 NotReady <none> 62s v1.19.3
bistelresearchdev-nn Ready master 57m v1.19.3
bistel#BISTelResearchDev-NN:/etc/kubernetes$
The bistelresearchdev-dn03 is the worker node and the message appears when I ran any command using kubectl as follows The connection to the server localhost:8080 was refused - did you specify the right host or port?.
I googled it a lot but any trials didn't work for me.
Thanks,
kubectl works only on master node in cluster. If you are getting this error then there is no issue.
I can see the issue here is node is NotReady status for that you can check below things.
Check kubelet is running on node bistelresearchdev-dn03 with systemctl status kubelet
Check network plugin is installed on your cluster.
The first computer you ran on is missing the kube config file.
Normally kubectl expects to find it at
~/.kube/config
If you get the one off the master node and copy it onto your machine your kubectl will see it and be able to use it.

Metric server not working : unable to handle the request (get nodes.metrics.k8s.io)

I am running command kubectl top nodes and getting error :
node#kubemaster:~/Desktop/metric$ kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
Metric Server pod is running with following params :
command:
- /metrics-server
- --metric-resolution=30s
- --requestheader-allowed-names=aggregator
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
Most of the answer I am getting is the above params,
Still getting error
E0601 18:33:22.012798 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:kubemaster: unable to fetch metrics from Kubelet kubemaster (192.168.56.30): Get https://192.168.56.30:10250/stats/summary?only_cpu_and_memory=true: context deadline exceeded, unable to fully scrape metrics from source kubelet_summary:kubenode1: unable to fetch metrics from Kubelet kubenode1 (192.168.56.31): Get https://192.168.56.31:10250/stats/summary?only_cpu_and_memory=true: dial tcp 192.168.56.31:10250: i/o timeout]
I have deployed metric server using :
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
What am I missing?
Using Calico for Pod Networking
On github page of metric server under FAQ:
[Calico] Check whether the value of CALICO_IPV4POOL_CIDR in the calico.yaml conflicts with the local physical network segment. The default: 192.168.0.0/16.
Could this be the reason. Can someone explains this to me.
I have setup Calico using :
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
My Node Ips are : 192.168.56.30 / 192.168.56.31 / 192.168.56.32
I have initiated the cluster with --pod-network-cidr=20.96.0.0/12. So my pods Ip are 20.96.205.192 and so on.
Also getting this in apiserver logs
E0601 19:29:59.362627 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.100.152.145:443/apis/metrics.k8s.io/v1beta1: Get https://10.100.152.145:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
where 10.100.152.145 is IP of service/metrics-server(ClusterIP)
Surprisingly it works on another cluster with Node Ip in 172.16.0.0 range.
Rest everything is same. Setup using kudeadm, Calico, same pod cidr
It started working after I edited the metrics-server deployment yaml config to include a DNS policy.
hostNetwork: true
Refer to the link below:
https://www.linuxsysadmins.com/service-unavailable-kubernetes-metrics/
Default value of Calico net is 192.168.0.0/16
There is a comment in yaml file:
The default IPv4 pool to create on startup if none exists. Pod IPs
will be chosen from this range. Changing this value after installation
will have no effect. This should fall within --cluster-cidr.
name: CALICO_IPV4POOL_CIDR value: "192.168.0.0/16"
So, its better use different one if your home network is contained in 192.168.0.0/16.
Also, if you used kubeadm you can check your cidr in k8s:
kubeadm config view | grep Subnet
Or you can use kubectl:
kubectl --namespace kube-system get configmap kubeadm-config -o yaml
Default one in kubernetes "selfhosted" is 10.96.0.0/12
I had the same problem trying to run metrics on docker desktop and I followed #suren's answer and it worked.
The default configuration is:
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
And I changed to:
- --kubelet-preferred-address-types=InternalIP
I had same issue in my on-prem k8s v1.26 (cni=calico).
I thinks that this issue because of Metric-Server version (v0.6).
I solved my issue by apply Metric-Server v5.0.2
1- Download this Yaml file from official source
2- add ( - --kubelet-insecure-tls=true ) bellow the -args section
3- apply yaml
enjoy ;)

Argo Workflow distribution on KOPS cluster

Using KOPS tool, I deployed a cluster with:
1 Master
2 slaves
1 Load Balancer
Now, I am trying to deploy an Argo Workflow, but I don't know the process. Will it install on Node or Master of the k8s cluster I built? How does it work?
Basically, if anyone can describe the functional flow or steps of deploying ARGO work flow on kubernetes, it would be nice. First, I need to understand where is it deployed on Master or Worker Node?
Usually, kops creates Kubernetes cluster with taints on a master node that prevent regular pods scheduling on it.
Although, there was an issues with some cluster network implementation, and sometimes you are getting a cluster without taints on the master.
You can change taints on the master node by running the following commands:
add taints (no pods on master):
kubectl taint node kube-master node-role.kubernetes.io/master:NoSchedule
remove taints (allow to schedule pods on master):
kubectl taint nodes --all node-role.kubernetes.io/master-
If you want to know whether the taints are applied to the master node of not, run the following command:
kubectl get node node-master --export -o yaml
Find a spec: section. In case the taints are present, you should see something like this:
...
spec:
externalID: node-master
podCIDR: 192.168.0.0/24
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
...

Kubernetes monitoring service heapster keeps restarting

I am running a kubernetes cluster using azure's container engine. I have an issue with one of the kubernetes services, the one that does resource monitoring heapster. The pod is relaunched every minute or something like that. I have tried removing the heapster deployment, replicaset and pods, and recreate the deployment. It goes back the the same behaviour instantly.
When I look at the resources with the heapster label it looks a little bit weird:
$ kubectl get deploy,rs,po -l k8s-app=heapster --namespace=kube-system
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/heapster 1 1 1 1 17h
NAME DESIRED CURRENT READY AGE
rs/heapster-2708163903 1 1 1 17h
rs/heapster-867061013 0 0 0 17h
NAME READY STATUS RESTARTS AGE
po/heapster-2708163903-vvs1d 2/2 Running 0 0s
For some reason there is two replica sets. The one called rs/heapster-867061013 keeps reappearing even when I delete all of the resources and redeploy them. The above also shows that the pod just started, and this is the issue it keeps getting created then it runs for some seconds and a new one is created. I am new to running kubernetes so I am unsure which logfiles are relevant to this issue.
Logs from heapster container
heapster.go:72] /heapster source=kubernetes.summary_api:""
heapster.go:73] Heapster version v1.3.0
configs.go:61] Using Kubernetes client with master "https://10.0.0.1:443" and version v1
configs.go:62] Using kubelet port 10255
heapster.go:196] Starting with Metric Sink
heapster.go:106] Starting heapster on port 8082
Logs from heapster-nanny container
pod_nanny.go:56] Invoked by [/pod_nanny --cpu=80m --extra-cpu=0.5m --memory=140Mi --extra-memory=4Mi --threshold=5 --deployment=heapster --container=heapster --poll-period=300000 --estimator=exponential]
pod_nanny.go:68] Watching namespace: kube-system, pod: heapster-2708163903-mqlsq, container: heapster.
pod_nanny.go:69] cpu: 80m, extra_cpu: 0.5m, memory: 140Mi, extra_memory: 4Mi, storage: MISSING, extra_storage: 0Gi
pod_nanny.go:110] Resources: [{Base:{i:{value:80 scale:-3} d:{Dec:<nil>} s:80m Format:DecimalSI} ExtraPerNode:{i:{value:5 scale:-4} d:{Dec:<nil>} s: Format:DecimalSI} Name:cpu} {Base:{i:{value:146800640 scale:0} d:{Dec:<nil>} s:140Mi Format:BinarySI} ExtraPerNode:{i:{value:4194304 scale:0} d:{Dec:<nil>} s:4Mi Format:BinarySI} Name:memory}]
It is completely normal and important that the Deployment Controller keeps old ReplicaSet resources in order to do fast rollbacks.
A Deployment resource manages ReplicaSet resources. Your heapster Deployment is configured to run 1 pod - this means it will always try to create one ReplicaSet with 1 pod. In case you make an update to the Deployment (say, a new heapster version), then the Deployment resource creates a new ReplicaSet which will schedule a pod with the new version. At the same time, the old ReplicaSet resource sets its desired pods to 0, but the resource itself is still kept for easy rollbacks. As you can see, the old ReplicaSet rs/heapster-867061013 has 0 pods running. In case you make a rollback, the Deployment deploy/heapster will increase the number of pods in rs/heapster-867061013 to 1 and decrease the number in rs/heapster-2708163903 back to 0. You should also checkout the documentation about the Deployment Controller (in case you haven't done it yet).
Still, it seems odd to me why your newly created Deployment Controller would instantly create 2 ReplicaSets. Did you wait a few seconds (say, 20) after deleting the Deployment Controller and before creating a new one? For me it sometimes takes some time before deletions propagate throughout the whole cluster and if I recreate too quickly, then the same resource is reused.
Concerning the heapster pod recreation you mentioned: pods have a restartPolicy. If set to Never, the pod will be recreated by its ReplicaSet in case it exits (this means a new pod resource is created and the old one is being deleted). My guess is that your heapster pod has this Never policy set. It might exit due to some error and reach a Failed state (you need to check that with the logs). Then after a short while the ReplicaSet creates a new pod.
OK, so it happens to be a problem in the azure container service default kubernetes configuration. I got some help from an azure supporter.
The problem is fixed by adding the label addonmanager.kubernetes.io/mode: EnsureExists to the heapster deployment. Here is the pull request that the supporter referenced: https://github.com/Azure/acs-engine/pull/1133

Will (can) Kubernetes run Docker containers on the master node(s)?

Kubernetes has master and minion nodes.
Will (can) Kubernetes run specified Docker containers on the master node(s)?
I guess another way of saying it is: can a master also be a minion?
Thanks for any assistance.
Update 2015-08-06: As of PR #12349 (available in 1.0.3 and will be available in 1.1 when it ships), the master node is now one of the available nodes in the cluster and you can schedule pods onto it just like any other node in the cluster.
A docker container can only be scheduled onto a kubernetes node running a kubelet (what you refer to as a minion). There is nothing preventing you from creating a cluster where the same machine (physical or virtual) runs both the kubernetes master software and a kubelet, but the current cluster provisioning scripts separate the master onto a distinct machine.
This is going to change significantly when Issue #6087 is implemented.
You need to taint your master node to run containers on it, although not recommended.
Run this on your master node:
kubectl taint nodes --all node-role.kubernetes.io/master-
Courtesy of Alex Ellis' blog post here.
You can try this code:
kubectl label node [name_of_node] node-short-name=node-1
Create yaml file (first.yaml)
apiVersion: v1
kind: Pod
metadata:
name: nginxtest
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
node-short-name: node-1
Create a pod
kubectl create –f first.yaml