Kubernetes HA master set up - kubernetes

I have made a HA Kubernetes cluster. FIrst I added a node and joined the other node as master role.
I basically did the multi etcd set up. This worked fine for me. I did the fail over testing which also worked fine. Now the problem is once I am done working, I drained and deleted the other node and then I shut down the other machine( a VM on GCP). But then my kubectl commands dont work... Let me share the steps:
kubectl get node(when multi node is set up)
NAME STATUS ROLES AGE VERSION
instance-1 Ready <none> 17d v1.15.1
instance-3 Ready <none> 25m v1.15.1
masternode Ready master 18d v1.16.0
kubectl get node ( when I shut down my other node)
root#masternode:~# kubectl get nodes
The connection to the server k8smaster:6443 was refused - did you specify the right host or port?
Any clue?

After reboot the server you need to do some step below:
sudo -i
swapoff -a
exit
strace -eopenat kubectl version

Related

minikube service url connection refused

I am beginner to kubernetes. I am trying to install minikube wanted to run my application in kubernetes. I am using ubuntu 16.04
I have followed the installation instructions provided here
https://kubernetes.io/docs/setup/learning-environment/minikube/#using-minikube-with-an-http-proxy
Issue1:
After installing kubectl, virtualbox and minikube I have run the command
minikube start --vm-driver=virtualbox
It is failing with following error
Starting local Kubernetes v1.10.0 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
E0912 17:39:12.486830 17689 start.go:305] Error restarting
cluster: restarting kube-proxy: waiting for kube-proxy to be
up for configmap update: timed out waiting for the condition
But when I checked the virtualbox I see the minikube VM running and when I run the kubectl
kubectl create deployment hello-minikube --image=k8s.gcr.io/echoserver:1.10
I see the deployments
kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
hello-minikube 1 1 1 1 27m
I exposed the hello-minikube deployment as service
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-minikube LoadBalancer 10.102.236.236 <pending> 8080:31825/TCP 15m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 19h
I got the url for the service
minikube service hello-minikube --url
http://192.168.99.100:31825
When I try to curl the url I am getting the following error
curl http://192.168.99.100:31825
curl: (7) Failed to connect to 192.168.99.100 port 31825: Connection refused
1)If minikube cluster got failed while starting, how did the kubectl able to connect to minikube to do deployments and services?
2) If cluster is fine, then why am i getting connection refused ?
I was looking at this proxy(https://kubernetes.io/docs/setup/learning-environment/minikube/#starting-a-cluster) what is my_proxy in this ?
Is this minikube ip and some port ?
I have tried this
Error restarting cluster: restarting kube-proxy: waiting for kube-proxy to be up for configmap update: timed out waiting for the condition
but do not understand how #3(set proxy) in solution will be done. Can some one help me getting instructions for proxy ?
Adding the command output which was asked in the comments
kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
etcd-minikube 1/1 Running 0 4m
kube-addon-manager-minikube 1/1 Running 0 5m
kube-apiserver-minikube 1/1 Running 0 4m
kube-controller-manager-minikube 1/1 Running 0 6m
kube-dns-86f4d74b45-sdj6p 3/3 Running 0 5m
kube-proxy-7ndvl 1/1 Running 0 5m
kube-scheduler-minikube 1/1 Running 0 5m
kubernetes-dashboard-5498ccf677-4x7sr 1/1 Running 0 5m
storage-provisioner 1/1 Running 0 5m
I deleted minikube and removed all files under ~/.minikube and
reinstalled minikube. Now it is working fine. I did not get the output
before but I have attached it after it is working to the question. Can
you tell me what does the output of this command tells ?
It will be very difficult or even impossible to tell what was exactly wrong with your Minikube Kubernetes cluster when it is already removed and set up again.
Basically there were a few things that you could do to properly troubleshoot or debug your issue.
Adding the command output which was asked in the comments
The output you posted is actually only part of the task that #Eduardo Baitello asked you to do. kubectl get po -n kube-system command simply shows you a list of Pods in kube-system namespace. In other words this is the list of system pods forming your Kubernetes cluster and, as you can imagine, proper functioning of each of these components is crucial. As you can see in your output the STATUS of your kube-proxy pod is Running:
kube-proxy-7ndvl 1/1 Running 0 5m
You were also asked in #Eduardo's question to check its logs. You can do it by issuing:
kubectl logs kube-proxy-7ndvl
It could tell you what was wrong with this particular pod at the time when the problem occured. Additionally in such case you may use describe command to see other pod details (sometimes looking at pod events may be very helpful to figure out what's going on with it):
kubectl describe pod kube-proxy-7ndvl
The suggestion to check this particular Pod status and logs was most probably motivated by this fragment of the error messages shown during your Minikube startup process:
E0912 17:39:12.486830 17689 start.go:305] Error restarting
cluster: restarting kube-proxy: waiting for kube-proxy to be
up for configmap update: timed out waiting for the condition
As you can see this message clearly suggests that there is in short "something wrong" with kube-proxy so it made a lot of sense to check it first.
There is one more thing you may have not noticed:
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-minikube LoadBalancer 10.102.236.236 <pending> 8080:31825/TCP 15m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 19h
Your hello-minikube service was not completely ready. In EXTERNAL-IP column you can see that its state was pending. As you can use describe command to describe Pods you can do so to get details of the service. Simple:
describe service hello-minikube
could tell you quite a lot in such case.
1)If minikube cluster got failed while starting, how did the kubectl
able to connect to minikube to do deployments and services? 2) If
cluster is fine, then why am i getting connection refused ?
Remember that Kubernetes Cluster is not a monolith structure and consists of many parts that depend on one another. The fact that kubectl worked and you could create deployment doesn't mean that the whole cluster was working fine and as you can see in the error message it was suggesting that one of its components, namely kube-proxy, could actually not function properly.
Going back to the beginning of your question...
I have followed the installation instructions provided here
https://kubernetes.io/docs/setup/learning-environment/minikube/#using-minikube-with-an-http-proxy
Issue1: After installing kubectl, virtualbox and minikube I have run
the command
minikube start --vm-driver=virtualbox
as far as I understood you don't use the http proxy so you didn't follow instructions from this particular fragment of the docs that you posted, did you ?
I have the impression that you mix 2 concepts. kube-proxy which is a Kubernetes cluster component and which is deployed as pod in kube-system space and http proxy server mentioned in this fragment of documentation.
I was looking at this
proxy(https://kubernetes.io/docs/setup/learning-environment/minikube/#starting-a-cluster)
what is my_proxy in this ?
If you don't know what is your http proxy address, most probably you simply don't use it and if you don't use it to connect to the Internet from your computer, it doesn't apply to your case in any way.
Otherwise you need to set it up for your Minikube by providing additional flags when you start it as follows:
minikube start --docker-env http_proxy=http://$YOURPROXY:PORT \
--docker-env https_proxy=https://$YOURPROXY:PORT
If you were able to start your Minikube and now it works properly only using the command:
minikube start --vm-driver=virtualbox
your issue was caused by something else and you don't need to provide the above mentioned flags to tell your Minikube what is your http proxy server that you're using.
As far as I understand currently everything is up and running and you can access the url returned by the command minikube service hello-minikube --url without any problem, right ? You can also run the command kubectl get service hello-minikube and check if its output differs from what you posted before. As you didn't attach any yaml definition files it's difficult to tell if it was nothing wrong with your service definition. Also note that Load Balancer is a service type designed to work with external load balancers provided by cloud providers and minikube uses NodePort instead of it.

kubectl get nodes shows NotReady

I have installed two nodes kubernetes 1.12.1 in cloud VMs, both behind internet proxy. Each VMs have floating IPs associated to connect over SSH, kube-01 is a master and kube-02 is a node. Executed export:
no_proxy=127.0.0.1,localhost,10.157.255.185,192.168.0.153,kube-02,192.168.0.25,kube-01
before running kubeadm init, but I am getting the following status for kubectl get nodes:
NAME STATUS ROLES AGE VERSION
kube-01 NotReady master 89m v1.12.1
kube-02 NotReady <none> 29s v1.12.2
Am I missing any configuration? Do I need to add 192.168.0.153 and 192.168.0.25 in respective VM's /etc/hosts?
Looks like pod network is not installed yet on your cluster . You can install weave for example with below command
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
After a few seconds, a Weave Net pod should be running on each Node and any further pods you create will be automatically attached to the Weave network.
You can install pod networks of your choice . Here is a list
after this check
$ kubectl describe nodes
check all is fine like below
Conditions:
Type Status
---- ------
OutOfDisk False
MemoryPressure False
DiskPressure False
Ready True
Capacity:
cpu: 2
memory: 2052588Ki
pods: 110
Allocatable:
cpu: 2
memory: 1950188Ki
pods: 110
next ssh to the pod which is not ready and observe kubelet logs. Most likely errors can be of certificates and authentication.
You can also use journalctl on systemd to check kubelet errors.
$ journalctl -u kubelet
Try with this
Your coredns is in pending state check with the networking plugin you have used and check the proper addons are added
check kubernates troubleshooting guide
https://kubernetes.io/docs/setup/independent/troubleshooting-kubeadm/#coredns-or-kube-dns-is-stuck-in-the-pending-state
https://kubernetes.io/docs/concepts/cluster-administration/addons/
And install the following with those
And check
kubectl get pods -n kube-system
On the off chance it might be the same for someone else, in my case, I was using the wrong AMI image to create the nodegroup.
Run
journalctl -u kubelet
Then check at node logs, if you get below error, disable the sawp using swapoff -a
"Failed to run kubelet" err="failed to run Kubelet: running with swap on is not supported, please disable swap! or set --fa
Main process exited, code=exited, status=1/FAILURE

Kubernetes worker node is in Not Ready state

I am comparatively new to kubernetes but i have successfully created many clusters before. Now i am facing an issue where i tried to add a node to an already existing cluster. At first kubeadm join seems to be successful but even after initializing the pod network only the master became into Ready.
root#master# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-virtual-machine Ready master 18h v1.9.0
testnode-virtual-machine NotReady <none> 16h v1.9.0
OS: Ubuntu 16.04
Any help will be appreciated.
Thanks.
try the following on the slave node and try to get the status again on master.
> sudo swapoff -a
> exit

kubernetes service IPs not reachable

So I've got a Kubernetes cluster up and running using the Kubernetes on CoreOS Manual Installation Guide.
$ kubectl get no
NAME STATUS AGE
coreos-master-1 Ready,SchedulingDisabled 1h
coreos-worker-1 Ready 54m
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default curl-2421989462-h0dr7 1/1 Running 1 53m 10.2.26.4 coreos-worker-1
kube-system busybox 1/1 Running 0 55m 10.2.26.3 coreos-worker-1
kube-system kube-apiserver-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-controller-manager-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-worker-1 1/1 Running 0 58m 192.168.0.204 coreos-worker-1
kube-system kube-scheduler-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
$ kubectl get svc --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.3.0.1 <none> 443/TCP 1h
As with the guide, I've setup a service network 10.3.0.0/16 and a pod network 10.2.0.0/16. Pod network seems fine as busybox and curl containers get IPs. But the services network has problems. Originally, I've encountered this when deploying kube-dns: the service IP 10.3.0.1 couldn't be reached, so kube-dns couldn't start all containers and DNS was ultimately not working.
From within the curl pod, I can reproduce the issue:
[ root#curl-2421989462-h0dr7:/ ]$ curl https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: No route to host
[ root#curl-2421989462-h0dr7:/ ]$ ip route
default via 10.2.26.1 dev eth0
10.2.0.0/16 via 10.2.26.1 dev eth0
10.2.26.0/24 dev eth0 src 10.2.26.4
It seems ok that there's only a default route in the container. As I understood it, the request (to default route) should be intercepted by the kube-proxy on the worker node, forwarded to the the proxy on the master node where the IP is translated via iptables to the masters public IP.
There seems to be a common problem with a bridge/netfilter sysctl setting, but that seems fine in my setup:
core#coreos-worker-1 ~ $ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
I'm having a real hard time to troubleshoot, as I lack the understanding of what the service IP is used for, how the service network is supposed to work in terms of traffic flow and how to best debug this.
So here're the questions I have:
What is the 1st IP of the service network (10.3.0.1 in this case) used for?
Is above description of the traffic flow correct? If not, what steps does it take for a container to reach a service IP?
What are the best ways to debug each step in the traffic flow? (I can't get any idea what's wrong from the logs)
Thanks!
The Sevice network provides fixed IPs for Services. It is not a routeable network (so don't expect ip ro to show anything nor will ping work) but a collection iptables rules managed by kube-proxy on each node (see iptables -L; iptables -t nat -L on the nodes, not Pods). These virtual IPs (see the pics!) act as load balancing proxy for endpoints (kubectl get ep), which are usually ports of Pods (but not always) with a specific set of labels as defined in the Service.
The first IP on the Service network is for reaching the kube-apiserver itself. It's listening on port 443 (kubectl describe svc kubernetes).
Troubleshooting is different on each network/cluster setup. I would generally check:
Is kube-proxy running on each node? On some setups it's run via systemd and on others there is a DeamonSet that schedules a Pod on each node. On your setup it is deployed as static Pods created by the kubelets thrmselves from /etc/kubernetes/manifests/kube-proxy.yaml
Locate logs for kube-proxy and find clues (can you post some?)
Change kube-proxy into userspace mode. Again, the details depend on your setup. For you it's in the file I mentioned above. Append --proxy-mode=userspace as a parameter on each node
Is the overlay (pod) network functional?
If you leave comments I will get back to you..
I had this same problem, and the ultimate solution that worked for me was enabling IP forwarding on all nodes in the cluster, which I had neglected to do.
$ sudo sysctl net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
Service IPs and DNS started working immediately afterwards.
I had the same issue, turned out to be a configuration issue in kube-proxy.yaml For the "master" parameter I had the ip address as in - --master=192.168.3.240 but it actually required to be a url like - --master=https://192.168.3.240
FYI my kube-proxy sucessfully uses --proxy-mode=iptables (v1.6.x)

kubectl get nodes not showing workers

I am following this tutorial with 2 vms running CentOS7. Everything looks fine (no errors during installation/setup) but I can't see my nodes.
NOTE:
I am running this on VMWare VMs
kub1 is my master and kub2 my worker node
kubectl get nodes output:
[root#kub1 ~]# kubectl cluster-info
Kubernetes master is running at http://kub1:8080
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
[root#kub2 ~]# kubectl cluster-info
Kubernetes master is running at http://kub1:8080
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
nodes:
[root#kub1 ~]# kubectl get nodes
[root#kub1 ~]# kubectl get nodes -a
[root#kub1 ~]#
[root#kub2 ~]# kubectl get nodes -a
[root#kub2 ~]# kubectl get no
[root#kub2 ~]#
cluster events:
[root#kub1 ~]# kubectl get events -a
LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
1h 1h 1 kub2.local Node Normal Starting {kube-proxy kub2.local} Starting kube-proxy.
1h 1h 1 kub2.local Node Normal Starting {kube-proxy kub2.local} Starting kube-proxy.
1h 1h 1 kub2.local Node Normal Starting {kubelet kub2.local} Starting kubelet.
1h 1h 1 node-kub2 Node Normal Starting {kubelet node-kub2} Starting kubelet.
1h 1h 1 node-kub2 Node Normal Starting {kubelet node-kub2} Starting kubelet.
/var/log/messages:
kubelet.go:1194] Unable to construct api.Node object for kubelet: can't get ip address of node node-kub2: lookup node-kub2: no such host
QUESTION: any idea why my nodes are not shown using "kubectl get nodes"?
My issue was that the KUBELET_HOSTNAME on /etc/kubernetes/kubeletvalue didn't match the hostname.
I commented that line, then restarted the services and I could see my worker after that.
hope that helps
Not sure about your scenario, but I have solved it after 3-4 hours of efforts.
Solved
I was facing this issue, because my docker cgroup driver was different than kubernetes cgroup driver.
Just updated it to cgroupfs using following commands mentioned in doc.
cat << EOF > /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=cgroupfs"]
}
EOF
Restart docker service service docker restart.
Reset kubernetes on slave node: kubeadm reset
Joined master again: kubeadm join <><>
It was visible on master using kubectl get nodes.
I had a similar problem after installing k8s using kubespray on fedora31, and to debug the issue, tried to run a random container directly using docker run that failed with:
docker: Error response from daemon: cgroups: cgroup mountpoint does not exist: unknown.
this is a known problem cause by cgroup version on fedora 31, and the fix is to update grub to use the previous version:
sudo dnf install grubby
sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0"