In our environment we usually do a canary mechanism for our microservices. we implement 1 service pointed to 2 different deployment config pods. But we found an issue that after we remove one of the DC from the service, we still able to see traffic exist on the removed DC pods. And it happend like forever, the traffic not distributed to the enabled DC even we wait for couple of hours.
The traffic distributed to the enabled DC pods after we restart the pods that call the service. We use openshift 3.6 in our environment.
I attach the flow of the service in this case also.
Badly need help for this issue.
If Kube proxy is down, the pods on a kubernetes node will not be able to communicate with the external world. Anything that Kubernetes does specially to guarantee the reliability of kube-proxy?
Similarly, how does Kubernetes guarantee reliability of kubelet?
It guarantees their reliability by:
Having multiple nodes: If one kubelet crashes, one node goes down. Similarly, every node runs a kube-proxy instance, which means losing one node means losing the kube-proxy instance on that node. Kubernetes is designed to handle node failures. And if you designed your app that is running on Kubernetes to be scalable, you will not be running it as single instance but rather as multiple instances - and kube-scheduler will distribute your workload across multiple nodes - which means your application will still be accessible.
Supporting a Highly-Available Setup: If you set up your Kubernetes cluster in High-Availability mode properly, there won't be one master node, but multiple. This means, you can even tolerate losing some master nodes. The managed Kubernetes offerings of the cloud providers are always highly-available.
These are the first 2 things that come to my mind. However, this is a broad question, so I can go into details if you elaborate what you mean by "reliability" a bit.
I faced the issue with Kubernetes after OOM on the master node. Kubernetes services were looking Ok, there were not any error or warning messages in the log. But Kubernetes failed to process new deployment, wich was created after OOM happened.
I reloaded Kubernetes by systemctl restart kube-*. And it solved the issue, Kubernetes began work normally.
I just wonder is it expected behavior or bug in Kubernetes?
It would be great if you can share kube-controller's log. But when api server crash / OOMKilled, there can be potential synchronization problems in early version of kubernetes (i remember we saw similar problems with daemonset and I have bug filed to Kubernete community), but rare.
Meanwhile, we did a lot of work to make kubernetes production ready: both tuning kubernetes and crafting other micro-services that need to talk to kubernetes. Hope these blog entries would help:
https://applatix.com/making-kubernetes-production-ready-part-2/ This is about 30+ knobs we used to tune kubernetes
https://applatix.com/making-kubernetes-production-ready-part-3/ This is about micro service behavior to ensure cluster stability
It seems the problem wasn't caused by OOM. It was caused by kube-controller regardless to was OOM happen or not.
If I restart kube-controller Kubernetes begins process deployments and pods normally.
We have an application with 4 pods running with a load balancer! We want to try the rolling update, but we are not sure what happens when a pod goes down! The documentation is unclear! Particularly this quote from Termination Of Pods:
Pod is removed from endpoints list for service, and are no longer considered part of the set of running pods for replication controllers. Pods that shutdown slowly can continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.
So, if someone can guide us on the following questions :
1.) When a pod is shutting down, can it still serve new requests? Or does the load balancer not consider it?
2.) Does it complete the requests it is processing till the grace-period is exhausted? and then kills the container even if any process is still running?
3.) Also, this mentions replication controllers, what we have is a Deployment and Deployment has replica sets, so will there be any difference?
We went through this question but the answers are conflicting without any source : Does a Kubernetes rolling-update gracefully remove pods from a service load balancer
1) when a Pod is shutting down it's state is changed to Terminating and it is not considered by the LoadBalancer - as described in the Pod termination docs
2) Yes - you might want to look at the pod.Spec.TerminationGracePeriodSeconds configuration to gain some control. You'll find details in the API documentation
3) No - the ReplicaSet and the Deployment take care of scheduling Pods, there's no difference when it comes to the shutdown behaviour of the Pods
I've been trying to figure out what happens when the Kubernetes master fails in a cluster that only has one master. Do web requests still get routed to pods if this happens, or does the entire system just shut down?
According to the OpenShift 3 documentation, which is built on top of Kubernetes, (https://docs.openshift.com/enterprise/3.2/architecture/infrastructure_components/kubernetes_infrastructure.html), if a master fails, nodes continue to function properly, but the system looses its ability to manage pods. Is this the same for vanilla Kubernetes?
In typical setups, the master nodes run both the API and etcd and are either largely or fully responsible for managing the underlying cloud infrastructure. When they are offline or degraded, the API will be offline or degraded.
In the event that they, etcd, or the API are fully offline, the cluster ceases to be a cluster and is instead a bunch of ad-hoc nodes for this period. The cluster will not be able to respond to node failures, create new resources, move pods to new nodes, etc. Until both:
Enough etcd instances are back online to form a quorum and make progress (for a visual explanation of how this works and what these terms mean, see this page).
At least one API server can service requests
In a partially degraded state, the API server may be able to respond to requests that only read data.
However, in any case, life for applications will continue as normal unless nodes are rebooted, or there is a dramatic failure of some sort during this time, because TCP/ UDP services, load balancers, DNS, the dashboard, etc. Should all continue to function for at least some time. Eventually, these things will all fail on different timescales. In single master setups or complete API failure, DNS failure will probably happen first as caches expire (on the order of minutes, though the exact timing is configurable, see the coredns cache plugin documentation). This is a good reason to consider a multi-master setup–DNS and service routing can continue to function indefinitely in a degraded state, even if etcd can no longer make progress.
There are actions that you could take as an operator which would accelerate failures, especially in a fully degraded state. For instance, rebooting a node would cause DNS queries and in fact probably all pod and service networking functionality until at least one master comes back online. Restarting DNS pods or kube-proxy would also be bad.
If you'd like to test this out yourself, I recommend kubeadm-dind-cluster, kind or, for more exotic setups, kubeadm on VMs or bare metal. Note: kubectl proxy will not work during API failure, as that routes traffic through the master(s).
Kubernetes cluster without a master is like a company running without a Manager.
No one else can instruct the workers(k8s components) other than the Manager(master node)(even you, the owner of the cluster, can only instruct the Manager)
Everything works as usual. Until the work is finished or something stopped them.(because the master node died after assigning the works)
As there is no Manager to re-assign any work for them, the workers will wait and wait until the Manager comes back.
The best practice is to assign multiple managers(master) to your cluster.
Although your data plane and running applications does not immediately starts breaking but there are several scenarios where cluster admins will wish they had multi-master setup. Key to understanding the impact would be understanding which all components talk to master for what and how and more importantly when will they fail if master fails.
Although your application pods running on data plane will not get immediately impacted but imagine a very possible scenario - your traffic suddenly surged and your horizontal pod autoscaler kicked in. The autoscaling would not work as Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through Metrics API for use by Horizontal Pod Autoscaler and vertical pod autoscaler ( but your API server is already dead).If your pod memory shoots up because of high load then it will eventually lead to getting killed by k8s OOM killer. If any of the pods die, then since controller manager and scheduler talks to API Server to watch for current state of pods so they too will fail. In short a new pod will not be scheduled and your application may stop responding.
One thing to highlight is that Kubernetes system components communicate only with the API server. They don’t
talk to each other directly and so their functionality themselves could fail I guess. Unavailable master plane can mean several things - failure of any or all of these components - API server,etcd, kube scheduler, controller manager or worst the entire node had crashed.
If API server is unavailable - no one can use kubectl as generally all commands talk to API server ( meaning you cannot connect to cluster, cannot login into any pods to check anything on container file system. You will not be able to see application logs unless you have any additional centralized log management system).
If etcd database failed or got corrupted - your entire cluster state data is gone and the admins would want to restore it from backups as early as possible.
In short - a failed single master control plane although may not immediately impact traffic serving capability but cannot be relied on for serving your traffic.