kube-apiserver high CPU and requests - kubernetes

We have a Kubernetes 1.7.8 clusters deployed with Kops 1.7 in HA with three masters. The cluster has 10 nodes and around 400 pods.
The cluster has heapster, prometheus, and ELK (collecting logs for some pods).
We are seeing a very high activity in the masters, over 90% of CPU used by the api-server.
Checking prometheus numbers we can see that near 5000 requests to the kube-apiserver are WATCH verbs, the rest are less than 50 request (GET, LIST, PATCH, PUT).
Almost all requests are reported with client "Go-Http-client/2.0" (the default User Agent for the Go HTTP library).
Is this a normal situation?
How can we debug which are the pods sending these requests? (How can we add the source IP to the kube-apiserver logs?)
[kube-apiserver.manifest][1]
Thanks,
Charles
[1]: https://pastebin.com/nGxSXuZb

Regarding the Kubernetes architecture this is a normal behavior because all kubernetes cluster components are calling the api-server to watch for changes.
That is why you have more than 5000 WATCH entries in your logs. Please take a look how the kubernetes cluster is managed by kube api server and how the master-node comunication is organized

Related

What is node/proxy subresource in kubernetes?

You can find mentions of that resource in the following Questions: 1, 2. But I am not able to figure out what is the use of this resource.
Yes, it's true, the provided (in comments) link to the documentation might be confusing so let me try to clarify you this.
As per the official documentation the apiserver proxy:
is a bastion built into the apiserver
connects a user outside of the cluster to cluster IPs which otherwise might not be reachable
runs in the apiserver processes
client to proxy uses HTTPS (or http if apiserver so configured)
proxy to target may use HTTP or HTTPS as chosen by proxy using available information
can be used to reach a Node, Pod, or Service
does load balancing when used to reach a Service
So answering your question - setting node/proxyresource in clusterRole allows k8s services access kubelet endpoints on specific node and path.
As per the official documentation:
There are two primary communication paths from the control plane
(apiserver) to the nodes. The first is from the apiserver to the
kubelet process which runs on each node in the cluster. The second is
from the apiserver to any node, pod, or service through the
apiserver's proxy functionality.
The connections from the apiserver to the kubelet are used for:
fetching logs for pods
attaching (through kubectl) to running pods
providing the kubelet's port-forwarding functionality
Here are also few running examples of using node/proxy resource in clusterRole:
How to Setup Prometheus Monitoring On Kubernetes Cluster
Running Prometheus on Kubernetes
It is hard to find in Kubernetes official documents any information about this sub-resource.
In context of RBAC, the format node/proxy can be used to grant access to the sub-resource named proxy for node resource. Also the same access can be granted for pods and services.
We can see it from the output of available resourses from the Kubernetes API server (API Version: v1.21.0):
===/api/v1===
...
nodes/proxy
...
pods/proxy
...
services/proxy
...
Detailed information about usage of proxy sub-resource can be found in The Kubernetes API (depends on the version you use) - section Proxy Operations for every mentioned resource: pods, nodes, services.

Kubernetes Service not distributing the traffic evenly among pods

I am using Kubernetes v1.20.10 baremetal installation. It has one master node and 3 worker nodes. The application simply served HTTP requests.
I am scaling the deployment based on the (HPA) Horizontal Pod Autoscaler and I noticed that the load is not getting evenly across pods. Only the first pod is getting 95% of the load and the other Pod is getting very low load.
I tried the answer mentioned here but did not work : Kubernetes service does not distribute requests between pods
Based on the information provided I assume that you are using http-keepalive which is a persistent tcp connection.
A kubernetes service distributes load for each (new) tcp connection. If you have persistent connections, only the additional connections will be distributed which is the effect that you observe.
Try: Disable http keepalive or set the maximum keepalive time to something like 15 seconds, maximum requests to 50.

Deploying a stateless Go app with Redis on Kubernetes

I had deploy a stateless Go web app with Redis on Kubernetes. Redis pod is running fine but the main issue with application pod and getting error dial tcp: i/o timeout in log. Thank you!!
Please take look: aks-vm-timeout.
Make sure that the default network security group isn't modified and that both port 22 and 9000 are open for connection to the API server. Check whether the tunnelfront pod is running in the kube-system namespace using the kubectl get pods --namespace kube-system command.
If it isn't, force deletion of the pod and it will restart.
Also make sure if Redis port is open.
More info about troubleshooting: dial-backend-troubleshooting.
EDIT:
Answering on your question about tunnelfront:
tunnelfront is an AKS system component that's installed on every cluster that helps to facilitate secure communication from your hosted Kubernetes control plane and your nodes. It's needed for certain operations like kubectl exec, and will be redeployed to your cluster on version upgrades.
Speaking about VM:
I would SSH into the it and start watching the disk IO latency using bpf / bcc tools and the docker / kubelet logs.

latency based routing for service endpoints in kubernetes cluster

we have single kubernetes cluster which has worker nodes in multiple data-centres which are in different geography area.
we have a service endpoint which connect to the application pods which are in different data-centres. lets say application A has 2 pods running in Data-CentresY, 2 pods in Data-CentreZ and 2 pods in Data-CentreX. now when requests lands on a service endpoint it route traffic to all these 6 pods which are in different data-centres.
we want to implement a latency based routing for service endpoints where when requests lands on a workers node it should route traffic to its nearest pods or pod with low network latency.
any suggestion or guidance are much appreciated.
Use kube-proxy with ipvs mode and use sed - shortest expected delay
Refer: https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs

How does the failover mechanism work in kubernetes service?

According to some of the tech blogs (e.g. Understanding kubernetes networking: services), k8s service dispatch all the requests through iptable rules.
What if one of the upstream pods crashed when a request happened to be routed on that pods.
Is there a failover mechanism in kubernetes service?
Will the request will be forwarded to next pod automatically?
How does kubernetes solve this through iptable?
Kubernetes offers a simple Endpoints API that is updated whenever the set of Pods in a Service changes. For non-native applications, Kubernetes offers a virtual-IP-based bridge to Services which redirects to the backend Pods
Here is the detail k8s service & endpoints
So your answer is endpoint Object
kubectl get endpoints,services,pods
There are liveness and readiness checks which decides if the pod is able to process the request or not. Kubelet with docker has mechanism to control the life cycle of pods. If the pod is healthy then its the part of the endpoint object.