Grafana showing k8s pods down for a minute - kubernetes

while using grafana for monitoring with Prometheus, we saw that sometimes grafana showed no pods for a service but when I checked in the cluster, all pods are running without any issue.
This issue is not continuous. Now I have to find out why grafana is alerting? But I don't know where to start.
Pls, ask if any info needed and pls show me the path, where I can start investigating.
Other info
This cluster is AWS EKS. Using prometheus:v2.22.1. Deployment of Prometheus & EKS cluster is done by Terraform.
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.20-eks-8c49e2", GitCommit:"8c49e2efc3cfbb7788a58025e679787daed22018", GitTreeState:"clean", BuildDate:"2021-10-17T05:13:46Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.22) and server (1.18) exceeds the supported minor version skew of +/-1

Related

Nginx Ingress Controller - Failed to watch *v1.EndpointSlice:

I have deployed Nginx Ingress Controller in EKS cluster v1.20.15-eks using helm chart from https://artifacthub.io/packages/helm/ingress-nginx/ingress-nginx/ version 4.4.2
The controller is deployed successfully but when creating Ingress Object I am getting below error.
W0206 09:46:11.909381 8 reflector.go:424] k8s.io/client-go#v0.25.3/tools/cache/reflector.go:169: failed to list *v1.EndpointSlice: the server could not find the requested resource
E0206 09:46:11.909410 8 reflector.go:140] k8s.io/client-go#v0.25.3/tools/cache/reflector.go:169: Failed to watch *v1.EndpointSlice: failed to list *v1.EndpointSlice: the server could not find the requested resource
kubectl version is
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.0", GitCommit:"af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38", GitTreeState:"clean", BuildDate:"2020-12-08T17:59:43Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.15-eks-fb459a0", GitCommit:"165a4903196b688886b456054e6a5d72ba8cddf4", GitTreeState:"clean", BuildDate:"2022-10-24T20:31:58Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
Can anyone help me with this. Thank you in advance.
chart version 4.4.2 has the application version of 1.5.1.
Version 1.5.1 of nginx only supports kubernetes versions of 1.25, 1.24, 1.23.
For kubernetes v1.20 the latest supported version was v1.3.1.
The chart version for v1.3.1 is v4.2.5.
The error you are facing, is due to nginx not finding v1.EndpointSlice, since the EndpointSlice was GA on k8s v1.21 as can be seen here. In previous versions it would be running on alpha/beta not v1.
Please refer to the table here.

k8s job pod resource usage

For regular pods (in running state), we can check the actual resource utilisation (runtime) using kubectl top pod <pod_name> command.
However, for the job pods (execution is already complete), any way we can fetch how much resources were consumed by those pods?
Getting this info does help to better tune the resource allocation and also, whether we over/under provisioning the requests for the job pods.
Kuberenetes version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:25:17Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.13", GitCommit:"a43c0904d0de10f92aa3956c74489c45e6453d6e", GitTreeState:"clean", BuildDate:"2022-08-17T18:23:45Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
If not any direct way, maybe a work-around to get this info.
There is no command that can show a job resource utilisation. The only option is using an external tool like the prometheus or make a sidecar container with logging resources usage.

istio v1.9 - istio-proxy envoy config grpc: received message larger than max

I'm getting this error message using istio v1.9 on a Kubernetes cluster. I tried searching it online but haven't found anything related to istio, a config that I could modify or an additional log to analyze. Do u have an idea of what could be happening?
I have a few apps running but nothing really big, one pod of a few apps.. not sure which limit I'm reaching here .. could this be an istio limit?
istio proxy sidecar log:
....
....
istio-proxy 2021-02-22T13:44:03.255958Z warning envoy config StreamSecrets gRPC config stream closed: 8, grpc: received message larger than max (XXXXXXX vs. 4194304)
....
$ istioctl version
fclient version: 1.9.0
control plane version: 1.9.0
data plane version: 1.9.0 (5 proxies)
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:58Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

"kubectl describe ingress ..." could not find the requested resource

I am trying to execute describe on ingress but does not work. Get command works fine but not describe. Is anything that I am doing wrong? I am running this against AKS.
usr#test:/mnt/c/Repos/user/charts/canary$ kubectl get ingress
NAME HOSTS ADDRESS PORTS AGE
ingress-route xyz.westus.cloudapp.azure.com 80 6h
usr#test:/mnt/c/Repos/user/charts/canary$ kubectl describe ingress ingress-route
Error from server (NotFound): the server could not find the requested resource
Version seems fine:
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", ..}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.10"...}
This could be caused by the incompatibility between the Kubernetes cluster version and the kubectl version.
Run kubectl version to print the client and server versions for the current context, example:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.1", GitCommit:"d647ddbd755faf07169599a625faf302ffc34458", GitTreeState:"clean", BuildDate:"2019-10-02T17:01:15Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.10-gke.0", GitCommit:"569511c9540f78a94cc6a41d895c382d0946c11a", GitTreeState:"clean", BuildDate:"2019-08-21T23:28:44Z", GoVersion:"go1.11.13b4", Compiler:"gc", Platform:"linux/amd64"}
You might want to upgrade your cluster version or downgrade your kubectl version. See more details in https://github.com/kubernetes/kubectl/issues/675

Kubectl Unable to describe on HPA

When I'm trying to describe on hpa following error is thrown:
kubectl describe hpa go-auth
Error from server (NotFound): the server could not find the requested resource
My kubectl version is :
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.7-gke.7", GitCommit:"b80664a77d3bce5b4701bc881d972b1a702290bf", GitTreeState:"clean", BuildDate:"2019-04-04T03:12:09Z", GoVersion:"go1.10.8b4", Compiler:"gc", Platform:"linux/amd64"}
Beware of kubectl version skew. Running kubectl v1.14 with kube-apiserver v1.12 is not supported.
As per kubectl docs:
You must use a kubectl version that is within one minor version
difference of your cluster. For example, a v1.2 client should work
with v1.1, v1.2, and v1.3 master. Using the latest version of kubectl
helps avoid unforeseen issues.
Give it another try using kubectl v1.12.x and you probably will get rid of this problem. Also, take a look at the #568 issue (especially this comment), which addresses the same problem that you have.
If you are wondering on how to manage multiple kubectl versions, I recommend this read: Using different kubectl versions with multiple Kubernetes clusters.