kubectl command to GKE-Autopilot sometimes return forbidden error - kubernetes

env
GKE Autopilot v1.22.12-gke.2300
use kubectl command from ubuntu2004 VM
use gke-gcloud-auth-plugin
what happens
kubectl command sometimes return (Forbidden) error. e.g.)
kubectl get pod
Error from server (Forbidden): pods is forbidden: User "my-mail#domain.com" cannot list resource "pods" in API group "" in the namespace "default": GKEAutopilot authz: the request was sent before policy enforcement is enabled
It happens not always, so it must not be IAM problem. (it happens about 40%).
Before, I thinks it was GKE Autopilot v1.21.xxxx, this error didn't happen; at least not such frequently.
I couldn't find any helpful info even if I searched "GKEAutopilot authz", or "the request was sent before policy enforcement is enabled"
I wish if someone who faced to same issue has any idea.
Thank you in advance

I asked google cloud support.
They said it's bug on GKE master, and was fixed by them.
This problem doesn't happen anymore

Related

Why kubectl give unable to list cronjob error while trying to list pods?

I'm having a weird issue at times. When trying to list pods in a Kubernetes cluster, it gives me the same exact error, which has nothing to do with cronjobs. This issue get fixed if I restart the terminal (sometimes I have to restart the computer). When I'm having this issue, I checked with other computers, they don't have any issues. I believe something wrong on my end. Does anyone have any idea why I end up having this issue?
➜ ~ kubectl get pods
Error from server (NotFound): Unable to list "tap.linkerd.io/v1alpha1, Resource=cronjobs": the server could not find the requested resource (get cronjobs.tap.linkerd.io)
Edit:
I can list deployments, cronjobs without any issues. This is happening only when I do get pods. Also it gets fixed by itself if I wait some time.
This may be an indication of version mismatch between your kubectl client and your server, not anything specific to Linkerd. You can confirm with kubectl version --short whether or not that is the case.

Can't do a kubectl get on the TokenReview kind

I'm searching for out of date apis in my k8s cluster but when I try to do kubectl get TokenReview --all-namespaces , it comes back with Error from server (MethodNotAllowed): the server does not allow this method on the requested resource
I was expecting a list of different yaml files of the kind "TokenReview" similar to the below
I'm running k8s 1.21 for server amd c.lient
anybody got any ideas?...not seeing anything in k8s docs
Apparently,.... "TokenReview create requests are not persisted, they are answered with ephemeral responses. This means the token review API does not support read requests (get, list, watch, etc) or update/delete/patch requests, only create."
"All of the _Review resources are like this as well, subjectaccessreview, selfsubjectaccessreview, tokenreview, etc, are all non-persisted resources."
I raised this on github and this is what they said

Why would kubectl logs return Authorization error?

I am trying to get logs from a pod that is running using kubectl logs grafana-6bfd846fbd-nbv8r
and I am getting the following output:
Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver, verb=get, resource=nodes, subresource=proxy)
I tried to figure why I would not have this specific authorisation even though I can manage everything with this user, no clue. The weirdest is that when I run kubectl auth can-i get pod/logs I get:
yes
After a few hours of going through ClusterRoles and ClusterRoleBindings, I am getting stuck and do know what to do to be authorized. Thanks for your help!
The failure is kube-apiserver trying to access the kubelet, not related to your user. This indicates your core system RBAC rules might be corrupted, check if your installer or K8s distro has a way to validate or repair them (most don't) or make a new cluster and compare them to that.

Missing edit permissions on a kubernetes cluster on GCP

This is a Google Cloud specific problem.
I returned from vacation and noticed I can no longer manage workloads or cluster due to this error: "Missing edit permissions on account"
I am a sole person with access to this account (owner role) and yet I see this issue.
The troubleshooting guide suggests checking system service account role, looks like it's set up correctly (why would it not if I haven't edited it):
If it's not set up correctly it suggests turning off/on the Kubernetes API on GCP, but when you press on "disable" there's a scary-looking prompt that your Kubernetes resources are going to be deleted, so obviously I can't do that.
Upon trying to connect to it I get
gcloud container clusters get-credentials cluster-1 --zone us-west1-b --project PROJECT_ID
Fetching cluster endpoint and auth data.
WARNING: cluster cluster-1 is not running. The kubernetes API may not be available.
In the logs I found a record (the last one) that is 4 days old:
"Readiness probe failed: Get http://10.20.0.5:44135/readiness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Anyone here has any ideas?
Thanks in advance.
The issue is solved,
I had to upgrade node versions in the pool.
What a misleading error message.
Hopefully, this helps someone.

FailedToUpdateEnpoint in kubernetes

I have a kubernetes cluster with some deployments and pods.I have experienced a issue with my deployments with error messages like FailedToUpdateEndpoint, RedinessprobeFailed.
This errors are unexpected and didn't have idea about it.When we analyse the logs of our, it seems like someone try hack our cluster(not sure about it).
Thing to be clear:
1.Is there any chance someone can illegally access our kubernetes cluster without having the kubeconfig?
2.Is there any chance, by using the frontend IP,access our apps and make changes in cluster configurations(means hack the cluster services via Web URL)?
3.Even if the cluster access illegally via frontend URL, is there any chance to change the configuration in cluster?
4.Is there is any mechanism to detect, whether the kubernetes cluster is healthy state or hacked by someone?
Above three mentioned are focus the point, is there any security related issues with kubernetes engine.If not
Then,
5.Still I work on this to find reason for that errors, Please provide more information on that, what may be the cause for these errors?
Error Messages:
FailedToUpdateEndpoint: Failed to update endpoint default/job-store: Operation cannot be fulfilled on endpoints "job-store": the object has been modified; please apply your changes to the latest version and try again
The same error happens for all our pods in cluster.
Readiness probe failed: Error verifying datastore: Get https://API_SERVER: context deadline exceeded; Error reaching apiserver: taking a long time to check apiserver