GKE cluster (k8s 1.21) with dataplane v2 can't kubectl exec/port-forward/logs etc - kubernetes

We have a cluster at GKE which spontaneously stopped supporting kubectl commands exec/port-forward/logs upon cluster upgrade (seemingly to 1.21). Troubleshooting with https://cloud.google.com/kubernetes-engine/docs/troubleshooting#kubect_commands_stops did not solve the issue:
there are no egress-blocking firewall rules
control-plane-to-nodes ssh ingress rule is in place
there are four project-wide ssh keys set, this does not seem like "too many", it is unclear if it would be safe to remove some (and which?)
there is no ssh-key metadata on the node vm's
there are no egress-blocking network policies in place in k8s
Another cluster in the same GCP project (1.20, no dataplane v2) is working fine. The firewall rules are the same. We really have no clue what the problem could be, we can't find anything in the logs.
Does anybody have the same issue, or have any idea how we could troubleshoot this further?
Example output
$ kubectl -v 5 exec podname -- echo 'hi'
I0319 10:09:14.318262 8314 gcp.go:122] WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.25+; use gcloud instead.
To learn more, consult https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credenkubectl -v 10 logs mainsite-test-next-web-deployment-7678d4ddc7-d57x5tial-plugins
I0319 10:09:14.396857 8314 request.go:1372] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0319 10:09:14.396902 8314 cached_discovery.go:78] skipped caching discovery info due to the server is currently unable to handle the request
I0319 10:09:14.396943 8314 shortcut.go:89] Error loading discovery information: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
I0319 10:09:14.419116 8314 request.go:1372] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0319 10:09:14.419128 8314 cached_discovery.go:78] skipped caching discovery info due to the server is currently unable to handle the request
I0319 10:09:14.454275 8314 podcmd.go:88] Defaulting container name to web
I0319 10:09:44.539168 8314 helpers.go:219] server response object: [{
"metadata": {},
"status": "Failure",
"message": "error dialing backend: dial timeout, backstop",
"code": 500
}]
Error from server: error dialing backend: dial timeout, backstop
$ kubectl -v 5 logs podname
I0319 10:12:47.021736 11845 gcp.go:122] WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.25+; use gcloud instead.
To learn more, consult https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins
I0319 10:12:47.086484 11845 request.go:1372] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0319 10:12:47.086501 11845 cached_discovery.go:78] skipped caching discovery info due to the server is currently unable to handle the request
I0319 10:12:47.086528 11845 shortcut.go:89] Error loading discovery information: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
I0319 10:12:47.110462 11845 request.go:1372] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0319 10:12:47.110480 11845 cached_discovery.go:78] skipped caching discovery info due to the server is currently unable to handle the request
I0319 10:13:17.158303 11845 helpers.go:219] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get \"https://10.164.0.17:10250/containerLogs/deploy-name/pod-name/web\": dial timeout, backstop",
"code": 500
}]
Error from server: Get "https://10.164.0.17:10250/containerLogs/deploy-name/pod-name/web": dial timeout, backstop

As you have a "timeout" error it seems that kubectl can't communicate with the cluster control plane.
Can you give a try executing:
gcloud container clusters get-credentials CLUSTER_NAME
--region=COMPUTE_REGION
After this the communication with the control plane should be restored.

In the end it turned out that it was the network policies that we had set up for the kube-system namespace. This used to work just fine, until it didn't. We completely forgot that we had them in the first place :see_no_evil: Turns out we were a little overzealous there. When we removed them, everything was fine.

Related

failed to do request: Head "https://192.168.56.2:5000/v2/ubn_mysql/manifests/latest": http: server gave HTTP response to HTTPS client

Warning Failed 22s (x2 over 37s) kubelet Failed to pull image "192.168.56.2:5000/ubn_mysql:latest": rpc error: code = Unknown desc = failed to pull and unpack image "192.168.56.2:5000/ubn_mysql:latest": failed to resolve reference "192.168.56.2:5000/ubn_mysql:latest": failed to do request: Head "https://192.168.56.2:5000/v2/ubn_mysql/manifests/latest": http: server gave HTTP response to HTTPS client
Getting above error while creating pod in k8. I am able to pull this image on worker nodes using
"docker pull 192.168.56.2:5000/ubn_mysql:latest" from registry '**http://192.168.56.2:5000/ubn_mysql:latest**' but getting issue while creating pod only.
I already have entries in below files:
vagrant#kubemaster:~/pods_yaml$ cat /etc/docker/daemon.json
{"insecure-registries" : ["192.168.56.2:5000"]}
vagrant#kubemaster:~/pods_yaml$ cat /etc/default/docker
DOCKER_OPTS="--config-file=/etc/docker/daemon.json"

How to debug "Internal Server Error" in AWS Cloudformation (MWAA)

I try to deploy AWS MWAA via Cloudformation but come accross:
Resource handler returned message: "Invalid request provided: Internal server error (Service: Mwaa, Status Code: 500, Request ID: 21fea850-9cf7-4977-a947-d10dd3cc1a13)" (RequestToken: f2384bd0-68a6-44d3-d57d-fed3a52f817b, HandlerErrorCode: InvalidRequest)"
on the MWAA resource itself. The other ressources (LoadBalancer, Listener, etc...) are correctly deployed.
What are my solutions to debug this? Since the error message is not explicit and the resource is not created (so it has no logs).

Using Keycloak for defining subjects in policies in Eclispe Ditto

My current use case is: I have a frontend application where a user is logged in via Keycloak. I would like to implement some parts of the Ditto HTTP API in this frontend (https://www.eclipse.org/ditto/http-api-doc.html).
For example I want to create policies (https://www.eclipse.org/ditto/basic-policy.html) for authorization. I've read in the documentation that one can use an OpenID Connect compliant provider and the form is : (https://www.eclipse.org/ditto/basic-policy.html#who-can-be-addressed).
There's basic auth example at the bottom of the page, it seems to use the username in this case.
{
"policyId": "my.namespace:policy-a",
"entries": {
"owner": {
"subjects": {
"nginx:ditto": {
"type": "nginx basic auth user"
}
},
...
}
My question is: What exactly would be the sub-claim if I want to use Keycloak? Is it also the username of the user I want to grant rights to? And how would I get this in my frontend where I want to specify the policy for sending it to Ditto afterwards?
UPDATE 1:
I tried to enable keycloak authentication in Ditto like suggested below and as stated here: https://www.eclipse.org/ditto/installation-operating.html#openid-connect
Because I'm running Ditto with Docker Compose, I added the following line as an environment variable in ditto/deployment/docker/docker-compose.yml in line 136: - Dditto.gateway.authentication.oauth.openid-connect-issuers.keycloak=http://localhost:8090/auth/realms/twin
This URL is the same as in the issuer claim of my token which I'm receiving from keycloak.
Now if I try to make for example a post request with Postman to {{basePath}}/things I get the following error:
<html>
<head>
<title>401 Authorization Required</title>
</head>
<body bgcolor="white">
<center>
<h1>401 Authorization Required</h1>
</center>
<hr>
<center>nginx/1.13.12</center>
</body>
</html>
I chose Bearer Token as Auth in Postman and pasted a fresh token. Basic Auth with the default ditto user is still working.
Do I have to specify the new subject/my user in Ditto before?
UPDATE 2:
I managed to turn basic auth in nginx off by commenting out "auth_basic" and "auth_basic_user_file" in nginx.conf!
It seems to be forwarded to Ditto now, because now I get the following error with Postman:
{
"status": 401,
"error": "gateway:jwt.issuer.notsupported",
"message": "The JWT issuer 'localhost:8090/auth/realms/twin' is not supported.",
"description": "Check if your JWT is correct."
}
UPDATE 3:
My configuration in gateway.conf looks now like this:
oauth {
protocol = "http"
openid-connect-issuers = {
keycloak = "localhost:8090/auth/realms/twin"
}
}
I also tried to add these two lines in the docker-compose.yml:
- Dditto.gateway.authentication.oauth.protocol=http
- Dditto.gateway.authentication.oauth.openid-connect-issuers.keycloak=localhost:8090/auth/realms/twin
Unfortunately I still had no luck, same error as above :/ It seems like an user had a similar problem with keycloak before (https://gitter.im/eclipse/ditto?at=5de3ff186a85195b9edcb1a6), but sadly he mentioned no solution.
EDIT: It turns out that I specified these variables in the wrong way, the correct solution is to add them as part of command: java ... more info here
UPDATE 4:
I tried to build Ditto locally instead of using the latest docker images and I think I might be one step further now, it seems like my oauth config is working. I get now:
{
"status": 503,
"error": "gateway:publickey.provider.unavailable",
"message": "The public key provider is not available.",
"description": "If after retry it is still unavailable, please contact the service team."
}
The error message from the log is:
gateway_1 | 2020-11-05 15:33:18,669 WARN [] o.e.d.s.g.s.a.j.DittoPublicKeyProvider - Got Exception from discovery endpoint <http://localhost:8090/auth/realms/twin/.well-known/openid-configuration>.
gateway_1 | akka.stream.StreamTcpException: Tcp command [Connect(localhost:8090,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
gateway_1 | Caused by: java.net.ConnectException: Connection refused
...
gateway_1 | java.util.concurrent.CompletionException: org.eclipse.ditto.services.gateway.security.authentication.jwt.PublicKeyProviderUnavailableException [message='The public key provider is not available.', errorCode=gateway:publickey.provider.unavailable, statusCode=SERVICE_UNAVAILABLE, description='If after retry it is still unavailable, please contact the service team.', href=null, dittoHeaders=ImmutableDittoHeaders [{}]]
...
gateway_1 | Caused by: org.eclipse.ditto.services.gateway.security.authentication.jwt.PublicKeyProviderUnavailableException [message='The public key provider is not available.', errorCode=gateway:publickey.provider.unavailable, statusCode=SERVICE_UNAVAILABLE, description='If after retry it is still unavailable, please contact the service team.', href=null, dittoHeaders=ImmutableDittoHeaders [{}]]
...
gateway_1 | Caused by: akka.stream.StreamTcpException: Tcp command [Connect(localhost:8090,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
gateway_1 | Caused by: java.net.ConnectException: Connection refused
My keyloak is definitely running, I'm able to get tokens. If I'm opening http://localhost:8090/auth/realms/twin/.well-known/openid-configuration which is in the first error message, I'm able to see my openid-configuration from keycloak config.
Edit: It seems that my gateway container cannot reach my keycloak container, will try to figure this out.
FINAL UPDATE:
Unreachable keycloak docker container from the gateway docker container was the issue. I'm now using traefik:
Keycloak container has the following alias: keycloak.localhost
Oauth configuration in the gateway looks like this:
oauth {
protocol = "http"
openid-connect-issuers = {
keycloak = "keycloak.localhost/auth/realms/twin"
}
}
Now the gateway can find the keycloak container via the alias and I can still use the keycloak admin ui from my localhoast: http://keycloak.localhost:8090/auth/admin/
Additional info: Traefic Blog
What exactly would be the sub-claim if I want to use Keycloak?
Keycloak provides you a JWT.
A JWT is an encrypted JSON which contains multiple fields called "claims". You can check how your token looks like by visiting https://jwt.io and pasting your token there. One of those fields is called sub. This is the sub claim.
To enable your keycloak authentication in eclipse ditto you need to add the issuer to the ditto configuration.
An example can be founde here.
The address must match the URL in the issuer claim of your JWT token.
ditto.gateway.authentication {
oauth {
protocol = "http"
openid-connect-issuers = {
some-name = "localhost:8090/auth/realms/twin"
}
}
}
Is it also the username of the user I want to grant rights to?
In eclipse ditto there is not really a concept of "user names". Eclipse ditto authentication is based on authorization subjects. For the basic authentication example you provided, the authorization subject which is generated within ditto is nginx:ditto.
For JWT authentication the authorization subject is generated as a combination of the name for the open id connect issuer which you configured (in my case some-name) and the value of the sub claim. An authorization subject could look like this: some-name:8d078113-3ee5-4dbf-8db1-eb1a6cf0fe81.
And how would I get this in my frontend where I want to specify the policy for sending it to Ditto afterwards?
I'm not sure if I understand the question correctly. If you mean how to authenticate your frontend HTTP requests to eclipse ditto, you need to provide the JWT to eclipse ditto by adding it to the authorization header of your HTTP requests in the following form:
authorization: Bearer yourJWT
If you mean how you would know the sub claim of a JWT, you need to parse the JWT to a JSON object and then read the sub claim out of the payload section.

how to get list of pod names using kubernetes rest api (jsonpath)

is jsonPath supported in kubernetes http api ?
for ex; how the following translates to in http API ?
kubectl get pods -o=jsonpath='{.items[0]}'
It's not supported by the API, you would need to evaluate that jsonpath against the API response.
You can use verbose flag -v6 and above to see what API calls are actually being made
kubectl get pods -o=jsonpath='{.items[0]}' -v6 2>&1
Output:
I0805 11:16:51.632841 76333 loader.go:375] Config loaded from file: /Users/loganath/firetap/config-ctl1
I0805 11:16:53.666539 76333 round_trippers.go:444] GET https://10.x.x.x:6443/api/v1/namespaces/web/pods?limit=500 200 OK in 2021 milliseconds
I0805 11:16:54.901557 76333 table_printer.go:45] Unable to decode server response into a Table. Falling back to hardcoded types: attempt to decode non-Table object

k8s-visualizer can't read from apiserver

I've tried multiple forks of github.com/brendandburns/gcp-live-k8s-visualizer/issues/6. the current fork i'm trying to get working is (as mentioned by flx in another thread: https://github.com/0ortmann/k8s-visualizer ). I can get the interface to start up; but when teh script.js goes to getJSON("/api..."....) it tried to pull the /api URI from the current port (i.e.8001) for which it gets an unauthorized response? my apiserver is running on port 8080... any ideas?
Update: the "problem" appears to be related to (a) the fact that i'm making the browser http request from a remote host (i.e. i'm not going to http://localhost) and (b) the request filtering that the kubectl proxy is doing... adding the --disable-filter to the kubectl proxy command and doing a curl <remotehostIP>:8001/api at least gets me a response Moved Permanently instead of unauthorized. however, any curl <remotehostIP>:8001/api/v1/pods or similar gets an http 500 error... also the kubectl proxy command has
W1003 15:22:23.805574 8666 proxy.go:116] Request filter disabled, your proxy is vulnerable to XSRF attacks, pleas
e be cautious
Starting to serve on [::]:8001I1003 15:22:23.961109 8666 logs.go:41] http: proxy error: unsupported protocol sche
me ""
I1003 15:22:23.961311 8666 logs.go:41] http: proxy error: unsupported protocol scheme ""
I1003 15:22:23.961451 8666 logs.go:41] http: proxy error: unsupported protocol scheme ""
I1003 15:22:23.962003 8666 logs.go:41] http: proxy error: unsupported protocol scheme ""
(unsupported protocol scheme messages repeat forever)...