Kubernetes API server , serving pod logs - kubernetes

The REST API requests , GET , POST , PUT etc to Kubernetes API server are request , responses and simple to understand , such as kubectl create <something>. I wonder how the API server serves the pod logs when I do kubectl logs -f <pod-name> ( and similar operations like kubectl attach <pod> ), Is it just an http response to GET in a loop?

My advice is to always check what kubectl does under the cover, and for that use -v=9 with your command. It will provide you with full request and responses that are going between the client and the server.

Yep, it looks like it's currently just a HTTP GET that kubectl is using, when looking at the source of logs.go although there seems to be a desire to unify and upgrade a couple of commands (exec, port-forward, logs, etc.) to WebSockets.
Showing Maciej's excellent suggestion in action:
$ kubectl run test --image centos:7 \
-- sh -c "while true ; do echo Work ; sleep 2 ; done"
$ kubectl get po
NAME READY STATUS RESTARTS AGE
test-769f6f8c9f-2nx7m 1/1 Running 0 2m
$ kubectl logs -v9 -f test-769f6f8c9f-2nx7m
I1019 13:49:34.282007 71247 loader.go:359] Config loaded from file /Users/mhausenblas/.kube/config
I1019 13:49:34.284698 71247 loader.go:359] Config loaded from file /Users/mhausenblas/.kube/config
I1019 13:49:34.292620 71247 loader.go:359] Config loaded from file /Users/mhausenblas/.kube/config
I1019 13:49:34.293136 71247 round_trippers.go:386] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.12.0 (darwin/amd64) kubernetes/0ed3388" 'https://192.168.64.13:8443/api/v1/namespaces/default/pods/test-769f6f8c9f-2nx7m'
I1019 13:49:34.305016 71247 round_trippers.go:405] GET https://192.168.64.13:8443/api/v1/namespaces/default/pods/test-769f6f8c9f-2nx7m 200 OK in 11 milliseconds
I1019 13:49:34.305039 71247 round_trippers.go:411] Response Headers:
I1019 13:49:34.305047 71247 round_trippers.go:414] Date: Fri, 19 Oct 2018 12:49:34 GMT
I1019 13:49:34.305054 71247 round_trippers.go:414] Content-Type: application/json
I1019 13:49:34.305062 71247 round_trippers.go:414] Content-Length: 2390
I1019 13:49:34.305125 71247 request.go:942] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"test-769f6f8c9f-2nx7m","generateName":"test-769f6f8c9f-","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/test-769f6f8c9f-2nx7m","uid":"0581b0fa-d39d-11e8-9827-42a64713caf8","resourceVersion":"892912","creationTimestamp":"2018-10-19T12:46:39Z","labels":{"pod-template-hash":"3259294759","run":"test"},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"test-769f6f8c9f","uid":"057f3ad4-d39d-11e8-9827-42a64713caf8","controller":true,"blockOwnerDeletion":true}]},"spec":{"volumes":[{"name":"default-token-fbx4m","secret":{"secretName":"default-token-fbx4m","defaultMode":420}}],"containers":[{"name":"test","image":"centos:7","args":["sh","-c","while true ; do echo Work ; sleep 2 ; done"],"resources":{},"volumeMounts":[{"name":"default-token-fbx4m","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","nodeName":"minikube","securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}]},"status":{"phase":"Running","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2018-10-19T12:46:39Z"},{"type":"Ready","status":"True","lastProbeTime":null,"lastTransitionTime":"2018-10-19T12:46:40Z"},{"type":"ContainersReady","status":"True","lastProbeTime":null,"lastTransitionTime":null},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2018-10-19T12:46:39Z"}],"hostIP":"192.168.64.13","podIP":"172.17.0.11","startTime":"2018-10-19T12:46:39Z","containerStatuses":[{"name":"test","state":{"running":{"startedAt":"2018-10-19T12:46:39Z"}},"lastState":{},"ready":true,"restartCount":0,"image":"centos:7","imageID":"docker-pullable://centos#sha256:67dad89757a55bfdfabec8abd0e22f8c7c12a1856514726470228063ed86593b","containerID":"docker://5c25f5fce576d68d743afc9b46a9ea66f3cd245f5075aa95def623b6c2d93256"}],"qosClass":"BestEffort"}}
I1019 13:49:34.316531 71247 loader.go:359] Config loaded from file /Users/mhausenblas/.kube/config
I1019 13:49:34.317000 71247 round_trippers.go:386] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.12.0 (darwin/amd64) kubernetes/0ed3388" 'https://192.168.64.13:8443/api/v1/namespaces/default/pods/test-769f6f8c9f-2nx7m/log?follow=true'
I1019 13:49:34.339341 71247 round_trippers.go:405] GET https://192.168.64.13:8443/api/v1/namespaces/default/pods/test-769f6f8c9f-2nx7m/log?follow=true 200 OK in 22 milliseconds
I1019 13:49:34.339380 71247 round_trippers.go:411] Response Headers:
I1019 13:49:34.339390 71247 round_trippers.go:414] Content-Type: text/plain
I1019 13:49:34.339407 71247 round_trippers.go:414] Date: Fri, 19 Oct 2018 12:49:34 GMT
Work
Work
Work
^C

If you extract any Kubernetes object using kubectl on the highest debugging level -v 9 with a streaming option -f, as for example kubectl logs -f <pod-name> -v 9, you can realize that kubectl passing follow=true flag to API request by acquiring logs from target Pod accordingly, and stream to the output as well:
curl -k -v -XGET -H "Accept: application/json, /" -H "User-Agent:
kubectl/v1.12.1 (linux/amd64) kubernetes/4ed3216"
'https://API_server_IP/api/v1/namespaces/default/pods/Pod-name/log?follow=true'
You can consider launching own API requests by following the next steps:
Obtain token for authorization purpose:
MY_TOKEN="$(kubectl get secret <default-secret> -o jsonpath='{$.data.token}' | base64 -d)"
Then you can retrieve manually the required data from API server directly:
curl -k -v -H "Authorization : Bearer $MY_TOKEN" https://API_server_IP/api/v1/namespaces/default/pods

Related

GKE throws invalid certificate when fetching logs

I'm trying to fetch the logs from a pod running in GKE, but I get this error:
I0117 11:42:54.468501 96671 round_trippers.go:466] curl -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.26.0 (darwin/arm64) kubernetes/b46a3f8" 'https://x.x.x.x/api/v1/namespaces/pleiades/pods/pleiades-0/log?container=server'
I0117 11:42:54.569122 96671 round_trippers.go:553] GET https://x.x.x.x/api/v1/namespaces/pleiades/pods/pleiades-0/log?container=server 500 Internal Server Error in 100 milliseconds
I0117 11:42:54.569170 96671 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 100 ms Duration 100 ms
I0117 11:42:54.569186 96671 round_trippers.go:577] Response Headers:
I0117 11:42:54.569202 96671 round_trippers.go:580] Content-Type: application/json
I0117 11:42:54.569215 96671 round_trippers.go:580] Content-Length: 226
I0117 11:42:54.569229 96671 round_trippers.go:580] Date: Tue, 17 Jan 2023 19:42:54 GMT
I0117 11:42:54.569243 96671 round_trippers.go:580] Audit-Id: a25a554f-c3f5-4f91-9711-3f2970376770
I0117 11:42:54.569332 96671 round_trippers.go:580] Cache-Control: no-cache, private
I0117 11:42:54.571392 96671 request.go:1154] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Get \"https://10.6.128.40:10250/containerLogs/pleiades/pleiades-0/server\": x509: certificate is valid for 127.0.0.1, not 10.6.128.40","code":500}
I0117 11:42:54.572267 96671 helpers.go:246] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get \"https://10.6.128.40:10250/containerLogs/pleiades/pleiades-0/server\": x509: certificate is valid for 127.0.0.1, not 10.6.128.40",
"code": 500
}]
How do I prevent this from happening?
One of the reasons for this error could be because both metrics-server and kubelet listen on port 10250. This is usually not a problem because metrics-server runs in its own namespace but the conflict would have prevented metrics-server from starting when in the host network.
You can confirm this behavior by running the following command :
$ kubectl -n kube-system get pods -l k8s-app=metrics-server -o yaml | grep 10250
- --secure-port=10250
- containerPort: 10250
If you can see a hostPort: 10250 in the yaml file of the metrics-server, please run the following command to delete metrics-server deployment on that cluster :
$ kubectl -n kube-system delete deployment -l k8s-app=metrics-server
Metrics server will be recreated correctly by GKE infrastructure. It should be recreated in ~15 seconds on clusters with a new addon manager, but could take up to 15 minutes on very old clusters.

OpenShift or Kubernetes - Create a Job from CronJob using curl command

I would like to create job from cronjob using curl command.
I am aware of kubernetes kubectl or openshift oc commands. The following command works. But I am looking for curl command.
Kubernetes:
kubectl create job --from=cronjob/
OpenShift
oc create job --from=cronjob/
Please help. I am using OpenShift 3.11.
You can run the kubectl command with high verbosity level and it should show the curl command and the request body which is internally being used.
kubectl create job --from=cronjob/test-job --v=10
I0324 10:46:36.071067 44400 round_trippers.go:423] curl -k -v -XGET -H "Accept: application/json" -H "User-Agent: kubectl/v1.17.0 (darwin/amd64) kubernetes/70132b0" 'https://127.0.0.1:32768/apis/batch/v1beta1/namespaces/default/cronjobs/test-job'
I0324 10:46:36.110550 44400 round_trippers.go:443] GET https://127.0.0.1:32768/apis/batch/v1beta1/namespaces/default/cronjobs/test-job 200 OK in 39 milliseconds
I0324 10:46:36.110573 44400 round_trippers.go:449] Response Headers:
I0324 10:46:36.110579 44400 round_trippers.go:452] Content-Type: application/json
I0324 10:46:36.110585 44400 round_trippers.go:452] Content-Length: 898
I0324 10:46:36.110590 44400 round_trippers.go:452] Date: Tue, 24 Mar 2020 05:16:36 GMT
I0324 10:46:36.110631 44400 request.go:1017] Response Body: {"kind":"CronJob","apiVersion":"batch/v1beta1","metadata":{"name":"test-job","namespace":"default","selfLink":"/apis/batch/v1beta1/namespaces/default/cronjobs/test-job","uid":"11813788-123d-4379-a103-79e18c7e954c","resourceVersion":"64182","creationTimestamp":"2020-03-24T05:16:03Z"},"spec":{"schedule":"*/1 * * * *","concurrencyPolicy":"Allow","suspend":false,"jobTemplate":{"metadata":{"name":"test-job","creationTimestamp":null},"spec":{"template":{"metadata":{"creationTimestamp":null},"spec":{"containers":[{"name":"test-job","image":"busybox","resources":{},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"Always"}],"restartPolicy":"OnFailure","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","securityContext":{},"schedulerName":"default-scheduler"}}}},"successfulJobsHistoryLimit":3,"failedJobsHistoryLimit":1},"status":{}}
I0324 10:46:36.117139 44400 request.go:1017] Request Body: {"kind":"Job","apiVersion":"batch/v1","metadata":{"name":"job","creationTimestamp":null,"annotations":{"cronjob.kubernetes.io/instantiate":"manual"},"ownerReferences":[{"apiVersion":"apps/v1","kind":"CronJob","name":"test-job","uid":"11813788-123d-4379-a103-79e18c7e954c","controller":true,"blockOwnerDeletion":true}]},"spec":{"template":{"metadata":{"creationTimestamp":null},"spec":{"containers":[{"name":"test-job","image":"busybox","resources":{},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"Always"}],"restartPolicy":"OnFailure","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","securityContext":{},"schedulerName":"default-scheduler"}}},"status":{}}
I0324 10:46:36.117189 44400 round_trippers.go:423] curl -k -v -XPOST -H "Accept: application/json, */*" -H "Content-Type: application/json" -H "User-Agent: kubectl/v1.17.0 (darwin/amd64) kubernetes/70132b0" 'https://127.0.0.1:32768/apis/batch/v1/namespaces/default/jobs'

kubeadm init stuck at health check when deploying HA kubernetes master with haproxy

I am deploying HA kubernetes master(stacked etcd) with kubeadm ,I followed
the instructions on official website :
https://kubernetes.io/docs/setup/independent/high-availability/
four nodes are planned in my cluster for now:
One HAProxy server node used for master loadbalance.
three etcd stacked master nodes.
I deployed haproxy with following configuration:
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend haproxy_kube
bind *:6443
mode tcp
option tcplog
timeout client 10800s
default_backend masters
backend masters
mode tcp
option tcplog
balance leastconn
timeout server 10800s
server master01 <master01-ip>:6443 check
my kubeadm-config.yaml is like this:
apiVersion: kubeadm.k8s.io/v1beta1
kind: InitConfiguration
nodeRegistration:
name: "master01"
---
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
apiServer:
certSANs:
- "<haproxyserver-dns>"
controlPlaneEndpoint: "<haproxyserver-dns>:6443"
networking:
serviceSubnet: "172.24.0.0/16"
podSubnet: "172.16.0.0/16"
my initial command is:
kubeadm init --config=kubeadm-config.yaml -v 11
but after I running the command above on the master01, it kept logging the following information:
I0122 11:43:44.039849 17489 manifests.go:113] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
I0122 11:43:44.041038 17489 local.go:57] [etcd] wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
I0122 11:43:44.041068 17489 waitcontrolplane.go:89] [wait-control-plane] Waiting for the API server to be healthy
I0122 11:43:44.042665 17489 loader.go:359] Config loaded from file /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I0122 11:43:44.044971 17489 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.13.2 (linux/amd64) kubernetes/cff46ab" 'https://<haproxyserver-dns>:6443/healthz?timeout=32s'
I0122 11:43:44.120973 17489 round_trippers.go:438] GET https://<haproxyserver-dns>:6443/healthz?timeout=32s in 75 milliseconds
I0122 11:43:44.120988 17489 round_trippers.go:444] Response Headers:
I0122 11:43:44.621201 17489 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.13.2 (linux/amd64) kubernetes/cff46ab" 'https://<haproxyserver-dns>:6443/healthz?timeout=32s'
I0122 11:43:44.703556 17489 round_trippers.go:438] GET https://<haproxyserver-dns>:6443/healthz?timeout=32s in 82 milliseconds
I0122 11:43:44.703577 17489 round_trippers.go:444] Response Headers:
I0122 11:43:45.121311 17489 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.13.2 (linux/amd64) kubernetes/cff46ab" 'https://<haproxyserver-dns>:6443/healthz?timeout=32s'
I0122 11:43:45.200493 17489 round_trippers.go:438] GET https://<haproxyserver-dns>:6443/healthz?timeout=32s in 79 milliseconds
I0122 11:43:45.200514 17489 round_trippers.go:444] Response Headers:
I0122 11:43:45.621338 17489 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.13.2 (linux/amd64) kubernetes/cff46ab" 'https://<haproxyserver-dns>:6443/healthz?timeout=32s'
I0122 11:43:45.698633 17489 round_trippers.go:438] GET https://<haproxyserver-dns>:6443/healthz?timeout=32s in 77 milliseconds
I0122 11:43:45.698652 17489 round_trippers.go:444] Response Headers:
I0122 11:43:46.121323 17489 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.13.2 (linux/amd64) kubernetes/cff46ab" 'https://<haproxyserver-dns>:6443/healthz?timeout=32s'
I0122 11:43:46.199641 17489 round_trippers.go:438] GET https://<haproxyserver-dns>:6443/healthz?timeout=32s in 78 milliseconds
I0122 11:43:46.199660 17489 round_trippers.go:444] Response Headers:
after quitting the loop with Ctrl-C, I run the curl command mannually, but every thing seems ok:
curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.13.2 (linux/amd64) kubernetes/cff46ab" 'https://<haproxyserver-dns>:6443/healthz?timeout=32s'
* About to connect() to <haproxyserver-dns> port 6443 (#0)
* Trying <haproxyserver-ip>...
* Connected to <haproxyserver-dns> (10.135.64.223) port 6443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* NSS: client certificate not found (nickname not specified)
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* subject: CN=kube-apiserver
* start date: Jan 22 03:43:38 2019 GMT
* expire date: Jan 22 03:43:38 2020 GMT
* common name: kube-apiserver
* issuer: CN=kubernetes
> GET /healthz?timeout=32s HTTP/1.1
> Host: <haproxyserver-dns>:6443
> Accept: application/json, */*
> User-Agent: kubeadm/v1.13.2 (linux/amd64) kubernetes/cff46ab
>
< HTTP/1.1 200 OK
< Date: Tue, 22 Jan 2019 04:09:03 GMT
< Content-Length: 2
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host <haproxyserver-dns> left intact
ok
I don't know how to find out the essential cause of this issue, hoping someone who know about this can give me some suggestion. Thanks!
After several days of finding and trying, again, I can solve this problem by myself. In fact, the problem perhaps came with a very rare situation:
I set proxy on master node in both /etc/profile and docker.service.d, which made the request to haproxy don't work well.
I don't know which setting cause this problem. But after adding a no proxy rule, the problem solved and kubeadm successfully initialized a master after the haproxy load balancer. Here is my proxy settings :
/etc/profile:
...
export http_proxy=http://<my-proxy-server-dns:port>/
export no_proxy=<my-k8s-master-loadbalance-server-dns>,<my-proxy-server-dns>,localhost
/etc/systemd/system/docker.service.d/http-proxy.conf:
[Service]
Environment="HTTP_PROXY=http://<my-proxy-server-dns:port>/" "NO_PROXY<my-k8s-master-loadbalance-server-dns>,<my-proxy-server-dns>,localhost, 127.0.0.0/8, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16"

Kubernetes v1.12 Problems with kubectl exec

I’ve been learning about Kubernetes using Kelsey Hightower’s excellent kubernetes-the-hard-way-guide.
Using this guide I’ve installed v1.12 on GCE. Everything works perfectly apart from kubectl exec:
$ kubectl exec -it shell-demo – /bin/bash --kubeconfig=/root/certsconfigs/admin.kubeconfig
error: unable to upgrade connection: Forbidden (user=kubernetes, verb=create, resource=nodes, subresource=proxy)
Note that I have set KUBECONFIG=/root/certsconfigs/admin.kubeconfig.
Apart from exec all other kubectl functions work as expected with this admin.kubeconfig file, so from that I deduce it valid for use with my cluster.
I’m pretty sure I have made a beginners mistake somewhere, but if somebody could advise where I have gone away, I should be most grateful.
TIA
Shaun
I have double checked that no .kube/config file exists anywhere on my master controller:
root#controller-1:/root/deployment/kubernetes# kubectl get pods
NAME READY STATUS
shell-demo 1/1 Running 0 23m
Here is the output with -v8:
root#controller-1:/root/deployment/kubernetes# kubectl -v8 exec -it shell-demo – /bin/bash
I1118 15:18:16.898428 11117 loader.go:359] Config loaded from file /root/certsconfigs/admin.kubeconfig
I1118 15:18:16.899531 11117 loader.go:359] Config loaded from file /root/certsconfigs/admin.kubeconfig
I1118 15:18:16.900611 11117 loader.go:359] Config loaded from file /root/certsconfigs/admin.kubeconfig
I1118 15:18:16.902851 11117 round_trippers.go:383] GET ://127.0.0.1:6443/api/v1/namespaces/default/pods/shell-demo
I1118 15:18:16.902946 11117 round_trippers.go:390] Request Headers:
I1118 15:18:16.903016 11117 round_trippers.go:393] Accept: application/json, /
I1118 15:18:16.903091 11117 round_trippers.go:393] User-Agent: kubectl/v1.12.0 (linux/amd64) kubernetes/0ed3388
I1118 15:18:16.918699 11117 round_trippers.go:408] Response Status: 200 OK in 15 milliseconds
I1118 15:18:16.918833 11117 round_trippers.go:411] Response Headers:
I1118 15:18:16.918905 11117 round_trippers.go:414] Content-Type: application/json
I1118 15:18:16.918974 11117 round_trippers.go:414] Content-Length: 2176
I1118 15:18:16.919053 11117 round_trippers.go:414] Date: Sun, 18 Nov 2018 15:18:16 GMT
I1118 15:18:16.919218 11117 request.go:942] Response Body: {“kind”:“Pod”,“apiVersion”:“v1”,“metadata”:{“name”:“shell-demo”,“namespace”:“default”,“selfLink”:"/api/v1/namespaces/default/pods/shell-demo",“uid”:“99f320f8-eb42-11e8-a053-42010af0000b”,“resourceVersion”:“13213”,“creationTimestamp”:“2018-11-18T14:59:51Z”},“spec”:{“volumes”:[{“name”:“shared-data”,“emptyDir”:{}},{“name”:“default-token-djprb”,“secret”:{“secretName”:“default-token-djprb”,“defaultMode”:420}}],“containers”:[{“name”:“nginx”,“image”:“nginx”,“resources”:{},“volumeMounts”:[{“name”:“shared-data”,“mountPath”:"/usr/share/nginx/html"},{“name”:“default-token-djprb”,“readOnly”:true,“mountPath”:"/var/run/secrets/kubernetes.io/serviceaccount"}],“terminationMessagePath”:"/dev/termination-log",“terminationMessagePolicy”:“File”,“imagePullPolicy”:“Always”}],“restartPolicy”:“Always”,“terminationGracePeriodSeconds”:30,“dnsPolicy”:“ClusterFirst”,“serviceAccountName”:“default”,“serviceAccount”:“default”,“nodeName”:“worker-1”,“securityContext”:{},“schedulerName”:“default-scheduler”,“tolerations”:[{“key”:"node.kubernet [truncated 1152 chars]
I1118 15:18:16.925240 11117 round_trippers.go:383] POST …
error: unable to upgrade connection: Forbidden (user=kubernetes, verb=create, resource=nodes, subresource=proxy)
According to your logs,the connection between kubectl and the apiserver is fine, and is being authenticated correctly.
To satisfy an exec request, the apiserver contacts the kubelet running the pod, and that connection is what is being forbidden.
Your kubelet is configured to authenticate/authorize requests, and the apiserver credential is not authorized to make the exec request against the kubelet's API.
Based on the forbidden message, your apiserver is authenticating as the "kubernetes" user to the kubelet.
You can grant that user full permissions to the kubelet API with the following command:
kubectl create clusterrolebinding apiserver-kubelet-admin --user=kubernetes --clusterrole=system:kubelet-api-admin
See the following docs for more information
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/#overview
https://kubernetes.io/docs/reference/access-authn-authz/rbac/#other-component-roles

Kubectl delete -f deployments/ --grace-period=0 --force does not work

What happened:
Force terminate does not work:
[root#master0 manifests]# kubectl delete -f prometheus/deployment.yaml --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
deployment.extensions "prometheus-core" force deleted
^C <---- Manual Quit due to hanging. Waited over 5 minutes with no change.
[root#master0 manifests]# kubectl -n monitoring get pods
NAME READY STATUS RESTARTS AGE
alertmanager-668794449d-6dppl 0/1 Terminating 0 22h
grafana-core-576c68c58d-7nvbt 0/1 Terminating 0 22h
kube-state-metrics-69b9d65dd5-rl8td 0/1 Terminating 0 3h
node-directory-size-metrics-6hcfc 2/2 Running 0 3h
node-directory-size-metrics-w7zxh 2/2 Running 0 3h
node-directory-size-metrics-z2m5j 2/2 Running 0 3h
prometheus-core-59778c7987-vh89h 0/1 Terminating 0 3h
prometheus-node-exporter-27fjg 1/1 Running 0 3h
prometheus-node-exporter-2t5v6 1/1 Running 0 3h
prometheus-node-exporter-hhxmv 1/1 Running 0 3h
Then
What you expected to happen:
Pod to be deleted
How to reproduce it (as minimally and precisely as possible):
We feel that the there might have been an IO error with the storage on the pods. Kubernetes has its own dedicated direct storage. All hosted on AWS. Use of t3.xl
Anything else we need to know?:
It seems to happen randomly but happens often enough as we have to reboot the entire cluster. Do stuck in termination can be ok to deal with, having no logs or no control to really force remove them and start again is frustrating.
Environment:
- Kubernetes version (use kubectl version):
kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:08:34Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration:
AWS
- OS (e.g. from /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
Kernel (e.g. uname -a):
Linux 3.10.0-862.6.3.el7.x86_64 #1 SMP Tue Jun 26 16:32:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Install tools:
Kubernetes was deployed with Kuberpray with GlusterFS as a container volume and Weave as its networking.
Others:
2 master 1 node setup. We have redeployed the entire setup and still get hit by the same issue.
I have posted this question on their issues page:
https://github.com/kubernetes/kubernetes/issues/68829
But no reply.
Logs from API:
[root#master0 manifests]# kubectl -n monitoring delete pod prometheus-core-59778c7987-bl2h4 --force --grace-period=0 -v9
I0919 13:53:08.770798 19973 loader.go:359] Config loaded from file /root/.kube/config
I0919 13:53:08.771440 19973 loader.go:359] Config loaded from file /root/.kube/config
I0919 13:53:08.772681 19973 loader.go:359] Config loaded from file /root/.kube/config
I0919 13:53:08.780266 19973 loader.go:359] Config loaded from file /root/.kube/config
I0919 13:53:08.780943 19973 loader.go:359] Config loaded from file /root/.kube/config
I0919 13:53:08.781609 19973 loader.go:359] Config loaded from file /root/.kube/config
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
I0919 13:53:08.781876 19973 request.go:897] Request Body: {"gracePeriodSeconds":0,"propagationPolicy":"Foreground"}
I0919 13:53:08.781938 19973 round_trippers.go:386] curl -k -v -XDELETE -H "Accept: application/json" -H "Content-Type: application/json" -H "User-Agent: kubectl/v1.11.0 (linux/amd64) kubernetes/91e7b4f" 'https://10.1.1.28:6443/api/v1/namespaces/monitoring/pods/prometheus-core-59778c7987-bl2h4'
I0919 13:53:08.798682 19973 round_trippers.go:405] DELETE https://10.1.1.28:6443/api/v1/namespaces/monitoring/pods/prometheus-core-59778c7987-bl2h4 200 OK in 16 milliseconds
I0919 13:53:08.798702 19973 round_trippers.go:411] Response Headers:
I0919 13:53:08.798709 19973 round_trippers.go:414] Content-Type: application/json
I0919 13:53:08.798714 19973 round_trippers.go:414] Content-Length: 3199
I0919 13:53:08.798719 19973 round_trippers.go:414] Date: Wed, 19 Sep 2018 13:53:08 GMT
I0919 13:53:08.798758 19973 request.go:897] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"prometheus-core-59778c7987-bl2h4","generateName":"prometheus-core-59778c7987-","namespace":"monitoring","selfLink":"/api/v1/namespaces/monitoring/pods/prometheus-core-59778c7987-bl2h4","uid":"7647d17a-bc11-11e8-bd71-06b8eceafd88","resourceVersion":"676465","creationTimestamp":"2018-09-19T13:39:41Z","deletionTimestamp":"2018-09-19T13:40:18Z","deletionGracePeriodSeconds":0,"labels":{"app":"prometheus","component":"core","pod-template-hash":"1533473543"},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"prometheus-core-59778c7987","uid":"75aba047-bc11-11e8-bd71-06b8eceafd88","controller":true,"blockOwnerDeletion":true}],"finalizers":["foregroundDeletion"]},"spec":{"volumes":[{"name":"config-volume","configMap":{"name":"prometheus-core","defaultMode":420}},{"name":"rules-volume","configMap":{"name":"prometheus-rules","defaultMode":420}},{"name":"api-token","secret":{"secretName":"api-token","defaultMode":420}},{"name":"ca-crt","secret":{"secretName":"ca-crt","defaultMode":420}},{"name":"prometheus-k8s-token-trclf","secret":{"secretName":"prometheus-k8s-token-trclf","defaultMode":420}}],"containers":[{"name":"prometheus","image":"prom/prometheus:v1.7.0","args":["-storage.local.retention=12h","-storage.local.memory-chunks=500000","-config.file=/etc/prometheus/prometheus.yaml","-alertmanager.url=http://alertmanager:9093/"],"ports":[{"name":"webui","containerPort":9090,"protocol":"TCP"}],"resources":{"limits":{"cpu":"500m","memory":"500M"},"requests":{"cpu":"500m","memory":"500M"}},"volumeMounts":[{"name":"config-volume","mountPath":"/etc/prometheus"},{"name":"rules-volume","mountPath":"/etc/prometheus-rules"},{"name":"api-token","mountPath":"/etc/prometheus-token"},{"name":"ca-crt","mountPath":"/etc/prometheus-ca"},{"name":"prometheus-k8s-token-trclf","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"prometheus-k8s","serviceAccount":"prometheus-k8s","nodeName":"master1.infra.cde","securityContext":{},"schedulerName":"default-scheduler"},"status":{"phase":"Pending","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2018-09-19T13:39:41Z"},{"type":"Ready","status":"False","lastProbeTime":null,"lastTransitionTime":"2018-09-19T13:39:41Z","reason":"ContainersNotReady","message":"containers with unready status: [prometheus]"},{"type":"ContainersReady","status":"False","lastProbeTime":null,"lastTransitionTime":null,"reason":"ContainersNotReady","message":"containers with unready status: [prometheus]"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2018-09-19T13:39:41Z"}],"hostIP":"10.1.1.187","startTime":"2018-09-19T13:39:41Z","containerStatuses":[{"name":"prometheus","state":{"terminated":{"exitCode":0,"startedAt":null,"finishedAt":null}},"lastState":{},"ready":false,"restartCount":0,"image":"prom/prometheus:v1.7.0","imageID":""}],"qosClass":"Guaranteed"}}
pod "prometheus-core-59778c7987-bl2h4" force deleted
I0919 13:53:08.798864 19973 round_trippers.go:386] curl -k -v -XGET -H "Accept: application/json" -H "User-Agent: kubectl/v1.11.0 (linux/amd64) kubernetes/91e7b4f" 'https://10.1.1.28:6443/api/v1/namespaces/monitoring/pods/prometheus-core-59778c7987-bl2h4'
I0919 13:53:08.801386 19973 round_trippers.go:405] GET https://10.1.1.28:6443/api/v1/namespaces/monitoring/pods/prometheus-core-59778c7987-bl2h4 200 OK in 2 milliseconds
I0919 13:53:08.801403 19973 round_trippers.go:411] Response Headers:
I0919 13:53:08.801409 19973 round_trippers.go:414] Content-Type: application/json
I0919 13:53:08.801415 19973 round_trippers.go:414] Content-Length: 3199
I0919 13:53:08.801420 19973 round_trippers.go:414] Date: Wed, 19 Sep 2018 13:53:08 GMT
I0919 13:53:08.801465 19973 request.go:897] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"prometheus-core-59778c7987-bl2h4","generateName":"prometheus-core-59778c7987-","namespace":"monitoring","selfLink":"/api/v1/namespaces/monitoring/pods/prometheus-core-59778c7987-bl2h4","uid":"7647d17a-bc11-11e8-bd71-06b8eceafd88","resourceVersion":"676465","creationTimestamp":"2018-09-19T13:39:41Z","deletionTimestamp":"2018-09-19T13:40:18Z","deletionGracePeriodSeconds":0,"labels":{"app":"prometheus","component":"core","pod-template-hash":"1533473543"},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"prometheus-core-59778c7987","uid":"75aba047-bc11-11e8-bd71-06b8eceafd88","controller":true,"blockOwnerDeletion":true}],"finalizers":["foregroundDeletion"]},"spec":{"volumes":[{"name":"config-volume","configMap":{"name":"prometheus-core","defaultMode":420}},{"name":"rules-volume","configMap":{"name":"prometheus-rules","defaultMode":420}},{"name":"api-token","secret":{"secretName":"api-token","defaultMode":420}},{"name":"ca-crt","secret":{"secretName":"ca-crt","defaultMode":420}},{"name":"prometheus-k8s-token-trclf","secret":{"secretName":"prometheus-k8s-token-trclf","defaultMode":420}}],"containers":[{"name":"prometheus","image":"prom/prometheus:v1.7.0","args":["-storage.local.retention=12h","-storage.local.memory-chunks=500000","-config.file=/etc/prometheus/prometheus.yaml","-alertmanager.url=http://alertmanager:9093/"],"ports":[{"name":"webui","containerPort":9090,"protocol":"TCP"}],"resources":{"limits":{"cpu":"500m","memory":"500M"},"requests":{"cpu":"500m","memory":"500M"}},"volumeMounts":[{"name":"config-volume","mountPath":"/etc/prometheus"},{"name":"rules-volume","mountPath":"/etc/prometheus-rules"},{"name":"api-token","mountPath":"/etc/prometheus-token"},{"name":"ca-crt","mountPath":"/etc/prometheus-ca"},{"name":"prometheus-k8s-token-trclf","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"prometheus-k8s","serviceAccount":"prometheus-k8s","nodeName":"master1.infra.cde","securityContext":{},"schedulerName":"default-scheduler"},"status":{"phase":"Pending","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2018-09-19T13:39:41Z"},{"type":"Ready","status":"False","lastProbeTime":null,"lastTransitionTime":"2018-09-19T13:39:41Z","reason":"ContainersNotReady","message":"containers with unready status: [prometheus]"},{"type":"ContainersReady","status":"False","lastProbeTime":null,"lastTransitionTime":null,"reason":"ContainersNotReady","message":"containers with unready status: [prometheus]"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2018-09-19T13:39:41Z"}],"hostIP":"10.1.1.187","startTime":"2018-09-19T13:39:41Z","containerStatuses":[{"name":"prometheus","state":{"terminated":{"exitCode":0,"startedAt":null,"finishedAt":null}},"lastState":{},"ready":false,"restartCount":0,"image":"prom/prometheus:v1.7.0","imageID":""}],"qosClass":"Guaranteed"}}
I0919 13:53:08.801758 19973 round_trippers.go:386] curl -k -v -XGET -H "Accept: application/json" -H "User-Agent: kubectl/v1.11.0 (linux/amd64) kubernetes/91e7b4f" 'https://10.1.1.28:6443/api/v1/namespaces/monitoring/pods?fieldSelector=metadata.name%3Dprometheus-core-59778c7987-bl2h4&resourceVersion=676465&watch=true'
I0919 13:53:08.803409 19973 round_trippers.go:405] GET https://10.1.1.28:6443/api/v1/namespaces/monitoring/pods?fieldSelector=metadata.name%3Dprometheus-core-59778c7987-bl2h4&resourceVersion=676465&watch=true 200 OK in 1 milliseconds
I0919 13:53:08.803424 19973 round_trippers.go:411] Response Headers:
I0919 13:53:08.803430 19973 round_trippers.go:414] Date: Wed, 19 Sep 2018 13:53:08 GMT
I0919 13:53:08.803436 19973 round_trippers.go:414] Content-Type: application/json
After some investigation and help from the Kubernetes community over on github. We found the solution. The answer is, in 1.11.0 there is a known bug in relation to this issue. after upgrading to 1.12.0 the issue was resolved. The issue is noted to be resolved in 1.11.1
Thanks to cduchesne https://github.com/kubernetes/kubernetes/issues/68829#issuecomment-422878108
Some times a Kubernetes workers have problems like zombie process or kernel panics or IO waiting. But when you want to delete a pod that use Storage and have many IO/PS like Prometheus DB , your worker can not kill that pods.
I had same situation like you but on Container Linux without any cloud platform like AWS and Gcloud. i just rebooted my broke worker and after that delete them normally without --grace-period=0. --grace-period=0 is a very bad command when your nodes and pods are running without any problems.
workers can reboot when you use an K8S. This is a good fetcher of K8S.
For run Prometheus you should make some Prometheus with different config or use federation for scale Prometheus if you want to have a Monitoring System without IO problem .
After you issue the kubectl delete I would log into the nodes where the pods are running and debug with docker commands. (assuming your runtime is Docker)
docker logs <container-with-issue>
docker exec -it <container-with-with-issue> bash # maybe the application is hanging.
Are you mounting any volumes for Prometheus? It could be that it's trying to release an EBS volume and the AWS API is unresponsive.
Hope it helps!