Kubernetes 1.11 could not find heapster for metrics - kubernetes

I'm using Kubernetes 1.11 on Digital Ocean, when I try to use kubectl top node I get this error:
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
but as stated in the doc, heapster is deprecated and no longer required from kubernetes 1.10

If you are running a newer version of Kubernetes and still receiving this error, there is probably a problem with your installation.
Please note that to install metrics server on kubernetes, you should first clone it by typing:
git clone https://github.com/kodekloudhub/kubernetes-metrics-server.git
then you should install it, WITHOUT GOING INTO THE CREATED FOLDER AND WITHOUT MENTIONING AN SPECIFIC YAML FILE , only via:
kubectl create -f kubernetes-metrics-server/
In this way all services and components are installed correctly and you can run:
kubectl top nodes
or
kubectl top pods
and get the correct result.

For kubectl top node/pod to work you either need the heapster or the metrics server installed on your cluster.
Like the warning says: heapster is being deprecated so the recommended choice now is the metrics server.
So follow the directions here to install the metrics server

Related

Fresh cluster and linkerd - viz doesn't startup

I've got an issue, I'm trying to install linkerd on my cluster, all is going well
I went exactly with this official README
https://linkerd.io/2.11/tasks/install-helm/
installed it via helm
MacBook-Pro-6% helm list -n default
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
linkerd2 default 1 2021-12-15 15:47:10.823551 +0100 CET deployed linkerd2-2.11.1 stable-2.11.1
linkerd itself works, and the linkerd check command as well
MacBook-Pro-6% linkerd version
Client version: stable-2.11.1
Server version: stable-2.11.1
but when I try to install viz dashboard as described in the getting-started page I run
linkerd viz install | kubectl apply -f -
and when going with
linkerd check
...
Status check results are √
Linkerd extensions checks
=========================
/ Running viz extension check
and it keeps on checking the viz extensions, and when I ran linkerd dashboard (deprecated I know) shows the same error
Waiting for linkerd-viz extension to become available
anyone got any clue what I'm doing wrong ? Been stuck at this part for 2hrs &_& and noone seem to have any answers
note, when I ran, linkerd check after instalation of viz I get
linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
‼ tap API service is running
FailedDiscoveryCheck: failing or missing response from https://10.190.101.142:8089/apis/tap.linkerd.io/v1alpha1: Get "https://10.190.101.142:8089/apis/tap.linkerd.io/v1alpha1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
see https://linkerd.io/2.11/checks/#l5d-tap-api for hints
‼ linkerd-viz pods are injected
could not find proxy container for grafana-8d54d5f6d-cv7q5 pod
see https://linkerd.io/2.11/checks/#l5d-viz-pods-injection for hints
√ viz extension pods are running
× viz extension proxies are healthy
No "linkerd-proxy" containers found in the "linkerd" namespace
see https://linkerd.io/2.11/checks/#l5d-viz-proxy-healthy for hints
debugging
From your problem descripiton:
‼ linkerd-viz pods are injected
could not find proxy container for grafana-8d54d5f6d-cv7q5 pod
see https://linkerd.io/2.11/checks/#l5d-viz-pods-injection for hints
and:
MacBook-Pro-6% helm list -n default
I encountered a similar problem but with flagger pod rather than grafana pod (I didn't attempt to install grafana component like you did).
A side effect of my problem is this:
$ linkerd viz dashboard
Waiting for linkerd-viz extension to become available
Waiting for linkerd-viz extension to become available
Waiting for linkerd-viz extension to become available
... ## repeating for 5 minutes or so before popping up the dashboard in browser.
The cause for my problem turned out to be that I installed the viz extension into the linkerd namespace. It should belong to the linkerd-viz namespace.
Looking at your original problem description, it seems that you installed the control plane into the default namespace (as opposed to the linkerd namespace.) While you can use any namespace you want, the control plane must be in a separate namespace from the viz extension. Details can be seen in the discussion I wrote here:
https://github.com/linkerd/website/issues/1309

How to find the pod that led to an error in GKE

If I look at my logs in GCP logs, I see for instance that I got a request that gave 500
log_message: "Method: some_cloud_goo.Endpoint failed: INTERNAL_SERVER_ERROR"
I would like to quickly go to that pod and do a kubectl logs on it. But I did not find a way to do this.
I am fairly new to k8s and GKE, any way to traceback the pod that handled that request?
You could run command "kubectl get pods " on each node to check the status of all pods and could figure out accordingly by running for detail description of an error " kubectl describe pod pod-name"
As mentioned in #Neelam answer, you can can get the pod names with the command kubectl get pods -A and log into all your pods to find the error.
Or, alternatively, you could deploy a custom monitoring system like Elastic GKE Logging available in GCP github Click-to-deploy.
See here to install from MarketPlace with few clicks.
It is a free alternative to have a complete monitoring system and you can filter your logs in Kibana dashboard after deployed.

kubernetes kubectl from another node to control plane: x509: certificate signed by unknown authority

I've setup 3 nodes on a cluster following https://linuxacademy.com/blog/containers/building-a-three-node-kubernetes-cluster-quick-guide/. I have all the visible from the control plane. When I try to run:
kubectl get nodes
from a worker node however, I get:
x509: certificate signed by unknown authority.
If I try:
kubectl get nodes --insecure-skip-tls-verify=true
I get:
the server doesn't have a resource type "nodes"
The api-server logs:
Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"pods \"kube-apiserver-user1c.mylabserver.com\" not found",
"reason":"NotFound","details":{"name":"kube-apiserver-user1c.mylabserver.com",
"kind":"pods"},"code":404}
kube-apiserver-user1c.mylabserver.com very much does exist, however
Logs for api-server show:
http: TLS handshake error from worker_node_ip:37596: remote error: tls: bad certificate`
So it very much looks like it doesn't like the certificate. I haven't been able to solve this issue. Any help is appreciated.
I followed steps form this article and checked this lab on LinuxAcademy.
In article which was posted on Posted on March 20, 2019 when kubernetes version was ~1.12 which is quite old (current version is 1.18).
However in article, they are using packages.cloud.google.com which downloads latest version of kubectl and kubeadm. According to this article, you will have latest version of docker(19.03), kubeadm(1.18), kubectl(1.18) but link to CNI in article is to Flannel which was compatible with Kubernetes 1.12.
If you followed this article these days you would get error like:
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
This issue occurs because between Kubernetes 1.15 and 1.16 there was huge chage in apiVersion.
In LinuxAcademy, they used proper version to Flannel CNI which was Kubernetes and Kubeadm in version 1.12.
$ sudo apt-get install -y kubelet=1.12.7-00 kubeadm=1.12.7-00 kubectl=1.12.7-00
For current version of Kubectl you should use this link to apply CNIs
Regarding issue with certificates, when you use command:
$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16
you are getting all defaults settings (also certs). More information you can find in Kbueadm Initialisation.
After that you should get join command which should be use on worker nodes to add them to cluster. After that, it should automatically configure node. For more information about Join can be found here.
Good lab would be to try install kubeadm based on Kubernetes docs.
Last thing I want to mention is that, as you have 1 master node and 2 worker nodes. You should execute commands only on master node. Worker nodes shouldn't run kubectl.

Nginx Ingress Controller Installation Error, "dial tcp 10.96.0.1:443: i/o timeout"

I'm trying to setup a kubernetes cluster with kubeadm and vagrant. I faced an error during installing nginx ingress controller was timeout when the pods is trying to retrieve the configmap through kubernetes API. I have looked around and trying to apply their solution, still no luck, this is the reason I come out with this post.
Environment:
I'm using vagrant to setup 2 nodes with ubuntu/xenial image.
kmaster
-------------------------------------------
network:
Adapter1: NAT
Adapter2: HostOnly-network, IP:192.168.2.71
kworker1
-------------------------------------------
network:
Adapter1: NAT
Adapter2: HostOnly-network, IP:192.168.2.72
I followed the kubeadm to setup the cluster
[Setup kubernetes with kubeadm]
And my kube cluster init command as below:
kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-advertise-address=192.168.2.71
and apply calico network plugin policy:
kubectl apply -f \
https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/etcd.yaml
kubectl apply -f \
https://docs.projectcalico.org/v3.4/getting-started/kubernetes/installation/hosted/calico.yaml
(Calico is a plugin I currently successful installed with, I will come out another post for flannel plugin which the plugin unable to access the service)
I'm using helm to install ingress controller followed the tutorial
https://kubernetes.github.io/ingress-nginx/deploy/
That's the error occurred once I applied helm deploy command when I describe the pod
Appreciate someone can help, as I know the reason was the pod unable to access kubernetes API. But not this already should enable by kubernetes by default?
My kubesystem pods status as below:
Another solution provided from kubernetes official website:
1) install kube-proxy with sidecar, I still new with kubernetes and I'm looking for example how to install kube-proxy with sidecar. Appreciate if someone could provide an example.
2) use client-go, I'm very confuse when I read this post, it seems that using go command to pull the go script, and I have no clue how's it working with kubernetes pods.
You guys are right, I have tested with digital ocean's droplet and it works as expected, I hit another error is "forbidden, user service account not permitted". Look like the pods is able to access the kubernetes api already. I also tested install istio which I encountered the same issue before, and now it worked in digital ocean droplet.
Thank you guys.

How to install influxdb and grafana?

enter image description hereI tried to used the instructions from this link https://github.com/kubernetes/heapster/blob/master/docs/influxdb.md but I was not able to install it. specifically I dont know what this instruction means "Ensure that kubecfg.sh is exported." I dont even know where I can find this I did this sudo find / -name "kubecfg.sh" and I found no results.
moving on to the next step "kubectl create -f deploy/kube-config/influxdb/" when I did this it says kube-system not found I am using latest version of kubernetes version 1.0.1
These instructions are broken can any one provide some instructions on how to install this? I have kubernetes cluster up and running I was able to create and delete pods and so on and default is the only namespace I have when i do kubectl get pods,svc,rc --all-namespaces
Changing kube-system to default in the yaml files is just getting me one step further but I am unable to access the UI and so on. so installing kube-system makes more sense however I dont know how to do it and any instructions on installing influxdb and grafana to get it up and running will be very helpful
I am using latest version of kubernetes version 1.0.1
FYI, the latest version is v1.2.3.
... it says kube-system not found
You can create the kube-system namespace by running
kubectl create namespace kube-system.
Hopefully once you've created the kube-system namespace the rest of the instructions will work.
We had the same issue deploying grafana/influxdb. So we dug into it:
Per https://github.com/kubernetes/heapster/blob/master/docs/influxdb.md since we don’t have an external load balancer, we changed the port type on the grafana service to NodePort which made it accessible at port 30397.
Then looked at the controller configuration here: https://github.com/kubernetes/heapster/blob/master/deploy/kube-config/influxdb/influxdb-grafana-controller.yaml and noticed the comment about using the api-server proxy which we wouldn’t be doing by exposing the NodePort, so we deleted the GF_SERVER_ROOT_URL environment variable from the config. At that point Grafana at least seemed to be running, but it looked like it was having trouble reaching influxdb.
We then changed the datasource to use localhost instead of monitoring-influxd and was able to connect. We're getting data on the cluster usage now, though individual pod data doesn’t seem to be working.