How to fix weave-net CrashLoopBackOff for the second node? - kubernetes

I have got 2 VMs nodes. Both see each other either by hostname (through /etc/hosts) or by ip address. One has been provisioned with kubeadm as a master. Another as a worker node. Following the instructions (http://kubernetes.io/docs/getting-started-guides/kubeadm/) I have added weave-net. The list of pods looks like the following:
vagrant#vm-master:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-vm-master 1/1 Running 0 3m
kube-system kube-apiserver-vm-master 1/1 Running 0 5m
kube-system kube-controller-manager-vm-master 1/1 Running 0 4m
kube-system kube-discovery-982812725-x2j8y 1/1 Running 0 4m
kube-system kube-dns-2247936740-5pu0l 3/3 Running 0 4m
kube-system kube-proxy-amd64-ail86 1/1 Running 0 4m
kube-system kube-proxy-amd64-oxxnc 1/1 Running 0 2m
kube-system kube-scheduler-vm-master 1/1 Running 0 4m
kube-system kubernetes-dashboard-1655269645-0swts 1/1 Running 0 4m
kube-system weave-net-7euqt 2/2 Running 0 4m
kube-system weave-net-baao6 1/2 CrashLoopBackOff 2 2m
CrashLoopBackOff appears for each worker node connected. I have spent several ours playing with network interfaces, but it seems the network is fine. I have found similar question, where the answer advised to look into the logs and no follow up. So, here are the logs:
vagrant#vm-master:~$ kubectl logs weave-net-baao6 -c weave --namespace=kube-system
2016-10-05 10:48:01.350290 I | error contacting APIServer: Get https://100.64.0.1:443/api/v1/nodes: dial tcp 100.64.0.1:443: getsockopt: connection refused; trying with blank env vars
2016-10-05 10:48:01.351122 I | error contacting APIServer: Get http://localhost:8080/api: dial tcp [::1]:8080: getsockopt: connection refused
Failed to get peers
What I am doing wrong? Where to go from there?

I ran in the same issue too. It seems weaver wants to connect to the Kubernetes Cluster IP address, which is virtual. Just run this to find the cluster ip:
kubectl get svc. It should give you something like this:
$ kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 100.64.0.1 <none> 443/TCP 2d
Weaver picks up this IP and tries to connect to it, but worker nodes does not know anything about it. Simple route will solve this issue. On all your worker nodes, execute:
route add 100.64.0.1 gw <your real master IP>

this happens with a single node setup, too. I tried several things like reapplying the configuration and recreation, but the most stable way at the moment is to perform a full tear down (as described in docs) and put the cluster up again.
I use these scripts for relaunching the cluster:
down.sh
#!/bin/bash
systemctl stop kubelet;
docker rm -f -v $(docker ps -q);
find /var/lib/kubelet | xargs -n 1 findmnt -n -t tmpfs -o TARGET -T | uniq | xargs -r umount -v;
rm -r -f /etc/kubernetes /var/lib/kubelet /var/lib/etcd;
up.sh
#!/bin/bash
systemctl start kubelet
kubeadm init
# kubectl taint nodes --all dedicated- # single node!
kubectl create -f https://git.io/weave-kube
edit: I would also give other Pod networks a try, like Calico, if this is a weave related issue

The most common causes for this may be:
- presence of a firewall (e.g. firewalld on CentOS)
- network configuration (e.g. default NAT interface on VirtualBox)
Currently kubeadm is still alpha, and this is one of the issues that has already been reported by many of the alpha testers. We are looking into fixing this by documenting the most common problems, such documentation is going to be ready closer to beta version.
Right there exists a VirtualBox+Vargant+Ansible for Ubunutu and CentOS reference implementation that provides solutions for firewall, SELinux and VirtualBox NAT issues.

/usr/local/bin/weave reset
was the fix for me - Hope its useful - and yes make sure selinux is set to disabled
and firewalld is not running (on redhat / centos) releases
kube-system weave-net-2vlvj 2/2 Running 3 11d
kube-system weave-net-42k6p 1/2 Running 3 11d
kube-system weave-net-wvsk5 2/2 Running 3 11d

Related

Get https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.96.0.1:443: i/o timeout]

my pod stucks in ContainerCreating status with this massage :
Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "483590313b7fd092fe5eeec92356152721df3e971d942174464ac5a3f1529898" network for pod "my-nginx": networkPlugin cni failed to set up pod "my-nginx_default" network: CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "483590313b7fd092fe5eeec92356152721df3e971d942174464ac5a3f1529898", failed to clean up sandbox container "483590313b7fd092fe5eeec92356152721df3e971d942174464ac5a3f1529898" network for pod "my-nginx": networkPlugin cni failed to teardown pod "my-nginx_default" network: error getting ClusterInformation: Get https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.96.0.1:443: i/o timeout]
the state of worker node is Ready .
but the output of kubectl get pods -n kube-system seems to have issues :
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6dfcd885bf-ktbhb 1/1 Running 0 22h
calico-node-4fs2v 0/1 Init:RunContainerError 1 22h
calico-node-l9qvc 0/1 Running 0 22h
coredns-f9fd979d6-8pzcd 1/1 Running 0 23h
coredns-f9fd979d6-v4cq8 1/1 Running 0 23h
etcd-k8s-master 1/1 Running 1 23h
kube-apiserver-k8s-master 1/1 Running 128 23h
kube-controller-manager-k8s-master 1/1 Running 4 23h
kube-proxy-bwtwj 0/1 CrashLoopBackOff 342 23h
kube-proxy-stq7q 1/1 Running 1 23h
kube-scheduler-k8s-master 1/1 Running 4 23h
and the resualt of command kubectl -n kube-system logs kube-proxy-bwtwj the resulst was :
failed to try resolving symlinks in path "/var/log/pods/kube-system_kube-proxy-bwtwj_1a0f4b93-cc6f-46b9-bf29-125feba593cb/kube-proxy/348.log": lstat /var/log/pods/kube-system_kube-proxy-bwtwj_1a0f4b93-cc6f-46b9-bf29-125feba593cb/kube-proxy/348.log: no such file or directory
I see two topics here:
The default --pod-network-cidr for calico is 192.168.0.0/16. You can use a different one but always make sure that there are no overlays. However, I have tested with the default one and my cluster runs with no problems. In order to start over with a proper config, you should Remove the node and Clean up the control plane. Than proceed with:
kubeadm init --pod-network-cidr=192.168.0.0/16
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
After that, join your worker nodes with kubeadm join
Use sudo where/if needed. All necessary details can be found in the official documentation.
The failed to try resolving symlinks error means that kubelet is looking for the pod logs in a wrong directory. In order to fix it you need to pass the --log-dir=/var/log flag to kubelet. After adding the flag you have run systemctl daemon-reload so the kubelet would be restarted. This has to be done on all of your nodes.
Make sure you deploy calico before joining other nodes to your cluster. When you have other nodes in your cluster calico-kube-controllers sometimes gets push to a worker node. This can lead to issues
You need to carefully check logs for calico-node pods.
In my case i have some other network interfaces and the autodetection mechanism in calico was detecting wrong interface (ip address).
You need to consult this documentation https://projectcalico.docs.tigera.io/reference/node/configuration.
What i did in my case, was simply:
kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=cidr=172.16.8.0/24
cidr is my "working network".
After this, all calico-nodes restarted and suddenly everything was fine.

New kubernetes install has remnants of old cluster

I did a complete tear down of a v1.13.1 cluster and am now running v1.15.0 with calico cni v3.8.0. All pods are running:
[gms#thalia0 ~]$ kubectl get po --namespace=kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-59f54d6bbc-2mjxt 1/1 Running 0 7m23s
calico-node-57lwg 1/1 Running 0 7m23s
coredns-5c98db65d4-qjzpq 1/1 Running 0 8m46s
coredns-5c98db65d4-xx2sh 1/1 Running 0 8m46s
etcd-thalia0.ahc.umn.edu 1/1 Running 0 8m5s
kube-apiserver-thalia0.ahc.umn.edu 1/1 Running 0 7m46s
kube-controller-manager-thalia0.ahc.umn.edu 1/1 Running 0 8m2s
kube-proxy-lg4cn 1/1 Running 0 8m46s
kube-scheduler-thalia0.ahc.umn.edu 1/1 Running 0 7m40s
But, when I look at the endpoint, I get the following:
[gms#thalia0 ~]$ kubectl get ep --namespace=kube-system
NAME ENDPOINTS AGE
kube-controller-manager <none> 9m46s
kube-dns 192.168.16.194:53,192.168.16.195:53,192.168.16.194:53 + 3 more... 9m30s
kube-scheduler <none> 9m46s
If I look at the log for the apiserver, I get a ton of TLS handshake errors, along the lines of:
I0718 19:35:17.148852 1 log.go:172] http: TLS handshake error from 10.x.x.160:45042: remote error: tls: bad certificate
I0718 19:35:17.158375 1 log.go:172] http: TLS handshake error from 10.x.x.159:53506: remote error: tls: bad certificate
These IP addresses were from nodes in a previous cluster. I had deleted them and done a kubeadm reset on all nodes, including master, so I have no idea why these are showing up. I would assume this is why the endpoints for the controller-manager and the scheduler are showing up as <none>.
In order to completely wipe your cluster you should do next:
1) Reset cluster
$sudo kubeadm reset (or use appropriate to your cluster command)
2) Wipe your local directory with configs
$rm -rf .kube/
3) Remove /etc/kubernetes/
$sudo rm -rf /etc/kubernetes/
4)And one of the main point is to get rid of your previous etc state configuration.
$sudo rm -rf /var/lib/etcd/

Kubernetes dashboard: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout

I have a Kubernetes cluster in vagrant (1.14.0) and installed calico.
I have installed the kubernetes dashboard. When I use kubectl proxy to visit the dashboard:
Error: 'dial tcp 192.168.1.4:8443: connect: connection refused'
Trying to reach: 'https://192.168.1.4:8443/'
Here are my pods (dashboard is restarting frequently):
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-etcd-cj928 1/1 Running 0 11m
calico-node-4fnb6 1/1 Running 0 18m
calico-node-qjv7t 1/1 Running 0 20m
calico-policy-controller-b9b6749c6-29c44 1/1 Running 1 11m
coredns-fb8b8dccf-jjbhk 1/1 Running 0 20m
coredns-fb8b8dccf-jrc2l 1/1 Running 0 20m
etcd-k8s-master 1/1 Running 0 19m
kube-apiserver-k8s-master 1/1 Running 0 19m
kube-controller-manager-k8s-master 1/1 Running 0 19m
kube-proxy-8mrrr 1/1 Running 0 18m
kube-proxy-cdsr9 1/1 Running 0 20m
kube-scheduler-k8s-master 1/1 Running 0 19m
kubernetes-dashboard-5f7b999d65-nnztw 1/1 Running 3 2m11s
logs of the dasbhoard pod:
2019/03/30 14:36:21 Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service account's configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout
Refer to our FAQ and wiki pages for more information: https://github.com/kubernetes/dashboard/wiki/FAQ
I can telnet from both master and nodes to 10.96.0.1:443.
What is configured wrongly? The rest of the cluster seems to work fine, although I see this logs in kubelet:
failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml"
kubelet seems to run fine on the master.
The cluster was created with this command:
kubeadm init --apiserver-advertise-address="192.168.50.10" --apiserver-cert-extra-sans="192.168.50.10" --node-name k8s-master --pod-network-cidr=192.168.0.0/16
you should define your hostname in /etc/hosts
#hostname
YOUR_HOSTNAME
#nano /etc/hosts
YOUR_IP HOSTNAME
if you set your hostname in your master but it did not work try
# systemctl stop kubelet
# systemctl stop docker
# iptables --flush
# iptables -tnat --flush
# systemctl start kubelet
# systemctl start docker
and you should install dashboard before join worker node
and disable your firewall
and you can check your free ram.
Exclude -- node-name parameter from kubeadm init command
try this command
kubeadm init --apiserver-advertise-address=$(hostname -i) --apiserver-cert-extra-sans="192.168.50.10" --pod-network-cidr=192.168.0.0/16
For me the issue was I needed to create a NetworkPolicy that allowed Egress traffic to the kubernetes API

Kubernetes coredns pods stuck in Pending status. Cannot start the dashboard

I am building a Kubernetes cluster following this tutorial, and I have troubles to access the Kubernetes dashboard. I already created another question about it that you can see here, but while digging up into my cluster, I think that the problem might be somewhere else and that's why I create a new question.
I start my master, by running the following commands:
> kubeadm reset
> kubeadm init --apiserver-advertise-address=[MASTER_IP] > file.txt
> tail -2 file.txt > join.sh # I keep this file for later
> kubectl apply -f https://git.io/weave-kube/
> kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 2m46s
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 2m46s
etcd-kubemaster 1/1 Running 0 93s
kube-apiserver-kubemaster 1/1 Running 0 93s
kube-controller-manager-kubemaster 1/1 Running 0 113s
kube-proxy-lxhvs 1/1 Running 0 2m46s
kube-scheduler-kubemaster 1/1 Running 0 93s
Here we can see that I have two coredns pods stuck in Pending state forever, and when I run the command :
> kubectl -n kube-system describe pod coredns-fb8b8dccf-kb2zq
I can see in the Events part the following Warning :
Failed Scheduling : 0/1 nodes are available 1 node(s) had taints that the pod didn't tolerate.
Since it is a Warning and not and Error, and that as a Kubernetes newbie, taints does not mean much to me, I tried to connect a node to the master (using the previously saved command) :
> cat join.sh
kubeadm join [MASTER_IP]:6443 --token [TOKEN] \
--discovery-token-ca-cert-hash sha256:[ANOTHER_TOKEN]
> ssh [USER]#[WORKER_IP] 'bash' < join.sh
This node has joined the cluster.
On the master, I check that the node is connected:
> kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubemaster NotReady master 13m v1.14.1
kubeslave1 NotReady <none> 31s v1.14.1
And I check my pods :
> kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 14m
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 14m
etcd-kubemaster 1/1 Running 0 13m
kube-apiserver-kubemaster 1/1 Running 0 13m
kube-controller-manager-kubemaster 1/1 Running 0 13m
kube-proxy-lxhvs 1/1 Running 0 14m
kube-proxy-xllx4 0/1 ContainerCreating 0 2m16s
kube-scheduler-kubemaster 1/1 Running 0 13m
We can see that another kube-proxy pod have been created and is stuck in ContainerCreating status.
And when I am doing a describe again :
kubectl -n kube-system describe pod kube-proxy-xllx4
I can see in the Events part multiple identical Warnings :
Failed create pod sandbox : rpx error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.1": Get https://k8s.gcr.io/v1/_ping: dial tcp: lookup k8s.gcr.io on [::1]:53 read up [::1]43133->[::1]:53: read: connection refused
Here are my repositories :
docker image ls
REPOSITORY TAG
k8s.gcr.io/kube-proxy v1.14.1
k8s.gcr.io/kube-apiserver v1.14.1
k8s.gcr.io/kube-controller-manager v1.14.1
k8s.gcr.io/kube-scheduler v1.14.1
k8s.gcr.io/coredns 1.3.1
k8s.gcr.io/etcd 3.3.10
k8s.gcr.io/pause 3.1
And so, for the dashboard part, I tried to start it with the command
> kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/aio/deploy/recommended/kubernetes-dashboard.yaml
But the dashboard pod is stuck in Pending state.
kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 40m
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 40m
etcd-kubemaster 1/1 Running 0 38m
kube-apiserver-kubemaster 1/1 Running 0 38m
kube-controller-manager-kubemaster 1/1 Running 0 39m
kube-proxy-lxhvs 1/1 Running 0 40m
kube-proxy-xllx4 0/1 ContainerCreating 0 27m
kube-scheduler-kubemaster 1/1 Running 0 38m
kubernetes-dashboard-5f7b999d65-qn8qn 1/1 Pending 0 8s
So, event though my problem originaly was that I cannot access to my dashboard, I guess that the real problem is deeper thant that.
I know that I just put a lot of information here, but I am a k8s beginner and I am completely lost on this.
There is an issue I experienced with coredns pods stuck in a pending mode when setting up your own cluster; which I resolve by adding pod network.
Looks like because there is no Network Addon installed, the nodes are taint as not-ready. Installing the Addon would remove the taints and the Pods will be able to schedule. In my case adding flannel fixed the issue.
EDIT: There is a note about this in the official k8s documentation - Create cluster with kubeadm:
The network must be deployed before any applications. Also, CoreDNS
will not start up before a network is installed. kubeadm only
supports Container Network Interface (CNI) based networks (and does
not support kubenet).
Actually it is the opposite of a deep or serious issue. This is a trivial issue. Always you see a pod stuck on Pending state, it means the scheduler is having a hard time to schedule the pod; mostly because there are no enough resources on the node.
In your case it is a taint that has the node, and your pod doesn't have the toleration. What you have to do is to describe the node and get the taint:
kubectl describe node | grep -i taints
Note: you might have more then one taint. So you might want to do kubectl describe no NODE since with grep you will only see one taint.
Once you get the taint, that will be something like hello=world:NoSchedule; which means key=value:effect, you will have to add a toleration section in your Deployment. This is an example Deployment so you can see how it should look like:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 10
strategy:
type: Recreate
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80
name: http
tolerations:
- effect: NoExecute #NoSchedule, PreferNoSchedule
key: node
operator: Equal
value: not-ready
tolerationSeconds: 3600
As you can see there is the toleration section in the yaml. So, if I would have a node with node=not-ready:NoExecute taint, no pod would be able to be scheduled on that node, unless would have this toleration.
Also you can remove the taint, if you don need it. To remove a taint you would describe the node, get the key of the taint and do:
kubectl taint node NODE key-
Hope it makes sense. Just add this section to your deployment, and it will work.
Set up the flannel network tool.
Running commands:
$ sysctl net.bridge.bridge-nf-call-iptables=1
$ kubectl apply -f
https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml

Error from server (NotFound): podmetrics.metrics.k8s.io "mem-example/memory-demo" not found

I am following this tutorial: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/
I have created the memory pod demo and I am trying to get the metrics from the pod but it is not working.
I installed the metrics server by cloning: https://github.com/kubernetes-incubator/metrics-server
And then running this command from top level:
kubectl create -f deploy/1.8+/
I am using kubernetes version 1.10.11.
The pod is definitely created:
λ kubectl get pod memory-demo --namespace=mem-example
NAME READY STATUS RESTARTS AGE
memory-demo 1/1 Running 0 6m
But the metics command does not work and gives an error:
λ kubectl top pod memory-demo --namespace=mem-example
Error from server (NotFound): podmetrics.metrics.k8s.io "mem-example/memory-demo" not found
What did I do wrong?
There are some patches to be done to metrics server deployment to get the metrics working.
Follow the below steps
kubectl delete -f deploy/1.8+/
wait till the metrics server gets undeployed
run the below command
kubectl create -f https://raw.githubusercontent.com/epasham/docker-repo/master/k8s/metrics-server.yaml
master $ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-6zg78 1/1 Running 0 2h
coredns-78fcdf6894-gk4sb 1/1 Running 0 2h
etcd-master 1/1 Running 0 2h
kube-apiserver-master 1/1 Running 0 2h
kube-controller-manager-master 1/1 Running 0 2h
kube-proxy-f5z9p 1/1 Running 0 2h
kube-proxy-ghbvn 1/1 Running 0 2h
kube-scheduler-master 1/1 Running 0 2h
metrics-server-85c54d44c8-rmvxh 2/2 Running 0 1m
weave-net-4j7cl 2/2 Running 1 2h
weave-net-82fzn 2/2 Running 1 2h
master $ kubectl top pod -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-78fcdf6894-6zg78 2m 11Mi
coredns-78fcdf6894-gk4sb 2m 9Mi
etcd-master 14m 90Mi
kube-apiserver-master 24m 425Mi
kube-controller-manager-master 26m 62Mi
kube-proxy-f5z9p 2m 19Mi
kube-proxy-ghbvn 3m 17Mi
kube-scheduler-master 8m 14Mi
metrics-server-85c54d44c8-rmvxh 1m 19Mi
weave-net-4j7cl 2m 59Mi
weave-net-82fzn 1m 60Mi
Check and verify the below lines in metrics server deployment manifest.
command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
On Minikube, I had to wait for 20-25 minutes after enabling the metrics-server addon. I was getting the same error for 20-25 minutes but later I could see the output without attempting for any solution.
I faced the similar issue of
Error from server (NotFound): podmetrics.metrics.k8s.io "default/apple-app" not found
I followed two steps and I was able to resolve the issue.
Download the latest customized components.yaml, which is their official file used for easy deployment.
Update the change
# - /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
to the command section of the deployment specification. I have commented the first line because it is the entrypoint of the image used by kubernetes metrics-server.
$ docker image inspect k8s.gcr.io/metrics-server-amd64:v0.3.6 -f {{.ContainerConfig.Entrypoint}}
[/metrics-server]
Even If you use it or not, it doesn't matter.
Note: You have to wait for few seconds for it to properly work.
After this running the top command will work for you.
$ kubectl top pod apple-app
NAME CPU(cores) MEMORY(bytes)
apple-app 1m 3Mi
I know this is an old thread may be someone will find this answer useful.
You have to checkout the following repo:
https://github.com/kubernetes-incubator/metrics-server
Go to the root of the repo and checkout release-0.3.2.
Remove default metrics server by:
kubectl delete -f deploy/1.8+/
Download the container yaml
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
Edit the container.yaml by adding the following lines to the argument section. You will see these two lines there
args:
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls=true
There is only one args parameter in that file.
Deploy your pod/deployment and you should be able to do:
kubectl top pod <pod-name>