Installing Istio on GCP Kubernetes - Istio deps Fail - kubernetes

I'm following the tutorial for Istio on the Google Cloud Platform and have been able to get my cluster up and running. I get to part where I start the demo app by running kubectl apply -f install/kubernetes/istio-demo-auth.yaml but a number of the pods wont come up.
I'm running Istio 1.0.3
kubectl version --short
Client Version: v1.11.1
Server Version: v1.9.7-gke.6
When I run the command kubectl get service -n istio-system
to verify istio pods are deployed and containers are running many of them are in crash cycles. Any tips on how to debug this?
grafana-7b6d98d887-9dgdc 1/1 Running 0 17h
istio-citadel-778747b96d-cw78t 1/1 Running 0 17h
istio-cleanup-secrets-2vjlf 0/1 Completed 0 17h
istio-egressgateway-7b8f4ccb6-rl69j 1/1 Running 123 17h
istio-galley-7557f8c985-jp975 0/1 ContainerCreating 0 17h
istio-grafana-post-install-n45x4 0/1 Error 202 17h
istio-ingressgateway-5f94fdc55f-dc2q5 1/1 Running 123 17h
istio-pilot-d6b56bf4d-czp9w 1/2 CrashLoopBackOff 328 17h
istio-policy-6c7d8454b-dpvfj 1/2 CrashLoopBackOff 500 17h
istio-security-post-install-qrzpq 0/1 CrashLoopBackOff 201 17h
istio-sidecar-injector-75cf59b857-z7wbc 0/1 ContainerCreating 0 17h
istio-telemetry-69db5c7575-4jp2d 1/2 CrashLoopBackOff 500 17h
istio-tracing-77f9f94b98-vjmhc 1/1 Running 0 17h
prometheus-67d4988588-gjmcp 1/1 Running 0 17h
servicegraph-57d8ff7686-x2r8r 1/1 Running 0 17h

That looks like the output for kubectl -n istio-system get pods
Tips, check the output for these:
$ kubectl -n istio-system logs istio-pilot-d6b56bf4d-czp9w
$ kubectl -n istio-system logs istio-policy-6c7d8454b-dpvfj
$ kubectl -n istio-system logs istio-grafana-post-install-n45x4
$ kubectl -n istio-system logs istio-telemetry-69db5c7575-4jp2d
Check the deployment/service/configmap definitions in install/kubernetes/istio-demo-auth.yaml that you have pods crashing for.

Try installing with Helm via template.
You would usually want to have Grafana, Zipkin & Kiali along. This is what worked for me:
1) kubectl apply -f install/kubernetes/helm/istio/templates/crds.yaml
2) helm template install/kubernetes/helm/istio --name istio --namespace istio-system --set grafana.enabled=true --set servicegraph.enabled=true --set tracing.enabled=true --set kiali.enabled=true --set sidecarInjectorWebhook.enabled=true --set global.tag=1.0.5 > $HOME/istio.yaml
3) kubectl create namespace istio-system
4) kubectl apply -f $HOME/istio.yaml

I had a similar issue - turned out that my NAT Gateway wasn't configured correctly. The Terraform I used to create the private cluster created an additional default internet gateway that I needed to delete.
Some came up, some didn't - I think that maybe some of the images were cached somewhere the cluster could reach, like a Google repo.

Related

Access Prometheus GUI on Kubernetes Cluster with Istio

I have installed Istio on my GKE cluster using Istio CLI. I have read that Prometheus comes default with Istio.
How do I confirm if Prometheus is correctly installed and how do I access it?
# kubectl get po -n istio-system
NAME READY STATUS RESTARTS AGE
istio-egressgateway-64d976b9b5-pmf8d 1/1 Running 0 18d
istio-ingressgateway-68c86b9fc8-94ftm 1/1 Running 0 18d
istiod-5c986fb85b-h6v4r 1/1 Running 0 18d
prometheus-7bfddb8dbf-x2p2x 2/2 Running 0 18d
zipkin-7fcd647cf9-hp8qs 1/1 Running 0 18d
If it's not there, deploy it with:
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.9/samples/addons/prometheus.yaml

How to config email alert in using grafana and prometheus-operator

I installed prometheus-operator (include prometheus/alertmanager/grafana) via helm. Then I access Grafana UI and config alert via email. When I click send an email test, I got the message “ SMTP not configured, check your grafana.ini config file’s [smtp] section”
But I don’t know where the grafana.ini to can change in this case.
[root#k8s-master ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5bbc8f45cb-nlqgh 1/1 Running 1 15h
kube-system calico-node-lk2j5 1/1 Running 1 15h
kube-system calico-node-v6wzs 1/1 Running 1 15h
kube-system calico-node-zfh5r 1/1 Running 1 15h
kube-system coredns-5c98db65d4-79c2g 1/1 Running 1 15h
kube-system coredns-5c98db65d4-bqj7g 1/1 Running 1 15h
kube-system etcd-k8s-master 1/1 Running 1 15h
kube-system kube-apiserver-k8s-master 1/1 Running 1 15h
kube-system kube-controller-manager-k8s-master 1/1 Running 2 15h
kube-system kube-proxy-8qmdt 1/1 Running 1 15h
kube-system kube-proxy-qwgbc 1/1 Running 1 15h
kube-system kube-proxy-vhqjd 1/1 Running 1 15h
kube-system kube-scheduler-k8s-master 1/1 Running 1 15h
monitoring alertmanager-prometheus-operator-alertmanager-0 2/2 Running 3 15h
monitoring prometheus-operator-grafana-64848fc9bb-dbnwc 2/2 Running 3 15h
monitoring prometheus-operator-kube-state-metrics-5d46566c59-ck4np 1/1 Running 2 15h
monitoring prometheus-operator-operator-64dcc7bfc-lpdj6 2/2 Running 2 15h
monitoring prometheus-operator-prometheus-node-exporter-ns4kg 1/1 Running 1 15h
monitoring prometheus-operator-prometheus-node-exporter-tdhwq 1/1 Running 2 15h
monitoring prometheus-operator-prometheus-node-exporter-xt8z9 1/1 Running 2 15h
monitoring prometheus-prometheus-operator-prometheus-0 3/3 Running 4 15h
You will be able to override this configuration, using helm variables, thanks to alertmanager.config key.
This key convert yaml into configuration for alertmanager, so you can use every alertmanager configuration.
You should probably also change grafana.ini configuration to configure smtp into grafana (test seems to use that configuration). You can check this configuration in Grafana via "Server admin" > "Settings", search "smtp".
As a reference, you can do something like the following for alertmanager :
helm upgrade --install prometheus stable/prometheus-operator \
-f helm/prometheus-operator.yml \
-f helm/grafana-custom.staging.yml \
--set-string alertmanager.config.global.smtp_smarthost="my.smtp.tld:465" \
--set-string alertmanager.config.global.smtp_auth_username="my#email.tld" \
--set-string alertmanager.config.global.smtp_from="my#email.tld" \
--set-string alertmanager.config.global.smtp_auth_password="MyAmazingPassword" \
--set-string grafana.'grafana\.ini'.smtp.enabled=true \
--set-string grafana.'grafana\.ini'.smtp.host="my.smtp.tld:465" \
--set-string grafana.'grafana\.ini'.smtp.from_address="my#email.tld" \
--set-string grafana.'grafana\.ini'.smtp.user="my#email.tld" \
--set-string grafana.'grafana\.ini'.smtp.password="MyAmazingPassword"
The grafana.ini is loaded through configmaps in prometheus-operator helm deployment. If you have already installed it via helm then you can just modify the configmap and then restart the grafana pod. Below is minimum config with which i was able to use SMTP.
[smtp]
enabled = true
host = your.smtp.server.name:25
skip_verify = true
from_address = "grafana#xyz.com"
from_name = Grafana
To get the configmap, run below command and edit the configmap(include namespace in below command if prometheus-operator is deployed in a speratae name space than default).
kubectl get configmap | grep grafana
After editing configmap, restart grafana pod(restarts of other pods is no needed).
Note: Skip_verify = true is not recommended.

`helm init` says tiller is already on cluster but it's not?

vagrant#ubuntu-xenial:~$ helm init
$HELM_HOME has been configured at /home/vagrant/.helm.
Warning: Tiller is already installed in the cluster.
(Use --client-only to suppress this message, or --upgrade to upgrade Tiller to the current version.)
vagrant#ubuntu-xenial:~$ helm ls
Error: could not find tiller
How can I diagnose this further?
Here are the currently running pods in kube-system:
vagrant#ubuntu-xenial:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
canal-dlfzg 2/2 Running 2 72d
canal-kxp4s 2/2 Running 0 29d
canal-lkkbq 2/2 Running 2 72d
coredns-86bc4b7c96-xwq4d 1/1 Running 2 49d
coredns-autoscaler-5d5d49b8ff-l6cxq 1/1 Running 0 49d
metrics-server-58bd5dd8d7-tbj7j 1/1 Running 1 72d
rke-coredns-addon-deploy-job-h4c4q 0/1 Completed 0 49d
rke-ingress-controller-deploy-job-mj82v 0/1 Completed 0 49d
rke-metrics-addon-deploy-job-tggx5 0/1 Completed 0 72d
rke-network-plugin-deploy-job-jzswv 0/1 Completed 0 72d
The issue was with the deployment / service account not being present.
vagrant#ubuntu-xenial:~$ kubectl get deployment tiller-deploy --namespace kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
tiller-deploy 0/1 0 0 24h
vagrant#ubuntu-xenial:~$ kubectl get events --all-namespaces
kube-system 4m52s Warning FailedCreate replicaset/tiller-deploy-7f4d76c4b6 Error creating: pods "tiller-deploy-7f4d76c4b6-" is forbidden: error looking up service account kube-system/tiller: serviceaccount "tiller" not found
I deleted the deployment and ran helm init once again which then worked:
kubectl delete deployment tiller-deploy --namespace kube-system
helm init

Error from server (NotFound): podmetrics.metrics.k8s.io "mem-example/memory-demo" not found

I am following this tutorial: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/
I have created the memory pod demo and I am trying to get the metrics from the pod but it is not working.
I installed the metrics server by cloning: https://github.com/kubernetes-incubator/metrics-server
And then running this command from top level:
kubectl create -f deploy/1.8+/
I am using kubernetes version 1.10.11.
The pod is definitely created:
λ kubectl get pod memory-demo --namespace=mem-example
NAME READY STATUS RESTARTS AGE
memory-demo 1/1 Running 0 6m
But the metics command does not work and gives an error:
λ kubectl top pod memory-demo --namespace=mem-example
Error from server (NotFound): podmetrics.metrics.k8s.io "mem-example/memory-demo" not found
What did I do wrong?
There are some patches to be done to metrics server deployment to get the metrics working.
Follow the below steps
kubectl delete -f deploy/1.8+/
wait till the metrics server gets undeployed
run the below command
kubectl create -f https://raw.githubusercontent.com/epasham/docker-repo/master/k8s/metrics-server.yaml
master $ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-6zg78 1/1 Running 0 2h
coredns-78fcdf6894-gk4sb 1/1 Running 0 2h
etcd-master 1/1 Running 0 2h
kube-apiserver-master 1/1 Running 0 2h
kube-controller-manager-master 1/1 Running 0 2h
kube-proxy-f5z9p 1/1 Running 0 2h
kube-proxy-ghbvn 1/1 Running 0 2h
kube-scheduler-master 1/1 Running 0 2h
metrics-server-85c54d44c8-rmvxh 2/2 Running 0 1m
weave-net-4j7cl 2/2 Running 1 2h
weave-net-82fzn 2/2 Running 1 2h
master $ kubectl top pod -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-78fcdf6894-6zg78 2m 11Mi
coredns-78fcdf6894-gk4sb 2m 9Mi
etcd-master 14m 90Mi
kube-apiserver-master 24m 425Mi
kube-controller-manager-master 26m 62Mi
kube-proxy-f5z9p 2m 19Mi
kube-proxy-ghbvn 3m 17Mi
kube-scheduler-master 8m 14Mi
metrics-server-85c54d44c8-rmvxh 1m 19Mi
weave-net-4j7cl 2m 59Mi
weave-net-82fzn 1m 60Mi
Check and verify the below lines in metrics server deployment manifest.
command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
On Minikube, I had to wait for 20-25 minutes after enabling the metrics-server addon. I was getting the same error for 20-25 minutes but later I could see the output without attempting for any solution.
I faced the similar issue of
Error from server (NotFound): podmetrics.metrics.k8s.io "default/apple-app" not found
I followed two steps and I was able to resolve the issue.
Download the latest customized components.yaml, which is their official file used for easy deployment.
Update the change
# - /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
to the command section of the deployment specification. I have commented the first line because it is the entrypoint of the image used by kubernetes metrics-server.
$ docker image inspect k8s.gcr.io/metrics-server-amd64:v0.3.6 -f {{.ContainerConfig.Entrypoint}}
[/metrics-server]
Even If you use it or not, it doesn't matter.
Note: You have to wait for few seconds for it to properly work.
After this running the top command will work for you.
$ kubectl top pod apple-app
NAME CPU(cores) MEMORY(bytes)
apple-app 1m 3Mi
I know this is an old thread may be someone will find this answer useful.
You have to checkout the following repo:
https://github.com/kubernetes-incubator/metrics-server
Go to the root of the repo and checkout release-0.3.2.
Remove default metrics server by:
kubectl delete -f deploy/1.8+/
Download the container yaml
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
Edit the container.yaml by adding the following lines to the argument section. You will see these two lines there
args:
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls=true
There is only one args parameter in that file.
Deploy your pod/deployment and you should be able to do:
kubectl top pod <pod-name>

Kubernetes Dashboard CrashLoopBackOff, Get error "connect: no route to host", How could I fix it?

I have deployed the Kubernetes dashboard which ended up in CrashLoopBackOff status. When I run:
$ kubectl logs kubernetes-dashboard-767dc7d4d-mc2sm --namespace=kube-system
the output is:
Error from server: Get https://10.4.211.53:10250/containerLogs/kube-system/kubernetes-dashboard-767dc7d4d-mc2sm/kubernetes-dashboard: dial tcp 10.4.211.53:10250: connect: no route to host
How can I fix this? Does this means that the port 10250 isn't open?
Update:
#LucaBrasi
Error from server (NotFound): pods "kubernetes-dashboard-767dc7d4d-mc2sm" not found
systemctl status kubelet --full Output is :
kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since 一 2018-09-10 15:04:57 CST; 1 day 23h ago
Docs: https://kubernetes.io/docs/
Main PID: 93440 (kubelet)
Tasks: 21
Memory: 78.9M
CGroup: /system.slice/kubelet.service
└─93440 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni
Output for kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-78fcdf6894-qh6zb 1/1 Running 2 3d
kube-system coredns-78fcdf6894-xbzgn 1/1 Running 1 3d
kube-system etcd-twsr-whtestserver01.garenanet.com 1/1 Running 2 3d
kube-system kube-apiserver-twsr-whtestserver01.garenanet.com 1/1 Running 2 3d
kube-system kube-controller-manager-twsr-whtestserver01.garenanet.com 1/1 Running 2 3d
kube-system kube-flannel-ds-amd64-2bnmx 1/1 Running 3 3d
kube-system kube-flannel-ds-amd64-r58j6 1/1 Running 0 3d
kube-system kube-flannel-ds-amd64-wq6ls 1/1 Running 0 3d
kube-system kube-proxy-ds7lg 1/1 Running 0 3d
kube-system kube-proxy-fx46d 1/1 Running 0 3d
kube-system kube-proxy-ph7qq 1/1 Running 2 3d
kube-system kube-scheduler-twsr-whtestserver01.garenanet.com 1/1 Running 1 3d
kube-system kubernetes-dashboard-767dc7d4d-mc2sm 0/1 CrashLoopBackOff 877 3d
I had the same issue when I reproduced all the steps from the tutorial you've linked - my dashboard was in CrashLoopBackOffstate. After I performed this steps and applied new dashboard yaml from the official github documentation (there seems to be no difference from the one you've posted), the dashboard was working properly.
First, list all the objects related to Kubernetes dashboard:
kubectl get secret,sa,role,rolebinding,services,deployments --namespace=kube-system | grep dashboard
Delete them:
kubectl delete deployment kubernetes-dashboard --namespace=kube-system
kubectl delete service kubernetes-dashboard --namespace=kube-system
kubectl delete role kubernetes-dashboard-minimal --namespace=kube-system
kubectl delete rolebinding kubernetes-dashboard-minimal --namespace=kube-system
kubectl delete sa kubernetes-dashboard --namespace=kube-system
kubectl delete secret kubernetes-dashboard-certs --namespace=kube-system
kubectl delete secret kubernetes-dashboard-key-holder --namespace=kube-system
Now apply Kubernetes dashboard yaml:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
Please tell me if this worked for you as well, and if it did, treat it as a workaround as I don't know the reason yet - I am investigating.