I am trying to install istio in a minikube cluster
I followed the tutorial on this page https://istio.io/docs/setup/kubernetes/quick-start/
I am trying to use Option 1 : https://istio.io/docs/setup/kubernetes/quick-start/#option-1-install-istio-without-mutual-tls-authentication-between-sidecars
I can see that the services have been created but the deployment seems to have failed.
kubectl get pods -n istio-system
No resources found
How can i troubleshoot this ?
Here are the results of get deployment
kubectl get deployment -n istio-system
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
grafana 1 0 0 0 4m
istio-citadel 1 0 0 0 4m
istio-egressgateway 1 0 0 0 4m
istio-galley 1 0 0 0 4m
istio-ingressgateway 1 0 0 0 4m
istio-pilot 1 0 0 0 4m
istio-policy 1 0 0 0 4m
istio-sidecar-injector 1 0 0 0 4m
istio-telemetry 1 0 0 0 4m
istio-tracing 1 0 0 0 4m
prometheus 1 0 0 0 4m
servicegraph 1 0 0 0 4m
This is what worked for me. Don't use the --extra-configs while starting minikube. This is crashing kube-controller-manager-minikube as its not able to find the file
error starting controllers: failed to start certificate controller:
error reading CA cert file "/var/lib/localkube/certs/ca.crt": open
/var/lib/localkube/certs/ca.crt: no such file or directory
Just start minikube with this command. I have minikube V0.30.0.
minikube start
Output:
Starting local Kubernetes v1.10.0 cluster...
Starting VM...
Downloading Minikube ISO
170.78 MB / 170.78 MB [============================================] 100.00% 0s
Getting VM IP address...
Moving files into cluster...
Downloading kubelet v1.10.0
Downloading kubeadm v1.10.0
Finished Downloading kubeadm v1.10.0
Finished Downloading kubelet v1.10.0
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
Kubectl is now configured to use the cluster.
Loading cached images from config file.
Pointing to istio-1.0.4 folder, run this command
kubectl apply -f install/kubernetes/helm/istio/templates/crds.yaml
This should install all the required crds
Run this command
kubectl apply -f install/kubernetes/istio-demo.yaml
After successful creation of rules, services, deployments etc., Run this command
kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
grafana-9cfc9d4c9-h2zn8 1/1 Running 0 5m
istio-citadel-74df865579-d2pbq 1/1 Running 0 5m
istio-cleanup-secrets-ghlbf 0/1 Completed 0 5m
istio-egressgateway-58df7c4d8-4tg4p 1/1 Running 0 5m
istio-galley-8487989b9b-jbp2d 1/1 Running 0 5m
istio-grafana-post-install-dn6bw 0/1 Completed 0 5m
istio-ingressgateway-6fc88db97f-49z88 1/1 Running 0 5m
istio-pilot-74bb7dcdd-xjgvz 0/2 Pending 0 5m
istio-policy-58878f57fb-t6fqt 2/2 Running 0 5m
istio-security-post-install-vqbzw 0/1 Completed 0 5m
istio-sidecar-injector-5cfcf6dd86-lr8ll 1/1 Running 0 5m
istio-telemetry-bf5558589-8hzcc 2/2 Running 0 5m
istio-tracing-ff94688bb-bwzfs 1/1 Running 0 5m
prometheus-f556886b8-9z6vp 1/1 Running 0 5m
servicegraph-55d57f69f5-fvqbg 1/1 Running 0 5m
Also try customizing the installation via Helm template. You don't need to have Tiller deployed for that.
1) kubectl apply -f install/kubernetes/helm/istio/templates/crds.yaml
2) helm template install/kubernetes/helm/istio --name istio --namespace istio-system --set grafana.enabled=true --set servicegraph.enabled=true --set tracing.enabled=true --set kiali.enabled=true --set sidecarInjectorWebhook.enabled=true --set global.tag=1.0.5 > $HOME/istio.yaml
3) kubectl create namespace istio-system
4) kubectl apply -f $HOME/istio.yaml
Related
I have a new install of kubernetes on Ubuntu-18 using version 1.24.3 with Calico. The calico-controller will not start:
$ sudo kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-555bc4b957-z4q2p 0/1 Pending 0 5m14s
kube-system calico-node-jz2j7 1/1 Running 0 5m15s
kube-system coredns-6d4b75cb6d-hwfx9 1/1 Running 0 5m14s
kube-system coredns-6d4b75cb6d-wdh55 1/1 Running 0 5m14s
kube-system etcd-ubuntu-18-extssd 1/1 Running 1 5m27s
kube-system kube-apiserver-ubuntu-18-extssd 1/1 Running 1 5m28s
kube-system kube-controller-manager-ubuntu-18-extssd 1/1 Running 1 5m26s
kube-system kube-proxy-t5z2r 1/1 Running 0 5m15s
kube-system kube-scheduler-ubuntu-18-extssd 1/1 Running 1 5m27s
Someone suggested setting a couple of Calico timeouts to 60 seconds, but that didn't work either.
What could be causing the calico-controller to fail to start, especially since the calico-node is running?
Also, is there a more trouble-free CNI implementation to use? Calico seems very error-prone.
I solved this by installing Weave:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
with this cidr:
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
I was trying to install the Weave Cloud Agents for my minikube. I used the provided command
curl -Ls https://get.weave.works |sh -s -- --token=xxx
but keep getting the following error:
There was an error while performing a DNS check: checking DNS failed, the DNS in the Kubernetes cluster is not working correctly. Please check that your cluster can download images and run pods.
I have following dns:
kube-system coredns-6955765f44-7zt4x 1/1 Running 0 38m
kube-system coredns-6955765f44-xdnd9 1/1 Running 0 38m
I tried different suggestions such as https://www.jeffgeerling.com/blog/2019/debugging-networking-issues-multi-node-kubernetes-on-virtualbox or https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/. However none of them resolved my issue.
It seems to an issue which happened before https://github.com/weaveworks/launcher/issues/285.
My Kubernetes is on v1.17.3
Reproduced you issue, have the same error.
minikube v1.7.2 on Centos 7.7.1908
Docker 19.03.5
vm-driver=virtualbox
Connecting cluster to "Old Tree 34" (id: old-tree-34) on Weave Cloud
Installing Weave Cloud agents on minikube at https://192.168.99.100:8443
Performing a check of the Kubernetes installation setup.
There was an error while performing a DNS check: checking DNS failed, the DNS in the Kubernetes cluster is not working correctly. Please check that your cluster can download images and run pods.
I wasnt able to fix this problem, instead of that found a workaround - use Helm. You have second tab 'Helm 'in 'Install the Weave Cloud Agents' with provided command, like
helm repo update && helm upgrade --install --wait weave-cloud \
--set token=xxx \
--namespace weave \
stable/weave-cloud
Lets install Helm and use it.
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get | bash
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller
.....
Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
helm repo update
helm upgrade --install --wait weave-cloud \
> --set token=xxx \
> --namespace weave \
> stable/weave-cloud
Release "weave-cloud" does not exist. Installing it now.
NAME: weave-cloud
LAST DEPLOYED: Thu Feb 13 14:52:45 2020
NAMESPACE: weave
STATUS: DEPLOYED
RESOURCES:
==> v1/Deployment
NAME AGE
weave-agent 35s
==> v1/Pod(related)
NAME AGE
weave-agent-69fbf74889-dw77c 35s
==> v1/Secret
NAME AGE
weave-cloud 35s
==> v1/ServiceAccount
NAME AGE
weave-cloud 35s
==> v1beta1/ClusterRole
NAME AGE
weave-cloud 35s
==> v1beta1/ClusterRoleBinding
NAME AGE
weave-cloud 35s
NOTES:
Weave Cloud agents had been installed!
First, verify all Pods are running:
kubectl get pods -n weave
Next, login to Weave Cloud (https://cloud.weave.works) and verify the agents are connect to your instance.
If you need help or have any question, join our Slack to chat to us – https://slack.weave.works.
Happy hacking!
Check(wait around 10 min to deploy everything):
kubectl get pods -n weave
NAME READY STATUS RESTARTS AGE
kube-state-metrics-64599b7996-d8pnw 1/1 Running 0 29m
prom-node-exporter-2lwbn 1/1 Running 0 29m
prometheus-5586cdd667-dtdqq 2/2 Running 0 29m
weave-agent-6c77dbc569-xc9qx 1/1 Running 0 29m
weave-flux-agent-65cb4694d8-sllks 1/1 Running 0 29m
weave-flux-memcached-676f88fcf7-ktwnp 1/1 Running 0 29m
weave-scope-agent-7lgll 1/1 Running 0 29m
weave-scope-cluster-agent-8fb596b6b-mddv8 1/1 Running 0 29m
[vkryvoruchko#nested-vm-image1 bin]$ kubectl get all -n weave
NAME READY STATUS RESTARTS AGE
pod/kube-state-metrics-64599b7996-d8pnw 1/1 Running 0 30m
pod/prom-node-exporter-2lwbn 1/1 Running 0 30m
pod/prometheus-5586cdd667-dtdqq 2/2 Running 0 30m
pod/weave-agent-6c77dbc569-xc9qx 1/1 Running 0 30m
pod/weave-flux-agent-65cb4694d8-sllks 1/1 Running 0 30m
pod/weave-flux-memcached-676f88fcf7-ktwnp 1/1 Running 0 30m
pod/weave-scope-agent-7lgll 1/1 Running 0 30m
pod/weave-scope-cluster-agent-8fb596b6b-mddv8 1/1 Running 0 30m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus ClusterIP 10.108.197.29 <none> 80/TCP 30m
service/weave-flux-memcached ClusterIP None <none> 11211/TCP 30m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prom-node-exporter 1 1 1 1 1 <none> 30m
daemonset.apps/weave-scope-agent 1 1 1 1 1 <none> 30m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kube-state-metrics 1/1 1 1 30m
deployment.apps/prometheus 1/1 1 1 30m
deployment.apps/weave-agent 1/1 1 1 31m
deployment.apps/weave-flux-agent 1/1 1 1 30m
deployment.apps/weave-flux-memcached 1/1 1 1 30m
deployment.apps/weave-scope-cluster-agent 1/1 1 1 30m
NAME DESIRED CURRENT READY AGE
replicaset.apps/kube-state-metrics-64599b7996 1 1 1 30m
replicaset.apps/prometheus-5586cdd667 1 1 1 30m
replicaset.apps/weave-agent-69fbf74889 0 0 0 31m
replicaset.apps/weave-agent-6c77dbc569 1 1 1 30m
replicaset.apps/weave-flux-agent-65cb4694d8 1 1 1 30m
replicaset.apps/weave-flux-memcached-676f88fcf7 1 1 1 30m
replicaset.apps/weave-scope-cluster-agent-8fb596b6b 1 1 1 30m
Login to https://cloud.weave.works/ and check the same:
Started installing agents on Kubernetes cluster v1.17.2
All Weave Cloud agents are connected!
What happened?
kubernetes version: 1.12
promethus operator: release-0.1
I follow the README:
$ kubectl create -f manifests/
# It can take a few seconds for the above 'create manifests' command to fully create the following resources, so verify the resources are ready before proceeding.
$ until kubectl get customresourcedefinitions servicemonitors.monitoring.coreos.com ; do date; sleep 1; echo ""; done
$ until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
$ kubectl apply -f manifests/ # This command sometimes may need to be done twice (to workaround a race condition).
and then I use the command and then is showed like:
[root#VM_8_3_centos /data/hansenwu/kube-prometheus/manifests]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 66s
alertmanager-main-1 1/2 Running 0 47s
grafana-54f84fdf45-kt2j9 1/1 Running 0 72s
kube-state-metrics-65b8dbf498-h7d8g 4/4 Running 0 57s
node-exporter-7mpjw 2/2 Running 0 72s
node-exporter-crfgv 2/2 Running 0 72s
node-exporter-l7s9g 2/2 Running 0 72s
node-exporter-lqpns 2/2 Running 0 72s
prometheus-adapter-5b6f856dbc-ndfwl 1/1 Running 0 72s
prometheus-k8s-0 3/3 Running 1 59s
prometheus-k8s-1 3/3 Running 1 59s
prometheus-operator-5c64c8969-lqvkb 1/1 Running 0 72s
[root#VM_8_3_centos /data/hansenwu/kube-prometheus/manifests]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 0/2 Pending 0 0s
grafana-54f84fdf45-kt2j9 1/1 Running 0 75s
kube-state-metrics-65b8dbf498-h7d8g 4/4 Running 0 60s
node-exporter-7mpjw 2/2 Running 0 75s
node-exporter-crfgv 2/2 Running 0 75s
node-exporter-l7s9g 2/2 Running 0 75s
node-exporter-lqpns 2/2 Running 0 75s
prometheus-adapter-5b6f856dbc-ndfwl 1/1 Running 0 75s
prometheus-k8s-0 3/3 Running 1 62s
prometheus-k8s-1 3/3 Running 1 62s
prometheus-operator-5c64c8969-lqvkb 1/1 Running 0 75s
I don't know why the pod altertmanager-main-0 pending and disaply then restart.
And I see the event, it is showed as:
72s Warning FailedCreate StatefulSet create Pod alertmanager-main-0 in StatefulSet alertmanager-main failed error: The POST operation against Pod could not be completed at this time, please try again.
72s Warning FailedCreate StatefulSet create Pod alertmanager-main-0 in StatefulSet alertmanager-main failed error: The POST operation against Pod could not be completed at this time, please try again.
72s Warning^Z FailedCreate StatefulSet
[10]+ Stopped kubectl get events -n monitoring
Most likely the alertmanager does not get enough time to start correctly.
Have a look at this answer : https://github.com/coreos/prometheus-operator/issues/965#issuecomment-460223268
You can set the paused field to true, and then modify the StatefulSet to try if extending the liveness/readiness solves your issue.
vagrant#ubuntu-xenial:~$ helm init
$HELM_HOME has been configured at /home/vagrant/.helm.
Warning: Tiller is already installed in the cluster.
(Use --client-only to suppress this message, or --upgrade to upgrade Tiller to the current version.)
vagrant#ubuntu-xenial:~$ helm ls
Error: could not find tiller
How can I diagnose this further?
Here are the currently running pods in kube-system:
vagrant#ubuntu-xenial:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
canal-dlfzg 2/2 Running 2 72d
canal-kxp4s 2/2 Running 0 29d
canal-lkkbq 2/2 Running 2 72d
coredns-86bc4b7c96-xwq4d 1/1 Running 2 49d
coredns-autoscaler-5d5d49b8ff-l6cxq 1/1 Running 0 49d
metrics-server-58bd5dd8d7-tbj7j 1/1 Running 1 72d
rke-coredns-addon-deploy-job-h4c4q 0/1 Completed 0 49d
rke-ingress-controller-deploy-job-mj82v 0/1 Completed 0 49d
rke-metrics-addon-deploy-job-tggx5 0/1 Completed 0 72d
rke-network-plugin-deploy-job-jzswv 0/1 Completed 0 72d
The issue was with the deployment / service account not being present.
vagrant#ubuntu-xenial:~$ kubectl get deployment tiller-deploy --namespace kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
tiller-deploy 0/1 0 0 24h
vagrant#ubuntu-xenial:~$ kubectl get events --all-namespaces
kube-system 4m52s Warning FailedCreate replicaset/tiller-deploy-7f4d76c4b6 Error creating: pods "tiller-deploy-7f4d76c4b6-" is forbidden: error looking up service account kube-system/tiller: serviceaccount "tiller" not found
I deleted the deployment and ran helm init once again which then worked:
kubectl delete deployment tiller-deploy --namespace kube-system
helm init
I'm running a simple example with Helm. Take a look below at values.yaml file:
cat << EOF | helm install helm/vitess -n vitess -f -
topology:
cells:
- name: 'zone1'
keyspaces:
- name: 'vitess'
shards:
- name: '0'
tablets:
- type: 'replica'
vttablet:
replicas: 1
mysqlProtocol:
enabled: true
authType: secret
username: vitess
passwordSecret: vitess-db-password
etcd:
replicas: 3
vtctld:
replicas: 1
vtgate:
replicas: 3
vttablet:
dataVolumeClaimSpec:
storageClassName: nfs-slow
EOF
Take a look at the output of current pods running below:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-fb8b8dccf-8f5kt 1/1 Running 0 32m
kube-system coredns-fb8b8dccf-qbd6c 1/1 Running 0 32m
kube-system etcd-master1 1/1 Running 0 32m
kube-system kube-apiserver-master1 1/1 Running 0 31m
kube-system kube-controller-manager-master1 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-bkg9z 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-q8vh4 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-vqmnz 1/1 Running 0 32m
kube-system kube-proxy-bd8mf 1/1 Running 0 32m
kube-system kube-proxy-nlc2b 1/1 Running 0 32m
kube-system kube-proxy-x7cd5 1/1 Running 0 32m
kube-system kube-scheduler-master1 1/1 Running 0 32m
kube-system tiller-deploy-8458f6c667-cx2mv 1/1 Running 0 27m
vitess etcd-global-6pwvnv29th 0/1 Init:0/1 0 16m
vitess etcd-operator-84db9bc774-j4wml 1/1 Running 0 30m
vitess etcd-zone1-zwgvd7spzc 0/1 Init:0/1 0 16m
vitess vtctld-86cd78b6f5-zgfqg 0/1 CrashLoopBackOff 7 16m
vitess vtgate-zone1-58744956c4-x8ms2 0/1 CrashLoopBackOff 7 16m
vitess zone1-vitess-0-init-shard-master-mbbph 1/1 Running 0 16m
vitess zone1-vitess-0-replica-0 0/6 Init:CrashLoopBackOff 7 16m
Running logs I see this error:
$ kubectl logs -n vitess vtctld-86cd78b6f5-zgfqg
++ cat
+ eval exec /vt/bin/vtctld '-cell="zone1"' '-web_dir="/vt/web/vtctld"' '-web_dir2="/vt/web/vtctld2/app"' -workflow_manager_init -workflow_manager_use_election -logtostderr=true -stderrthreshold=0 -port=15000 -grpc_port=15999 '-service_map="grpc-vtctl"' '-topo_implementation="etcd2"' '-topo_global_server_address="etcd-global-client.vitess:2379"' -topo_global_root=/vitess/global
++ exec /vt/bin/vtctld -cell=zone1 -web_dir=/vt/web/vtctld -web_dir2=/vt/web/vtctld2/app -workflow_manager_init -workflow_manager_use_election -logtostderr=true -stderrthreshold=0 -port=15000 -grpc_port=15999 -service_map=grpc-vtctl -topo_implementation=etcd2 -topo_global_server_address=etcd-global-client.vitess:2379 -topo_global_root=/vitess/global
ERROR: logging before flag.Parse: E0422 02:35:34.020928 1 syslogger.go:122] can't connect to syslog
F0422 02:35:39.025400 1 server.go:221] Failed to open topo server (etcd2,etcd-global-client.vitess:2379,/vitess/global): grpc: timed out when dialing
I'm running behind vagrant with 1 master and 2 nodes. I suspect that is a issue with eth1.
The storage are configured to use NFS.
$ kubectl logs etcd-operator-84db9bc774-j4wml
time="2019-04-22T17:26:51Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-zone1-zwgvd7spzc])" cluster-name=etcd-zone1 cluster-namespace=vitess pkg=cluster
time="2019-04-22T17:26:51Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-zone1-zwgvd7spzc])" cluster-name=etcd-global cluster-namespace=vitess pkg=cluster
It appears that etcd is not fully initializing. Note that neither the pod for the global lockserver (etcd-global-6pwvnv29th) nor the local one for cell zone1 (pod etcd-zone1-zwgvd7spzc) are ready.