I've installed Istio 1.1 RC on a fresh GKE cluster, using Helm, and enabled mTLS (some options omitted like Grafana and Kiali):
helm template istio/install/kubernetes/helm/istio \
--set global.mtls.enabled=true \
--set global.controlPlaneSecurityEnabled=true \
--set istio_cni.enabled=true \
--set istio-cni.excludeNamespaces={"istio-system"} \
--name istio \
--namespace istio-system >> istio.yaml
kubectl apply -f istio.yaml
Next, I installed the Bookinfo example app like this:
kubectl label namespace default istio-injection=enabled
kubectl apply -f istio/samples/bookinfo/platform/kube/bookinfo.yaml
kubectl apply -f istio/samples/bookinfo/networking/bookinfo-gateway.yaml
kubectl apply -f istio/samples/bookinfo/networking/destination-rule-all-mtls.yaml
Then I've gone about testing by following the examples at: https://istio.io/docs/tasks/security/mutual-tls/
My results show the config is incorrect, but the verification guide above doesn't provide any hints about how to fix or diagnose issues. Here's what I see:
istio/bin/istioctl authn tls-check productpage.default.svc.cluster.local
Stderr when execute [/usr/local/bin/pilot-discovery request GET /debug/authenticationz ]: gc 1 #0.015s 6%: 0.016+1.4+1.0 ms clock, 0.064+0.31/0.45/1.6+4.0 ms cpu, 4->4->1 MB, 5 MB goal, 4 P
gc 2 #0.024s 9%: 0.007+1.4+1.0 ms clock, 0.029+0.15/1.1/1.1+4.3 ms cpu, 4->4->2 MB, 5 MB goal, 4 P
HOST:PORT STATUS SERVER CLIENT AUTHN POLICY DESTINATION RULE
productpage.default.svc.cluster.local:9080 OK mTLS mTLS default/ default/istio-system
This appears to show that mTLS is OK. And the previous checks all pass, like checking the cachain is present, etc. The above check passes for all the bookinfo components.
However, the following checks show an issue:
1: Confirm that plain-text requests fail as TLS is required to talk to httpbin with the following command:
kubectl exec $(kubectl get pod -l app=productpage -o jsonpath={.items..metadata.name}) -c istio-proxy -- curl http://productpage:9080/productpage -o /dev/null -s -w '%{http_code}\n'
200 <== Error. Should fail.
2: Confirm TLS requests without client certificate also fail:
kubectl exec $(kubectl get pod -l app=productpage -o jsonpath={.items..metadata.name}) -c istio-proxy -- curl https://productpage:9080/productpage -o /dev/null -s -w '%{http_code}\n' -k
000 <=== Correct behaviour
command terminated with exit code 35
3: Confirm TLS request with client certificate succeed:
kubectl exec $(kubectl get pod -l app=productpage -o jsonpath={.items..metadata.name}) -c istio-proxy -- curl https://productpage:9080/productpage -o /dev/null -s -w '%{http_code}\n' --key /etc/certs/key.pem --cert /etc/certs/cert-chain.pem --cacert /etc/certs/root-cert.pem -k
000 <=== Incorrect. Should succeed.
command terminated with exit code 35
What else can I do to debug my installation? I've followed the installation process quite carefully. Here's my cluster info:
Kubernetes master is running at https://<omitted>
calico-typha is running at https://<omitted>/api/v1/namespaces/kube-system/services/calico-typha:calico-typha/proxy
GLBCDefaultBackend is running at https://<omitted>/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy
Heapster is running at https://<omitted>/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://<omitted>/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://<omitted>/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
Kubernetes version: 1.11.7-gke.4
I guess I'm after either a more comprehensive guide or some specific things I can check.
Edit: Additional info:
Pod status:
default ns:
NAME READY STATUS RESTARTS AGE
details-v1-68868454f5-l7srt 2/2 Running 0 3h
productpage-v1-5cb458d74f-lmf7x 2/2 Running 0 2h
ratings-v1-76f4c9765f-ttstt 2/2 Running 0 2h
reviews-v1-56f6855586-qszpm 2/2 Running 0 2h
reviews-v2-65c9df47f8-ztrss 2/2 Running 0 3h
reviews-v3-6cf47594fd-hq6pc 2/2 Running 0 2h
istio-system ns:
NAME READY STATUS RESTARTS AGE
grafana-7b46bf6b7c-2qzcv 1/1 Running 0 3h
istio-citadel-5bf5488468-wkmvf 1/1 Running 0 3h
istio-cleanup-secrets-release-1.1-latest-daily-zmw7s 0/1 Completed 0 3h
istio-egressgateway-cf8d6dc69-fdmw2 1/1 Running 0 3h
istio-galley-5bcd455cbb-7wjkl 1/1 Running 0 3h
istio-grafana-post-install-release-1.1-latest-daily-vc2ff 0/1 Completed 0 3h
istio-ingressgateway-68b6767bcb-65h2d 1/1 Running 0 3h
istio-pilot-856849455f-29nvq 2/2 Running 0 2h
istio-policy-5568587488-7lkdr 2/2 Running 2 3h
istio-sidecar-injector-744f68bf5f-h22sp 1/1 Running 0 3h
istio-telemetry-7ffd6f6d4-tsmxv 2/2 Running 2 3h
istio-tracing-759fbf95b7-lc7fd 1/1 Running 0 3h
kiali-5d68f4c676-qrxfd 1/1 Running 0 3h
prometheus-c4b6997b-6d5k9 1/1 Running 0 3h
Example destinationrule:
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
creationTimestamp: "2019-02-21T15:15:09Z"
generation: 1
name: productpage
namespace: default
spec:
host: productpage
subsets:
- labels:
version: v1
name: v1
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
If you are using Istio 1.1 RC, you should be looking at the docs at https://preliminary.istio.io/ instead of https://istio.io/. The preliminary.istio.io site is always the working copy of the docs, corresponding to the next to be Istio release (1.1 currently).
That said, those docs are currently changing a lot day-to-day as they are being cleaned up and corrected during final testing before 1.1 is released, probably in the next couple of weeks.
A possible explanation for the plain text http request returning 200 in you test is that you may be running with permissive mode.
Related
I have the following pods on my kubernetes (1.18.3) cluster:
NAME READY STATUS RESTARTS AGE
pod1 1/1 Running 0 14m
pod2 1/1 Running 0 14m
pod3 0/1 Pending 0 14m
pod4 0/1 Pending 0 14m
pod3 and pod4 cannot start because the node has capacity for 2 pods only. When pod1 finishes and quits, then the scheduler picks either pod3 or pod4 and starts it. So far so good.
However, I also have a high priority pod (hpod) that I'd like to start before pod3 or pod4 when either of the running pods finishes and quits.
So I created a priorityclass can be found in the kubernetes docs:
kind: PriorityClass
metadata:
name: high-priority-no-preemption
value: 1000000
preemptionPolicy: Never
globalDefault: false
description: "This priority class should be used for XYZ service pods only."
I've created the following pod yaml:
apiVersion: v1
kind: Pod
metadata:
name: hpod
labels:
app: hpod
spec:
containers:
- name: hpod
image: ...
resources:
requests:
cpu: "500m"
memory: "500Mi"
limits:
cpu: "500m"
memory: "500Mi"
priorityClassName: high-priority-no-preemption
Now the problem is that when I start the high prio pod with kubectl apply -f hpod.yaml, then the scheduler terminates a running pod to allow the high priority pod to start despite I've set 'preemptionPolicy: Never'.
The expected behaviour would be to postpone starting hpod until a currently running pod finishes. And when it does, then let hpod start before pod3 or pod4.
What am I doing wrong?
Prerequisites:
This solution was tested on Kubernetes v1.18.3, docker 19.03 and Ubuntu 18.
Also text editor is required (i.e. sudo apt-get install vim).
In Kubernetes documentation under How to disable preemption you can find Note:
Note: In Kubernetes 1.15 and later, if the feature NonPreemptingPriority is enabled, PriorityClasses have the option to set preemptionPolicy: Never. This will prevent pods of that PriorityClass from preempting other pods.
Also under Non-preempting PriorityClass you have information:
The use of the PreemptionPolicy field requires the NonPreemptingPriority feature gate to be enabled.
Later if you will check thoses Feature Gates info, you will find that NonPreemptingPriority is false, so as default it's disabled.
Output with your current configuration:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-normal 1/1 Running 0 32s
nginx-normal-2 1/1 Running 0 32s
$ kubectl apply -f prio.yaml
pod/nginx-priority created$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-normal-2 1/1 Running 0 48s
nginx-priority 1/1 Running 0 8s
To enable preemptionPolicy: Never you need to apply --feature-gates=NonPreemptingPriority=true to 3 files:
/etc/kubernetes/manifests/kube-apiserver.yaml
/etc/kubernetes/manifests/kube-controller-manager.yaml
/etc/kubernetes/manifests/kube-scheduler.yaml
To check if this feature-gate is enabled you can check by using commands:
ps aux | grep apiserver | grep feature-gates
ps aux | grep scheduler | grep feature-gates
ps aux | grep controller-manager | grep feature-gates
For quite detailed information, why you have to edit thoses files please check this Github thread.
$ sudo su
# cd /etc/kubernetes/manifests/
# ls
etcd.yaml kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml
Use your text editor to add feature gate to those files
# vi kube-apiserver.yaml
and add - --feature-gates=NonPreemptingPriority=true under spec.containers.command like in example bellow:
spec:
containers:
- command:
- kube-apiserver
- --feature-gates=NonPreemptingPriority=true
- --advertise-address=10.154.0.31
And do the same with 2 other files. After that you can check if this flags were applied.
$ ps aux | grep apiserver | grep feature-gates
root 26713 10.4 5.2 565416 402252 ? Ssl 14:50 0:17 kube-apiserver --feature-gates=NonPreemptingPriority=true --advertise-address=10.154.0.31
Now you have redeploy your PriorityClass.
$ kubectl get priorityclass
NAME VALUE GLOBAL-DEFAULT AGE
high-priority-no-preemption 1000000 false 12m
system-cluster-critical 2000000000 false 23m
system-node-critical 2000001000 false 23m
$ kubectl delete priorityclass high-priority-no-preemption
priorityclass.scheduling.k8s.io "high-priority-no-preemption" deleted
$ kubectl apply -f class.yaml
priorityclass.scheduling.k8s.io/high-priority-no-preemption created
Last step is to deploy pod with this PriorityClass.
TEST
$ kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-normal 1/1 Running 0 4m4s
nginx-normal-2 1/1 Running 0 18m
$ kubectl apply -f prio.yaml
pod/nginx-priority created
$ kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-normal 1/1 Running 0 5m17s
nginx-normal-2 1/1 Running 0 20m
nginx-priority 0/1 Pending 0 67s
$ kubectl delete po nginx-normal-2
pod "nginx-normal-2" deleted
$ kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-normal 1/1 Running 0 5m55s
nginx-priority 1/1 Running 0 105s
I did a complete tear down of a v1.13.1 cluster and am now running v1.15.0 with calico cni v3.8.0. All pods are running:
[gms#thalia0 ~]$ kubectl get po --namespace=kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-59f54d6bbc-2mjxt 1/1 Running 0 7m23s
calico-node-57lwg 1/1 Running 0 7m23s
coredns-5c98db65d4-qjzpq 1/1 Running 0 8m46s
coredns-5c98db65d4-xx2sh 1/1 Running 0 8m46s
etcd-thalia0.ahc.umn.edu 1/1 Running 0 8m5s
kube-apiserver-thalia0.ahc.umn.edu 1/1 Running 0 7m46s
kube-controller-manager-thalia0.ahc.umn.edu 1/1 Running 0 8m2s
kube-proxy-lg4cn 1/1 Running 0 8m46s
kube-scheduler-thalia0.ahc.umn.edu 1/1 Running 0 7m40s
But, when I look at the endpoint, I get the following:
[gms#thalia0 ~]$ kubectl get ep --namespace=kube-system
NAME ENDPOINTS AGE
kube-controller-manager <none> 9m46s
kube-dns 192.168.16.194:53,192.168.16.195:53,192.168.16.194:53 + 3 more... 9m30s
kube-scheduler <none> 9m46s
If I look at the log for the apiserver, I get a ton of TLS handshake errors, along the lines of:
I0718 19:35:17.148852 1 log.go:172] http: TLS handshake error from 10.x.x.160:45042: remote error: tls: bad certificate
I0718 19:35:17.158375 1 log.go:172] http: TLS handshake error from 10.x.x.159:53506: remote error: tls: bad certificate
These IP addresses were from nodes in a previous cluster. I had deleted them and done a kubeadm reset on all nodes, including master, so I have no idea why these are showing up. I would assume this is why the endpoints for the controller-manager and the scheduler are showing up as <none>.
In order to completely wipe your cluster you should do next:
1) Reset cluster
$sudo kubeadm reset (or use appropriate to your cluster command)
2) Wipe your local directory with configs
$rm -rf .kube/
3) Remove /etc/kubernetes/
$sudo rm -rf /etc/kubernetes/
4)And one of the main point is to get rid of your previous etc state configuration.
$sudo rm -rf /var/lib/etcd/
I am working on a tutorial on Create a Kubernetes service to point to the Ambassador deployment.
Tutorial: https://www.bogotobogo.com/DevOps/Docker/Docker-Envoy-Ambassador-API-Gateway-for-Kubernetes.php
On running the command
curl $(minikube service --url ambassador)/httpbin/ip
I'm getting error
curl: (7) Failed to connect to 192.168.99.100 port 30790: Connection refused
curl: (3) <url> malformed
I can actually remove the error of
curl: (3) <url> malformed
by running
minikube service --url ambassador
http://192.168.99.100:30790
and then
curl http://192.168.99.100:30790/httpbin/ip
I've already tried this answer curl: (7) Failed to connect to 192.168.99.100 port 31591: Connection refused also the step mentioned in this answer is already in the blog, and it didn't work.
This is the code from the blog, for ambassador-svc.yaml
---
apiVersion: v1
kind: Service
metadata:
labels:
service: ambassador
name: ambassador
annotations:
getambassador.io/config: |
---
apiVersion: ambassador/v0
kind: Mapping
name: httpbin_mapping
prefix: /httpbin/
service: httpbin.org:80
host_rewrite: httpbin.org
spec:
type: LoadBalancer
ports:
- name: ambassador
port: 80
targetPort: 80
selector:
service: ambassador
Can this be a problem related to VM?
Also, I tried to work on this tutorial first but unfortunately, got the same error.
Let me know if anything else is needed from my side.
Edit:
1.As asked in the comment here is the output of
kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-qkxwm 1/1 Running 0 5h16m
coredns-fb8b8dccf-rrn4f 1/1 Running 0 5h16m
etcd-minikube 1/1 Running 0 5h15m
kube-addon-manager-minikube 1/1 Running 4 5h15m
kube-apiserver-minikube 1/1 Running 0 5h15m
kube-controller-manager-minikube 1/1 Running 0 3h17m
kube-proxy-wfbxs 1/1 Running 0 5h16m
kube-scheduler-minikube 1/1 Running 0 5h15m
storage-provisioner 1/1 Running 0 5h16m
after running
kubectl apply -f https://docs.projectcalico.org/v3.7/manifests/calico.yaml
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-78f8f67c4d-zqtl2 1/1 Running 0 65s
calico-node-27lcq 1/1 Running 0 65s
coredns-fb8b8dccf-qkxwm 1/1 Running 2 22h
coredns-fb8b8dccf-rrn4f 1/1 Running 2 22h
etcd-minikube 1/1 Running 1 22h
kube-addon-manager-minikube 1/1 Running 5 22h
kube-apiserver-minikube 1/1 Running 1 22h
kube-controller-manager-minikube 1/1 Running 0 8m27s
kube-proxy-wfbxs 1/1 Running 1 22h
kube-scheduler-minikube 1/1 Running 1 22h
storage-provisioner 1/1 Running 2 22h
kubectl get pods --namespace=kube-system should have the network service pod
So you have not set up networking policy to used for DNS.
Try using Network Policy Calico
by using command
kubectl apply -f https://docs.projectcalico.org/v3.7/manifests/calico.yaml
check now kubectl get pods --namespace=kube-system
You should get output like this :-
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6ff88bf6d4-tgtzb 1/1 Running 0 2m45s
kube-system calico-node-24h85 1/1 Running 0 2m43s
kube-system coredns-846jhw23g9-9af73 1/1 Running 0 4m5s
kube-system coredns-846jhw23g9-hmswk 1/1 Running 0 4m5s
kube-system etcd-jbaker-1 1/1 Running 0 6m22s
kube-system kube-apiserver-jbaker-1 1/1 Running 0 6m12s
kube-system kube-controller-manager-jbaker-1 1/1 Running 0 6m16s
kube-system kube-proxy-8fzp2 1/1 Running 0 5m16s
kube-system kube-scheduler-jbaker-1 1/1 Running 0 5m41s
Here is a checklist to troubleshoot the problem:
Have you created at least one Listener (CRD) ?
If you don't create at least one Listener the edge-stack pod won't respond on any ports for a request. So create the Listeners object as reported in the official doc: https://www.getambassador.io/docs/edge-stack/latest/tutorials/getting-started/
Note that the default port defined in the Ambassador Deployment and in turn in the edge-stack pod are 8080 and 8443 so avoid to change them unless you know what you are doing. You can check if any existing Listener is defined with this command: kubectl get Listener -n ambassador or just kubectl get Listener if maybe you used the default namespace by mistake.
Have you defined at least one Service of type LoadBalancer which reference the Ambassador service ?
Normally this is not needed because once you deployed Ambassador on your node you should have the Service edge-stack in the Namespace ambassador which is configured as LoadBalancer so it exposes some ports in the range 30000-32767 and redirect the traffic to the 8080 and 8443. If you manually created a Service of type LoadBalancer make sure to use the right ports in the port and targetPort fields. The ports are those used by the edge-stack Deployment which are by default 8080 and 8443)
Check that the edge-stack pod is responding to the request on it's port
The easiest way is to use the web based UI of Kubernates or your cloud provider and exec a bash into the edge-stack pod. With the default Kubernates dashboard you can just reach the pod, click on the vertical three dots -> Exec.
If you are running ambassador locally you can use eval $(minikube docker-env), search the related container with docker ps | grep edge-stack and once you got its id you can exec the bash with the command docker exec -it <id> bash
Finally run curl -Lki http://127.0.0.1:8080 You shoudl get something like this: HTTP/1.1 404 Not Founddate: Tue, XX Xxx XXXX XX:XX:XX XXXserver: envoycontent-length: 0
If curl get a response instead of "connection refused" the service is running successfully on the pod and the problem must be in the Service, Deployment or Listener configuration. You can try to use the Cluster IP of the Service edge-stack instead of 127.0.0.1 but you need to run the curl command from a pod different than edge-stack. You can use your own pod or exec into the other ambassador pods which are: edge-stack-agent and edge-stack-redis.
I am following this tutorial: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/
I have created the memory pod demo and I am trying to get the metrics from the pod but it is not working.
I installed the metrics server by cloning: https://github.com/kubernetes-incubator/metrics-server
And then running this command from top level:
kubectl create -f deploy/1.8+/
I am using kubernetes version 1.10.11.
The pod is definitely created:
λ kubectl get pod memory-demo --namespace=mem-example
NAME READY STATUS RESTARTS AGE
memory-demo 1/1 Running 0 6m
But the metics command does not work and gives an error:
λ kubectl top pod memory-demo --namespace=mem-example
Error from server (NotFound): podmetrics.metrics.k8s.io "mem-example/memory-demo" not found
What did I do wrong?
There are some patches to be done to metrics server deployment to get the metrics working.
Follow the below steps
kubectl delete -f deploy/1.8+/
wait till the metrics server gets undeployed
run the below command
kubectl create -f https://raw.githubusercontent.com/epasham/docker-repo/master/k8s/metrics-server.yaml
master $ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-6zg78 1/1 Running 0 2h
coredns-78fcdf6894-gk4sb 1/1 Running 0 2h
etcd-master 1/1 Running 0 2h
kube-apiserver-master 1/1 Running 0 2h
kube-controller-manager-master 1/1 Running 0 2h
kube-proxy-f5z9p 1/1 Running 0 2h
kube-proxy-ghbvn 1/1 Running 0 2h
kube-scheduler-master 1/1 Running 0 2h
metrics-server-85c54d44c8-rmvxh 2/2 Running 0 1m
weave-net-4j7cl 2/2 Running 1 2h
weave-net-82fzn 2/2 Running 1 2h
master $ kubectl top pod -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-78fcdf6894-6zg78 2m 11Mi
coredns-78fcdf6894-gk4sb 2m 9Mi
etcd-master 14m 90Mi
kube-apiserver-master 24m 425Mi
kube-controller-manager-master 26m 62Mi
kube-proxy-f5z9p 2m 19Mi
kube-proxy-ghbvn 3m 17Mi
kube-scheduler-master 8m 14Mi
metrics-server-85c54d44c8-rmvxh 1m 19Mi
weave-net-4j7cl 2m 59Mi
weave-net-82fzn 1m 60Mi
Check and verify the below lines in metrics server deployment manifest.
command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
On Minikube, I had to wait for 20-25 minutes after enabling the metrics-server addon. I was getting the same error for 20-25 minutes but later I could see the output without attempting for any solution.
I faced the similar issue of
Error from server (NotFound): podmetrics.metrics.k8s.io "default/apple-app" not found
I followed two steps and I was able to resolve the issue.
Download the latest customized components.yaml, which is their official file used for easy deployment.
Update the change
# - /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
to the command section of the deployment specification. I have commented the first line because it is the entrypoint of the image used by kubernetes metrics-server.
$ docker image inspect k8s.gcr.io/metrics-server-amd64:v0.3.6 -f {{.ContainerConfig.Entrypoint}}
[/metrics-server]
Even If you use it or not, it doesn't matter.
Note: You have to wait for few seconds for it to properly work.
After this running the top command will work for you.
$ kubectl top pod apple-app
NAME CPU(cores) MEMORY(bytes)
apple-app 1m 3Mi
I know this is an old thread may be someone will find this answer useful.
You have to checkout the following repo:
https://github.com/kubernetes-incubator/metrics-server
Go to the root of the repo and checkout release-0.3.2.
Remove default metrics server by:
kubectl delete -f deploy/1.8+/
Download the container yaml
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
Edit the container.yaml by adding the following lines to the argument section. You will see these two lines there
args:
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls=true
There is only one args parameter in that file.
Deploy your pod/deployment and you should be able to do:
kubectl top pod <pod-name>
I'm following the tutorial for Istio on the Google Cloud Platform and have been able to get my cluster up and running. I get to part where I start the demo app by running kubectl apply -f install/kubernetes/istio-demo-auth.yaml but a number of the pods wont come up.
I'm running Istio 1.0.3
kubectl version --short
Client Version: v1.11.1
Server Version: v1.9.7-gke.6
When I run the command kubectl get service -n istio-system
to verify istio pods are deployed and containers are running many of them are in crash cycles. Any tips on how to debug this?
grafana-7b6d98d887-9dgdc 1/1 Running 0 17h
istio-citadel-778747b96d-cw78t 1/1 Running 0 17h
istio-cleanup-secrets-2vjlf 0/1 Completed 0 17h
istio-egressgateway-7b8f4ccb6-rl69j 1/1 Running 123 17h
istio-galley-7557f8c985-jp975 0/1 ContainerCreating 0 17h
istio-grafana-post-install-n45x4 0/1 Error 202 17h
istio-ingressgateway-5f94fdc55f-dc2q5 1/1 Running 123 17h
istio-pilot-d6b56bf4d-czp9w 1/2 CrashLoopBackOff 328 17h
istio-policy-6c7d8454b-dpvfj 1/2 CrashLoopBackOff 500 17h
istio-security-post-install-qrzpq 0/1 CrashLoopBackOff 201 17h
istio-sidecar-injector-75cf59b857-z7wbc 0/1 ContainerCreating 0 17h
istio-telemetry-69db5c7575-4jp2d 1/2 CrashLoopBackOff 500 17h
istio-tracing-77f9f94b98-vjmhc 1/1 Running 0 17h
prometheus-67d4988588-gjmcp 1/1 Running 0 17h
servicegraph-57d8ff7686-x2r8r 1/1 Running 0 17h
That looks like the output for kubectl -n istio-system get pods
Tips, check the output for these:
$ kubectl -n istio-system logs istio-pilot-d6b56bf4d-czp9w
$ kubectl -n istio-system logs istio-policy-6c7d8454b-dpvfj
$ kubectl -n istio-system logs istio-grafana-post-install-n45x4
$ kubectl -n istio-system logs istio-telemetry-69db5c7575-4jp2d
Check the deployment/service/configmap definitions in install/kubernetes/istio-demo-auth.yaml that you have pods crashing for.
Try installing with Helm via template.
You would usually want to have Grafana, Zipkin & Kiali along. This is what worked for me:
1) kubectl apply -f install/kubernetes/helm/istio/templates/crds.yaml
2) helm template install/kubernetes/helm/istio --name istio --namespace istio-system --set grafana.enabled=true --set servicegraph.enabled=true --set tracing.enabled=true --set kiali.enabled=true --set sidecarInjectorWebhook.enabled=true --set global.tag=1.0.5 > $HOME/istio.yaml
3) kubectl create namespace istio-system
4) kubectl apply -f $HOME/istio.yaml
I had a similar issue - turned out that my NAT Gateway wasn't configured correctly. The Terraform I used to create the private cluster created an additional default internet gateway that I needed to delete.
Some came up, some didn't - I think that maybe some of the images were cached somewhere the cluster could reach, like a Google repo.