Telemetry mixer logs - kubernetes

I deploy istio 1.2.5 on a K8s cluster.
According to documentation https://istio.io/faq/mixer/ in rules section:
kubectl get rules --all-namespaces
You will get the list. In my cluster I got No resources found
But if I use:
kubectl get rules.config.istio.io -n istio-system
I got the list:
NAME AGE
kubeattrgenrulerule 5h
promhttp 5h
promtcp 5h
promtcpconnectionclosed 5h
promtcpconnectionopen 5h
stdio 5h
stdiotcp 5h
Someone know the difference?
Also if I try:
kubectl -n istio-system logs -f istio-telemetry-7df96d454b-4kxs9 -c mixer
I didn't got the log of request in the log ( I found it work in another cluster). Do you know why?

I tried to reproduce your issue on both versions Istio 1.2.5 and Istio 1.3.0 and environments like GKE, Minikube and Kubeadm.
I have tried to install it manually and using HELM. Each time everything worked as should.
Based on the information you have provided: I found it work in another cluster and you are using bare metal I would guess that this cluster have some specific configuration or some of the kubernetes/Istio objects have Insufficient resources.
$ kubectl describe node [node-name]
Please keep in mind that you might install Istio Configuration Profile which requested too many resources. Each profile contain different amount of resources based on each object (citadel, egress, galley, pilot, telemetry, etc). For example if you will check Istio Docs
The Envoy proxy uses 0.6 vCPU and 50 MB memory per 1000 requests per second going through the proxy.
The istio-telemetry service uses 0.6 vCPU per 1000 mesh-wide requests per second.
Pilot uses 1 vCPU and 1.5 GB of memory.

Related

Istio Installation successful but not able to deploy POD

I have successfully installed Istio in k8 cluster.
Istio version is 1.9.1
Kubernetes CNI plugin used: Calico version 3.18 (Calico POD is up and running)
kubectl get pod -A
istio-system istio-egressgateway-bd477794-8rnr6 1/1 Running 0 124m
istio-system istio-ingressgateway-79df7c789f-fjwf8 1/1 Running 0 124m
istio-system istiod-6dc55bbdd-89mlv 1/1 Running 0 124
When I'm trying to deploy sample nginx app I am getting the error below:
failed calling webhook sidecar-injector.istio.io context deadline exceeded
Post "https://istiod.istio-system.svc:443/inject?timeout=30s":
context deadline exceeded
When I Disable automatic proxy sidecar injection the pod is getting deployed without any errors.
kubectl label namespace default istio-injection-
I am not sure how to fix this issue could you please some one help me on this?
In this case, adding hostNetwork:true under spec.template.spec to the istiod Deployment may help.
This seems to be a workaround when using Calico CNI for pod networking (see: failed calling webhook "sidecar-injector.istio.io)
As we can find in the Kubernetes Host namespaces documentation:
HostNetwork - Controls whether the pod may use the node network namespace. Doing so gives the pod access to the loopback device, services listening on localhost, and could be used to snoop on network activity of other pods on the same node.

Unable able to see Pods CPU and Memory Utilization and graphs are missing Kubernetes dashboard

K8s VERSION = v1.18.6
I have deployed the Kubernetes dashboard using the following command and added a privileged user with which I logged into the dashboard.
but not able to see Pods CPU and Memory Utilization graphs are missing Kubernetes dashboard
The Kubernetes Metrics Server is an aggregator of resource usage data in your cluster,
To deploy the Metrics Server
Deploy the Metrics Server with the following command:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
Verify that the metrics-server deployment is running the desired number of pods with the following command.
kubectl get deployment metrics-server -n kube-system
Output
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 1/1 1 1 6m
Also you can validate by below command:
kubectl top nodes
to see node cpu utilisation if it works, it should then come up in Dashboard as well.
Resource usage metrics are only available for K8s clusters once Metrics Server has been installed.

GKE in-cluster DNS resolutions stopped working

So this has been working forever. I have a few simple services running in GKE and they refer to each other via the standard service.namespace DNS names.
Today all DNS name resolution stopped working. I haven't changed anything, although this may have been triggered by a master upgrade.
/ambassador # nslookup ambassador-monitor.default
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'ambassador-monitor.default': Try again
/ambassador # cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local c.snowcloud-01.internal google.internal
nameserver 10.207.0.10
options ndots:5
Master version 1.14.7-gke.14
I can talk cross-service using their IP addresses, it's just DNS that's not working.
Not really sure what to do about this...
The easiest way to verify if there is a problem with your Kube DNS is to look at the logs StackDriver [https://cloud.google.com/logging/docs/view/overview].
You should be able to find DNS resolution failures in the logs for the pods, with a filter such as the following:
resource.type="container"
("UnknownHost" OR "lookup fail" OR "gaierror")
Be sure to check logs for each container. Because the exact names and numbers of containers can change with the GKE version, you can find them like so:
kubectl get pod -n kube-system -l k8s-app=kube-dns -o \
jsonpath='{range .items[*].spec.containers[*]}{.name}{"\n"}{end}' | sort -u kubectl get pods -n kube-system -l k8s-app=kube-dns
Has the pod been restarted frequently? Look for OOMs in the node console. The nodes for each pod can be found like so:
kubectl get pod -n kube-system -l k8s-app=kube-dns -o \
jsonpath='{range .items[*]}{.spec.nodeName} pod={.metadata.name}{"\n"}{end}'
The kube-dns pod contains four containers:
kube-dns process watches the Kubernetes master for changes in Services and Endpoints, and maintains in-memory lookup structures to serve DNS requests,
dnsmasq adds DNS caching to improve performance,
sidecar provides a single health check endpoint while performing dual health checks (for dnsmasq and kubedns). It also collects dnsmasq metrics and exposes them in the Prometheus format,
prometheus-to-sd scraping the metrics exposed by sidecar and sending them to Stackdriver.
By default, the dnsmasq container accepts 150 concurrent requests. Requests beyond this are simply dropped and result in failed DNS resolution, including resolution for metadata. To check for this, view the logs with the following filter:
resource.type="container"resource.labels.cluster_name="<cluster-name>"resource.labels.namespace_id="kube-system"logName="projects/<project-id>/logs/dnsmasq""Maximum number of concurrent DNS queries reached"
If legacy stackdriver logging of cluster is disabled, use the following filter:
resource.type="k8s_container"resource.labels.cluster_name="<cluster-name>"resource.labels.namespace_name="kube-system"resource.labels.container_name="dnsmasq""Maximum number of concurrent DNS queries reached"
If Stackdriver logging is disabled, execute the following:
kubectl logs --tail=1000 --namespace=kube-system -l k8s-app=kube-dns -c dnsmasq | grep 'Maximum number of concurrent DNS queries reached'
Additionally, you can try to use the command [dig ambassador-monitor.default #10.207.0.10] from each nodes to verify if this is only impacting one node. If it is, you can simple re-create the impacted node.
It appears that I hit a bug that caused the gke-metadata server to start crash pooling (which in turn prevented kube-dns from working).
Creating a new pool with a previous version (1.14.7-gke.10) and migrating to it fixed everything.
I am told a fix has already been submitted.
Thank you for your suggestions.
Start by debugging your kubernetes services [1]. This will tell you whether is a k8s resource issue or kubernetes itself is failing. Once you understand that, you can proceed to fix it. You can post results here if you want to follow up.
[1] https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/

unable to access dns from a kubernetes pod

I have a kubernetes master and node setup in two centos VMs on my Win 10.
I used flannel for CNI and deployed ambassador as an API gateway.
As the ambassador routes did not work, I analysed further to understand that the DNS (ip-10.96.0.10) is not accessible from busybox pod which means that none of the service names can be accessed. Could I get any suggestion please.
1. You should use newest version of Flannel.
Flannel does not setup service IPs but kube-proxy does, you should look at kube-proxy on your nodes and ensure they are not reporting errors.
I'd suggest taking a look at https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#tabs-pod-install-4 and ensure you have met the requirements stated there.
Similar issue but with Calico plugin you can find here: https://github.com/projectcalico/calico/issues/1798
2. Check if you have open port 8285, flannel uses UDP port 8285 for sending encapsulated IP packets. Make sure to enable this traffic to pass between the hosts.
3. Ambassador includes an integrated diagnostics service to help with troubleshooting, this may be useful for you. By default, this is not exposed to the Internet. To view it, we'll need to get the name of one of the Ambassador pods:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
ambassador-3655608000-43x86 1/1 Running 0 2m
ambassador-3655608000-w63zf 1/1 Running 0 2m
Forwarding local port 8877 to one of the pods:
kubectl port-forward ambassador-3655608000-43x86 8877
will then let us view the diagnostics at http://localhost:8877/ambassador/v0/diag/.
First spot should solve your problem, if not, try remainings.
I hope this helps.

Kubernetes Autoscaling

I have Kubernetes v1.12.1 installed on my cluster.
I downloaded the metrics-server from the following repo:
https://github.com/kubernetes-incubator/metrics-server
and then run the following command:
kubectl create -f metrics-server/deploy/1.8+/
and then I tried autoscaling a deployment using:
kubectl autoscale deployment example-app-tier --min 1 --max 3 --cpu-percent 70 --spacename example
but the targets here shows unkown/70%
kubectl get hpa --spacename example
NAMESPACE NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
example example-app-tier Deployment/example-app-tier <unknown>/70% 1 3 1 3h35m
and when I try running the kubectl top nodes or pods I get an error saying:
error: Metrics not available for pod default/pi-ss8j6, age: 282h48m5.334137739s
So I'm looking for any tutorial that helps me step by step enabling autoscaling using metrics-server or Prometheus (and not Heapster as it is deprecated and will not be supported anymore)
Thank you!
you need to register your metrics server with API server and make sure they communicate.
https://github.com/kubernetes/kubernetes/issues/59438
If it is done already , you need to check the help for the kubectl top command in your version of k8s , the command may default to use heapster , and you may need to tell it to use the new service instead.
https://github.com/kubernetes/kubernetes/pull/56206
from the help command it looks like it is not yet ported to new metric server and still looking for heapster by default.
C02W84XMHTD5:tmp iahmad$ kubectl top node --help
Display Resource (CPU/Memory/Storage) usage of nodes.
The top-node command allows you to see the resource consumption of nodes.
Aliases:
node, nodes, no
Examples:
# Show metrics for all nodes
kubectl top node
# Show metrics for a given node
kubectl top node NODE_NAME
Options:
--heapster-namespace='kube-system': Namespace Heapster service is located in
--heapster-port='': Port name in service to use
--heapster-scheme='http': Scheme (http or https) to connect to Heapster as
--heapster-service='heapster': Name of Heapster service
-l, --selector='': Selector (label query) to filter on, supports '=', '==', and '!='.(e.g. -l
key1=value1,key2=value2)
Usage:
kubectl top node [NAME | -l label] [options]
Use "kubectl options" for a list of global command-line options (applies to all commands).
note: I am using 1.10 , maybe in your version , the options are different