Kubernetes Health Checks Failing with Network Policies Enabled - kubernetes

When enabling only egress network policies, all readiness and liveness checks fail after pods are restarted.
This is what I see when describing the pod:
Warning Unhealthy 115s (x7 over 2m55s) kubelet, Readiness probe failed: Get http://10.202.158.105:80/health/ready: dial tcp 10.202.158.105:80: connect: connection refused
Warning Unhealthy 115s (x7 over 2m55s) kubelet, Liveness probe failed: Get http://10.202.158.105:80/health/live: dial tcp 10.202.158.105:80: connect: connection refused
Immediately, if I disable the policies, the health checks will resume functioning. If the pod is already healthy before applying the network policies, it will continue to work.
I've also tried to whitelist every namespace with this policy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-all
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 80
- protocol: TCP
port: 8080
I'm having a hard time finding any guidance on how to resolve this. Is there an egress policy that would need enabled to allow kubelet to monitor the pods health checks?
The pod is running inside of Azure Kubernetes Services and using Calico networking.

It looks like the kube-probe uses the .1 address of each pod cidr in AKS. I believe that will be the address the linux bridge is assigned on the agent pool VM, so the host chooses it as the cheapest route to the pods.
There is no pod with this address so I can't see how it can be matched by a selector, unless AKS has some magic built in to their implementation.
kubectl get pods --all-namespaces -o json \
| jq -r '.items[] | [ .status.podIP, .metadata.name ] | join("\t")'
The policy could be made to work with a specific rule for the source .1 IP of all the pod CIDR's.
kubectl get nodes -o json \
| jq '.items[] | [ .metadata.name, .spec.podCIDR ]'
[
"aks-agentpool-12345678-vmss000000",
"10.212.0.0/24"
]
[
"aks-agentpool-12345678-vmss000001",
"10.212.1.0/24"
]
So that would be an ipBlock for each node:
ingress:
- from:
- ipBlock:
cidr: 10.212.0.1/32
- ipBlock:
cidr: 10.212.1.1/32
Which is a bit horrible as it's per cluster and per node pool configuration. I only dabble with AKS so there might be a better solution. If you can't find anything else I'd file a bug on https://github.com/Azure/AKS/

Related

Connection Refused between Kubernetes pods in the same cluster

I am new to Kubernetes and I'm working on deploying an application within a new Kubernetes cluster.
Currently, the service running has multiple pods that need to communicate with each other. I'm looking for a general approach to go about debugging the issue, rather than getting into the specifies of the service as the question will become much too specific.
The pods within the cluster are throwing an error:
err="Get \"http://testpod.mynamespace.svc.cluster.local:8080/": dial tcp 10.10.80.100:8080: connect: connection refused"
Both pods are in the same cluster.
What are the best steps to take to debug this?
I have tried running:
kubectl exec -it testpod --namespace mynamespace -- cat /etc/resolv.conf
And this returns:
search mynamespace.svc.cluster.local svc.cluster.local cluster.local us-east-2.compute.internal
Which I found here: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
First of all, the following pattern:
my-svc.my-namespace.svc.cluster-domain.example
is applicable only to FQDNs of Services, not Pods which have the following form:
pod-ip-address.my-namespace.pod.cluster-domain.example
e.g.:
172-17-0-3.default.pod.cluster.local
So in fact you're querying cluster dns about FQDN of the Service named testpod and not about FQDN of the Pod. Judging by the fact that it's being resolved successfully, such Service already exists in your cluster but most probably is misconfigured. The fact that you're getting the error message connection refused can mean the following:
your Service FQDN testpod.mynamespace.svc.cluster.local has been successfully resolved
(otherwise you would receive something like curl: (6) Could not resolve host: testpod.default.svc.cluster.local)
you've reached successfully your testpod Service
(otherwise, i.e. if it existed but wasn't listening on 8080 port, you're trying to connect to, you would receive timeout e.g. curl: (7) Failed to connect to testpod.default.svc.cluster.local port 8080: Connection timed out)
you've reached the Pod, exposed by testpod Service (you've been sussessfully redirected to it by the testpod Service)
but once reached the Pod, you're trying to connect to incorect port and that's why the connection is being refused by the server
My best guess is that your Pod in fact listens on different port, like 80 but you exposed it via the ClusterIP Service by specifying only --port value e.g. by:
kubectl expose pod testpod --port=8080
In such case both --port (port of the Service) and --targetPort (port of the Pod) will have the same value. In other words you've created a Service like the one below:
apiVersion: v1
kind: Service
metadata:
name: testpod
spec:
ports:
- protocol: TCP
port: 8080
targetPort: 8080
And you probably should've exposed it either this way:
kubectl expose pod testpod --port=8080 --targetPort=80
or with the following yaml manifest:
apiVersion: v1
kind: Service
metadata:
name: testpod
spec:
ports:
- protocol: TCP
port: 8080
targetPort: 80
Of course your targetPort may be different than 80, but connection refused in such case can mean only one thing: target http server (running in a Pod) refuses connection to 8080 port (most probably because it isn't listening on it). You didn't specify what image you're using, whether it's a standard nginx webserver or something based on your custom image. But if it's nginx and wasn't configured differently it listens on port 80.
For further debug, you can attach to your Pod:
kubectl exec -it testpod --namespace mynamespace -- /bin/sh
and if netstat command is not present (the most likely scenario) run:
apt update && apt install net-tools
and then check with netstat -ntlp on which port your container listens on.
I hope this helps you solve your issue. In case of any doubts, don't hesitate to ask.

Kubernetes Ingress Controller: Failed calling webhook, dial tcp connect: connection refused

I have set up a Kubernetes cluster (a master and a worker) on two Centos 7 machines. They have the following IPs:
Master: 192.168.1.40
Worker: 192.168.1.41
They are accessible by SSH and I am not using a VPN. For both boxes, I have sudo access.
For the work I am doing, I had to add an Nginx Ingress Controller, which I did by doing:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.43.0/deploy/static/provider/baremetal/deploy.yaml
This yaml file seems fine to me and is a common one that occurs when trying to add an nginx ingress controller to a kubernetes cluster.
I don't see any errors when I do the above command.
However, when I try to install a helm configuration, such as:
helm install dai eggplant/dai --version 0.6.5 -f dai.yaml --namespace dai
I am getting an error with my Nginx Ingress Controller:
W0119 11:58:00.550727 60628 warnings.go:70] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
Error: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s": dial tcp 10.108.86.48:443: connect: connection refused
I think this is because of some kind of DNS error. I don't know where the IP 10.108.86.48:443 is coming from or how to find out.
I have also enabled a bunch of ports with firewall-cmd.
[root#manager-node ~]# sudo firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: ens33
sources:
services: dhcpv6-client ssh
ports: 6443/tcp 2379-2380/tcp 10250/tcp 10251/tcp 10252/tcp 10255/tcp 443/tcp 30154/tcp 31165/tcp
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
However, my nginx ingress pod doesn't seem to start either:
NAME READY STATUS RESTARTS AGE
ingress-nginx-controller-7bc44b4bb-rwmh2 0/1 ContainerCreating 0 19h
It remains as ContainerCreating for hours.
The issue is that as part of that kubectl apply -f you are also applying a ValidatingWebhookConfiguration (check the applied manifest file).
See Using Admission Controllers | Kubernetes
Using Admission Controllers | Kubernetes for more info.
The error you are seeing is because your Deployment is not starting up, and thus the ValidatingWebhook service configured as part of it isn't starting up either, so the Validating Controller in Kubernetes is failing every request.
- --validating-webhook=:8443
- --validating-webhook-certificate=/usr/local/certificates/cert
- --validating-webhook-key=/usr/local/certificates/key
Your pod is most likely not starting for another reason. More information is required to further debug.
I would recommend removing the ValidatingWebhookConfiguration from the applied manfiest.
You can also remove it manually with
kubectl delete ValidatingWebhookCOnfiguration ingress-nginx-admission
(Validating Controllers aren't namespaced)

All Kubernetes proxy targets down - Prometheus Operator

I have a k8s cluster deployed in openstack. I have deployed Prometheus operator for it to monitor the cluster. But I am getting Kubernetes proxy down alert for all the nodes.
I would like to know basics of how Prometheus operator scrapes Kubernetes proxy? also would like to know what configurations needs to be done to fix it.
I can see that kube proxy is running in all nodes at 10249 port.
Error :
Get http://10.8.10.11:10249/metrics: dial tcp 10.8.10.11:10249: connect: connection refused
HELM values configuration
kubeProxy:
enabled: true
## If your kube proxy is not deployed as a pod, specify IPs it can be found on
##
endpoints: []
# - 10.141.4.22
# - 10.141.4.23
# - 10.141.4.24
service:
port: 10249
targetPort: 10249
# selector:
# k8s-app: kube-proxy
serviceMonitor:
## Scrape interval. If not set, the Prometheus default scrape interval is used.
##
interval: ""
## Enable scraping kube-proxy over https.
## Requires proper certs (not self-signed) and delegated authentication/authorization checks
##
https: false
Set the kube-proxy argument for metric-bind-address
$ kubectl edit cm/kube-proxy -n kube-system
...
kind: KubeProxyConfiguration
metricsBindAddress: 0.0.0.0:10249
...
$ kubectl delete pod -l k8s-app=kube-proxy -n kube-system

dns entries for pods in not ready state

I'm trying to build a simple mongo replica set cluster in kubernetes.
i have a StatefulSet of mongod instances, with
livenessProbe:
initialDelaySeconds: 60
exec:
command:
- mongo
- --eval
- "db.adminCommand('ping')"
readinessProbe:
initialDelaySeconds: 60
exec:
command:
- /usr/bin/mongo --quiet --eval 'rs.status()' | grep ok | cut -d ':' -f 2 | tr -dc '0-9' | awk '{ if($0=="0"){ exit 127 }else{ exit 0 } }'
as you can see, my readinessProbe is checking to see if the mongo replicaSet is working correctly.
however, i get a circular dependency with (and existing) cluster reporting:
"lastHeartbeatMessage" : "Error connecting to mongo-2.mongo:27017 :: caused by :: Could not find address for mongo-2.mongo:27017: SocketException: Host not found (authoritative)",
(where mongo-2 was undergoing a rolling update).
looking further:
$ kubectl run --generator=run-pod/v1 tmp-shell --rm -i --tty --image nicolaka/netshoot -- /bin/bash
bash-5.0# nslookup mongo-2.mongo
Server: 10.96.0.10
Address: 10.96.0.10#53
** server can't find mongo-2.mongo: NXDOMAIN
bash-5.0# nslookup mongo-0.mongo
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: mongo-0.mongo.cryoem-logbook-dev.svc.cluster.local
Address: 10.27.137.6
so the question is whether there is a way to get kubernetes to always keep the dns entry for the mongo pods to always be present? it appears that i have a chicken and egg situation where if the entire pod hasn't passed its readiness and liveness checks, then a dns entry is not created, and hence the other mongod instances will not be able to access it.
I ended up just putting in a ClusterIP Service for each of the statefulset instances with a selector for the specific instance:
ie
apiVersion: v1
kind: Service
metadata:
name: mongo-0
spec:
clusterIP: 10.101.41.87
ports:
- port: 27017
protocol: TCP
targetPort: 27017
selector:
role: mongo
statefulset.kubernetes.io/pod-name: mongo-0
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
and repeat for the othe stss. the key here is the selector:
statefulset.kubernetes.io/pod-name: mongo-0
I believe you are misinterpreting the error.
Could not find address for mongo-2.mongo:27017: SocketException: Host not found (authoritative)"
The pod is created with an IP attached. Then it's registered into DNS:
Pod-0 has the IP 10.0.0.10 and now it's FQDN is Pod-0.servicename.namespace.svc.cluster.local
Pod-1 has the IP 10.0.0.11 and now it's FQDN is Pod-1.servicename.namespace.svc.cluster.local
Pod-2 has the IP 10.0.0.12 and now it's FQDN is Pod-2.servicename.namespace.svc.cluster.local
But DNS is a live service, IPs are dynamically assigned and can't be duplicated.
So whenever it receives a request:
"Connect me with Pod-A.servicename.namespace.svc.cluster.local"
It tries to reach the registered IP and if the Pod is offline due to a rolling update, it will think the pod is unavailable and will return "Could not find the address (IP) for Pod-0.servicename" until the pod is online again or until the IP reservation expires and only then the DNS registry will be recycled.
The DNS is not discarting the DNS name registered, it's only answering it's currently offline.
You can either ignore the errors during the rolling or rethink your script and try using the internal js environment as mentioned in the comments for continuous monitoring of the mongo status.
EDIT:
When Pods from a StatefulSet with N replicas are being deployed, they are created sequentially, in order from {0..N-1}.
When Pods are being deleted, they are terminated in reverse order, from {N-1..0}.
This is the expected/desired default behavior.
So the error is expected, since the rollingUpdate makes the pod temporarily unavailable.

Istio (1.0) intra ReplicaSet routing - support traffic between pods in a Kubernetes Deployment

How does Istio support IP based routing between pods in the same Service (or ReplicaSet to be more specific)?
We would like to deploy a Tomcat application with replica > 1 within an Istio mesh. The app runs Infinispan, which is using JGroups to sort out communication and clustering. JGroups need to identify its cluster members and for that purpose there is the KUBE_PING (Kubernetes discovery protocol for JGroups). It will consult K8S API with a lookup comparable to kubectl get pods. The cluster members can be both pods in other services and pods within the same Service/Deployment.
Despite our issue being driven by rather specific needs the topic is generic. How do we enable pods to communicate with each other within a replica set?
Example: as a showcase we deploy the demo application https://github.com/jgroups-extras/jgroups-kubernetes. The relevant stuff is:
apiVersion: v1
items:
- apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: ispn-perf-test
namespace: my-non-istio-namespace
spec:
replicas: 3
< -- edited for brevity -- >
Running without Istio, the three pods will find each other and form the cluster. Deploying the same with Istio in my-istio-namespace and adding a basic Service definition:
kind: Service
apiVersion: v1
metadata:
name: ispn-perf-test-service
namespace: my-istio-namespace
spec:
selector:
run : ispn-perf-test
ports:
- protocol: TCP
port: 7800
targetPort: 7800
name: "one"
- protocol: TCP
port: 7900
targetPort: 7900
name: "two"
- protocol: TCP
port: 9000
targetPort: 9000
name: "three"
Note that output below is wide - you might need to scroll right to get the IPs
kubectl get pods -n my-istio-namespace -o wide
NAME READY STATUS RESTARTS AGE IP NODE
ispn-perf-test-558666c5c6-g9jb5 2/2 Running 0 1d 10.44.4.63 gke-main-pool-4cpu-15gb-98b104f4-v9bl
ispn-perf-test-558666c5c6-lbvqf 2/2 Running 0 1d 10.44.4.64 gke-main-pool-4cpu-15gb-98b104f4-v9bl
ispn-perf-test-558666c5c6-lhrpb 2/2 Running 0 1d 10.44.3.22 gke-main-pool-4cpu-15gb-98b104f4-x8ln
kubectl get service ispn-perf-test-service -n my-istio-namespace
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ispn-perf-test-service ClusterIP 10.41.13.74 <none> 7800/TCP,7900/TCP,9000/TCP 1d
Guided by https://istio.io/help/ops/traffic-management/proxy-cmd/#deep-dive-into-envoy-configuration, let's peek into the resulting Envoy conf of one of the pods:
istioctl proxy-config listeners ispn-perf-test-558666c5c6-g9jb5 -n my-istio-namespace
ADDRESS PORT TYPE
10.44.4.63 7900 TCP
10.44.4.63 7800 TCP
10.44.4.63 9000 TCP
10.41.13.74 7900 TCP
10.41.13.74 9000 TCP
10.41.13.74 7800 TCP
< -- edited for brevity -- >
The Istio doc describes the listeners above as
Receives outbound non-HTTP traffic for relevant IP:PORT pair from
listener 0.0.0.0_15001
and this all makes sense. The pod ispn-perf-test-558666c5c6-g9jb5 can reach itself on 10.44.4.63 and the service via 10.41.13.74. But... what if the pod sends packets to 10.44.4.64 or 10.44.3.22? Those IPs are not present among the listeners so afaik the two "sibling" pods are non-reachable for ispn-perf-test-558666c5c6-g9jb5.
Can Istio support this today - then how?
You are right that HTTP routing only supports local access or remote access by service name or service VIP.
That said, for your particular example, above, where the service ports are named "one", "two", "three", the routing will be plain TCP as described here. Therefore, your example should work. The pod ispn-perf-test-558666c5c6-g9jb5 can reach itself on 10.44.4.63 and the other pods at 10.44.4.64 and 10.44.3.22.
If you rename the ports to "http-one", "http-two", and "http-three" then HTTP routing will kick in and the RDS config will restrict the remote calls to ones using recognized service domains.
To see the difference in the RDF config look at the output from the following command when the port is named "one", and when it is changed to "http-one".
istioctl proxy-config routes ispn-perf-test-558666c5c6-g9jb5 -n my-istio-namespace --name 7800 -o json
With the port named "one" it will return no routes, so TCP routing will apply, but in the "http-one" case, the routes will be restricted.
I don't know if there is a way to add additional remote pod IP addresses to the RDS domains in the HTTP case. I would suggest opening an Istio issue, to see if it's possible.