All Kubernetes proxy targets down - Prometheus Operator - kubernetes

I have a k8s cluster deployed in openstack. I have deployed Prometheus operator for it to monitor the cluster. But I am getting Kubernetes proxy down alert for all the nodes.
I would like to know basics of how Prometheus operator scrapes Kubernetes proxy? also would like to know what configurations needs to be done to fix it.
I can see that kube proxy is running in all nodes at 10249 port.
Error :
Get http://10.8.10.11:10249/metrics: dial tcp 10.8.10.11:10249: connect: connection refused
HELM values configuration
kubeProxy:
enabled: true
## If your kube proxy is not deployed as a pod, specify IPs it can be found on
##
endpoints: []
# - 10.141.4.22
# - 10.141.4.23
# - 10.141.4.24
service:
port: 10249
targetPort: 10249
# selector:
# k8s-app: kube-proxy
serviceMonitor:
## Scrape interval. If not set, the Prometheus default scrape interval is used.
##
interval: ""
## Enable scraping kube-proxy over https.
## Requires proper certs (not self-signed) and delegated authentication/authorization checks
##
https: false

Set the kube-proxy argument for metric-bind-address
$ kubectl edit cm/kube-proxy -n kube-system
...
kind: KubeProxyConfiguration
metricsBindAddress: 0.0.0.0:10249
...
$ kubectl delete pod -l k8s-app=kube-proxy -n kube-system

Related

From kubernetes cluster how to have access to external service with host file?

We need to connect to nginx-test.dup.com on port 80 from our kubernetes cluster.
We are using ExternalName but nginx-test.dup.com is only defined in /etc/hosts.
Is there a way to make that service available from within kubernetes cluster? We also tried adding hostNetwork: true
as described in How do I access external hosts from within my cluster?
and we got the following error:
error validating data: ValidationError(Service.spec): unknown field "hostNetwork" in io.k8s.api.core.v1.ServiceSpec
kind: Service
apiVersion: v1
metadata:
name: nginx-test
spec:
ports:
- name: test-https
port: 8080
targetPort: 80
type: ExternalName
externalName: nginx-test.dup.com
CoreDNS doesn't take /etc/hosts into account. You can add the hosts section to the configMap of the CoreDNS manually.
# kubectl edit cm coredns -n kube-system
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
# Add the hosts section from here
hosts {
xxx.xxx.xxx.xxx nginx-test.dup.com
fallthrough
}
# to here
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
...
}
Please note that it will take some time for the new setting to be used.

cetic-nifi Invalid host header issue

Helm version: v3.5.2
Kubernetes version: v1.20.4
nifi chart version:latest : 1.0.2 rel
Issue: [cetic/nifi]-issue
I'm trying to connect to nifi UI deployed in kubernetes.
I have set following properties in values yaml
properties:
# use externalSecure for when inbound SSL is provided by nginx-ingress or other external mechanism
sensitiveKey: changeMechangeMe # Must to have minimal 12 length key
algorithm: NIFI_PBKDF2_AES_GCM_256
externalSecure: false
isNode: false
httpsPort: 8443
webProxyHost: 10.0.39.39:30666
clusterPort: 6007
# ui service
service:
type: NodePort
httpsPort: 8443
nodePort: 30666
annotations: {}
# loadBalancerIP:
## Load Balancer sources
## https://kubernetes.io/docs/tasks/access-application-cluster/configure-cloud-provider-firewall/#restrict-access-for-loadbalancer-service
##
# loadBalancerSourceRanges:
# - 10.10.10.0/24
## OIDC authentication requires "sticky" session on the LoadBalancer for JWT to work properly...but AWS doesn't like it on creation
# sessionAffinity: ClientIP
# sessionAffinityConfig:
# clientIP:
# timeoutSeconds: 10800
10.0.39.39 - is the kubernetes masternode internal ip.
When nifi get started i get follwoing
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/k8sadmin/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/k8sadmin/.kube/config
NAME: nifi
LAST DEPLOYED: Thu Nov 25 12:38:00 2021
NAMESPACE: jeed-cluster
STATUS: deployed
REVISION: 1
NOTES:
Cluster endpoint IP address will be available at:
kubectl get svc nifi -n jeed-cluster -o jsonpath='{.status.loadBalancer.ingress[*].ip}'
Cluster endpoint domain name is: 10.0.39.39:30666 - please update your DNS or /etc/hosts accordingly!
Once you are done, your NiFi instance will be available at:
https://10.0.39.39:30666/nifi
and when i do a curl
curl https://10.0.39.39:30666 put sample.txt -k
<h1>System Error</h1>
<h2>The request contained an invalid host header [<code>10.0.39.39:30666</
the request [<code>/</code>]. Check for request manipulation or third-part
t.</h2>
<h3>Valid host headers are [<code>empty
<ul><li>127.0.0.1</li>
<li>127.0.0.1:8443</li>
<li>localhost</li>
<li>localhost:8443</li>
<li>[::1]</li>
<li>[::1]:8443</li>
<li>nifi-0.nifi-headless.jeed-cluste
<li>nifi-0.nifi-headless.jeed-cluste
<li>10.42.0.8</li>
<li>10.42.0.8:8443</li>
<li>0.0.0.0</li>
<li>0.0.0.0:8443</li>
</ul>
Tried lot of things but still cannot whitelist master node ip in
proxy hosts
Ingress is not used
edit: it looks like properties set in values.yaml is not set in nifi.properties in side the pod. Is there any reason for this?
Appreciate help!
As NodePort service you can also assign port number from 30000-32767.
You can apply values when you install your chart with:
properties:
webProxyHost: localhost
httpsPort:
This should let nifi whitelist your https://localhost:

Kubernetes Health Checks Failing with Network Policies Enabled

When enabling only egress network policies, all readiness and liveness checks fail after pods are restarted.
This is what I see when describing the pod:
Warning Unhealthy 115s (x7 over 2m55s) kubelet, Readiness probe failed: Get http://10.202.158.105:80/health/ready: dial tcp 10.202.158.105:80: connect: connection refused
Warning Unhealthy 115s (x7 over 2m55s) kubelet, Liveness probe failed: Get http://10.202.158.105:80/health/live: dial tcp 10.202.158.105:80: connect: connection refused
Immediately, if I disable the policies, the health checks will resume functioning. If the pod is already healthy before applying the network policies, it will continue to work.
I've also tried to whitelist every namespace with this policy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-all
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 80
- protocol: TCP
port: 8080
I'm having a hard time finding any guidance on how to resolve this. Is there an egress policy that would need enabled to allow kubelet to monitor the pods health checks?
The pod is running inside of Azure Kubernetes Services and using Calico networking.
It looks like the kube-probe uses the .1 address of each pod cidr in AKS. I believe that will be the address the linux bridge is assigned on the agent pool VM, so the host chooses it as the cheapest route to the pods.
There is no pod with this address so I can't see how it can be matched by a selector, unless AKS has some magic built in to their implementation.
kubectl get pods --all-namespaces -o json \
| jq -r '.items[] | [ .status.podIP, .metadata.name ] | join("\t")'
The policy could be made to work with a specific rule for the source .1 IP of all the pod CIDR's.
kubectl get nodes -o json \
| jq '.items[] | [ .metadata.name, .spec.podCIDR ]'
[
"aks-agentpool-12345678-vmss000000",
"10.212.0.0/24"
]
[
"aks-agentpool-12345678-vmss000001",
"10.212.1.0/24"
]
So that would be an ipBlock for each node:
ingress:
- from:
- ipBlock:
cidr: 10.212.0.1/32
- ipBlock:
cidr: 10.212.1.1/32
Which is a bit horrible as it's per cluster and per node pool configuration. I only dabble with AKS so there might be a better solution. If you can't find anything else I'd file a bug on https://github.com/Azure/AKS/

How to configure a prometheus target for kubelet metrics

I would like to plot in Grafana, the metrics for the readiness/liveness probes for some of my pods. Currently, the way I am deploying prometheus in my cluster is using:
helm install prometheus stable/prometheus -n prometheus
I am able to see all standard metrics by going to the prometheus UI, but I am trying to figure out how to get the probes metrics. Apparently the kubelet expose these metrics in /metrics/probes, but I don't know how to configure them. Moreover, I noted that apparently the "standard" metrics are grabbed from the kubernetes api-server on the /metrics/ path, but so far I haven't configured any path nor any config file (I just run the above command to install prometheus). I am assuming that this /metrics/ path is hardcoded somewhere in the helm chart repo, but since I want to get the metrics for the kubelets, this might be trickier, as my understanding is that he api-server lives in the master k8s node, and the kubelet only runs on the worker nodes (so I have no idea where to point the /metrics/probes path).
Use Prometheus Operator and create ServiceMonitor in which you can specify the endpoints, ports exposed by kubelet or any other component. Prometheus will start scraping the endpoints for metrics.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: kubelet
labels:
k8s-app: kubelet
spec:
jobLabel: k8s-app
endpoints:
- port: https-metrics
scheme: https
interval: 30s
tlsConfig:
insecureSkipVerify: true
bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
- port: https-metrics
scheme: https
path: /metrics/cadvisor
interval: 30s
honorLabels: true
tlsConfig:
insecureSkipVerify: true
bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
selector:
matchLabels:
k8s-app: kubelet
namespaceSelector:
matchNames:
- kube-system

redis-cluster on kubernetes: connection timed out

I have combined/followed the following manuals to create a redis cluster on kubernetes (GCP):
https://github.com/sanderploegsma/redis-cluster
https://rancher.com/blog/2019/deploying-redis-cluster
I have created 3 nodes with each 2 pods on it. The problem is: I get a connection timeout when I connect from outside of the kubernetes cluster (through a load balancer external ip) to the redis-cluster.
$ redis-cli -h external_ip_lb -p 6379 -c
external_ip_lb:6379> set foo bar
-> Redirected to slot [12182] located at interal_ip_node:6379
Could not connect to Redis at interal_ip_node:6379: Operation timed out
When I get into the shell of a running container and do the redis-cli commands there, it works.
$ kubectl exec -it redis-cluster-0 -- redis-cli -c
127.0.0.1:6379> set foo bar
-> Redirected to slot [12182] located at internal_ip_node:6379
OK
internal_ip_node:6379> get foo
"bar"
I also tried to set a cluster IP service and do a port-foward to my local machine port 7000, this gives me the same error as with the external ip method.
$ kubectl port-foward pods/redis-cluster-0 7000:6379
Does anyone has an idea what could be wrong? Clearly it has something do do with my local machine not being a part of the kubernetes cluster, so the connection with the internal IP's of the other nodes fail.
Edit: output of kubectl describe svc redis-cluster-lb
Name: redis-cluster-lb
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"redis-cluster-lb","namespace":"default"},"spec":{"ports":[{"port"...
Selector: app=redis-cluster
Type: LoadBalancer
IP: internal_ip_lb
LoadBalancer Ingress: external_ip_lb
Port: <unset> 6379/TCP
TargetPort: 6379/TCP
NodePort: <unset> 30631/TCP
Endpoints: internal_ip_node_1:6379,internal_ip_node_2:6379,internal_ip_node_3:6379 + 3 more...
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
I'm able to ping the external load balancer's IP.
I am not Redis expert, but in Redis documentation you can read:
Since cluster nodes are not able to proxy requests, clients may be redirected to other nodes using redirection errors
This is why you are are having this issues with redis cluster behind LB and this is also the reason why it is (most probably) not going to work.
You may probably need to use some proxy (e.g. official redis-cluster-poxy) that is running inside of k8s cluster, can reach all internal IPs of redis cluster and would handle redirects.