cert-manager-webhook: FailedDiscoveryCheck, namespace hangs in termination

cert-manager-webhook: FailedDiscoveryCheck, namespace hangs in termination - kubernetes

I deleted a namespace that has a service that is exposed with nginx-ingress with a Let's Encrypt certificate controlled by cert-manager. Deletion of the namespace is hanging with status Terminating.
It is likely a problem with the internal API as explained here. When I run:
kubectl api-resources
it returns that the certmanager webhook API isn't reachable:
error: unable to retrieve the complete list of server APIs: webhook.certmanager.k8s.io/v1beta1: the server is currently unable to handle the request
When I run kubectl get apiservices v1beta1.webhook.certmanager.k8s.io -o yaml, for checking its status conditions:
...
service:
name: cert-manager-webhook
namespace: nginx-ingress
port: 443
version: v1beta1
versionPriority: 15
status:
conditions:
- lastTransitionTime: "2020-01-21T15:02:23Z"
message: 'failing or missing response from https://10.24.32.6:10250/apis/webhook.certmanager.k8s.io/v1beta1:
bad status from https://10.24.32.6:10250/apis/webhook.certmanager.k8s.io/v1beta1:
404'
reason: FailedDiscoveryCheck
status: "False"
type: Available
All nginx-ingress and cert-manager pods are in good health. I have done an update on certmanager in the time that I have deployed and deleted this namespace, which might be an explanation of the issue. How can this problem be solved?
versions:
Kubernetes: v1.15.4-gke.22
cert-manager v0.12.0
nginx-ingress: 1.29.3

A simle solution to solve the issue is presented here. But this does not describe how such a problem arises or can be prevented.
Create a temporary JSON file that describes the terminating namespace:
kubectl get namespace <terminating-namespace> -o json >tmp.json
Edit the file tmp.json by removing the kubernetes value from the finalizers field and save the file.
Set a temporary proxy IP and port:
kubectl proxy
From a new terminal window, make an API call with your temporary proxy IP and port:
curl -k -H "Content-Type: application/json" -X PUT --data-binary #tmp.json http://127.0.0.1:8001/api/v1/namespaces/<terminating-namespace>/finalize

Related

ArgoCD CLI login with server running with --insecure

So I have installed ArgoCD on to my cluster. I then patched it with,
kubectl -n argocd patch deployment argocd-server --type json -p='[ { "op": "replace", "path":"/spec/template/spec/containers/0/command","value": ["argocd-server","--insecure"] }]'
so that i can host it with Contour dealing with the TLS / SSL Cert. Heres the config for the ingress / Contour:
apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
name: argocd
namespace: argocd
spec:
virtualhost:
fqdn: argo.xxx.com
tls:
secretName: default/cert
routes:
- requestHeadersPolicy:
set:
- name: l5d-dst-override
value: argocd-server.argocd.svc.cluster.local:443
services:
- name: argocd-server
port: 443
conditions:
- prefix: /
loadBalancerPolicy:
strategy: Cookie
But now cant login to the Argo server with the cli, even using port-forward (which worked, before i patched the server with the 'insecure' flag).
When trying to use the port-forward access, i get this
error creating error stream for port 8080 -> 8080: EOF
Using,
kubectl port-forward svc/argocd-server -n argocd 8080:443
So I have tried as many options / flags as i can think of to login via the ingress / contour url,
argocd login argo.xxx.com --plaintext --insecure --grpc-web
argocd login argo.xxx.com --plaintext --insecure
argocd login argo.xxx.com --plaintext
argocd login argo.xxx.com --insecure --grpc-web
I either get back a 404 or a 502. Sometimes an empty error code,
FATA[0007] rpc error: code = Unavailable desc =
FATA[0003] rpc error: code = Unknown desc = POST http://argo.xxx.com:443/session.SessionService/Create failed with status code 502
FATA[0002] rpc error: code = Unknown desc = POST https://argo.xxx.com:443/argocd/session.SessionService/Create failed with status code 404
With out any flags added to login, this is the error i get back,
FATA[0007] rpc error: code = Internal desc = transport: received the unexpected content-type "text/plain; charset=utf-8"

It might have been a while, and could have been solved, but I had a similar issue.
With the ArgoCD version 2.4.14 I was able to resolve the issue using the command:
argocd login --insecure --port-forward --port-forward-namespace=argocd --plaintext
Username: admin
Password:
'admin:login' logged in successfully
Context 'port-forward' updated
I allowed the CLI to use its selector app.kubernetes.io/name=argocd-server to find the service in the argocd namespace.

Using a custom certificate for the Kubernetes api server with minikube

I have been trying to find how to do this but so far have found nothing, I am quite new to Kubernetes so I might just have looked over it. I want to use my own certificate for the Kubernetes API server, is this possible? And if so, can someone perhaps give me a link?

Ok, so here is my idea. We know we cannot change cluster certs, but there is other way to do it. We should be able to proxy through ingress.
First we enabled ingres addon:
➜ ~ minikube addons enable ingress
Given tls.crt and tls.key we create a secret (you don't need to do this if you are using certmanager but this requires some additinal steps I am not going to describe here):
➜ ~ kubectl create secret tls my-tls --cert=tls.crt --key tls.key
and an ingress object:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-k8s
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
tls:
- hosts:
- foo.bar.com
secretName: my-tls
rules:
- host: foo.bar.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes
port:
number: 443
Notice what docs say about CN and FQDN: k8s docs:
Referencing this secret in an Ingress tells the Ingress controller to secure the channel from the client to the load balancer using TLS. You need to make sure the TLS secret you created came from a certificate that contains a Common Name (CN), also known as a Fully Qualified Domain Name (FQDN) for https-example.foo.com.
The only issue with this approach is that we cannot use certificates for authentication when accessing from the outside.
But we can use tokens. Here is a page in k8s docs: https://kubernetes.io/docs/reference/access-authn-authz/authentication/ that lists all possible methods of authentication.
For testing I choose serviceaccout token but feel free to experiment with others.
Let's create a service account, bind a role to it, and try to access the cluster:
➜ ~ kubectl create sa cadmin
serviceaccount/cadmin created
➜ ~ kubectl create clusterrolebinding --clusterrole cluster-admin --serviceaccount default:cadmin cadminbinding
clusterrolebinding.rbac.authorization.k8s.io/cadminbinding created
Now we follow these instructions: access-cluster-api from docs to try to access the cluster with sa token.
➜ ~ APISERVER=https://$(minikube ip)
➜ ~ TOKEN=$(kubectl get secret $(kubectl get serviceaccount cadmin -o jsonpath='{.secrets[0].name}') -o jsonpath='{.data.token}' | base64 --decode )
➜ ~ curl $APISERVER/api --header "Authorization: Bearer $TOKEN" --insecure -H "Host: foo.bar.com"
{
"kind": "APIVersions",
"versions": [
"v1"
],
"serverAddressByClientCIDRs": [
{
"clientCIDR": "0.0.0.0/0",
"serverAddress": "192.168.39.210:8443"
}
]
}
note: I am testing it with invalid/selfsigned certificates and I don't own the foo.bar.com domain so I need to pass Host header by hand. For you it may look a bit different, so don't just copypate; try to understand what's happening and adjust it. If you have a domain you should be able to access it directly (no $(minikube ip) necessary).
As you should see, it worked! We got a valid response from api server.
But we probably don't want to use curl to access k8s.
Let's create a kubeconfig with the token.
kubectl config set-credentials cadmin --token $TOKEN --kubeconfig my-config
kubectl config set-cluster mini --kubeconfig my-config --server https://foo.bar.com
kubectl config set-context mini --kubeconfig my-config --cluster mini --user cadmin
kubectl config use-context --kubeconfig my-config mini
And now we can access k8s with this config:
➜ ~ kubectl get po --kubeconfig my-config
No resources found in default namespace.

Yes, you can use your own certificate and set inn the Kubernetes API server.
Suppose you have created the certificate move and save them to specific node directory:
{
sudo mkdir -p /var/lib/kubernetes/
sudo mv ca.pem ca-key.pem kubernetes-key.pem kubernetes.pem \
service-account-key.pem service-account.pem \
encryption-config.yaml /var/lib/kubernetes/
}
The instance internal IP address will be used to advertise the API Server to members of the cluster. Get the internal IP:
INTERNAL_IP=$(curl -s -H "Metadata-Flavor: Google" \
http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/ip)
you can crate the service of API server and set it.
Note : Above mentioned example is specifically with consider the GCP instances so you might have to change some commands like.
INTERNAL_IP=$(curl -s -H "Metadata-Flavor: Google" \
http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/ip)
for the above command, you can provide the manual bare metal IP list instead of getting from GCP instance API if you are not using it.
Here we go please refer to this link : https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/docs/08-bootstrapping-kubernetes-controllers.md#configure-the-kubernetes-api-server
here you can find all the details for creating and setting whole Kubernetes cluster from scratch along will detailed document and commands : https://github.com/kelseyhightower/kubernetes-the-hard-way

kubernetes dashboard will not load

I am completely new to Kubernetes, so go easy on me.
I am running kubectl proxy but am only seeing the JSON output. Based on this discussion I attempted to set the memory limits by running:
kubectl edit deployment kubernetes-dashboard --namespace kube-system
I then changed the container memory limit:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
...
spec:
...
template:
metadata:
...
spec:
containers:
- image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.8.1
imagePullPolicy: IfNotPresent
livenessProbe:
...
name: kubernetes-dashboard
ports:
- containerPort: 9090
protocol: TCP
resources:
limits:
memory: 1Gi
I still only get the JSON served when I save that and visit http://127.0.0.1:8001/ui
Running kubectl logs --namespace kube-system kubernetes-dashboard-665756d87d-jssd8 I see the following:
Starting overwatch
Using in-cluster config to connect to apiserver
Using service account token for csrf signing
No request provided. Skipping authorization
Successful initial request to the apiserver, version: v1.10.0
Generating JWE encryption key
New synchronizer has been registered: kubernetes-dashboard-key-holder-kube-system. Starting
Starting secret synchronizer for kubernetes-dashboard-key-holder in namespace kube-system
Initializing JWE encryption key from synchronized object
Creating in-cluster Heapster client
Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Serving insecurely on HTTP port: 9090
I read through a bunch of links from a Google search on the error but nothing really worked.
Key components are:
Local: Ubuntu 18.04 LTS
minikube: v0.28.0
Kubernetes Dashboard: 1.8.3
Installed via:
kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
Halp!

Have you considered using the minikube dashboard? You can reach it by:
minikube dashboard
Also you will get json on http://127.0.0.1:8001/ui because it is deprecated, so you have to use full proxy URL as it states in the dashboard github page.
If you still want to use this 'external' dashboard for some future not minikube related projects or there is some other reason I don't know about you can reach it by:
kubectl proxy
and then:
http://localhost:8001/api/v1/namespaces/kube-system/services/http:kubernetes-dashboard:/proxy/
note that in the documentation it is https which is not correct in this case (might be documentation error or it might be clarified in the documentation part which I suggest you read if you need further information on web UI).
Hope this helps.

Grafana HTTP Error Bad Gateway and Templating init failed errors

Use helm installed Prometheus and Grafana on minikube at local.
$ helm install stable/prometheus
$ helm install stable/grafana
Prometheus server, alertmanager grafana can run after set port-forward:
$ export POD_NAME=$(kubectl get pods --namespace default -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
$ kubectl --namespace default port-forward $POD_NAME 9090
$ export POD_NAME=$(kubectl get pods --namespace default -l "app=prometheus,component=alertmanager" -o jsonpath="{.items[0].metadata.name}")
$ kubectl --namespace default port-forward $POD_NAME 9093
$ export POD_NAME=$(kubectl get pods --namespace default -l "app=excited-crocodile-grafana,component=grafana" -o jsonpath="{.items[0].metadata.name}")
$ kubectl --namespace default port-forward $POD_NAME 3000
Add Data Source from grafana, got HTTP Error Bad Gateway error:
Import dashboard 315 from:
https://grafana.com/dashboards/315
Then check Kubernetes cluster monitoring (via Prometheus), got Templating init failed error:
Why?

In the HTTP settings of Grafana you set Access to Proxy, which means that Grafana wants to access Prometheus. Since Kubernetes uses an overlay network, it is a different IP.
There are two ways of solving this:
Set Access to Direct, so the browser directly connects to Prometheus.
Use the Kubernetes-internal IP or domain name. I don't know about the Prometheus Helm-chart, but assuming there is a Service named prometheus, something like http://prometheus:9090 should work.

I turned off the firewall on appliance, post that adding http://prometheus:9090 on URL did not throw bad gateway error.

I was never able to find a "proper" fix, but I found a workaround:
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
selector:
app: prometheus
prometheus: k8s
sessionAffinity: ClientIP
clusterIP: None
By setting the clusterIP to None, the service changes to "Headless" mode, which means that requests are sent directly to a random one of the pods in that service/cluster. More info here: https://kubernetes.io/docs/concepts/services-networking/service/#headless-services
There's probably a better solution, but this is the only one I've found that actually works for me, with kube-prometheus. (I've tried docker-desktop, k3d, and kind, and all of them have the same issue, so I doubt it's the emulator's fault; and I stripped my config down to basically just kube-prometheus, so it's hard to understand where the problem lies, but oh well.)

Istio Ingress resulting in "no healthy upstream"

I am using deploying an outward facing service, that is exposed behind a nodeport and then an istio ingress. The deployment is using manual sidecar injection. Once the deployment, nodeport and ingress are running, I can make a request to the istio ingress.
For some unkown reason, the request does not route through to my deployment and instead displays the text "no healthy upstream". Why is this, and what is causing it?
I can see in the http response that the status code is 503 (Service Unavailable) and the server is "envoy". The deployment is functioning as I can map a port forward to it and everything works as expected.

Just in case, like me, you get curious... Even though in my scenario it was clear the case of the error...
Error cause: I had two versions of the same service (v1 and v2), and an Istio VirtualService configured with traffic route destination using weights. Then, 95% goes to v1 and 5% goes to v2. As I didn't have the v1 deployed (yet), of course, the error "503 - no healthy upstream" shows up 95% of the requests.
Ok, even so, I knew the problem and how to fix it (just deploy v1), I was wondering... But, how can I have more information about this error? How could I get a deeper analysis of this error to find out what was happening?
This is a way of investigating using the configuration command line utility of Istio, the istioctl:
# 1) Check the proxies status -->
$ istioctl proxy-status
# Result -->
NAME CDS LDS EDS RDS PILOT VERSION
...
teachstore-course-v1-74f965bd84-8lmnf.development SYNCED SYNCED SYNCED SYNCED istiod-86798869b8-bqw7c 1.5.0
...
...
# 2) Get the name outbound from JSON result using the proxy (service with the problem) -->
$ istioctl proxy-config cluster teachstore-course-v1-74f965bd84-8lmnf.development --fqdn teachstore-student.development.svc.cluster.local -o json
# 2) If you have jq install locally (only what we need, already extracted) -->
$ istioctl proxy-config cluster teachstore-course-v1-74f965bd84-8lmnf.development --fqdn teachstore-course.development.svc.cluster.local -o json | jq -r .[].name
# Result -->
outbound|80||teachstore-course.development.svc.cluster.local
inbound|80|9180-tcp|teachstore-course.development.svc.cluster.local
outbound|80|v1|teachstore-course.development.svc.cluster.local
outbound|80|v2|teachstore-course.development.svc.cluster.local
# 3) Check the endpoints of "outbound|80|v2|teachstore-course..." using v1 proxy -->
$ istioctl proxy-config endpoints teachstore-course-v1-74f965bd84-8lmnf.development --cluster "outbound|80|v2|teachstore-course.development.svc.cluster.local"
# Result (the v2, 5% of the traffic route is ok, there are healthy targets) -->
ENDPOINT STATUS OUTLIER CHECK CLUSTER
172.17.0.28:9180 HEALTHY OK outbound|80|v2|teachstore-course.development.svc.cluster.local
172.17.0.29:9180 HEALTHY OK outbound|80|v2|teachstore-course.development.svc.cluster.local
# 4) However, for the v1 version "outbound|80|v1|teachstore-course..." -->
$ istioctl proxy-config endpoints teachstore-course-v1-74f965bd84-8lmnf.development --cluster "outbound|80|v1|teachstore-course.development.svc.cluster.local"
ENDPOINT STATUS OUTLIER CHECK CLUSTER
# Nothing! Emtpy, no Pods, that's explain the "no healthy upstream" 95% of time.

Although this is a somewhat general error resulting from a routing issue within an improper Istio setup, I will provide a general solution/piece of advice to anyone coming across the same issue.
In my case the issue was due to incorrect route rule configuration, the Kubernetes native services were functioning however the Istio routing rules were incorrectly configured so Istio could not route from the ingress into the service.

I faced the issue, when I my pod was in ContainerCreating state. So, it resulted in 503 error. Also as #pegaldon, explained it can also occur due to incorrect route configuration or there are no gateways created by user.

delete destinationrules.networking.istio.io
and recreate the virtualservice.networking.istio.io
[root#10-20-10-110 ~]# curl http://dprovider.example.com:31400/dw/provider/beat
no healthy upstream[root#10-20-10-110 ~]#
[root#10-20-10-110 ~]# curl http://10.210.11.221:10100/dw/provider/beat
"该服务节点 10.210.11.221 心跳正常!"[root#10-20-10-110 ~]#
[root#10-20-10-110 ~]#
[root#10-20-10-110 ~]# cat /home/example_service_yaml/vs/dw-provider-service.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: dw-provider-service
namespace: example
spec:
hosts:
- "dprovider.example.com"
gateways:
- example-gateway
http:
- route:
- destination:
host: dw-provider-service
port:
number: 10100
subset: "v1-0-0"
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: dw-provider-service
namespace: example
spec:
host: dw-provider-service
subsets:
- name: "v1-0-0"
labels:
version: 1.0.0
[root#10-20-10-110 ~]# vi /home/example_service_yaml/vs/dw-provider-service.yaml
[root#10-20-10-110 ~]# kubectl -n example get vs -o wide | grep dw
dw-collection-service [example-gateway] [dw.collection.example.com] 72d
dw-platform-service [example-gateway] [dplatform.example.com] 81d
dw-provider-service [example-gateway] [dprovider.example.com] 21m
dw-sync-service [example-gateway] [dw-sync-service dsync.example.com] 34d
[root#10-20-10-110 ~]# kubectl -n example delete vs dw-provider-service
virtualservice.networking.istio.io "dw-provider-service" deleted
[root#10-20-10-110 ~]# kubectl -n example delete d dw-provider-service
daemonsets.apps deniers.config.istio.io deployments.extensions dogstatsds.config.istio.io
daemonsets.extensions deployments.apps destinationrules.networking.istio.io
[root#10-20-10-110 ~]# kubectl -n example delete destinationrules.networking.istio.io dw-provider-service
destinationrule.networking.istio.io "dw-provider-service" deleted
[root#10-20-10-110 ~]# kubectl apply -f /home/example_service_yaml/vs/dw-provider-service.yaml
virtualservice.networking.istio.io/dw-provider-service created
[root#10-20-10-110 ~]# curl http://dprovider.example.com:31400/dw/provider/beat
"该服务节点 10.210.11.221 心跳正常!"[root#10-20-10-110 ~]#
[root#10-20-10-110 ~]#

From my experience, the "no healthy upstream" error can have different causes. Usually, Istio has received ingress traffic that should be forwarded (the client request, or Istio downstream), but the destination is unavailable (istio upstream / kubernetes service). This results in a HTTP 503 "no healthy upstream" error.
1.) Broken Virtualservice definitions
If you have a destination in your VirtualService context where the traffic should be routed, ensure this destination exists (in terms of the hostname is correct, or the service is available from this namespace)
2.) ImagePullBack / Terminating / Service is not available
Ensure your destination is available in general. Sometimes no pod is available, so no upstream will be available too.
3.) ServiceEntry - same destination in 2 lists, but lists with different DNS Rules
Check your namespace for ServiceEntry objects with:
kubectl -n <namespace> get serviceentry
If the result has more than one entry (multiple lines in one ServiceEntry object), check if a destination address (e.g. foo.com) is available in various lines.
If the same destination address (e.g. foo.com) is available in various lines, ensure that the column "DNS" does not have different resolution settings (e.g. one line uses DNS, the other line has NONE). If yes, this is an indicator that you try to apply different DNS settings to the same destination address.
A solution is:
a) to unify the DNS setting, setting all lines to NONE or DNS, but not to mix it up.
b) Ensure the destination (foo.com) is available in one line, and a collision of different DNS rules does not appear.
a) involves restarting istio-ingressgateway pods (data plane) to make it work.
b) Involves no restart of istio data or istio control plane.
Basically: It helps to check the status between Control Plane (istiod) and DatapPlane (istio-ingressgateway) with
istioctl proxy-status
The output of istioctl proxy-status should ensure that the columns say "SYNC" this ensures that the control plane and Data Plane are synced. If not, you can restart the istio-ingressgateway deployment or the istiod daemonset, to force "fresh" processes.
Further, it helped to run
istioctl analyze -A
to ensure that targets are checked in the VirtualService context and do exist. If a virtual service definition exists with routing definitions whose destination is unavailable, istioctl analyze -A can detect these unavailable destinations.
Furthermore, reading the logfiles of the istiod container helps. The istiod error messages often indicate the context of the error in the routing (which namespace and service or istio setting). You can use the default way with
kubectl -n istio-system logs <nameOfIstioDPod>
Referenes:
https://istio.io/latest/docs/reference/config/networking/service-entry/
https://istio.io/latest/docs/reference/config/networking/virtual-service/
https://istio.io/latest/docs/ops/diagnostic-tools/proxy-cmd/

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse