Getting "x509: certificate signed by unknown authority" even with "--insecure-skip-tls-verify" option in Kubernetes - kubernetes

I have a private Docker image registry running on a Linux VM (10.78.0.228:5000) and a Kubernetes master running on a different VM running Centos Linux 7.
I used the below command to create a POD:
kubectl create --insecure-skip-tls-verify -f monitorms-rc.yml
I get this:
sample monitorms-mmqhm 0/1 ImagePullBackOff 0 8m
and upon running:
kubectl describe pod monitorms-mmqhm --namespace=sample
Warning Failed Failed to pull image "10.78.0.228:5000/monitorms":
Error response from daemon: {"message":"Get
https://10.78.0.228:5000/v1/_ping: x509: certificate signed by unknown
authority"}
Isn't Kubernetes supposed to ignore the server certificate for all operations during POD creation when the --insecure-skip-tls-verify is passed?
If not, how do I make it ignore the tls verification while pulling the docker image?
PS:
Kubernetes version :
Client Version: v1.5.2
Server Version: v1.5.2
I have raised this issue here: https://github.com/kubernetes/kubernetes/issues/43924

The issue you're seeing is actually a docker issue. Using --insecure-skip-tls-verify is a valid arg to kubectl, but it only deals with the connecition between kubectl and the kubernetes API server. The error you're seeing is actually because the docker daemon cannot login to the private registry because the cert it's using in unsigned.
Have a look at the Docker insecure registry docs and this should solve your problem.

Related

Failed to pull image "velero/velero-plugin-for-gcp:v1.1.0" while installing Velero in GKE Cluster

I'm trying to install and configure Velero for kubernetes backup. I have followed the link to configure it in my GKE cluster. The installation went fine, but velero is not working.
I am using google cloud shell for running all my commands (I have installed and configured velero client in my google cloud shell)
On further inspection on velero deployment and velero pods, I found out that it is not able to pull the image from the docker repository.
kubectl get pods -n velero
NAME READY STATUS RESTARTS AGE
velero-5489b955f6-kqb7z 0/1 Init:ErrImagePull 0 20s
Error from velero pod (kubectl describe pod) (output redacted for readability - only relevant info shown below)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 38s default-scheduler Successfully assigned velero/velero-5489b955f6-kqb7z to gke-gke-cluster1-default-pool-a354fba3-8674
Warning Failed 22s kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Failed to pull image "velero/velero-plugin-for-gcp:v1.1.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 22s kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Error: ErrImagePull
Normal BackOff 21s kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Back-off pulling image "velero/velero-plugin-for-gcp:v1.1.0"
Warning Failed 21s kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Error: ImagePullBackOff
Normal Pulling 8s (x2 over 37s) kubelet, gke-gke-cluster1-default-pool-a354fba3-8674 Pulling image "velero/velero-plugin-for-gcp:v1.1.0"
Command used to install velero: (some of the values are given as variables)
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.1.0 \
--bucket $storagebucket \
--secret-file ~/velero-backup-storage-sa-key.json
Velero Version
velero version
Client:
Version: v1.4.2
Git commit: 56a08a4d695d893f0863f697c2f926e27d70c0c5
<error getting server version: timed out waiting for server status request to be processed>
GKE version
v1.15.12-gke.2
Isn't this a Private Cluster ? โ€“ mario 31 mins ago
#mario this is a private cluster but I can deploy other services without any issues (for eg: I have deployed nginx successfully) โ€“
Sreesan 15 mins ago
Well, this is a know limitation of GKE Private Clusters. As you can read in the documentation:
Can't pull image from public Docker Hub
Symptoms
A Pod running in your cluster displays a warning in kubectl describe such as Failed to pull image: rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Potential causes
Nodes in a private cluster do not have outbound access to the public
internet. They have limited access to Google APIs and services,
including Container Registry.
Resolution
You cannot fetch images directly from Docker Hub. Instead, use images
hosted on Container Registry. Note that while Container Registry's
Docker Hub
mirror
is accessible from a private cluster, it should not be exclusively
relied upon. The mirror is only a cache, so images are periodically
removed, and a private cluster is not able to fall back to Docker Hub.
You can also compare it with this answer.
It can be easily verified on your own by making a simple experiment. Try to run two different nginx deployments. First based on image nginx (which equals to nginx:latest) and the second one based on nginx:1.14.2.
While the first scenario is perfectly feasible because the nginx:latest image can be pulled from Container Registry's Docker Hub mirror which is accessible from a private cluster, any attempt of pulling nginx:1.14.2 will fail which you'll see in Pod events. It happens because the kubelet is not able to find this version of the image in GCR and it tries to pull it from public docker registry (https://registry-1.docker.io/v2/), which in Private Clusters is not possible. "The mirror is only a cache, so images are periodically removed, and a private cluster is not able to fall back to Docker Hub." - as you can read in docs.
If you still have doubts, just ssh into your node and try to run following commands:
curl https://cloud.google.com/container-registry/
curl https://registry-1.docker.io/v2/
While the first one works perfectly, the second one will eventually fail:
curl: (7) Failed to connect to registry-1.docker.io port 443: Connection timed out
Reason ? - "Nodes in a private cluster do not have outbound access to the public internet."
Solution ?
You can search what is currently available in GCR here.
In many cases you should be able to get the required image if you don't specify it's exact version (by default latest tag is used). While it can help with nginx, unfortunatelly no version of velero/velero-plugin-for-gcp is currently available in Google Container Registry's Docker Hub mirror.
Granting private nodes outbound internet access by using Cloud NAT seems the only reasonable solution that can be applied in your case.
I solved this problem by realizing that version of:
velero/velero-plugin-for-gcp
is not following the version of:
velero/velero
For example, now latest versions are:
velero/velero:v1.9.1 and velero/velero-plugin-for-gcp:v1.5.0

Cert-manager fails on kubernetes with webhooks

I'm following the Kubernetes install instructions for Helm: https://docs.cert-manager.io/en/latest/getting-started/install/kubernetes.html
With Cert-manager v0.81 on K8 v1.15, Ubuntu 18.04 on-premise.
When I get to testing the installation, I get these errors:
error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "issuers.admission.certmanager.k8s.io": the server is currently unable to handle the request
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "certificates.admission.certmanager.k8s.io": the server is currently unable to handle the request
If I apply the test-resources.yaml before installing with Helm, I'm not getting the errors but it is still not working.
These errors are new to me, as Cert-manager used to work for me on my previous install about a month ago, following the same installation instructions.
I've tried with Cert-Manager 0.72(CRD 0.7) as well as I think that was the last version I managed to get installed but its not working either.
What does these errors mean?
Update: It turned out to be an internal CoreDNS issue on my cluster. Somehow not being configured correctly. Possible related to wrong POD_CIDR configuration.
If you experience this problem, check the logs of CoreDNS(Or KubeDNS) and you may see lots of errors related to contacting services. Unfortunately, I no longer have the errors.
But this is how I figured out that my network setup was invalid.
I'm using Calico(Will apply for other networks as well) and its network was not set to the same as the POD_CIDR network that I initialized my Kubernetes with.
Example
1. Set up K8:
kubeadm init --pod-network-cidr=10.244.0.0/16
Configure Calico.yaml:
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
I also tried cert-manager v0.8.0 a very similar setup on Ubuntu 18.04 and k8s v1.14.1 and I began to get the same error when i tore down cert-manager using kubectl delete and reinstalled it, after experiencing some network issues on the cluster.
I stumbled on a solution that worked. On the master node, simply restart the apiserver container:
$ sudo docker ps -a | grep apiserver
af99f816c7ec gcr.io/google_containers/kube-apiserver#sha256:53b987e5a2932bdaff88497081b488e3b56af5b6a14891895b08703129477d85 "/bin/sh -c '/usr/loc" 15 months ago Up 19 hours k8s_kube-apiserver_kube-apiserver-ip-xxxxxc_0
40f3a18050c3 gcr.io/google_containers/pause-amd64:3.0 "/pause" 15 months ago Up 15 months k8s_POD_kube-apiserver-ip-xxxc_0
$ sudo docker restart af99f816c7ec
af99f816c7ec
$
Then try applying the test-resources.yaml again:
$ kubectl apply -f test-resources.yaml
namespace/cert-manager-test unchanged
issuer.certmanager.k8s.io/test-selfsigned created
certificate.certmanager.k8s.io/selfsigned-cert created
If that does not work, this github issue mentions that the master node might need firewall rules to be able to reach the cert-manager-webhook pod. The exact steps to do so will depend on which cloud platform you are on.

Installation error with Istio 1.1.3 and Kubernetes 1.13.5

I'm trying to install Istio 1.13.1 on Kubernetes 1.13.5 deployed on minikube 1.0.0 but I get some errors in the end. Here is log of the installation:
$ minikube start --memory=4096 --disk-size=30g --kubernetes-version=v1.13.5 --profile=istio
๐Ÿ˜„ minikube v1.0.0 on darwin (amd64)
๐Ÿคน Downloading Kubernetes v1.13.5 images in the background ...
๐Ÿ”ฅ Creating virtualbox VM (CPUs=2, Memory=4096MB, Disk=30000MB) ...
2019/04/19 19:51:56 No matching credentials were found, falling back on anonymous
2019/04/19 19:51:56 No matching credentials were found, falling back on anonymous
2019/04/19 19:51:56 No matching credentials were found, falling back on anonymous
2019/04/19 19:51:56 No matching credentials were found, falling back on anonymous
๐Ÿ“ถ "istio" IP address is 192.168.99.104
๐Ÿณ Configuring Docker as the container runtime ...
๐Ÿณ Version of container runtime is 18.06.2-ce
โŒ› Waiting for image downloads to complete ...
โœจ Preparing Kubernetes environment ...
๐Ÿ’พ Downloading kubeadm v1.13.5
๐Ÿ’พ Downloading kubelet v1.13.5
๐Ÿšœ Pulling images required by Kubernetes v1.13.5 ...
๐Ÿš€ Launching Kubernetes v1.13.5 using kubeadm ...
โŒ› Waiting for pods: apiserver proxy etcd scheduler controller dns
๐Ÿ”‘ Configuring cluster permissions ...
๐Ÿค” Verifying component health .....
๐Ÿ’— kubectl is now configured to use "istio"
๐Ÿ„ Done! Thank you for using minikube!
$ ./bin/istioctl version
version.BuildInfo{Version:"1.1.3", GitRevision:"d19179769183541c5db473ae8d062ca899abb3be", User:"root", Host:"fbd493e1-5d72-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.2-56-gd191797"}
$ kubectl create -f install/kubernetes/istio-demo.yaml
namespace/istio-system created
customresourcedefinition.apiextensions.k8s.io/virtualservices.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/destinationrules.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/serviceentries.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/gateways.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/envoyfilters.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/clusterrbacconfigs.rbac.istio.io created
customresourcedefinition.apiextensions.k8s.io/policies.authentication.istio.io created
customresourcedefinition.apiextensions.k8s.io/meshpolicies.authentication.istio.io created
customresourcedefinition.apiextensions.k8s.io/httpapispecbindings.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/httpapispecs.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/quotaspecbindings.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/quotaspecs.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/rules.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/attributemanifests.config.istio.io created
...
unable to recognize "install/kubernetes/istio-demo.yaml": no matches for kind "attributemanifest" in version "config.istio.io/v1alpha2"
unable to recognize "install/kubernetes/istio-demo.yaml": no matches for kind "attributemanifest" in version
That seems strange as the CRDs seem to have been created successfully but then when they are referenced to create some objects whose type is one of these CRDs then it fails.
I omitted other errors but that happens also for "handler", "logentry", "rule", "metric", "kubernetes", "DestinationRule"
.
On the documentation page https://istio.io/docs/setup/kubernetes/, it is stated that
Istio 1.1 has been tested with these Kubernetes releases: 1.11, 1.12, 1.13.
Does anyone have an idea ?
In the docs there is a step to execute CRDs init. I don't see that in your snippet, seems like that's what you're missing.
So:
$ for i in install/kubernetes/helm/istio-init/files/crd*yaml; do kubectl apply -f $i; done
Your missing CRD seems to be defined in this exact file: https://github.com/istio/istio/blob/master/install/kubernetes/helm/istio-init/files/crd-10.yaml but you should install all of them.
My bad, it seems that I have skipped the first step:
Install all the Istio Custom Resource Definitions (CRDs) using kubectl apply, and wait a few seconds for the CRDs to be committed in the Kubernetes API-server:
$ for i in install/kubernetes/helm/istio-init/files/crd*yaml; do kubectl apply -f $i; done

Kubernetes Master Server is failing to become up and running

Installed kubeadm v1.6.0-alpha, kubectl v1.5.3, kubelet v1.5.3
Executed command $kubeadm init, to bring the Kubernetes Master up.
Issue observed: Stuck with the below log message
Created API client, waiting for the control plane to become ready
How to make the Kubernetes master server up and running or how to debug the issue?
Could you try using kubelet and kubectl 1.6 to see if it is a version mismatch?

Kubernetes ssh into pods fails

I'm trying to ssh into my pod with this command
kubectl --namespace=default exec -ti pod-name /bin/bash
I get this error:
Content-Type specified (plain/text) must be 'application/json'
The process gets stuck and I have to close the terminal.
I was able to ssh into my pods before I re install kubernetes in my machine. Is this an issue with latest kubernetes releases?
You're not trying to "ssh", you're forwarding your standard input and receiving a standard output over HTTP through the Kubernetes API.
That said, you're using Docker 1.10 whereas Kubernetes doesn't support it yet. Check this out https://github.com/kubernetes/kubernetes/issues/19720
edit:
Kubernetes supports Docker 1.10+ since the 1.3.0 release.