TLS handshake timeout with kubernetes in GKE - kubernetes

I've created a cluster on Google Kubernetes Engine (previously Google Container Engine) and installed the Google Cloud SDK and the Kubernetes tools with it on my Windows machine.
It worked well for some time, and, out of nowhere, it stopped working. Every command I'm issuing with kubectl provokes the following:
Unable to connect to the server: net/http: TLS handshake timeout
I've searched Google, the Kubernetes Github Issues, Stack Overflow, Server Fault ... without success.
I've tried the following:
Restart my computer
Change wifi connection
Check that I'm not somehow using a proxy
Delete and re-create my cluster
Uninstall the Google Cloud SDK (and kubectl) from my machine and re-install them
Delete my .kube folder (config and cache)
Check my .kube/config
Change my cluster's version (tried 1.8.3-gke.0 and 1.7.8-gke.0)
Retry several hours later
Tried both on PowerShell and cmd.exe
Note that the cluster seem to work perfectly, since I have my application running on it and can interact with it normally through the Google Cloud Shell.
Running:
gcloud container clusters get-credentials cluster-2 --zone europe-west1-b --project ___
kubectl get pods
works on Google Cloud Shell and provokes the TLS handshake timeout on my machine.

For others seeing this issue, there is another cause to consider.
After doing:
gcloud config set project $PROJECT_NAME
gcloud config set container/cluster $CLUSTER_NAME
gcloud config set compute/zone europe-west2
gcloud beta container clusters get-credentials $CLUSTER_NAME --region europe-west2 --project $PROJECT_NAME
I was then seeing:
kubectl cluster-info
Unable to connect to the server: net/http: TLS handshake timeout
I tried everything suggested here and elsewhere. When the above worked without issue from my home desktop, I discovered that shared workspace wifi was disrupting TLS/VPNs to control the internet access!

This is what I did to solve the above problem.
I simply ran the following commands::
> gcloud container clusters get-credentials {cluster_name} --zone {zone_name} --project {project_name}
> gcloud auth application-default login
Replace the placeholders appropriately.

So this MAY NOT work for you on GKE, but Azure AKS (managed Kubernetes) has a similar problem with the same error message so who knows — this might be helpful to someone.
The solution to this for me was to scale the nodes in my Cluster from the Azure Kubernetes service blade web console.
Workaround / Solution
Log into the Azure (or GKE) Console — Kubernetes Service UI.
Scale your cluster up by 1 node.
Wait for scale to complete and attempt to connect (you should be able to).
Scale your cluster back down to the normal size to avoid cost increases.
Total time it took me ~2 mins.
More Background Info on the Issue
Added this to the full ticket description write up that I posted over here (if you want more info have a read):
'Unable to connect Net/http: TLS handshake timeout' — Why can't Kubectl connect to Azure AKS server?

Related

Kubernetes suddenly stopped being able to connect to server

I was successfully able to connect to the kubernetes cluster and work with the services and pods. At one point this changed and everytime I try to connect to the cluster I get the following error:
PS C:\Users\xxx> kubectl get pods
Unable to connect to the server: error parsing output for access token command "C:\\Program Files (x86)\\Google\\Cloud SDK\\google-cloud-sdk\\bin\\gcloud.cmd config config-helper --format=json": yaml: line 4: could not find expected ':'
I am unsure of what the issue is. Google unfortunately doesn't yield any results for me either.
I have not changed any config files or anything. It was a matter of it working one second and not working the next.
Thanks.
It looks like the default auth plugin for GKE might be buggy on windows. kubectl is trying to run gcloud to get a token to authenticate to your cluster. If you run kubectl config view you can see the command it tried to run, and run it yourself to see if/why it fails.
As Alexandru said, a workaround is to use Google Application Default Credentials. Actually, gcloud container has built in support for doing this, which you can toggle by setting a property:
gcloud config set container/use_application_default_credentials true Try running this or set environment variable
%CLOUDSDK_CONTAINER_USE_APPLICATION_DEFAULT_CREDENTIALS% to true.
Referenced from here
The workaround for this issue being:
gcloud container clusters get-credentials <cluster-name> If you dont know your cluster name find it by gcloud container clusters list Finally, if those don't have issues, do gcloud auth application-default login and login with relative details

Problems in connection when I try to connect GKE cluster using kubectl

I have a running cluster on Google Cloud Kubernetes engine and I want to access that using kubectl from my local system.
I tried installing kubectl with gcloud but it didn't worked. Then I installed kubectl using apt-get. When I try to see the version of it using kubectl version it says
Unable to connect to server EOF. I also don't have file ~/.kube/config, which I am not sure why. Can someone please tell me what I am missing here? How can I connect to the already running cluster in GKE?
gcloud container clusters get-credentials ... will auth you against the cluster using your gcloud credentials.
If successful, the command adds appropriate configuration to ~/.kube/config such that you can kubectl.

Azure Service Fabric Cluster returns nothing for code-versions and config-versions

In short: both the "sfctl cluster code-versions" and "sfctl cluster config-versions" return empty arrays. Is this a symptom of a problem with the cluster?
Background: I am attempting to follow the Create a Linux container app tutorial, for learning about Service Fabric; but I have run into a problem when the application upload fails with a timeout.
On investigating this, I found that the other sfctl cluster commands (e.g. sfctl cluster health) all worked and returned useful data - except code-versions and config-versions, which both return an empty array:
$ sfctl cluster code-versions
[]
$ sfctl cluster config-versions
[]
I'm not sure if that's unhealthy, or what kind of data they might be returning.
Other notes:
The cluster is secured with a self-signed certificate; this is installed locally and works correctly, but both the above commands also log a warning:
~/.local/lib/python3.5/site-packages/urllib3/connectionpool.py:847: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings InsecureRequestWarning)
However, the same warning is logged for the other commands (e.g. sysctl cluster health) and doesn't stop them from working.
The cluster is at version 6.4.634.1, on Linux
Service Fabric Explorer shows everything as Healthy: Cluster Health State, System Application Health State, and the 3 nodes.
The Azure portal shows the cluster status as "Baseline upgrade"
Explorer shows the cluster as having Code Version "0.0.0.0"

Google Cloud Kubernetes Engine troubleshooting

is anyone experiencing issue with the Google Kubernetes Engine (specifically in us-central1-b region - April 3rd 11pm EST)
I'm not able to to see my workload or any of my cluster configurations in the Google Kubernetes Engine section, it is intermittent (one minute is there then disappears)
Also can't connect to the Kubernetes section of the Google Cloud Console to check on my pods !!
No information of any issues as far as I can see. Could have been a network connection issue on your side, please check.
It appears to me that this is some issue about getting metadata from the cluster. You could check the following to help you troubleshoot or see if there is any underlying problems:
$ docker info
$ docker version
$ kubectl version
$ gcloud container clusters describe <your-cluster-name-) --zone <your-cluster-zone>
$ kubectl get componentstatuses

Kubernetes unable to pull images from gcr.io

I am trying to setup Kubernetes for the first time. I am following the Fedora Manual installation guide: http://kubernetes.io/v1.0/docs/getting-started-guides/fedora/fedora_manual_config.html
I am trying to get the kubernetes addons running , specifically the kube-ui. I created the service and replication controller like so:
kubectl create -f cluster/addons/kube-ui/kube-ui-rc.yaml --namespace=kube-system
kubectl create -f cluster/addons/kube-ui/kube-ui-svc.yaml --namespace=kube-system
When i run
kubectl get events --namespace=kube-system
I see errors such as this:
Failed to pull image "gcr.io/google_containers/pause:0.8.0": image pull failed for gcr.io/google_containers/pause:0.8.0, this may be because there are no credentials on this request. details: (Authentication is required.)
How am i supposed to tell kubernetes to authenticate? This isnt covered in the documentation. So how do i fix this?
This happened due to a recent outage to gce storage as a result of which all of us went through this error while pulling images from gcr (which uses gce storage on the backend).
Are you still seeing this error ?
as the message says, you need credentials. Are you using Google Container Engine? Then you need to run
gcloud config set project <your-project>
gcloud config set compute/zone <your-zone, like us-central1-f>
gcloud beta container clusters get-credentials --cluster <your-cluster-name>
then your GCE cluster will have the credentials