Hosting: Azure centos VM's running RKE1
Rancher Version: v2.6.2
Kubernetes Version: 1.18.6
Looking for help diagnosing this issue; I get two error messages
From Rancher:
Cluster health check failed: Failed to communicate with API server during namespace check: Get "https://<NODE_IP>:6443/api/v1/namespace/kube-system?timeout=45s": write tcp 172.16.0.2:443 -> <NODE_IP>:7832:i/o timeout
From Kubectl:
unable to create impersonator account: error setting up impersonation for user user-sd7q9: Put "https://<NODE_IP>:6443/apis/rbac.authorization.k8s.io/v1/clusterroles/cattle-impersonation-user-sd7q9": write tcp 172.17.0.2:443-> <NODE_IP>:7832: i/o timeout
Nothing appears to be broken and my applications are still available via ingress.
https://github.com/rancher/rancher/issues/34671
I am trying to implement CI / CD using GitLab + Terraform to K8S Cluster and K8S Control Plane (Master node) was setup on CentOS
However, Pipeline job fails with the following error
Error: Failed to get existing workspaces: Get "https://192.xx.xx.xx/api/v1/namespaces/default/secrets?labelSelector=tfstate%3Dtrue": dial tcp 192.xx.xx.xx:443: i/o timeout
From the error mentioned above (default/secrets?labelSelector=tfstate%3Dtrue), I assume the error is related to missing 'terraform secret' on default namespace
Example (Terraform secret taken from my Windows)
PS C:\> kubectl get secret
NAME TYPE DATA AGE
default-token-7mzv6 kubernetes.io/service-account-token 3 27d
tfstate-default-state Opaque 1 15h
However, I am not sure which process would create 'tfsecret' or should we create it manually ?
Kindly let me know if I my understanding is wrong and had I missed anything else
EDIT
The issue mentioned above occurred because existing Gitlab-runner was on a different subnet (eg 172.xx.xx.xx instead of 192.xx.xx.xx)
I was asked to use a different Gitlab-runner which runs on the same subnet and now it throws the following error
Error: Failed to get existing workspaces: Get "https://192.xx.xx.xx:6443/api/v1/namespaces/default/secrets?labelSelector=tfstate%3Dtrue": x509: certificate signed by unknown authority
Now, I am bit confused whether the certificate-issue is between GitLab-Runner and Gitlab-Server or Gitlab-Server and K8S Cluster or something else
You have configured Kubernetes as the remote state backend for your Terraform configuration. The error is, that the backend is trying to query existing secrets to determine what workspaces are configured. The x509: certificate signed by unknown authority indicates, that the KUBECONFIG the remote state backend uses does not match the CA of the API server you're connecting to.
If the runners are K8s pods themselves, make sure you provide a KUBECONFIG that matches your target cluster and that the remote state does not configure itself as in-cluster by reading the service account token every K8s pod has - which in most cases will only work for the cluster the pod is running on.
You don't provide enough information to be more specific. But big picture, you have to configure the state backend, and any provider that connect to K8s. Theoretically, the state backend secrets and the K8s resources do not have to be on the same cluster. Meaning, you may have to have different configuration for state backend and K8s providers.
We have an on-premise kubernetes deployment in our data center. I just finished deploying the pods for Dex, configured hooked up with our LDAP server to allow LDAP based authentication via Dex, ran tests and was able to retrieve the OpenID connect token for authentication.
Now I would like to change our on-premise k8s API server startup parameters to enable OIDC and point it to the Dex container.
How do I enable OIDC to the API server startup command without downtime to our k8s cluster? Was reading this doc https://kubernetes.io/docs/reference/access-authn-authz/authentication/ but the site just says "Enable the required flags" without the steps
Thanks!
I installed Dex + Active Directory Integration few months ago on a cluster installed by kubeadmn .
Let's assume that Dex is now running and it can be accessible thru
https://dex.example.com .
In this case,..
Enabling ODIC at the level of API server has 3 steps :
These steps have to be done on each of your Kubernetes master nodes.
1- SSH to your master node.
$ ssh root#master-ip
2- Edit the Kubernetes API configuration.
Add the OIDC parameters and modify the issuer URL accordingly.
$ sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml
...
command:
- /hyperkube
- apiserver
- --advertise-address=x.x.x.x
...
- --oidc-issuer-url=https://dex.example.com # <-- 🔴 Please focus here
- --oidc-client-id=oidc-auth-client # <-- 🔴 Please focus here
- --oidc-username-claim=email # <-- 🔴 Please focus here
- --oidc-groups-claim=groups # <-- 🔴 Please focus here
...
3- The Kubernetes API will restart by itself.
I recommend also to check a full guide like this tuto.
The OIDC flags are for Kubernetes API Server. You have not mentioned how you have installed Kubernetes on prem. Ideally you should have multiple master nodes fronted by a LoadBalancer.
So you would disable traffic to one master node from the loadbalancer and login to that master node and edit the manifest of api server in /etc/kubernetes/manifests and add the OIDC flags. Once you change the manifest api server pod will be restarted automatically.
You repeat the same process for all master nodes and since at any given point in time you have at least one master node available there should not be any downtime.
After deploying BITNAMI HELM chart for AIRFLOW, on kubernetes cluster, ALTHOUGH EVERYTHING WORKS, logging is still unreachable.
Turns out that helm chat that is being used to deploy is using a headless service for communication between celery workers and is not able to show me logs.
I have set the hostname_callable setting right, and yet, LOGS ALWAYS PICK UP THE NAME OF HEADLESS SERVICE as their hostname, but, not the DNS name.
*** Log file does not exist: /opt/bitnami/airflow/logs/secondone/s3files/2020-06-19T10:35:00+00:00/1.log
*** Fetching from: http://mypr-afw-worker-1.mypr-afw-headless.mynamespace.svc.cluster.local:8793/log/secondone/s3files/2020-06-19T10:35:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='mypr-afw-worker-1.mypr-afw-headless.mynamespace.svc.cluster.local', port=8793): Max retries exceeded with url: /log/secondone/s3files/2020-06-19T10:35:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f12917f5630>: Failed to establish a new connection: [Errno 111] Connection refused',))
Any help in this regard would be appreciated! thanks!
How are you setting the hostname, it seems you need to pass them as an array:
## The list of hostnames to be covered with this ingress record.
## Most likely this will be just one host, but in the event more hosts are needed, this is an array
##
hosts:
- name: airflow.local
path: /
or --set ingress.hosts[0].name=airflow.local --set ingress.hosts[0].path=/ in the helm install command
Am working on Azure Kubernates where we can store Docker Images in Azure. Here am trying to check my kubectl version, then am getting
Unable to connect to the server: dial tcp [::1]:8080: connectex: No
connection could be made because the target machine actively refused
it.
For this I followed MSDN:uilding Microservices with AKS and VSTS – Part 2 and MSDOCS:Kubernetes on windows
So, can you please suggest me “How to resolve for this issue?”
I am on windows 10, and for me I did not enable kubernetes on Docker Desktop.
As you can see here, there are no contexts available.
So go to settings of docker desktop and enable it as follows.
Now run a command as follows.
kubectl config get-contexts
Ensure you see something like this.
Also you can also try listing the nodes as follows.
kubectl get nodes
I think you might missed out to configure the cluster, for that you need to run the below command in your command prompt.
az aks get-credentials --resource-group myResourceGroup --name myAKSCluster
The above CLI command creates .config file with complete cluster and nodes details in your local machine.
After that you run kubectl get nodes command in your command prompt, then you can get the list of nodes inside the cluster like in the below image.
For reference follow this Deploy an Azure Kubernetes Service (AKS) cluster.
If you can see that your config file is correctly configured by going to $HOME/.kube/config - Linux or %UserProfile%/.kube/config - Windows but you are still receiving the error message - try running command line as an administrator.
More information on the config file can be found here: https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/
In my case, I was shuffling between az aks k8s cluster and local docker-desktop.
So every time I change the cluster context I need to restart the docker, else I get the same described error.
Unable to connect to the server: dial tcp 127.0.0.1:6443: connectex: No connection could be made because the target machine actively refused it.
PS: make sure your cluster is started as shown in this picture showing (Stop local cluster)
For me it appeared to be due to Windows not having a HOME environment variable set. According to the docs kubectl will use the config file $(HOME)/.kube/config. But since this variable isn't set on Window it can't locate the file.
I created a HOME variable with the same value as USERPROFILE and it started working.
I'm using Hyper-V on Local Windows and I met this error because I didn't configure minikube.
(I know the question is about Azure, not minikube. But this article is on the top for the error message. So, I've put the solution here.)
1. enable Hyper-V.
Type in systeminfo on your Terminal. If you can find the line below,
Hyper-V Requirements: A hypervisor has been detected. Features required for Hyper-V will not be displayed.
Hyper-V works correctly.
If you can't, enable it from settings.
2. Create Hyper-V Network Switch
Open Hyper-V manager. (Searching it is the fastest way.)
Next, click your PC name on the left.
Then, you can find Virtual Switch Manager menu on the right.
Click it and choose External Virtual Switch with name: "Minikube Switch"
Click apply to create it.
3. start minikube
Go back to terminal and type in:
minikube start --vm-driver hyperv --hyperv-virtual-switch "Minikube Switch"
For more information, check the steps in this article.
Check docker is running and you started minikube or whichever cloud kube you using.
my issue resolved after running "minikube start --driver=docker"
Essentially this problem occurs if your minikube or kind isn't configured. Just try to restart your minikube or kind. If that doesn't solve your problem then try to restart your hypervisor which minikube uses.
minikube start
This command solved my issue.
I was facing the same error while firing the command "kubectl get pods"
The issue has been resolved by having following steps below:
a) First find out current-context
kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
b) if no context is set then set it manually by using
kubectl config set-context <Your context>
Hope this will help you.
If you're facing this error on windows, its possible that your docker instance is not running.
These are the steps I followed to replicate the above error;
Stopped docker and then tried to start-up an nginx-deployment. Doing this caused the mentioned error above to happen.
How did I solve it?
Check if minikube is running in my case this was not running
Start minikube
Retry applying your configuration above. In my case see the screenshot below
When you see that your deployment has been created, then all should be fine.
I had exactly the same problem even after having correct config (by running an azure cli command).
It seems that kubectl expects HOME env.variable set but it did not exist for me. There is however a solution:
If you add a KUBECONFIG environmental variable that will point to config it will start working.
Example:
setx KUBECONFIG %UserProfile%\.kube\config
When the variable is present kubectl has no troubles reading from file.
P.S. It is an alternative to setting a HOME variable as suggested in another answer.
Azure self-hosted agent doesn't have the permission to access Kubernates cluster:
Remove Azure self-hosted agent - .\config.cmd Remove
configure again ( .\config.cmd) with a user have permission to access Kubernates cluster
I encountered similar problem:
> kubectl cluster-info
"To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Unable to connect to the server: dial tcp xxx.x.x.x:8080: connectex: No connection could be made because the target machine actively refused it."
> kubectl cluster-info dump
Unable to connect to the server: dial tcp xxx.0.0.x:8080: connectex: No connection could be made because the target machine actively refused it.
This setup was working fine until Docker for Desktop bought it's own copy of kubectl. There are 2 ways to overcome this situation:
1 - Quit / Stop Docker for Desktop while using the cluster
2 - Set KUBECONFIG file path
I tried both the options and they worked.
Found a good source for .kube/config, sending it over here for quick reference:
apiVersion: v1
clusters:
- cluster:
certificate-authority: fake-ca-file
server: https://1.2.3.4
name: development
- cluster:
insecure-skip-tls-verify: true
server: https://5.6.7.8
name: scratch
contexts:
- context:
cluster: development
namespace: frontend
user: developer
name: dev-frontend
- context:
cluster: development
namespace: storage
user: developer
name: dev-storage
- context:
cluster: scratch
namespace: default
user: experimenter
name: exp-scratch
current-context: ""
kind: Config
preferences: {}
users:
- name: developer
user:
client-certificate: fake-cert-file
client-key: fake-key-file
- name: experimenter
user:
password: some-password
username: exp
Reference: https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/
Following #ilya-chernomordik,
I've added my config path to the System Variable by doing
setx KUBECONFIG "D:\Minikube\Minikube.minikube\config"
I have changed the default Location from C: Drive to D: Drive as i have less space in C.
Now the problem is fixed.
edit: after 5 mins, the api server again stopped. It's been more than 5-6 hours i'm trying to solve this issue. I'm not sure why this problem is happening, even after adding the coreect path.
On Rancher Desktop, make sure context is correctly choosen
In my situation, I'm in windows with docker desktop in a simple scenario just for studies, but the case is:
In the docker version in 20.10 or above, it come with kubernetes installed. Then it doesn't necessary installed a cluster adm like minikube. Then, when it just need to enable kubernetes in Docker Desktop configuration. Like:
Go to Docker Desktop: settings > kubernetes > check the box inside section Enable kubernetes and then click in Restart Kubernetes Cluster
When we do this, the docker provide all needed to works Kubernetes properly.
Referenced by: Blog