Waiting for pods: apiserver get stuck

Waiting for pods: apiserver get stuck - kubernetes

I am trying to implement auditing policy
My yaml
~/.minikube/addons$ cat audit-policy.yaml
# Log all requests at the Metadata level.
apiVersion: audit.k8s.io/v1beta1
kind: Policy
rules:
- level: Metadata
Pods got stuck
minikube start --extra-config=apiserver.Authorization.Mode=RBAC --extra-config=apiserver.Audit.LogOptions.Path=/var/logs/audit.log --extra-config=apiserver.Audit.PolicyFile=/etc/kubernetes/addons/audit-policy.yaml
😄 minikube v0.35.0 on linux (amd64)
💡 Tip: Use 'minikube start -p <name>' to create a new cluster, or 'minikube delete' to delete this one.
🔄 Restarting existing virtualbox VM for "minikube" ...
⌛ Waiting for SSH access ...
📶 "minikube" IP address is 192.168.99.101
🐳 Configuring Docker as the container runtime ...
✨ Preparing Kubernetes environment ...
▪ apiserver.Authorization.Mode=RBAC
▪ apiserver.Audit.LogOptions.Path=/var/logs/audit.log
▪ apiserver.Audit.PolicyFile=/etc/kubernetes/addons/audit-policy.yaml
🚜 Pulling images required by Kubernetes v1.13.4 ...
🔄 Relaunching Kubernetes v1.13.4 using kubeadm ...
⌛ Waiting for pods: apiserver
Why?
I can do this
minkub start
Then I go for minikube ssh
$ sudo bash
$ cd /var/logs
bash: cd: /var/logs: No such file or directory
ls
cache empty lib lock log run spool tmp
How to apply extra-config?

I don't have good news. Although you made some mistakes with the /var/logs it does not matter in this case, as there seems to be no way of implement auditing policy in Minikube (I mean there is, few ways at least but they all seem to fail).
You can try couple of ways presented in GitHub issues and other links I will provide, but I tried probably all of them and they do not work with current Minikube version. You might try to make this work with earlier versions maybe, as it seems like at some point it was possible with the way you have provided in your question, but as for now in the updated version it is not. Anyway I have spend some time on trying the ways from the links and couple of my own ideas but no success, maybe you will be able to find the missing piece.
You can find more information in this documents:
Audit Logfile Not Created
Service Accounts and Auditing in Kubernetes
fails with -extra-config=apiserver.authorization-mode=RBAC and audit logging: timed out waiting for kube-proxy
How do I enable an audit log on minikube?
Enable Advanced Auditing Webhook Backend Configuration

Related

Kubectl connection refused existing cluster

Hope someone can help me.
To describe the situation in short, I have a self managed k8s cluster, running on 3 machines (1 master, 2 worker nodes). In order to make it HA, I attempted to add a second master to the cluster.
After some failed attempts, I found out that I needed to add controlPlaneEndpoint configuration to kubeadm-config config map. So I did, with masternodeHostname:6443.
I generated the certificate and join command for the second master, and after running it on the second master machine, it failed with
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available
Checking the first master now, I get connection refused for the IP on port 6443. So I cannot run any kubectl commands.
Tried recreating the .kube folder, with all the config copied there, no luck.
Restarted kubelet, docker.
The containers running on the cluster seem ok, but I am locked out of any cluster configuration (dashboard is down, kubectl commands not working).
Is there any way I make it work again? Not losing any of the configuration or the deployments already present?
Thanks! Sorry if it’s a noob question.
Cluster information:
Kubernetes version: 1.15.3
Cloud being used: (put bare-metal if not on a public cloud) bare-metal
Installation method: kubeadm
Host OS: RHEL 7
CNI and version: weave 0.3.0
CRI and version: containerd 1.2.6

This is an old, known problem with Kubernetes 1.15 [1,2].
It is caused by short etcd timeout period. As far as I'm aware it is a hard coded value in source, and cannot be changed (feature request to make it configurable is open for version 1.22).
Your best bet would be to upgrade to a newer version, and recreate your cluster.

Unable to connect to the server: net/http: TLS handshake timeout

On minikube for windows I created a deployment on the kubernetes cluster, then I tried to scale it by changing replicas from 1 to 2, and after that kubectl hangs and my disk usage is 100%.
I only have one container in my deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: first-deployment
spec:
replicas: 1
selector:
matchLabels:
run: app
template:
metadata:
labels:
run: app
spec:
containers:
- name: demo
image: ner_app
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5000
all I did was run this after the pods were successfully deployed and running
kubectl scale --replicas=2 deployment first-deployment
In another terminal I was watching the pods using
kubectl get pods --watch
But everything is unresponsive and I'm not sure how to recover from this.
When I run kubectl get pods again it gives the following message
PS D:\docker\ner> kubectl get pods
Unable to connect to the server: net/http: TLS handshake timeout
Is there a way to recover, or cancel whatever process is running?
Also my VM's are on Hyper-V for Windows 10 Pro (minikube and Docker Desktop) both have the default RAM allocated - 2048MB
The container in my pod is a machine learning process and the model it loads could be large, in the order of 200MB to 300MB

You may have some proxy problems. Try following commands:
$ unset http_proxy
$ unset https_proxy
and repeat your kubectl call.

For me, the problem is that Docker ran out of memory. (EDIT: Possibly anyway; I wrote this post a while ago, and am now not so sure that is the root case, but did not write down my rationale, so idk.)
Anyway, to fix:
Fully close your k8s emulator. (docker desktop, minikube, etc.)
Shutdown WSL2. (wsl --shutdown) [EDIT: This step is apparently not necessary -- at least not always, since this time I skipped it, and the problem still resolved.]
Restart your k8s emulator.
Rerun the commands you wanted.
Sometimes it also works to simply:
Right click the Docker Desktop tray-icon, press "Restart Docker", and wait a few minutes for things to restart. (sometimes this fails, with Docker Desktop saying "Docker failed to start", so I'd generally recommend the more thorough process above)

Just happened to me on a new Windows 10 install with Ubuntu distro in WSL2. I solved the problem by running:
$ sudo ifconfig eth0 mtu 1350
(BTW, I was on a VPN connection when trying the 'kubectl get pods' command)

You can set up resource limits on deployments so that pods will not use the entire available resource in the node.

In my case I have my private EKS cluster and there is no 443(HTTPS) enabled in security groups.
My issue is solved after enabling the (HTTPS)443 port in security groups.
Kindly refer for AWS documentation for more details: "You must ensure that your Amazon EKS control plane security group contains rules to allow ingress traffic on port 443 from your connected network"

i solved this problem when execute the following command
minikube delete
and then start it
minikube start --vm-driver="virtualbox"
if use this why your pods will deleted
and when run kubectl get pods
you can see this result
No resources found in default namespace.

You could try $ unset all_proxy to reset the socket proxy.
Also, if you're connected to a VPN, try disconnecting - it seems that can interfere with connecting to a cluster.

I think the other answers don't really mention or refer to the vpn and proxy documentation for minikube: https://minikube.sigs.k8s.io/docs/handbook/vpn_and_proxy/
The NO_PROXY variable here is important: Without setting it, minikube may not be able to access resources within the VM. minikube uses two IP ranges, which should not go through the proxy:
192.168.99.0/24: Used by the minikube VM. Configurable for some hypervisors via --host-only-cidr
192.168.39.0/24: Used by the minikube kvm2 driver.
192.168.49.0/24: Used by the minikube docker driver’s first cluster.
10.96.0.0/12: Used by service cluster IP’s. Configurable via --service-cluster-ip-rang
So adding those IP ranges to your NO_PROXY environment variable should fix the issue.

Simply closing cmd, opening again, then
minikube start
And then executing the commands again solved this issue for me.
P.S: minikube start took less than a minute

Adding the IP address to the no_proxy list worked for me.
Obtain the IP address from ip addr output.
export no_proxy=localhost,127.0.0.1,<IP_ADDRESS>

restart minikube will work.
But if you don't want to delete it
then you can just switch to other cluster and then switch back.
I just click other kubenete cluster (ex: docker-desktop)
and then click back to the cluster I want to run (ex: minikube)

If you're on Linux or Mac, go to your virtualbox, and then on the toolbar choose 'Global Tools', then if you see two machines are using the same ip address, you should remove one of them. this image shows virtual box GUI

As this answer comes first on search for net-http-tls-handshake-timeout error
For those having issue with AWS EKS (and likely any K8s),
NO_PROXY solves problem by adding related IP/host to environment variable.
As suggested in comments for first answer.
For AWS EKS (when seeing this intermittently after vpc-cni addon upgrade)
replace for specific region or single url for your use case.
NO_PROXY=$NO_PROXY;eks.amazonaws.com

At least for Windows 10 and 11
$PS C:\oc rollback dc/my-app
Unable to connect to the server: net/http: TLS handshake timeout
For OpenShift 4.x the problem is that for some reason you are logged-out:
$PS C:\oc status
error: You must be logged in to the server (Unauthorized)
logging in by e.g.
$oc login -u developer
resolves the problem

Open PowerShell as an administrator and run the command "wsl --shutdown". You will see the same notification in your open Ubuntu terminal.
Open Docker Desktop.
Open a new terminal.
Run the command "minikube status" in the Ubuntu terminal.
Run the Minikube container. You can do this in Docker Desktop.
Run the command "minikube start".
That's it! You don't need to close your computer after this, and Minikube should work fine.

Installation error with Istio 1.1.3 and Kubernetes 1.13.5

I'm trying to install Istio 1.13.1 on Kubernetes 1.13.5 deployed on minikube 1.0.0 but I get some errors in the end. Here is log of the installation:
$ minikube start --memory=4096 --disk-size=30g --kubernetes-version=v1.13.5 --profile=istio
😄 minikube v1.0.0 on darwin (amd64)
🤹 Downloading Kubernetes v1.13.5 images in the background ...
🔥 Creating virtualbox VM (CPUs=2, Memory=4096MB, Disk=30000MB) ...
2019/04/19 19:51:56 No matching credentials were found, falling back on anonymous
2019/04/19 19:51:56 No matching credentials were found, falling back on anonymous
2019/04/19 19:51:56 No matching credentials were found, falling back on anonymous
2019/04/19 19:51:56 No matching credentials were found, falling back on anonymous
📶 "istio" IP address is 192.168.99.104
🐳 Configuring Docker as the container runtime ...
🐳 Version of container runtime is 18.06.2-ce
⌛ Waiting for image downloads to complete ...
✨ Preparing Kubernetes environment ...
💾 Downloading kubeadm v1.13.5
💾 Downloading kubelet v1.13.5
🚜 Pulling images required by Kubernetes v1.13.5 ...
🚀 Launching Kubernetes v1.13.5 using kubeadm ...
⌛ Waiting for pods: apiserver proxy etcd scheduler controller dns
🔑 Configuring cluster permissions ...
🤔 Verifying component health .....
💗 kubectl is now configured to use "istio"
🏄 Done! Thank you for using minikube!
$ ./bin/istioctl version
version.BuildInfo{Version:"1.1.3", GitRevision:"d19179769183541c5db473ae8d062ca899abb3be", User:"root", Host:"fbd493e1-5d72-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.2-56-gd191797"}
$ kubectl create -f install/kubernetes/istio-demo.yaml
namespace/istio-system created
customresourcedefinition.apiextensions.k8s.io/virtualservices.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/destinationrules.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/serviceentries.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/gateways.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/envoyfilters.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/clusterrbacconfigs.rbac.istio.io created
customresourcedefinition.apiextensions.k8s.io/policies.authentication.istio.io created
customresourcedefinition.apiextensions.k8s.io/meshpolicies.authentication.istio.io created
customresourcedefinition.apiextensions.k8s.io/httpapispecbindings.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/httpapispecs.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/quotaspecbindings.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/quotaspecs.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/rules.config.istio.io created
customresourcedefinition.apiextensions.k8s.io/attributemanifests.config.istio.io created
...
unable to recognize "install/kubernetes/istio-demo.yaml": no matches for kind "attributemanifest" in version "config.istio.io/v1alpha2"
unable to recognize "install/kubernetes/istio-demo.yaml": no matches for kind "attributemanifest" in version
That seems strange as the CRDs seem to have been created successfully but then when they are referenced to create some objects whose type is one of these CRDs then it fails.
I omitted other errors but that happens also for "handler", "logentry", "rule", "metric", "kubernetes", "DestinationRule"
.
On the documentation page https://istio.io/docs/setup/kubernetes/, it is stated that
Istio 1.1 has been tested with these Kubernetes releases: 1.11, 1.12, 1.13.
Does anyone have an idea ?

In the docs there is a step to execute CRDs init. I don't see that in your snippet, seems like that's what you're missing.
So:
$ for i in install/kubernetes/helm/istio-init/files/crd*yaml; do kubectl apply -f $i; done
Your missing CRD seems to be defined in this exact file: https://github.com/istio/istio/blob/master/install/kubernetes/helm/istio-init/files/crd-10.yaml but you should install all of them.

My bad, it seems that I have skipped the first step:
Install all the Istio Custom Resource Definitions (CRDs) using kubectl apply, and wait a few seconds for the CRDs to be committed in the Kubernetes API-server:
$ for i in install/kubernetes/helm/istio-init/files/crd*yaml; do kubectl apply -f $i; done

Cannot get fabric8 to start on local development machine (OSX or Linux)

I'm trying to give fabric8 a shot but I'm having issues getting it to start on a local machine running minikube and virtualbox (I've attempted this on Linux and OSX. I'm able to get all but one of the pods to start (after manually increasing minikube's VM ram to 8GB). The expose controller won't start and is giving me the following error in the logs:
I0415 14:29:43.431944 1 exposecontroller.go:47] Using build: '2.3.2'
F0415 14:29:43.492059 1 exposecontroller.go:66] failed to create new strategy: failed to create node port expose strategy: failed to list nodes: nodes is forbidden: User "system:serviceaccount:fabric8:exposecontroller" cannot list nodes at the cluster scope
Here are the commands I'm running:
minikube start --cpus=5 --disk-size=50g --memory=8000
curl -sS http://get.fabric8.io/download.txt | bash
gofabric8 start
I also tried creating OAuth secret via GitHub (using bogus IP address info for the redirect URL) but this doesn't make sense to me because I don't have a domain... Then I ran these:
minikube start --vm-driver=xhyve --cpus=5 --disk-size=50g --memory=8000
minikube addons enable ingress
gofabric8 deploy --package system -n fabric8
That resulted in the exposecontroller working but then additional pods (keycloak, for example) were created but failed to start.
I've spent hours trying to get this to work and am about to give up. The documentation on GitHub differs from fabric8's site documentation and I just can't get it to work. If someone is able to help, I would greatly appreciate it.
Note:
I've attempted to follow the instructions here:
http://fabric8.io/guide/getStarted/gofabric8.html
Additionally, I attempted to follow this:
https://github.com/fabric8io/fabric8-platform/blob/master/INSTALL.md

Cannot create priviledged containers

I am using instructions from https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/docker-multinode.md to setup a multinode kubernetes cluster on vmware vcloud infrastructure.
I was able to get the cluster working but when I tried the nfs example I was not able to create the nfs container. So I recreated all the VMs and rebuilt kubernetes from source using:
git clone https://github.com/kubernetes/kubernetes.git
cd kubernetes
sed -i 's/allow_privileged: .*/allow_privileged: true/g' cluster/saltbase/pillar/privilege.sls
./build/run.sh hack/build-cross.sh
cp _output/dockerized/bin/linux/$(dpkg --print-architecture)/kubectl /usr/local/bin
chmod +x /usr/local/bin/kubectl
and continued to setup the kubernetes cluster and retried the NFS example and I get the following error:
kubectl create -f nfs-server-pod.yaml
The Pod "nfs-server" is invalid.
spec.containers[0].securityContext.privileged: forbidden '<*>(0xc20931650c)true'
I tried with both the master and 1.0.3 release and had the same result.
Can you please tell me how to resolve this issue and Thanks for your support

We thought that turning privileged containers off by default would be good for security. It turns out to just be a pain point for a lot of people, so we're working to turn it on by default in kubernetes v1.1.
The --allow-privileged flag has to be set on both the kubelet and the apiserver - please check that